[meta-virtualization] [PATCH v2] libvirtd not properly shutting down on oVirt node install
Bruce Ashfield
bruce.ashfield at gmail.com
Tue Oct 7 20:37:40 PDT 2014
On Tue, Oct 7, 2014 at 4:33 PM, Zibo Zhao <zibo.zhao at windriver.com> wrote:
> From: Prateek Kumar <prateek.kumar at windriver.com>
>
> On registering the ovirt node form the ovirt engine web
> interface the ovirt node installation goes fine but the
> node will suffer a "communication failure" and the
> status of the node will fail to go up status.
> Its due to the race condition in libvirtd shutdown where
> pid file is deleted before the real libvirtd process get
> terminated.
> The issue is fixed by making sure the libvirtd process
> is terminated before deteting the pid file and providing
"deleting"
> sufficient delay.
>
What's the difference between the two patches that came out hours apart ?
Also, can we make these lines closer to 72 chars long .. it makes the log easier
to read.
... and finally, while that is a fine description of the problem, it
is talking about
packages that we don't have in meta-virt. So I'd like the log to just
describe the
race, and the fix, and drop the oVirt parts.
> Signed-off-by: Prateek Kumar <prateek.kumar at windriver.com>
> Signed-off-by: Zibo Zhao <zibo.zhao at windriver.com>
> ---
> recipes-extended/libvirt/libvirt/libvirtd.sh | 36 +++++++++++++++++++++-------
> 1 file changed, 28 insertions(+), 8 deletions(-)
>
> diff --git a/recipes-extended/libvirt/libvirt/libvirtd.sh b/recipes-extended/libvirt/libvirt/libvirtd.sh
> index b4fa674..beb77ec 100755
> --- a/recipes-extended/libvirt/libvirt/libvirtd.sh
> +++ b/recipes-extended/libvirt/libvirt/libvirtd.sh
> @@ -9,6 +9,8 @@
>
> . /etc/init.d/functions
>
> +PID=/var/run/libvirtd.pid
> +
> if [ -f /lib/lsb/init-functions ]
> then
> . /lib/lsb/init-functions
> @@ -72,24 +74,42 @@ fi
>
> case "$1" in
> start)
> - if [ -e /var/run/libvirtd.pid ]; then
> - if [ -d /proc/$(cat /var/run/libvirtd.pid) ]; then
> + if [ -e $PID ]; then
> + if [ -d /proc/$(cat $PID) ]; then
> echo "virtualization library already started; not starting."
> else
> - echo "Removing stale PID file /var/run/libvirtd.pid."
> - rm -f /var/run/libvirtd.pid
> + echo "Removing stale PID file $PID."
> + rm -f $PID
The switch to using $PID really looks like it should be a separate patch. Since
is has nothing to do with the race condition fix.
> fi
> fi
> log_begin_msg "Starting virtualization library daemon: libvirtd"
> start-stop-daemon -K -x /usr/bin/dnsmasq
> - start-stop-daemon --start --quiet --pidfile /var/run/libvirtd.pid --exec /usr/sbin/libvirtd -- --daemon --listen
> + start-stop-daemon --start --quiet --pidfile $PID --exec /usr/sbin/libvirtd -- --daemon --listen
> log_end_msg $?
> ;;
> stop)
> + if [ -f $PID ]; then
> + libvirtd_pid=$(cat $PID)
> + fi
> log_begin_msg "Stopping virtualization library daemon: libvirtd"
> - start-stop-daemon --stop --quiet --retry 3 --exec /usr/sbin/libvirtd --pidfile /var/run/libvirtd.pid
> - log_end_msg $?
> - rm -f /var/run/libvirtd.pid
> + start-stop-daemon --stop --quiet --retry 3 --exec /usr/sbin/libvirtd --pidfile $PID
> + CHECK=0
> + K=0
> + if [ "x$libvirtd_pid" != "x" ]; then
> + while /bin/kill -0 $libvirtd_pid >/dev/null 2>&1; do
we shouldn't hardcode /bin/kill.
Bruce
> + K=$(($K+1))
> + sleep 1
> + if [ $K -ge 60 ]; then
> + log_end_msg "Stopping libvirtd: timed out"
> + CHECK=1
> + break
> + fi
> + done
> + fi
> + if [ $CHECK -eq 0 ]; then
> + log_end_msg "Removing pid file $PID"
> + rm -f $PID
> + fi
> ;;
> restart)
> $0 stop
> --
> 1.9.1
>
> --
> _______________________________________________
> meta-virtualization mailing list
> meta-virtualization at yoctoproject.org
> https://lists.yoctoproject.org/listinfo/meta-virtualization
--
"Thou shalt not follow the NULL pointer, for chaos and madness await
thee at its end"
More information about the meta-virtualization
mailing list