[yocto] cannot re-use shared state cache between build hosts
Patrick Ohly
patrick.ohly at intel.com
Mon Sep 4 00:33:27 PDT 2017
On Fri, 2017-09-01 at 17:04 +0200, Andrea Galbusera wrote:
> Hi Maciej,
>
> On Fri, Sep 1, 2017 at 4:08 PM, Maciej Borzęcki <maciej.borzecki at rndi
> ty.com> wrote:
> > On Fri, Sep 1, 2017 at 3:54 PM, Andrea Galbusera <gizero at gmail.com>
> > wrote:
> > > Hi!
> > >
> > > I was trying to share sstate between different hosts, but the
> > consumer build
> > > system seems to be unable to use re-use any sstate object. My
> > scenario is
> > > setup as follows:
> > >
> > > * The cache was populated by a pristine qemux86 core-image-
> > minimal build of
> > > morty. This was done in a crops/poky container (running in docker
> > on Mac)
> > > * The cache was then served via HTTP
> >
> > Make sure that you use a decent HTTP server. Simple `python3 -m
> > http.server` will quickly choke when the mirror is being checked.
> > Also
> > running bitbake -DDD -v makes investigating this much easier.
>
> To be honest, the current server was indeed setup with python's
> SimpleHTTPServer... As you suggest, I checked the verbose debug log
> and noticed what's happening behind the apparently happy "Checking
> sstate mirror object availability" step. After a first "SState:
> Successful fetch test for" that I see correctly served with 200 on
> the server side, tests for any other sstate object suddenly and
> systematically fail with logs like this:
...
> DEBUG: checkstatus() urlopen failed: <urlopen error [Errno 9] Bad
> file descriptor>
More recent bitbake should not fail like that anymore. It's still
better to use an HTTP server that performs better, though.
commit 6fa07752bbd3ac345cd8617da49a70e0b2dd565f
Author: Patrick Ohly <patrick.ohly at intel.com>
Date: Mon Jul 17 15:25:10 2017 +0200
fetch2/wget.py: improve error handling during sstate check
When the sstate is accessed via HTTP, the existence check can fail due
to network issues, in which case bitbake silently continues without
sstate.
One such network issue is an HTTP server like Python's own SimpleHTTP
which closes the TCP connection despite an explicit "Keep-Alive" in
the HTTP request header. The server does that without a "close" in the
HTTP response header, so the socket remains in the connection cache,
leading to "urlopen failed: <urlopen error [Errno 9] Bad file
descriptor>" (only visible in "bitbake -D -D" output) when trying to
use the cached connection again.
The connection might also get closed for other reasons (proxy,
timeouts, etc.), so this is something that the client should be able
to handle.
This is achieved by checking for the error, removing the bad
connection, and letting the check_status() method try again with a new
connection. It is necessary to let the second attempt fail
permanently, because bad proxy setups have been observed to also lead
to such broken connections. In that case, we need to abort for real
after trying twice, otherwise a build would just hang forever.
[YOCTO #11782]
--
Best Regards, Patrick Ohly
The content of this message is my personal opinion only and although
I am an employee of Intel, the statements I make here in no way
represent Intel's position on the issue, nor am I authorized to speak
on behalf of Intel on this matter.
More information about the yocto
mailing list