[yocto] [pseudo] Pseudo 1.8+ xattr sqlite corruption

Wed Sep 19 09:25:15 PDT 2018

Dell - Internal Use - Confidential  

> On Wed, 19 Sep 2018 12:33:37 +0100
> "Burton, Ross" <ross.burton at intel.com> wrote:
> 
> > On Tue, 18 Sep 2018 at 22:21, Seebs <seebs at seebs.net> wrote:
> > > > Are the databases supposed to be shareable between different build
> > > > machines? IIRC, the answer is no. Could you store the native inode
> > > > type as a sqlite BLOB? Not necessarily a good idea.... Just an
> > > > idea.
> > >
> > > I think coercing the values into range is probably safer. It should
> > > be trivial enough...
> > 
> > That is an excellent catch and I'm hopeful that this explains the
> > failures in glibc-locales too that I still see occasionally.
> > 
> > Is anyone actually writing a patch?
> 
> I can try to get a proposed patch out sometime soon, I don't have an
> easy way to check it.
> 
> -s

You can send the patch to me, it is easy to reproduce here.

I think the "coercing" of the values is a good path, when the inodes are high, they are usually all high, except for the rare case where inode values are just on either side of the Signed limit.  That happened one time and was the final proof of why this was happening.

A little more background, we have two build environments, the developer workstations and the automated build cluster.  The developers never saw this, because they are stand-alone boxes with typical 4 TB hardrives, max inode counts around a couple million. Our automated build system is a bunch of virtual machines which are setup and torn down with each build.  What we noticed though is each newly created builder gets a new inode starting value, which is an increment from previous ones.  This is why completely restarting the build cluster 'fixed' the issue for a while, it reset the inode numbering. So each builder gets assigned a non-overlapping new block of inodes, e.g. 1 - 1M, 1M+1 - 2M, 2M+1 - 3M, etc.  Eventually this climbs up into the range above 1.5B and the problems begin. The build managers are investigating if there is a way in the cluster config to limit the inode upper limit and force the numbers to wrap around sooner.

One thing I am curious about, is that Pseudo 1.6.x never gave us this problem, was the reference inside the database different? Or maybe it's just a case of never hitting the issue.

Thanks,

Jack Fewx
jack.fewx at dell.com