[yocto] configure optimization feature update

Wed Jun 15 17:57:50 PDT 2011

Hi Richard,

Recently I was doing the "configure optimization" feature and collecting data for it.

The main logic of this feature is straight forward:

1. Use the diff file as autoreconf cache. (I use command: "diff -ruN SOURCE-ORIG SOURCE", here "SOURCE-ORIG" is the source directory before running autoreconf, while "SOURCE" is the directory after running autoreconf).
2. Add SRC_URI checksum for all patches of the source code.
3. Tag each autoreconf cache file with ${PN} and the SRC_URI checksum of source code and all patches.
4. If the currently SRC_URI checksum matches the cached checksum, then we can patch the cache instead of running "autoreconf" stage.

I did some testings for sato build, the result is not as good as we expected:

On a server build machine (Genuine Intel(R) CPU @ 2.40GHz, 2 sockets with 6 core each and hyperthreading, thus 24 logical CPUs in all, 66G memory):

w/o the optimization:
real    83m40.963s
user    496m58.550s
sys     329m1.590s

w/ the optimization:
real    79m1.062s
user    460m58.600s
sys     347m42.120s

It has about 5% performance gain.

I also tested the patch on a desktop core-i7 machine (Intel(R) Core(TM) i7 CPU 870 @ 2.93GHz, 4 core 8 logical CPU, 4G memory):

w/o the optimization:
real    105m25.436s
user    372m48.040s
sys     51m23.950s

w/ the optimization:
real    103m38.314s
user    332m35.770s
sys     49m4.520s

It only has about 2% performance gain.

The result is not encouraging.

There are also some other things we need to take into consideration for this feature:

1. If add this feature, the first build time should be longer than current since it needs to build the autoreconf cache.
2. Maintainers needs to maintain the SRC_URI checksums not only for source code, but also all its patches. For some recipes, it has more than 20 patches, which needs assignable maintenance effort.
3. How to distribute the caches will be a problem. The total size of such cache is about 900M (before compression) and 200M (after compression). Since the size is not small, distributing it with Poky source code doesn't make sense. On another aspect, we can use something like "sstate". But since we already have caches of sstate, I think it is not necessary for us to enable another similar cache mechanism with little improvement.

Therefore my opinion is we may give up this feature. What's your comments and suggestions?

Thanks,
Dongxiao