[yocto] [Openembedded-architecture] Proposal: dealing with language-specific build tools/dependency management tools

Paul Eggleton paul.eggleton at linux.intel.com
Mon Mar 13 13:58:42 PDT 2017


Alex - thanks for kicking off the discussion here, this is something we 
definitely need to get a better handle on.

My involvement with this is perhaps somewhat accidental - I ended up working 
on the fetcher that was implemented by someone else because I needed it to 
work for devtool integration, and got sucked into fixing a number of issues I 
found in the process. Unfortunately - other than python, naturally - I'm not 
that familiar with the actual languages or their package managers, I've only 
come into contact with node.js and npm through the work that I've done with 
the build system.

This is a tough problem to solve. Maybe some of the other language package 
managers are more cooperative, I'm not sure, but npm *really* doesn't want to 
be used in the manner we're trying to use it - it insists on being able to go 
out to the registry and things get a bit ugly if you tell it not to. RSS-
related issues aside (since I kind of have a path to fixing those) the latest 
pain was in 90cb980a1c49de99a0aec00c0cd5fc1e165490a7 when we shifted from a 
cp -a to a second invocation of npm install in order to get a more accurate 
install step - the side-effect has been we broke that step for certain 
modules, as the certain directives in the package.json file trigger querying 
the repository, but we've told it not to do that so it errors out.

Another thing that is still a stumbling block for the "one package represents 
a tree of modules" that I've yet to properly resolve is how to deal with a 
"partial" tree, where you have one or more modules that you want to actually 
work on, i.e. you have another recipe to satisfy the dependency possibly under 
the control of devtool in your workspace. npm doesn't seem to provide any 
specific mechanism to help with this either. At the moment I'm not sure 
there's a solution with npm other than hacking the package.json file.

Since it was brought up by Trevor a while back, I still have a todo item to go 
and look at yarn [1] to see if it really solved some of these issues for 
node.js in a less nasty way than npm.

One thing I will say - I really want to see the fetchers (or at least, the 
custom bits) for these language package managers implemented in the metadata 
rather than in bitbake. This stuff moves fast, there are a growing number of 
these package managers, and it's awkward to have part of the implementation in 
one place and a significant portion of the rest of it (class and supporting 
recipes) in another, without which the fetcher is useless. However, at the 
same time we want to make sure we don't lose the ability to have mirror 
tarballs which are implemented on the bitbake side.

With regard to the recipes generated by devtool, they do end up being a bit 
ugly because we package each underlying npm package individually. The reason I 
did it that way is to have each source and therefore each license represented 
in the image manifests. I am open to having a mode where we have it all in one 
package, though, but it seems to me that LICENSE must include all the licenses 
at parse time. I'm open to suggestions. Perhaps we could save the package.json 
file next to the recipe - is that going to be practical? It might be a bit 
easier to update at least. I agree that in addition to providing the lockdown 
we do definitely need to make updating these recipes easier, it's a bit 
awkward right now.

Cheers,
Paul

[1] https://yarnpkg.com/

On Saturday, 11 March 2017 2:49:01 AM NZDT Alexander Kanavin wrote:
> Hello all,
> 
> *Introduction*
> 
> The new generation of programming languages (think node.js, Go, Rust) is
> a poor fit for the Yocto build model which follows the traditional Unix
> model. In particular, those new development environments have no problem
> with 'grabbing random stuff from the Internet' as a part of development
> and build process. However, Yocto has very strict rules about the build
> steps and what they can and can not do, and also a strict enforcement of
> license and version checks for every component that gets built. Those
> two models clash, and this is a proposal of how they could be reconciled.
> 
> I'll also send a separate email that talks specifically about MEAN stack
> and how it could be supported as Yocto - take it as a specific example
> for all of the below.
> 
> *Background*
> 
> The traditional development model on Unix clearly separates installation
> of dependencies needed to develop a project from the development process
> itself. Typically, when one wants to build some project, first the
> project README needs to be inspected, and any required dependencies
> installed system-wide using the distribution package management's tool.
> When those dependencies change, usually this manifests itself in a
> previously unseen build failure which is again manually resolved by
> figuring out the missing dependency and installing it. This can be
> awkward, but it's how things have been done for decades, and Yocto's
> build system (with separate steps for fetching, unpacking, building,
> packaging etc.) is built around the assumption that most software can be
> built this way.
> 
> Unfortunately, this situation is changing. The new development
> environments, such as Go, Rust or node.js see this approach as
> cumbersome and getting-in-the-way for developers. They want projects'
> setup to be as quick and automatic as possible - and it should also be
> cross-platform. So each such environment comes with a specialized tool
> which handles installation of dependencies and bypasses the distribution
> package management altogether. Typically these dependencies are fetched
> from the Internet and installed into the project tree. The details are
> hidden; it's assumed that developers don't want to know or care. In
> particular, specific versions of dependencies can be only weakly
> specified or ignored altogether (that is, the latest commit is always
> fetched), licensing is totally overlooked, a list of what was installed
> cannot be trivially obtained, and repeating the procedure the next day
> may result in a different set of code being pulled in, because someone
> somewhere added a commit to their github repo.
> 
> This does not work well in Yocto context. Yocto project prides itself on
> being specific and exact about what gets build, how it gets built and
> what license is attached to each component. So we need to somehow
> enforce that with the new model, and avoid the situation where separate,
> incompatible, and difficult to grasp solutions are developed for each
> language environment.
> 
> *Design considerations*
> 
> 1. I would like recipes to remain short and sweet and clear. In
> particular, node.js projects can pull in hundreds of dependencies; I
> want to keep their metadata out of the recipe and somewhere else, for
> readability, clarity, and maintainability.
> 
> 2. I don't want to implement custom fetchers, or otherwise re-implement
> (poorly) those language-specific build and dependency management tools.
> Let's use npm, cargo and go as much as we possibly can and let them do
> their job - yes, that also includes them fetching things from the
> internet for us.
> 
> 3. When things need to be updated to a new version, manual editing of
> metadata should be avoided: when there are hundreds of dependencies, a
> tool should modify the metadata, and human should only inspect the changes.
> 
> *How do we deal with this?*
> 
> By introducing a lockdown file that lives next to the recipe. The
> concept is already implemented in npm, but needs to be made generic and
> come with a common API that is using the file to verify the build.
> 
> *What is a lockdown file?*
> 
> The file captures all of the recipe dependencies that are pulled in by
> the build tool. For each such dependency the following information is
> provided (this is really similar to what is in recipes, and that is on
> purpose:
> 
> - name
> - description (optional)
> - verification data (this is specific to each language, but can be
> version, git commit id, a checksum and so on). The only requirement is
> that it maps to a unique set of code.
> - license string
> - license file path
> - license checksum
> 
> *How is the lockdown file used?*
> 
> 1. It needs to be generated in the first place when adding a new recipe.
> For example:
> 
> bitbake -c generate_lockdown recipe
> 
> would fetch and unpack the recipe code, then run npm/cargo/go to pull in
> the dependencies, then walk the project tree and generate the lockdown
> metadata. Sometimes the tools can help here somewhat, but other times
> they can be used only for fetching, and verification data has to be
> figured out by inspecting the tree with our custom-written code. This is
> the hard part that we have to deal with.
> 
> 2. It can be used to perform a 'loose' build of the recipe that does not
> guarantee reproducibility.
> 
> We have to accept this: some projects just don't care about it, and
> offer no support to those who want reproducibility. We should at least
> provide a way to build such projects in Yocto. The information in
> lockdown file is not enforced; it's merely compared against the actual
> build and any differences presented to the user as warnings. This is a
> recipe setting.
> 
> 3. It can also be used to perform a 'strict' build of the recipe that
> enforces what is in the lockdown file.
> 
> The information in the lockdown file is given to the language-specific
> tool to help it fetch the right things (whenever the tool makes it
> possible), and then is used to compare to what was fetched, but this
> time any mismatches stop the build. Exactly how this happens is specific
> to each language, and again, it is the hard bit that we need to deal with.
> 
> 4. When a recipe is updated to a new version, the lockdown file needs to
> be updated as well.
> 
> One possibility is to generate a new lockdown file (as in point 1), and
> then a human can compare that against the old lockdown file.
> 
> bitbake -c update_lockdown recipe
> 
> 5. Packaging
> 
> Go by default is compiling everything into a static executable, so there
> are no separate packages. All dependencies' licenses should be rolled
> into the package: lockdown file tells what they are and where they are
> in the build tree.
> 
> Other environments do install the dependencies somewhere in the system,
> so those should be packaged separately: lockdown file is used to get a
> list of them and attach licenses to them. Installation paths (things
> that FILES_ is set to) should typically be easy to figure out from
> dependency names.
> 
> *Conclusion*
> 
> This is only a preliminary idea: I understand that the devil is in the
> details, and there are plenty of details where things may not work out
> as planned, or there's something else I didn't think of that should be
> accounted for. So flame away!
> 
> _______________________________________________
> Openembedded-architecture mailing list
> Openembedded-architecture at lists.openembedded.org
> http://lists.openembedded.org/mailman/listinfo/openembedded-architecture


-- 

Paul Eggleton
Intel Open Source Technology Centre



More information about the yocto mailing list