[Automated-testing] Farming together - areas of collobration

Wed Nov 15 14:55:06 PST 2017

> -----Original Message-----
> From: Jan Lübbe [mailto:jlu at pengutronix.de]
> To: Bird, Timothy <Tim.Bird at sony.com>; Kieran Bingham
> > > From: Kieran Bingham on Saturday, November 11, 2017 7:37 AM
> [...]
> > > That is to say - for me - I envisage that a developer should be able
> > > to 'lock' a board, take control of it, and use it and then release
> > > it. (Or a timeout would release the lock as well perhaps)
> >
> > I agree with this concept.  'ttc' has board reservation.  (I'm not
> > saying this promote 'ttc', but rather to indicate that it was one of
> > the things we added based on our usage models for board farms at
> > Sony.)  I'm not sure where board reservation fits in the layer model - probably
> at the "cow"
> > layer.  (BTW - the layers will need non-analogy names at some point.)

Yes agreed.

> 
> > > An automated test system - should then simply be another 'developer'
> > > who just happens to be an automated build and test system. Whether a
> > > real person, or a build pc - the access should be much the same.
> > > (albeit perhaps with their own unique access keys and permissions of
> > > course)
> >
> > I think this makes the model of managing the board for multiple use
> > cases simpler.  If there are differences between human/board
> > interaction and automated-system/board interaction, it would be good
> > to identify those.
> 
> I think the main difference between interactive (during development) and CI
> use is that the CI can wait until the board is free again.
> 
> The "Cow" layer shouldn't handle scheduling like Jenkins or Lava do it (because
> that would cause problems when using those as the top layer).
> We can probably get away with just reservation/locking on the Cow
> layer:
> For Jenkins you could have a pipeline job first run the build, then reserve a
> board and wait in a lightweight executor until it's available and then continue
> with running the tests. That way, the test results would be attached directly to
> the build (and this way to the original SCM changes).

Yes I think I'm aligned on this as well. I don't see any difference between a real user and automated test - besides the reservation issues.

In our farm we've had some difficulty in this area including - developers not 'releasing' or unlocking their board. Broken Jenkins tests not unlocking a board. Long running Jenkins tests keeping a board locked in the morning and the developer isn’t sure how long they'll need to wait to use the board.

We've worked around this to some extend with a 'force use' (where we can steal a board from someone), an ability to run 'ebfarm use wait' where it will block until the board is released (whilst sending a friendly email message to the current user of the board telling them their board is in contention). We've never worked on any reservation system/scheduling.

I'm not sure which layer this fits in either. Probably the Cow? I guess the same layer that deals with users/authentication/etc.

> 
> > > My design goals here are that an individual should be able to access
> > > targets with name based resolution such as:
> > >
> > >   lab beaglebone serial # Load up a shared serial console over tmux/screen
> > >   lab beaglebone on
> > >   lab beaglebone reboot
> > >   lab beaglebone upload Image board.dtb   # Upload boot files using
> rsync/scp
> > >
> >
> > Just by way of comparison, ttc supports similar things:
> > Assuming a board named 'bbb' for "beagleboneblack".  I use commands
> > like the following on a daily basis:
> >  $ ttc bbb console - get serial console  $ ttc bbb login - get network
> > login (usually ssh these days)  $ ttc bbb reboot - reboot board  $ ttc
> > bbb kinstall - install kernel  $ ttc bbb fsinstall - install file
> > system (or image, as appropriate)  $ ttc run <command> - execute
> > command on board
> >
> > See https://elinux.org/Ttc_Program_Usage_Guide for a list of verbs.
> > (Again - I'm not promoting ttc, just providing data for comparison with other
> systems).
> > Having tried to push ttc in the past as an industry standard, and
> > gotten nowhere, I am perhaps more sensitive to the issues of how hard
> > it is to build an ecosystem around this stuff.
> 
> One difficulty with these verbs is that they apply to different levels:
> - Simple HW (console, reboot)
> - USB protocols (fastboot, mass storage, SD, bootloader upload)
> - Bootloader control (kernel install/upload, variables, boot device
> selection)
> - Linux shell control (login, running tests, ...)
> - Board state control (switch from linux back to bootloader, boot to a different
> redundant image, ...)
> 
> The Bootloader/Linux levels need a lot more knowledge about how each target
> works. For our uses-cases, this is often highly project specific. So it needs to be
> relatively simple to customize the behavior, while still making the common
> cases easy.

We once started a test library, we'd have a tcl file with functions (like wait for prompt, log in etc) - these were pretty much standard, but could be overwritten by a particular board file (at our workspace level) for example to indicate what the prompt looks like.

> 
> > Maybe I should try to convert my lab over to labgrid, and see if it
> > handles the things I want to do (and if not send some patches upstream).
> > For some reason, I could never muster the energy to do this with LAVA,
> > but maybe converting to labgrid wouldn't involve as much perceived pain.
> >
> > Is anyone on this list using labgrid (or especially if they've
> > converted over from something else), and can tell me how painful it was?

Jan is a contributor : )

> 
> I suspect there aren't may people who have tried it already. Also the docs don't
> really have a step-by-step guide how to set up the remote infrastructure. There
> are probably also some traps during setup which are not obvious to me. ;)
> 
> Tim, if you could describe your test setup (which boards, how they are
> connected/powered/…, where they are connected to, what your first use- case
> is), I'd write that guide for your case. Later I could generalize it and move it to
> the docs as a guide.
> 
> [...]
> > It would be good if things were discoverable, inspectable, etc.
> > That's one thing that I never finished with ttc - was a web front-end.
> > My question would be if there are existing front-ends that this could
> > plug in to (Jenkins, kernelCI, LAVA - do they all have "board
> > management" screens?)
> 
> LAVA is the only one of those who has a real per-board management UI.
> Kernel-CI shows results filtered by board, but the config comes from text files.
> Jenkins by itself only knows about "build slave", which don't map naturally to
> boards, but can be extended by plugins (such as Fuego). Labgrid doesn't have a
> web UI either.
> 
> > > I think perhaps we should meet up for a beer to discuss this sometime!
> > > (farmers-con anyone?)
> >
> > Until there's sufficient ecosystem to justify a full event, I can
> > always approve/secure a session or two at ELC or ELCE, for BOFs, talks
> > or even working meetings.
> 
> Yes, we should keep meeting at the larger events, ideally collecting notes.

I'd be happy to join.

Andrew Murray

> 
> Regards,
> Jan
> --
> Pengutronix e.K.                           |                             |
> Industrial Linux Solutions                 | http://www.pengutronix.de/  |
> Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
> Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |