[Automated-testing] ATS 2019 at ELC-E: Raw Notes from Pengutronix

Mon Nov 11 23:54:07 PST 2019

On 02.11.19 15:40, Tim.Bird at sony.com wrote:
> 
> 
>> -----Original Message-----
>> From: Chris Fiege

>> Jan, Rouven and I took some notes during the ATS today.
>>
>> You can find our Raw notes here:
>> https://pad.stratum0.org/p/ELCE-PTX-ATS
>>
>> Feel free to correct or complete our notes.
>> In the last parts (currently starting at Line 440) some names are missing.
>> Maybe someone can add those.
>>
>> Deadline for your input is 07.11.2019. Afterwards I will post the notes
>> here on the list.>
> Thank you all *so much* for taking these notes.  I edited the notes in a few
> places, and added a few names (hopefully correctly).
> 
> The minutes really helped to jog my memory of the discussions we had and 
> they'll be a very useful resource in the future.
>  -- Tim

Thanks to everyone for your Input. I've attached our meeting minutes.

Regards Chris
-------------- next part --------------
Minutes from sessions at Automated Testing Summit 2019
(held in Lyon, France - October 31, 2019)

https://events19.linuxfoundation.org/events/ats-2019/program/schedule/

09:00 https://ats19.sched.com/event/Uvq8/keynote-welcome-opening-remarks-tim-bird-sr-staff-software-engineer-sony

09:10 https://ats19.sched.com/event/Uvpn/keynote-report-on-recent-testing-meetups-kevin-hilman-co-founder-sr-engineer-baylibre

	* "the bugs are too fast - and why we can't catch them."
	* Summary of test conferences
	* Everybody is testing in their own corner - there is not much communication / upstreaming
	* Test coverage := on the beaten tracks
	* Even with fragmentation: We still find lots of bugs. How can we fix all this bugs?
	* ~10% of kernel commits have are bugfixes
	* syzbot: finds ~3bugs/day; but only 7% coverage
	* => we could find much more bugs!  how can we deal with that?
	* Can we use the structured information that come from bots without bringing it in free-form emails?
	* => Discussion is already going on.
	* ==> estimated 20k bugs/release (!)
	* "This is a lot of bugs. Can we dig out?"
		* Current problem: Fragmentation in CI/CD, test frameworks, test suites, result parsing, pass/fail criteria, log collection, results visualization, bug tracking, kernel develper process for fixes
		* And there are even more closed projects working on this
	* Conclusion:
		* Fragmentation is bad
		* Collaboration is good
		* Work upstream
		* No upstream? Create one!

09:30 https://ats19.sched.com/event/WDIw/lkft-status-update-milosz-wasilewski-linaro

LKFT
	* Covering multiple architectures: arm32, arm64, i386, x86_64
	* Multiple hardware of each type
	* Testing with QEMU
	* Testing multiple LTS branches
	* Testing: latest stable, mainline, next
	* Running multiple test suites: LTP, perf, kselftest, ...
	* ~25.000 Tests per push
	* => 1M tests each week, 70M up to today
	* lately also started testing android with a downstream kernel
	* Future Plans:
		* Boot design:
			* LAVA
			* Choose rootfs, choose kernel
			* Fastboot is avoided where possible
			* Uses NFS based rootfs where possible
			* LAVA job generation abstracted with own tool => Talk after lunch today
		* Test design:
			* Kselftest build with kernel and overlayed into rootfs
			* dmesg-logs: improved parsing for kernel warnings and errors
		* Reporting system:
			* Find a lot of data => but reporting is not easy
			* Looking for reporting and analytics layer
			* Analyse cross branch and cross time
			* How to integrate with multiple data sources?
	* questions:
		* How many tests are failing? How many bugs are found?
			* About 100 Bugs currently open that have not been fixed
			* But no detailed statistics

09:40 https://ats19.sched.com/event/WDJ3/fuego-status-update-tim-bird-sony

Status update: fuego
	* Introduction:
		* What is fuego?
		* Debian-based linux distro
		* Jenkins packaged inside
		* Test execution core inside
		* Collection of tests inside
		* all in a docker container
		* its for high level integration testing
	* Fuego does embedded testing: Always host and board
	* Fuego cross-builds tests and tools => architectural neutral
	* Collection of tests:
		* Scripts for test execution
		* Tools for results parsing, analysis, visualization
		* Phases: pre_test, dependency check, build, deploy, run, post_test
	* Fuego has multiple transports between host and target
	* Command line tool if you want to ignore all the GUI
	* Everything in docker to make it reproducible
	* Latest release 1.5:
		* Now: Jenkinsless install, can now be plugged into other CI-systems
		* Now installable without a container and can run nativley
		* Fuego core needs only bash and python on host
			* Individual tests might require other things
		* Fuego markes sure you only used a limited feature set on the target
			* Requires only posix shell and 20 common linux commands (all available from busybox) on target
			* does not require awk or sed, but does require grep
	* Fuego does not support provisioning of a target!
		* Is an exercise for the user /o\
	* Currently prototype features:
		* Support tests from other frameworks: Functional.Linaro, Functional.ptest
		* Want to support running a test via LAVA, because so many people are running LAVA
		* configurable back end for test results
		* Artifact server support: Pull artifacts from other servers, (like kernelci)
	* Roadmap: Short term: 
		* Board provisioning support (even if Tim wanted someone else to do it)
		* External monitors e.g. power monitors => Adds another dimension of data to each test
		* Continue integration with other systems:
			* Run a test on an external board manager like beaker, labgrid, ...
	* Roadmap: Longer Term:
		* Utilize external artifact server for SUT
			* maybe use kernelci triggers and builds?
		* Hardware testing
			* How to test SPI, CAN bus, USB
			* Currently totally missing
	* fserver support:
		* Test object server
		* Storage of tests, build artifacts, test requests and results
		* Can be used to deliver requests from one host to another and return the results
		* is WIP
		* https://fuegotest.org/wiki/Using_Fuego_with_fserver
	* Fuego has some tests included
		* But there are thousands of tests in-house but neither results nor tests are shared
		* groups at fujitsu, sony, toshiba, mitsubishi, samsung are using it
			* A group at Renesas is using an older version of it (1.2?)
		* fujitsu is contributing the most currently
	* fuego is focused on test complete products or distributions.
		* it is not used to test upstream / master stuff
		* In Tim's lab the newest kernel is 4.4-something
	* fuego does a lot of work in describing and understanding pass/fail criteria
		* Kevin Hilman: That is really valueable to the upstream-focused test projects
		* pass/fail criteria is dependend on hardware (board, sd-card, ...) => a lot effort is done to create this criteria based on those

09:50 https://ats19.sched.com/event/WDJB/kernelci-status-update-kevin-hilman-baylibre

state of kernel-ci
	* goal always was: test all architectures and as much boards as possible
	* 35x SoC vendors
	* over 250+ unique boards
	* building multiple kernel trees with multiple compilers (and versions) and with multiple configurations
	* mostly boot testing => get into a shell is a pass
	* some boards have more complex tests (DRM, v4l2-compliance, power (suspend / resume), USB smoke test)
	* not writing own test suites, but using the ones of others
	* With kernel-ci in the linux foundation:
		* gets more of a forum where people can share code and infos
		* There is already a lot of press coverage. it's on kernelci.org
	* Current problems:
		* collecting sooo much data
		* want to do some data analytics on this (maybe also with the LF project)
			* currently no new data or graphs to show
	* After plumbers: started to massage test results of multiple projects into common format. more: https://github.com/kernelci/kcidb
	* with kernel-ci cooperating with google and microsoft:
		* much more compute power in the cloud(s); can add much more variants
	* Guillaume Tucker from Collabora has been refactoring the legacy jenkins jobs
	* Microsoft took some of those pipelines and ported them to azure pipelines in a few days

10:00 https://ats19.sched.com/event/WDJI/cki-status-update-veronika-kabatova-red-hat

cki overview
https://cki-project.org/
	* finding bugs before they get into the kernel
	* tracking a few upstream trees
		* newest stable, stable-next, ARM-next, RDMA, RT-devel
		* SCSI, net-next, mainline (on x86_64 only)
		* Not sharing the results publically, just with developers (?)
			* Results for the first set of trees are sent to appropriate mailing lists and developers
			* Results for the second group are internal-only as we don't actively collaborate with maintainers of those trees but there's no private info there
				* We should be able to push to kcidb once it's ready, the only reason for not sending these results is to not spam people
	* Focus: Mostly server-systems and hardware used in those systems (storage, nics, infiniband)
	* Using Gitlab-CI, not Jenkins
	* Beaker as Hardware-abstraction backend && provisioning system
	* Compile results in a report and publish on mailing list
	* Also publish on kcidb
	* KPET selects tests to run based on files touch by the patch => try to prevent finding old bugs and only the new ones (and getting results quicker)
	* kselftest integration on todo list

10:10 https://ats19.sched.com/event/WIZr/slav-status-update-pawel-wieczorek-samsung

	* Started 2017 as proof-of-concept
	* Automated testing with interactive sessions on the DUTs
	* Were not able to sell the muxpis hardware since they are R&D, but it is public
	* Current state: project is on hold
		* Documenting design decisions
		* Trying to provide a muxpi-less evaluation environment
		* trying to minimize confusion: map repositories to Test_Glossary from last ATS
	* https://github.com/SamsungSLAV/muxpi
		* https://3mdeb.com/products/open-source-hardware/muxpi/
		* ~250 EUR
	* Now software is completly open source

10:25: Dmitry Vyukov (developer of syzbot and syzkaller) 5-minute announcement 
	* 3000 syzcaller reproducers are availabe
	* only crash/not-crash result
	* tests may corrupt your system
		* side note: CKI has a flag on the test definition if a test is destructive (i.e. regarding filesystems)
	* some developers want patches to be reviewed but not tested or the other way around

10:30 https://ats19.sched.com/event/VdBs/open-testing-philosophy-kevin-hilman-baylibre

	* Slide from Guillaume Tucker: https://www.linuxplumbersconf.org/event/4/timetable/?view=nicecompact
	* Reminder: Define pipeline-blocks for testing to enable us to share tests, results
	* Find places in that diagram for collaboration

cfi:
    https://github.com/SmithChart/Designing-for-Automated-Testing
    https://designing-for-automated-testing.readthedocs.io/en/latest/

Comment: x86 on TAC needed for android test suite

10:40 coffeebreak

11:00 https://ats19.sched.com/event/UvpM/labgrid-real-world-examples-jan-lubbe-rouven-czerwinski-pengutronix-ek
15 minutes until projector is fixed :/

11:50 https://ats19.sched.com/event/Uvpk/new-ways-out-of-the-struggle-of-testing-embedded-devices-chris-fiege-pengutronix-ek

	* New Ways Out of the Struggle of Testing Embedded Devices - Chris Fiege, Pengutronix e.K.
	* Organized in a Rack
	* Pengutronix Lab
		* USB
		* CAN
		* RS232
		* Ethernet
		* GPIO
		* Power Supply
	* In the Future: more CI for customer projects, more remote access to lab
		* need to increase reliablity
	* -> more smaller test controllers/servers

12:30 lunch

14:00 https://ats19.sched.com/event/WR99/beaker-project-automated-testing-at-red-hat-tomas-klohna-red-hat

	* {{Chris arrived late... some notes missing here}}
	* Beaker:
		* is a unified testing platform for quality engineers
		* has Support for alternative hw-architectures
		* allows to filter on hardware.
		* stores logs and results, exposes this in a web-ui
		* supports multiple systems for one tests. (scheduler is smart enough to wait for multiple resources)
	* Test farm is heavily loaded: 9k machines and 4k users available
	* Beaker is split up:
		* Inventory management
			* Knows machine details
			* Knows machine history
			* Access control and user database
			* Filter of hw-properties
		* Scheduling
			* Job contains recipe sets, recipe sets contain recipes and recipes contain tasks
			* Scheduling is done on ??? layer
			* Tasks defined in XML
			* Code can come from Git or RPM
		* Provisioning
			* Only Redhat-like distros
		* Testing
			* tests written in C
				* Tests can be written in any language as long as they handle the Beaker API, it's the test harness that's written in C
			* tasks can be destructive
			* tasks have a timeout, there is a watchdog
			* tasks have metadata
		* result collection
			* some web-ui

14:20 https://ats19.sched.com/event/V959/slav-test-stack-abstraction-layers-pawel-wieczorek-samsung-rd-institute-poland

	* SLAV: Test stack abstraction layers
	* Aims at testing everything embedded
	* Motivation
		* Parallel access for interactive hacking and automated testing
		* Both use cases are close to each other: both need preparation step, interaction and release of resources
	* Abstractions:
		* DUT - Device under test
		* TM / DUT-C 
			* Test manager: Network access
			* DUT Control
		* Test scheduler / Test manager
		* Non-monolithic approach
	* Hardware Layer
		* All schematics published
		* SD-Mux: DUT Control
		* SDWire: Unser USB-SD-Mux
		* MuxPi: DUT Control and Target Management
			* Contains SBC
	* Software:
		* REST-API
		* Test Manager
		* Test Scheduler
		* Target Manager
		* Capabilities: 
			* Describe attributes of the DUT and connected devices
			* Can be used to get access to the right board
		* Abstraction: Admin creates defined shell-scripts at defined location on target manager
	* What did we learn here?
		* Pro:
			* Requires only preparing test plan
			* Test plans can be reused
		* Cons:
			* Keeping compliance to other formats
			* Other tools gaining features really fast
			* Capabilities have to be defined when preparing DUT for lab

Jan-Simon Möller, AGL Release Manager  14:50 https://ats19.sched.com/event/UvpJ/how-agl-tests-its-distro-and-what-challenges-we-face-jan-simon-moller-the-linux-foundation

	* AGL initially did infotainment, now working on instrument cluster
	* Tools:
		* Single Signon via LF Identity for tools:
		* Git with Gerrit code review
		* JIRA, Jenkins
	* Challenges:
		* multiple boards, multiple images -> large test matrix
		* also need timely results (hours not days) => fast turnaround time
		* release builds need a full pass with license/CVE scanning
		* automotive wants full test coverage
	* Architecture:
		* Git <-> Gerrit => Jenkins / CI 
		* Lava: Tests on hardware (Qemu, or real hardware)
			* Multiple LAVA instances controlling multiple DUTs
		* Collecting results in modified kernelCI database
		* Results via E-Mail
		* Also: Results in KernelCI web UI but is not really helpful
			* Missing trend analysis => Currently waiting for new capabilities in kernelCI
		* Feedback-Loop from CI to gerrit: Developer gets +1 for build and CT
		* Every single commit goes through the CI loop
	* Lessons learned:
		* Build step:
			* full builds take 5-6 hours on 16-core cloud machine
				* IO is an issue in Cloud environments
			* run in qemu first (touchstone build)
			* Build Phase: 2.5h but that seems too long
		* Test step:
			* Test centers are decentralized, build is in the cloud.
			* Transfer of artifacts (~800MB) takes time
			* Build jobs tend to finish in bursts => leads to multiple downloads in parallel
			* Lava: Needs better priorizing or round-robin scheduling. currently one lab seems to be preferred
		* reporting:
			* KernelCI UI is too slow with many tests

14:00 https://ats19.sched.com/event/UvpY/test-plan-templating-for-lava-milosz-wasilewski-linaro-ltd
	* start with 1 devices
	* now 8 devices, 20+ tests
	* convoluted definitions
		* interconnections between deploy type and tests
		* Deployments are all different
	* Separation into layers
	* base -> deployment -> type -> test
	* More Android tests (CTS, VTS)
	* Questions:
		* KernelCI similar, migration possible?
			* Possible to migrate kernelCI to this
			* Kevin Hilman: interested in migrating to the lava-test-plans because its already similar
	* https://github.com/andersson/bootrr to check device tree and loaded kernel

14:50 https://ats19.sched.com/event/Uvpe/a-survey-of-open-source-test-definitions-tim-bird-sony

	* Standards XKCD
	* Test definition store for each framework
	* What's inside:
		* how to run the test
		* lots of metadata
		* parsing
		* analysis
		* whatever ends up in the database
	* Fuego
		* yml and tar variables and parsers
		* a lot of files
		* python, shell, yaml and json
	* Linaro:
		* test.yaml
		* test.sh instructions
	* Yocto:
		* Python file with classes
	* 0Day
		* PGKBUILD, pack in bash
		* yaml for execution and dependencies
	* CKI
		* index files for meta-data and triggering, scheduling, control
		* Makefile with phases
		* metadata with metadata (created by Makefile)
		* README.md metadata (markdown)
	* Jenkins
		* config.xml - meta-data, including instructions
	* SLAV
		* yaml, meta-data, instructions and execution
	* Overview
		* Everyone uses shell snippets for tests
		* YAML often for metadata
	* Overview of fields in projects (not repeated here)
	* Overview of common intersections
	* Harmonization issues
		* time location for execution
		* what runs where (target, test server)
		* what is the required software?

15:20 https://sched.co/WIcw Guide to CIP testing

	* Goal to have env to test SLTS Kernel, SLTS RT Kernel, CIP Core (Deby & ISAR), SW update
	* currently ~30 Kernel Configs for SLTS v4.4 & v4.19
	* Use Gitlab CI, Gitlab runner in AWS with k8s, builders on-demand in AWS
	* LAVA Workers locally
	* currently testing Boot, Spectre/Meltdown checker from linaro
		* LTP in progress
	* next steps:
		* improve reporting
		* improve coverage
			* kselftest, jitterdebugger, linaro test defs, benchmarks, hardware testing (CAN/PCIe/USB etc)
		* add more boards
		* collaboration with automated testing community
	* gitlab cloud ci: gitlab.com/cip-project/cip-testing/gitlab-cloud-ci
		* ISAR needs binfmt-misc (needs privileges because not namespace aware)
			* hard way: fix binfmt_misc namespace aware
			* easy way: use privileged containers
			* went both ways :)
		* reuses kops as a thin wrapper
		* AWS or on-premise
		* scaling supported
	* currently:
		* 100% uptime over last six months
		* auto scaling a bit slow on AWS
		* m5d.4xlarge (16vCPU, 64GB RAM, 2x NVMe SSD)
		* dynamic 0-40 slave nodes
		* on-premise at Siemens
	* contributions/bug reports welcome
	* master always running on AWS (40€ per month)
	* linux-cip-ci:
		* two containers for building and testing linux SLTS kernels via LAVA:
			* creates job def and waits for the result
	* currently two physical labs

https://ats19.sched.com/event/Uvph/working-together-to-build-a-modular-ci-ecosystem-discussion-session-tim-bird-sony
Tim: Modular Testing Framework
	* Many monolithic test frameworks
	* want to mix and match
	* -> reduce work
	* we need to define APIs / boundaries between our modules / systems
	* we're making progress on the run artifacts (kcidb)
	* how can we integrate our tools so that it is still easy for our users to use our systems?
	* proposal:
		* git style interface toolname, verb args
		* JSON as return value
		* async via start/stop/collect
	* How should we input data? JSON, ENV, parameters?
	* CLI is ok since we are not time-critical
	* how to get threre without breaking existing systems?
		* need to look at each other's systems

https://ats19.sched.com/event/Uvri/defining-a-standard-board-management-api-jan-lubbe-pengutronix-ek-pawel-wieczorek-samsung-rd-institute-poland

TODO: Some Names missing here. See Names in [Brackets]

	* Poll: are there people interested to run multiple test systems in one lab?
		* Linaro: Interactive hacking is a case
		* Fuego, Lava, labgrid seem to need to coexist
		* There should be common coordinator for all the systems. and all systems must be aware of this
		* Pro: You can use any system for the use-cases it is good in and learn the other ones

	* Jan proposes a central resource controller:
		* Make reservation, use the board, give it back afterwards
	* Chris suggests to have a master-scheduler and not use hacks on the single-master systems.
		* Must make single-master systems be aware of the new master
	* Jan:
		* Need a seperate daemon talking to the systems
		* Schedule this systems time to do it's jobs.
		* Needs jobs to inspect a queue and the boards state
	* Remi: That is currently used in LAVA for development.

	* Tim: How to handle serial port? Fuego wants local device
		* Jan: LAVA can call a tool and use STDOUT STDIN
		* Jan: We already use that on top of labgrid
	* Kevin: Many labs use ser2net. In newer version ser2net can multiplex multiple clients

	* Tim: Using a different model: Test via SSH and store results in files on the target. They don't really use the serial port for test results. They are only used for executing commands if needed.

	* Jan: What about different configurations in all the tools?
		* Jan: For the beginning it could be ok. Later there could be a tool that compiles the oder versions
		* Tim: Could think about using other formats.

	* Jan: Progress: Labgrid takes no new drivers for power switches. We tell them to add it to pdudaemon.
	* Tim: New release of pdudaemon.
	* Matt Hart:
		* There is now a daemonless-mode
		* Output can now be driven by name if they are named in the config file.
	* Remi:
		* pdudaemon with named ports can now be used in LAVA with a small configuration-change
		* there is no need for the commands in the config file any more

https://ats19.sched.com/event/UvyI/summit-wrap-up-tim-bird-sony

	* Tim: Action items:
		* Note Chris: Action items also on: https://elinux.org/Automated_Testing_Summit_2019#Presentations
		* Tim: Everybody: Upload your slides!
			* Agree to best upload it to elinux.org: https://elinux.org/Automated_Testing_Summit_2019#Summit_Artifacts
		* ALL: Send notes to the list
	* Tim: Key decisions:
		* Jan: Upload test results to kcidb
			* Tim: All systems make a kcidb client to upload results
			* Kevin: Extend kcidb schema
			* Tim: Has prioritiy over test definition unification work
		* Tim: Use LTP metadata format as initial standard
			* add meta-data convert to kselftest (Tim will take care)
			* Also plan to add a converter to this format for Fuego (Tim again)
			* This should give us some ideas of how to use it, and tweak it going forward
		* Jan: Will build a prototoype to move boards between LAVA and labgrid.
			* Jan: Calls out for other systems to adapt to the prototype once it's there.
		* Tim: Chris should keep working on "Hardware Design for Testing"
			* Try to promote it via: people.kernel.org, corporate blog post, LWN, ...
		* Jan: Suggests people add more infos on interesting or not interesting hardware to the Board_Farm page: https://elinux.org/Board_Farm
		* Jan/Tim: Do we do this again?
			* Tim: There was some work to do before: Logo, sponsor-stuff. But that is done now.
			* Tim: This year there was a lot of testing-discussion on other events. Make preperation weird.
			* Tim: There will be a testing-track at plumbers
			* Jan: There will be an embedded track at FOSDEM
			* Kevin: Testing at plumbers will be a full day
			* Tim/Kevin: There is only little overlap of people. 
			* Tim: There is a lot of VM-testing.
			* Sasha Levin: Please submit your topics for the micro-conference!
			* Tim adds an action item to do a decision in the future.
			* Tomas Klohna: https://opentestcon.org/ is open for all topics
			* Jan: Would like to have more "hackfest" or workshops
			* Tim: Wants to plan for plumbers hackfest in august 2020
				* https://www.linuxplumbersconf.org/
			* Tim: sounds like a decision to focus on that next year, instead of repeat of ATS in 2020 (October, co-located with ELCE)
				* Some people won't be at Plumbers, so they would miss out
		* Jan suggests to continue use the mailing list
		* Jan makes an advertisement for the usb-sd-mux