[Automated-testing] ATS 2019 at ELC-E: Raw Notes from Pengutronix
Chris Fiege
cfi at pengutronix.de
Mon Nov 11 23:54:07 PST 2019
On 02.11.19 15:40, Tim.Bird at sony.com wrote:
>
>
>> -----Original Message-----
>> From: Chris Fiege
>> Jan, Rouven and I took some notes during the ATS today.
>>
>> You can find our Raw notes here:
>> https://pad.stratum0.org/p/ELCE-PTX-ATS
>>
>> Feel free to correct or complete our notes.
>> In the last parts (currently starting at Line 440) some names are missing.
>> Maybe someone can add those.
>>
>> Deadline for your input is 07.11.2019. Afterwards I will post the notes
>> here on the list.>
> Thank you all *so much* for taking these notes. I edited the notes in a few
> places, and added a few names (hopefully correctly).
>
> The minutes really helped to jog my memory of the discussions we had and
> they'll be a very useful resource in the future.
> -- Tim
Thanks to everyone for your Input. I've attached our meeting minutes.
Regards Chris
-------------- next part --------------
Minutes from sessions at Automated Testing Summit 2019
(held in Lyon, France - October 31, 2019)
https://events19.linuxfoundation.org/events/ats-2019/program/schedule/
09:00 https://ats19.sched.com/event/Uvq8/keynote-welcome-opening-remarks-tim-bird-sr-staff-software-engineer-sony
09:10 https://ats19.sched.com/event/Uvpn/keynote-report-on-recent-testing-meetups-kevin-hilman-co-founder-sr-engineer-baylibre
* "the bugs are too fast - and why we can't catch them."
* Summary of test conferences
* Everybody is testing in their own corner - there is not much communication / upstreaming
* Test coverage := on the beaten tracks
* Even with fragmentation: We still find lots of bugs. How can we fix all this bugs?
* ~10% of kernel commits have are bugfixes
* syzbot: finds ~3bugs/day; but only 7% coverage
* => we could find much more bugs! how can we deal with that?
* Can we use the structured information that come from bots without bringing it in free-form emails?
* => Discussion is already going on.
* ==> estimated 20k bugs/release (!)
* "This is a lot of bugs. Can we dig out?"
* Current problem: Fragmentation in CI/CD, test frameworks, test suites, result parsing, pass/fail criteria, log collection, results visualization, bug tracking, kernel develper process for fixes
* And there are even more closed projects working on this
* Conclusion:
* Fragmentation is bad
* Collaboration is good
* Work upstream
* No upstream? Create one!
09:30 https://ats19.sched.com/event/WDIw/lkft-status-update-milosz-wasilewski-linaro
LKFT
* Covering multiple architectures: arm32, arm64, i386, x86_64
* Multiple hardware of each type
* Testing with QEMU
* Testing multiple LTS branches
* Testing: latest stable, mainline, next
* Running multiple test suites: LTP, perf, kselftest, ...
* ~25.000 Tests per push
* => 1M tests each week, 70M up to today
* lately also started testing android with a downstream kernel
* Future Plans:
* Boot design:
* LAVA
* Choose rootfs, choose kernel
* Fastboot is avoided where possible
* Uses NFS based rootfs where possible
* LAVA job generation abstracted with own tool => Talk after lunch today
* Test design:
* Kselftest build with kernel and overlayed into rootfs
* dmesg-logs: improved parsing for kernel warnings and errors
* Reporting system:
* Find a lot of data => but reporting is not easy
* Looking for reporting and analytics layer
* Analyse cross branch and cross time
* How to integrate with multiple data sources?
* questions:
* How many tests are failing? How many bugs are found?
* About 100 Bugs currently open that have not been fixed
* But no detailed statistics
09:40 https://ats19.sched.com/event/WDJ3/fuego-status-update-tim-bird-sony
Status update: fuego
* Introduction:
* What is fuego?
* Debian-based linux distro
* Jenkins packaged inside
* Test execution core inside
* Collection of tests inside
* all in a docker container
* its for high level integration testing
* Fuego does embedded testing: Always host and board
* Fuego cross-builds tests and tools => architectural neutral
* Collection of tests:
* Scripts for test execution
* Tools for results parsing, analysis, visualization
* Phases: pre_test, dependency check, build, deploy, run, post_test
* Fuego has multiple transports between host and target
* Command line tool if you want to ignore all the GUI
* Everything in docker to make it reproducible
* Latest release 1.5:
* Now: Jenkinsless install, can now be plugged into other CI-systems
* Now installable without a container and can run nativley
* Fuego core needs only bash and python on host
* Individual tests might require other things
* Fuego markes sure you only used a limited feature set on the target
* Requires only posix shell and 20 common linux commands (all available from busybox) on target
* does not require awk or sed, but does require grep
* Fuego does not support provisioning of a target!
* Is an exercise for the user /o\
* Currently prototype features:
* Support tests from other frameworks: Functional.Linaro, Functional.ptest
* Want to support running a test via LAVA, because so many people are running LAVA
* configurable back end for test results
* Artifact server support: Pull artifacts from other servers, (like kernelci)
* Roadmap: Short term:
* Board provisioning support (even if Tim wanted someone else to do it)
* External monitors e.g. power monitors => Adds another dimension of data to each test
* Continue integration with other systems:
* Run a test on an external board manager like beaker, labgrid, ...
* Roadmap: Longer Term:
* Utilize external artifact server for SUT
* maybe use kernelci triggers and builds?
* Hardware testing
* How to test SPI, CAN bus, USB
* Currently totally missing
* fserver support:
* Test object server
* Storage of tests, build artifacts, test requests and results
* Can be used to deliver requests from one host to another and return the results
* is WIP
* https://fuegotest.org/wiki/Using_Fuego_with_fserver
* Fuego has some tests included
* But there are thousands of tests in-house but neither results nor tests are shared
* groups at fujitsu, sony, toshiba, mitsubishi, samsung are using it
* A group at Renesas is using an older version of it (1.2?)
* fujitsu is contributing the most currently
* fuego is focused on test complete products or distributions.
* it is not used to test upstream / master stuff
* In Tim's lab the newest kernel is 4.4-something
* fuego does a lot of work in describing and understanding pass/fail criteria
* Kevin Hilman: That is really valueable to the upstream-focused test projects
* pass/fail criteria is dependend on hardware (board, sd-card, ...) => a lot effort is done to create this criteria based on those
09:50 https://ats19.sched.com/event/WDJB/kernelci-status-update-kevin-hilman-baylibre
state of kernel-ci
* goal always was: test all architectures and as much boards as possible
* 35x SoC vendors
* over 250+ unique boards
* building multiple kernel trees with multiple compilers (and versions) and with multiple configurations
* mostly boot testing => get into a shell is a pass
* some boards have more complex tests (DRM, v4l2-compliance, power (suspend / resume), USB smoke test)
* not writing own test suites, but using the ones of others
* With kernel-ci in the linux foundation:
* gets more of a forum where people can share code and infos
* There is already a lot of press coverage. it's on kernelci.org
* Current problems:
* collecting sooo much data
* want to do some data analytics on this (maybe also with the LF project)
* currently no new data or graphs to show
* After plumbers: started to massage test results of multiple projects into common format. more: https://github.com/kernelci/kcidb
* with kernel-ci cooperating with google and microsoft:
* much more compute power in the cloud(s); can add much more variants
* Guillaume Tucker from Collabora has been refactoring the legacy jenkins jobs
* Microsoft took some of those pipelines and ported them to azure pipelines in a few days
10:00 https://ats19.sched.com/event/WDJI/cki-status-update-veronika-kabatova-red-hat
cki overview
https://cki-project.org/
* finding bugs before they get into the kernel
* tracking a few upstream trees
* newest stable, stable-next, ARM-next, RDMA, RT-devel
* SCSI, net-next, mainline (on x86_64 only)
* Not sharing the results publically, just with developers (?)
* Results for the first set of trees are sent to appropriate mailing lists and developers
* Results for the second group are internal-only as we don't actively collaborate with maintainers of those trees but there's no private info there
* We should be able to push to kcidb once it's ready, the only reason for not sending these results is to not spam people
* Focus: Mostly server-systems and hardware used in those systems (storage, nics, infiniband)
* Using Gitlab-CI, not Jenkins
* Beaker as Hardware-abstraction backend && provisioning system
* Compile results in a report and publish on mailing list
* Also publish on kcidb
* KPET selects tests to run based on files touch by the patch => try to prevent finding old bugs and only the new ones (and getting results quicker)
* kselftest integration on todo list
10:10 https://ats19.sched.com/event/WIZr/slav-status-update-pawel-wieczorek-samsung
* Started 2017 as proof-of-concept
* Automated testing with interactive sessions on the DUTs
* Were not able to sell the muxpis hardware since they are R&D, but it is public
* Current state: project is on hold
* Documenting design decisions
* Trying to provide a muxpi-less evaluation environment
* trying to minimize confusion: map repositories to Test_Glossary from last ATS
* https://github.com/SamsungSLAV/muxpi
* https://3mdeb.com/products/open-source-hardware/muxpi/
* ~250 EUR
* Now software is completly open source
10:25: Dmitry Vyukov (developer of syzbot and syzkaller) 5-minute announcement
* 3000 syzcaller reproducers are availabe
* only crash/not-crash result
* tests may corrupt your system
* side note: CKI has a flag on the test definition if a test is destructive (i.e. regarding filesystems)
* some developers want patches to be reviewed but not tested or the other way around
10:30 https://ats19.sched.com/event/VdBs/open-testing-philosophy-kevin-hilman-baylibre
* Slide from Guillaume Tucker: https://www.linuxplumbersconf.org/event/4/timetable/?view=nicecompact
* Reminder: Define pipeline-blocks for testing to enable us to share tests, results
* Find places in that diagram for collaboration
cfi:
https://github.com/SmithChart/Designing-for-Automated-Testing
https://designing-for-automated-testing.readthedocs.io/en/latest/
Comment: x86 on TAC needed for android test suite
10:40 coffeebreak
11:00 https://ats19.sched.com/event/UvpM/labgrid-real-world-examples-jan-lubbe-rouven-czerwinski-pengutronix-ek
15 minutes until projector is fixed :/
11:50 https://ats19.sched.com/event/Uvpk/new-ways-out-of-the-struggle-of-testing-embedded-devices-chris-fiege-pengutronix-ek
* New Ways Out of the Struggle of Testing Embedded Devices - Chris Fiege, Pengutronix e.K.
* Organized in a Rack
* Pengutronix Lab
* USB
* CAN
* RS232
* Ethernet
* GPIO
* Power Supply
* In the Future: more CI for customer projects, more remote access to lab
* need to increase reliablity
* -> more smaller test controllers/servers
12:30 lunch
14:00 https://ats19.sched.com/event/WR99/beaker-project-automated-testing-at-red-hat-tomas-klohna-red-hat
* {{Chris arrived late... some notes missing here}}
* Beaker:
* is a unified testing platform for quality engineers
* has Support for alternative hw-architectures
* allows to filter on hardware.
* stores logs and results, exposes this in a web-ui
* supports multiple systems for one tests. (scheduler is smart enough to wait for multiple resources)
* Test farm is heavily loaded: 9k machines and 4k users available
* Beaker is split up:
* Inventory management
* Knows machine details
* Knows machine history
* Access control and user database
* Filter of hw-properties
* Scheduling
* Job contains recipe sets, recipe sets contain recipes and recipes contain tasks
* Scheduling is done on ??? layer
* Tasks defined in XML
* Code can come from Git or RPM
* Provisioning
* Only Redhat-like distros
* Testing
* tests written in C
* Tests can be written in any language as long as they handle the Beaker API, it's the test harness that's written in C
* tasks can be destructive
* tasks have a timeout, there is a watchdog
* tasks have metadata
* result collection
* some web-ui
14:20 https://ats19.sched.com/event/V959/slav-test-stack-abstraction-layers-pawel-wieczorek-samsung-rd-institute-poland
* SLAV: Test stack abstraction layers
* Aims at testing everything embedded
* Motivation
* Parallel access for interactive hacking and automated testing
* Both use cases are close to each other: both need preparation step, interaction and release of resources
* Abstractions:
* DUT - Device under test
* TM / DUT-C
* Test manager: Network access
* DUT Control
* Test scheduler / Test manager
* Non-monolithic approach
* Hardware Layer
* All schematics published
* SD-Mux: DUT Control
* SDWire: Unser USB-SD-Mux
* MuxPi: DUT Control and Target Management
* Contains SBC
* Software:
* REST-API
* Test Manager
* Test Scheduler
* Target Manager
* Capabilities:
* Describe attributes of the DUT and connected devices
* Can be used to get access to the right board
* Abstraction: Admin creates defined shell-scripts at defined location on target manager
* What did we learn here?
* Pro:
* Requires only preparing test plan
* Test plans can be reused
* Cons:
* Keeping compliance to other formats
* Other tools gaining features really fast
* Capabilities have to be defined when preparing DUT for lab
Jan-Simon Möller, AGL Release Manager 14:50 https://ats19.sched.com/event/UvpJ/how-agl-tests-its-distro-and-what-challenges-we-face-jan-simon-moller-the-linux-foundation
* AGL initially did infotainment, now working on instrument cluster
* Tools:
* Single Signon via LF Identity for tools:
* Git with Gerrit code review
* JIRA, Jenkins
* Challenges:
* multiple boards, multiple images -> large test matrix
* also need timely results (hours not days) => fast turnaround time
* release builds need a full pass with license/CVE scanning
* automotive wants full test coverage
* Architecture:
* Git <-> Gerrit => Jenkins / CI
* Lava: Tests on hardware (Qemu, or real hardware)
* Multiple LAVA instances controlling multiple DUTs
* Collecting results in modified kernelCI database
* Results via E-Mail
* Also: Results in KernelCI web UI but is not really helpful
* Missing trend analysis => Currently waiting for new capabilities in kernelCI
* Feedback-Loop from CI to gerrit: Developer gets +1 for build and CT
* Every single commit goes through the CI loop
* Lessons learned:
* Build step:
* full builds take 5-6 hours on 16-core cloud machine
* IO is an issue in Cloud environments
* run in qemu first (touchstone build)
* Build Phase: 2.5h but that seems too long
* Test step:
* Test centers are decentralized, build is in the cloud.
* Transfer of artifacts (~800MB) takes time
* Build jobs tend to finish in bursts => leads to multiple downloads in parallel
* Lava: Needs better priorizing or round-robin scheduling. currently one lab seems to be preferred
* reporting:
* KernelCI UI is too slow with many tests
14:00 https://ats19.sched.com/event/UvpY/test-plan-templating-for-lava-milosz-wasilewski-linaro-ltd
* start with 1 devices
* now 8 devices, 20+ tests
* convoluted definitions
* interconnections between deploy type and tests
* Deployments are all different
* Separation into layers
* base -> deployment -> type -> test
* More Android tests (CTS, VTS)
* Questions:
* KernelCI similar, migration possible?
* Possible to migrate kernelCI to this
* Kevin Hilman: interested in migrating to the lava-test-plans because its already similar
* https://github.com/andersson/bootrr to check device tree and loaded kernel
14:50 https://ats19.sched.com/event/Uvpe/a-survey-of-open-source-test-definitions-tim-bird-sony
* Standards XKCD
* Test definition store for each framework
* What's inside:
* how to run the test
* lots of metadata
* parsing
* analysis
* whatever ends up in the database
* Fuego
* yml and tar variables and parsers
* a lot of files
* python, shell, yaml and json
* Linaro:
* test.yaml
* test.sh instructions
* Yocto:
* Python file with classes
* 0Day
* PGKBUILD, pack in bash
* yaml for execution and dependencies
* CKI
* index files for meta-data and triggering, scheduling, control
* Makefile with phases
* metadata with metadata (created by Makefile)
* README.md metadata (markdown)
* Jenkins
* config.xml - meta-data, including instructions
* SLAV
* yaml, meta-data, instructions and execution
* Overview
* Everyone uses shell snippets for tests
* YAML often for metadata
* Overview of fields in projects (not repeated here)
* Overview of common intersections
* Harmonization issues
* time location for execution
* what runs where (target, test server)
* what is the required software?
15:20 https://sched.co/WIcw Guide to CIP testing
* Goal to have env to test SLTS Kernel, SLTS RT Kernel, CIP Core (Deby & ISAR), SW update
* currently ~30 Kernel Configs for SLTS v4.4 & v4.19
* Use Gitlab CI, Gitlab runner in AWS with k8s, builders on-demand in AWS
* LAVA Workers locally
* currently testing Boot, Spectre/Meltdown checker from linaro
* LTP in progress
* next steps:
* improve reporting
* improve coverage
* kselftest, jitterdebugger, linaro test defs, benchmarks, hardware testing (CAN/PCIe/USB etc)
* add more boards
* collaboration with automated testing community
* gitlab cloud ci: gitlab.com/cip-project/cip-testing/gitlab-cloud-ci
* ISAR needs binfmt-misc (needs privileges because not namespace aware)
* hard way: fix binfmt_misc namespace aware
* easy way: use privileged containers
* went both ways :)
* reuses kops as a thin wrapper
* AWS or on-premise
* scaling supported
* currently:
* 100% uptime over last six months
* auto scaling a bit slow on AWS
* m5d.4xlarge (16vCPU, 64GB RAM, 2x NVMe SSD)
* dynamic 0-40 slave nodes
* on-premise at Siemens
* contributions/bug reports welcome
* master always running on AWS (40€ per month)
* linux-cip-ci:
* two containers for building and testing linux SLTS kernels via LAVA:
* creates job def and waits for the result
* currently two physical labs
https://ats19.sched.com/event/Uvph/working-together-to-build-a-modular-ci-ecosystem-discussion-session-tim-bird-sony
Tim: Modular Testing Framework
* Many monolithic test frameworks
* want to mix and match
* -> reduce work
* we need to define APIs / boundaries between our modules / systems
* we're making progress on the run artifacts (kcidb)
* how can we integrate our tools so that it is still easy for our users to use our systems?
* proposal:
* git style interface toolname, verb args
* JSON as return value
* async via start/stop/collect
* How should we input data? JSON, ENV, parameters?
* CLI is ok since we are not time-critical
* how to get threre without breaking existing systems?
* need to look at each other's systems
https://ats19.sched.com/event/Uvri/defining-a-standard-board-management-api-jan-lubbe-pengutronix-ek-pawel-wieczorek-samsung-rd-institute-poland
TODO: Some Names missing here. See Names in [Brackets]
* Poll: are there people interested to run multiple test systems in one lab?
* Linaro: Interactive hacking is a case
* Fuego, Lava, labgrid seem to need to coexist
* There should be common coordinator for all the systems. and all systems must be aware of this
* Pro: You can use any system for the use-cases it is good in and learn the other ones
* Jan proposes a central resource controller:
* Make reservation, use the board, give it back afterwards
* Chris suggests to have a master-scheduler and not use hacks on the single-master systems.
* Must make single-master systems be aware of the new master
* Jan:
* Need a seperate daemon talking to the systems
* Schedule this systems time to do it's jobs.
* Needs jobs to inspect a queue and the boards state
* Remi: That is currently used in LAVA for development.
* Tim: How to handle serial port? Fuego wants local device
* Jan: LAVA can call a tool and use STDOUT STDIN
* Jan: We already use that on top of labgrid
* Kevin: Many labs use ser2net. In newer version ser2net can multiplex multiple clients
* Tim: Using a different model: Test via SSH and store results in files on the target. They don't really use the serial port for test results. They are only used for executing commands if needed.
* Jan: What about different configurations in all the tools?
* Jan: For the beginning it could be ok. Later there could be a tool that compiles the oder versions
* Tim: Could think about using other formats.
* Jan: Progress: Labgrid takes no new drivers for power switches. We tell them to add it to pdudaemon.
* Tim: New release of pdudaemon.
* Matt Hart:
* There is now a daemonless-mode
* Output can now be driven by name if they are named in the config file.
* Remi:
* pdudaemon with named ports can now be used in LAVA with a small configuration-change
* there is no need for the commands in the config file any more
https://ats19.sched.com/event/UvyI/summit-wrap-up-tim-bird-sony
* Tim: Action items:
* Note Chris: Action items also on: https://elinux.org/Automated_Testing_Summit_2019#Presentations
* Tim: Everybody: Upload your slides!
* Agree to best upload it to elinux.org: https://elinux.org/Automated_Testing_Summit_2019#Summit_Artifacts
* ALL: Send notes to the list
* Tim: Key decisions:
* Jan: Upload test results to kcidb
* Tim: All systems make a kcidb client to upload results
* Kevin: Extend kcidb schema
* Tim: Has prioritiy over test definition unification work
* Tim: Use LTP metadata format as initial standard
* add meta-data convert to kselftest (Tim will take care)
* Also plan to add a converter to this format for Fuego (Tim again)
* This should give us some ideas of how to use it, and tweak it going forward
* Jan: Will build a prototoype to move boards between LAVA and labgrid.
* Jan: Calls out for other systems to adapt to the prototype once it's there.
* Tim: Chris should keep working on "Hardware Design for Testing"
* Try to promote it via: people.kernel.org, corporate blog post, LWN, ...
* Jan: Suggests people add more infos on interesting or not interesting hardware to the Board_Farm page: https://elinux.org/Board_Farm
* Jan/Tim: Do we do this again?
* Tim: There was some work to do before: Logo, sponsor-stuff. But that is done now.
* Tim: This year there was a lot of testing-discussion on other events. Make preperation weird.
* Tim: There will be a testing-track at plumbers
* Jan: There will be an embedded track at FOSDEM
* Kevin: Testing at plumbers will be a full day
* Tim/Kevin: There is only little overlap of people.
* Tim: There is a lot of VM-testing.
* Sasha Levin: Please submit your topics for the micro-conference!
* Tim adds an action item to do a decision in the future.
* Tomas Klohna: https://opentestcon.org/ is open for all topics
* Jan: Would like to have more "hackfest" or workshops
* Tim: Wants to plan for plumbers hackfest in august 2020
* https://www.linuxplumbersconf.org/
* Tim: sounds like a decision to focus on that next year, instead of repeat of ATS in 2020 (October, co-located with ELCE)
* Some people won't be at Plumbers, so they would miss out
* Jan suggests to continue use the mailing list
* Jan makes an advertisement for the usb-sd-mux
More information about the automated-testing
mailing list