[Automated-testing] test definitions shared library

Mon Jul 15 19:51:21 PDT 2019

Hi Milosz (and other project maintainers interested in sharing test definitions),

> From: Milosz Wasilewski <milosz.wasilewski at linaro.org>
[...]
> > 2) Language
> >
> > In Fuego, we are using shell scripts for most phases (dependencies, build,
> deploy, run, cleanup) and depending on the test we use python for parsing.
> If the test is simple enough we use a shell script.
> > In Linaro, I think that you are using mostly shell scripts but sometimes
> python. Once thing that we could do is to check whether python is available
> and if it isn't provide a simpler parsing code based on awk/grep/etc. Another
> option for Linaro is to add python to the dependencies of the test. This
> would work on most modern Debian-based OS image builders such as Debos
> or ISAR.
> 
> I didn't check all but the rule of thumb is that when test runs on the target is
> should only use shell scripts. There are python scripts in test-definitions
> repository but they're meant to be used in case script runs on the host (like
> Android CTS).

I see.

To be honest, I have been using shell scripts as well but I sometimes wish we could remove that rule of thumb and liberate ourselves.

Some advantages of using python:
- easier to write and read
- we can possibly write more powerful tests wrappers and parsers
- some frameworks (avocado) and tests use python already
https://avocado-framework.readthedocs.io/en/70.0/WritingTests.html

Some disadvantages:
- can't be used for testing OS images that you can't customize
   -> but if we just want to test software, we can prepare OS image specifically for testing. 
   -> actually we could share the OS images as well (either for boards or for KVM/Qemu)
- size of the OS image will probably increase
   -> but not much

I can live with shell scripts, but do we really need to be so constrained?
Assuming that we have control over the OS images (e.g. using Debos + kernel for your board/QEMU) then can't we level up the functionality of our test wrappers?

I guess the decision here is: do we assume control on the OS images or not.

***TO OTHER TEST-Wrapper PROJECTS**: what is your use case in this regard?

> > Finally, an alternative would be to use Go static binaries for the parsing
> phase. Go would work in Fuego (on the host side) and on Linaro (on the
> target side) unless you are using an OS or architecture not supported by Go
> [12].
> 
> This is possible but it will add a bit of maintenance cost.

OK, I thought it would be a good chance to practice Go ;).

> > 3) Directory structure
> >
> > Fuego has a defined structure [13]
> > - fuego_test.sh: script that contains the test wrapper phases (e.g
> > build, run, ..)
> > - optional
> >    - parser.py: parses the ouput with the help of a library that produces a
> JSON file that complies with a schema very close to kernelci/squad schemes.
> >    - source tarball
> >    - test.yaml with metadata
> >    - test specs: various combinations of parameters that should be
> > tried
> >
> > I think this structure can cover most needs. In Linaro, you also store the
> LAVA YAML job files however i think that those are not strictly necessary.
> 
> As we agreed some time ago I'm planning to align test-definitions with Fuego
> 'phases' and rename/split the scripts. This isn't hard but takes time and is
> currently pretty low on my priority list.  There are a few example LAVA YAML
> files but they are only stored there for demonstration purpose and can be
> considered as 'documentation' rather than code [15].

OK, I wasn't sure that this was completely agreed.

***TO OTHER TEST-Wrapper PROJECTS**: do you agree with this?

> > Linaro YAML files have three main sections:
> >
> > * os dependencies: these ones can be added to the test.sh's precheck
> > phase
> 
> correct. This is used by LAVA to install dependencies outside of the lava test
> shell action. We stopped using them in test-definitions because they break
> portability. I'm not sure if this is a useful LAVA feature.
> 
> > * board dependencies: these depend on the lab and the boards you have
> so they should be handled outside of the test definitions.
> 
> IMHO this section is controversial. It depends on the device type naming
> convention and this isn't consistent across all LAVA labs. So I would consider
> it as a guidance rather than dependency. I once had an idea to use this field
> to compose test plans for devices but it didn't work.
> 
> > * execution instructions: all test definitions in Linaro seem to follow the
> same pattern, so you should be able to just abstract that into a template. For
> example:
> >     steps:
> >         - cd ./automated/linux/busybox/ <-- convert to
> automated/linux/$TEST
> >         - ./busybox.sh <-- convert to ./test.sh
> >         - ../../utils/send-to-lava.sh ./output/result.txt
> 
> This was done to improve portability. Some definitions have a bit more
> complicated scripts though
> 

Ok, I got it.
Do you think the YAML job files should be stored in the same repository as the test definitions or separately?

> > 4) Output format
> >
> > Tests already have several standards (junit, tap etc), but I think that all test
> frameworks end up creating a parser/converter that collects those
> "standard" logs and transform them into a unified per-framework format.
> For example, a JSON file in Fuego or a set of LAVA serial messages in Linaro.
> >
> > In my super-biased opinion, the test definitions should produce a JSON file
> similar to what Squad/kernelci (or bigquery!?) can accept. And then, each
> framework can have some glue code to convert that JSON file into their
> preferred format (for example, a JSON to LAVA script).
> 
> why not support all frameworks? By 'all' I mean that test definition can
> produce output depending on the framework it runs in. It can be requested
> as script parameter. This way the frameworks won't have to do any magic
> and it would be a task for test maintainers to make sure it works in
> frameworks they care about.

Good point. Then we need an internal output format, and "plugins" to convert the internal format into each framework's format.

> > 5) Test naming
> >
> > I think we should adopt the nomenclature introduced by Squad which
> > includes test results (pass, fail,..) and metrics (bps, I/O ops/sec
> > etc..)
> >
> > {
> >   "test1": "pass",
> >   "test2": "pass",
> >   "testsuite1/test1": "pass",
> >   "testsuite1/test2": "fail",
> >   "testsuite2/subgroup1/testA": "pass",
> >   "testsuite2/subgroup2/testA": "pass",
> >   "testsuite2/subgroup2/testA[variant/one]": "pass",
> >   "testsuite2/subgroup2/testA[variant/two]": "pass"
> > }
> >
> > {
> >   "v1": 1,
> >   "v2": 2.5,
> >   "group1/v1": [1.2, 2.1, 3.03],
> >   "group1/subgroup/v1": [1, 2, 3, 2, 3, 1] }
> >
> > If we agree on that, then we can prepare a list of tests that adhere to such
> nomenclature. For example (just invented):
> >
> > ltp/syscalls/madvice/madvice01
> > busybox/cp/symbolic/sym01
> 
> I'm OK with that, but I'm a bit skeptical about it. It will be hard to convince
> everyone to use the same convention.

***TO OTHER TEST-Wrapper PROJECTS**: do you agree with this?

I think that the naming is only required when you share the test results. The main goal is to compare results between frameworks.

> > 6) Repository
> >
> > Tests and test runners should be where they are already.
> > But the standard set of test definitions can be shared, so it would be nice to
> put them on a separate directory.
> > I propose gitlab.com because it has an easy to use CI system. We could
> > create a new user (automated_testing) and then a project inside
> (test_definitions).
> > Note: anybody here took this address?
> > https://gitlab.com/automated_testing
> 
> Sounds like a good idea - reduces fragmentation. I'm +1 on that.

Great.

I would also want to use the Gitlab CI to build the artifacts (static test binaries, OS images, ...). What do you think?
Also I think we need a shared storage server to store the artifacts. 
Not sure how that could be funded though.

Thanks,
Daniel

> >
> > [1] https://elinux.org/Automated_Testing
> > [2]
> > https://github.com/Linaro/test-definitions/blob/master/automated/lib/s
> > h-test-lib [3]
> > https://github.com/sangorrin/test-definitions/blob/master/automated/li
> > nux/fuego/fuego.sh [4]
> > https://github.com/sangorrin/test-definitions/tree/master/automated/li
> > nux/iozone [5]

> > https://bitbucket.org/fuegotest/fuego-core/src/master/engine/tests/Ben
> > chmark.IOzone/ [6]
> > https://bitbucket.org/fuegotest/fuego-core/src/next/tests/Functional.l
> > inaro/ [7]
> > https://github.com/sangorrin/test-definitions/tree/master/automated/li
> > nux/ptest [8]
> > https://bitbucket.org/fuegotest/fuego-core/src/next/tests/Functional.p
> > test/ [9]
> > https://github.com/chase-qi/test-definitions/blob/master/automated/lin
> > ux/fuego-multinode/parser.py [10]
> > https://bitbucket.org/fuegotest/fuego-core/src/next/tests/Functional.b
> > usybox/ [11]
> > https://github.com/Linaro/test-definitions/tree/master/automated/linux
> > /busybox [12] https://github.com/golang/go/wiki/MinimumRequirements
> > [13] http://fuegotest.org/fuego-1.0/Test_definition
> > [14]
> > https://github.com/Linaro/squad/blob/master/doc/intro.rst#input-file-f
> > ormats
> 
> [15] https://github.com/Linaro/test-
> definitions/blob/master/automated/linux/iperf/lava-multinode-job-
> example-iperf.yaml
> 
> Best Regards,
> milosz