[Automated-testing] test definitions shared library

Thu Jul 11 22:05:10 PDT 2019

Hello Milosz,

Yesterday, during the monthly automated testing call [1], I mentioned that I had used Linaro test definitions' library [2] with a few additional functions [3] to run _some_ Fuego tests.

First of all, there is an easier way to run Fuego tests from LAVA definitions. That's the reason I stopped developing that adaptation layer. The easier way consists of installing the no-jenkins version of Fuego on the target filesystem (install-debian.sh). Once installed it works just as any other test runner and can be called from a LAVA yaml job (e.g. ftc run-test -t Functional.bzip2) as I showed during my presentation at Linaro connect (https://www.youtube.com/watch?v=J_Gor9WIr9g).

Having said that, I do see a benefit in modularizing the test definitions so that they can be re-used on different frameworks or without a framework at all.

From here on, I am going to try to explain what I think of a test definition standard as it comes out of my mind in brainstorming mode. Please bear with me. Hopefully, this will hit some areas of consensus and some others that require refining. There are many frameworks out there and i only know a small fraction of them so I am not sure if these ideas would work for all of them. I'd love to hear feedback from other projects.

1) What is a test definition?

"meta-data and software that comprise a particular test" [1]

Well this sounds a bit abstract to me. There are several types of test definitions:

  * test wrappers: these are wrapper scripts that take care of building, deploying, executing and parsing an existing test. For example, IOzone is an existing test and [4] and [5] are the test definitions from Linaro and Fuego respectively. I think it's interesting to compare them and extract the similarities and differences. Both of them are able to build the IOzone test; Linaro is able to install build dependencies while Fuego assumes the toolchain has them; both execute the iozone binary, but only Fuego (in this particular case) allows specifying the parameters to pass to iozone; finally both parse the test output log. Fuego does all of this using a set of function callbacks (test_run, test_cleanup..) while Linaro test definition has no defined structure. For Linaro, parsing occurs on the target, while for Fuego it happens on the host.

  * test runner wrappers: these are wrappers scripts for test runners (e.g., ltp, ptest, autopkgtest). Linaro test definitions can also be used as a test runner [6]. Here we can compare the ptest test runner from Linaro [7] and Fuego [8]. The one in Linaro is written in python, whereas the one in Fuego is a shell script and a python file for the parser. Both of them call ptest-runner, and then parse the output. Fuego converts the parsed output into Fuego's test results unified format (JSON) while Linaro converts the results into LAVA serial port messages. 
Note: there is a script created by Chase to convert a Fuego results JSON into LAVA serial port messages [9].

  * original test scripts: these are usually test scripts that call binaries on the target filesystem and check that they work as expected. For example, this is what Fuego busybox test [10] and Linaro busybox test [11] are doing. These type of scripts tend to be a bit shallow on the testing side, and are mostly a basic confirmation that the binaries work.

2) Language

In Fuego, we are using shell scripts for most phases (dependencies, build, deploy, run, cleanup) and depending on the test we use python for parsing. If the test is simple enough we use a shell script.
In Linaro, I think that you are using mostly shell scripts but sometimes python. Once thing that we could do is to check whether python is available and if it isn't provide a simpler parsing code based on awk/grep/etc. Another option for Linaro is to add python to the dependencies of the test. This would work on most modern Debian-based OS image builders such as Debos or ISAR.
Finally, an alternative would be to use Go static binaries for the parsing phase. Go would work in Fuego (on the host side) and on Linaro (on the target side) unless you are using an OS or architecture not supported by Go [12].

3) Directory structure

Fuego has a defined structure [13]
- fuego_test.sh: script that contains the test wrapper phases (e.g build, run, ..)
- optional
   - parser.py: parses the ouput with the help of a library that produces a JSON file that complies with a schema very close to kernelci/squad schemes.
   - source tarball
   - test.yaml with metadata
   - test specs: various combinations of parameters that should be tried

I think this structure can cover most needs. In Linaro, you also store the LAVA YAML job files however i think that those are not strictly necessary. Linaro YAML files have three main sections:

* os dependencies: these ones can be added to the test.sh's precheck phase
* board dependencies: these depend on the lab and the boards you have so they should be handled outside of the test definitions.
* execution instructions: all test definitions in Linaro seem to follow the same pattern, so you should be able to just abstract that into a template. For example:
    steps:
        - cd ./automated/linux/busybox/ <-- convert to automated/linux/$TEST
        - ./busybox.sh <-- convert to ./test.sh
        - ../../utils/send-to-lava.sh ./output/result.txt

4) Output format

Tests already have several standards (junit, tap etc), but I think that all test frameworks end up creating a parser/converter that collects those "standard" logs and transform them into a unified per-framework format. For example, a JSON file in Fuego or a set of LAVA serial messages in Linaro.

In my super-biased opinion, the test definitions should produce a JSON file similar to what Squad/kernelci (or bigquery!?) can accept. And then, each framework can have some glue code to convert that JSON file into their preferred format (for example, a JSON to LAVA script).

5) Test naming

I think we should adopt the nomenclature introduced by Squad which includes test results (pass, fail,..) and metrics (bps, I/O ops/sec etc..)

{
  "test1": "pass",
  "test2": "pass",
  "testsuite1/test1": "pass",
  "testsuite1/test2": "fail",
  "testsuite2/subgroup1/testA": "pass",
  "testsuite2/subgroup2/testA": "pass",
  "testsuite2/subgroup2/testA[variant/one]": "pass",
  "testsuite2/subgroup2/testA[variant/two]": "pass"
}

{
  "v1": 1,
  "v2": 2.5,
  "group1/v1": [1.2, 2.1, 3.03],
  "group1/subgroup/v1": [1, 2, 3, 2, 3, 1]
}

If we agree on that, then we can prepare a list of tests that adhere to such nomenclature. For example (just invented):

ltp/syscalls/madvice/madvice01
busybox/cp/symbolic/sym01

6) Repository

Tests and test runners should be where they are already.
But the standard set of test definitions can be shared, so it would be nice to put them on a separate directory.
I propose gitlab.com because it has an easy to use CI system. We could create a new user (automated_testing) and then
a project inside (test_definitions).
Note: anybody here took this address? https://gitlab.com/automated_testing

Thanks,
Daniel

[1] https://elinux.org/Automated_Testing
[2] https://github.com/Linaro/test-definitions/blob/master/automated/lib/sh-test-lib
[3] https://github.com/sangorrin/test-definitions/blob/master/automated/linux/fuego/fuego.sh
[4] https://github.com/sangorrin/test-definitions/tree/master/automated/linux/iozone
[5] https://bitbucket.org/fuegotest/fuego-core/src/master/engine/tests/Benchmark.IOzone/
[6] https://bitbucket.org/fuegotest/fuego-core/src/next/tests/Functional.linaro/
[7] https://github.com/sangorrin/test-definitions/tree/master/automated/linux/ptest
[8] https://bitbucket.org/fuegotest/fuego-core/src/next/tests/Functional.ptest/
[9] https://github.com/chase-qi/test-definitions/blob/master/automated/linux/fuego-multinode/parser.py
[10] https://bitbucket.org/fuegotest/fuego-core/src/next/tests/Functional.busybox/
[11] https://github.com/Linaro/test-definitions/tree/master/automated/linux/busybox
[12] https://github.com/golang/go/wiki/MinimumRequirements
[13] http://fuegotest.org/fuego-1.0/Test_definition
[14] https://github.com/Linaro/squad/blob/master/doc/intro.rst#input-file-formats