[Automated-testing] test definitions shared library

Fri Jul 12 01:59:55 PDT 2019

Hi Daniel,

On Fri, 12 Jul 2019 at 06:05, <daniel.sangorrin at toshiba.co.jp> wrote:
>
> Hello Milosz,
>
> Yesterday, during the monthly automated testing call [1], I mentioned that I had used Linaro test definitions' library [2] with a few additional functions [3] to run _some_ Fuego tests.
>
> First of all, there is an easier way to run Fuego tests from LAVA definitions. That's the reason I stopped developing that adaptation layer. The easier way consists of installing the no-jenkins version of Fuego on the target filesystem (install-debian.sh). Once installed it works just as any other test runner and can be called from a LAVA yaml job (e.g. ftc run-test -t Functional.bzip2) as I showed during my presentation at Linaro connect (https://www.youtube.com/watch?v=J_Gor9WIr9g).
>
> Having said that, I do see a benefit in modularizing the test definitions so that they can be re-used on different frameworks or without a framework at all.
>
> From here on, I am going to try to explain what I think of a test definition standard as it comes out of my mind in brainstorming mode. Please bear with me. Hopefully, this will hit some areas of consensus and some others that require refining. There are many frameworks out there and i only know a small fraction of them so I am not sure if these ideas would work for all of them. I'd love to hear feedback from other projects.
>
> 1) What is a test definition?
>
> "meta-data and software that comprise a particular test" [1]
>
> Well this sounds a bit abstract to me. There are several types of test definitions:
>
>   * test wrappers: these are wrapper scripts that take care of building, deploying, executing and parsing an existing test. For example, IOzone is an existing test and [4] and [5] are the test definitions from Linaro and Fuego respectively. I think it's interesting to compare them and extract the similarities and differences. Both of them are able to build the IOzone test; Linaro is able to install build dependencies while Fuego assumes the toolchain has them; both execute the iozone binary, but only Fuego (in this particular case) allows specifying the parameters to pass to iozone; finally both parse the test output log. Fuego does all of this using a set of function callbacks (test_run, test_cleanup..) while Linaro test definition has no defined structure. For Linaro, parsing occurs on the target, while for Fuego it happens on the host.
>
>   * test runner wrappers: these are wrappers scripts for test runners (e.g., ltp, ptest, autopkgtest). Linaro test definitions can also be used as a test runner [6]. Here we can compare the ptest test runner from Linaro [7] and Fuego [8]. The one in Linaro is written in python, whereas the one in Fuego is a shell script and a python file for the parser. Both of them call ptest-runner, and then parse the output. Fuego converts the parsed output into Fuego's test results unified format (JSON) while Linaro converts the results into LAVA serial port messages.
> Note: there is a script created by Chase to convert a Fuego results JSON into LAVA serial port messages [9].
>
>   * original test scripts: these are usually test scripts that call binaries on the target filesystem and check that they work as expected. For example, this is what Fuego busybox test [10] and Linaro busybox test [11] are doing. These type of scripts tend to be a bit shallow on the testing side, and are mostly a basic confirmation that the binaries work.
>
> 2) Language
>
> In Fuego, we are using shell scripts for most phases (dependencies, build, deploy, run, cleanup) and depending on the test we use python for parsing. If the test is simple enough we use a shell script.
> In Linaro, I think that you are using mostly shell scripts but sometimes python. Once thing that we could do is to check whether python is available and if it isn't provide a simpler parsing code based on awk/grep/etc. Another option for Linaro is to add python to the dependencies of the test. This would work on most modern Debian-based OS image builders such as Debos or ISAR.

I didn't check all but the rule of thumb is that when test runs on the
target is should only use shell scripts. There are python scripts in
test-definitions repository but they're meant to be used in case
script runs on the host (like Android CTS).

> Finally, an alternative would be to use Go static binaries for the parsing phase. Go would work in Fuego (on the host side) and on Linaro (on the target side) unless you are using an OS or architecture not supported by Go [12].

This is possible but it will add a bit of maintenance cost.

>
> 3) Directory structure
>
> Fuego has a defined structure [13]
> - fuego_test.sh: script that contains the test wrapper phases (e.g build, run, ..)
> - optional
>    - parser.py: parses the ouput with the help of a library that produces a JSON file that complies with a schema very close to kernelci/squad schemes.
>    - source tarball
>    - test.yaml with metadata
>    - test specs: various combinations of parameters that should be tried
>
> I think this structure can cover most needs. In Linaro, you also store the LAVA YAML job files however i think that those are not strictly necessary.

As we agreed some time ago I'm planning to align test-definitions with
Fuego 'phases' and rename/split the scripts. This isn't hard but takes
time and is currently pretty low on my priority list.  There are a few
example LAVA YAML files but they are only stored there for
demonstration purpose and can be considered as 'documentation' rather
than code [15].

> Linaro YAML files have three main sections:
>
> * os dependencies: these ones can be added to the test.sh's precheck phase

correct. This is used by LAVA to install dependencies outside of the
lava test shell action. We stopped using them in test-definitions
because they break portability. I'm not sure if this is a useful LAVA
feature.

> * board dependencies: these depend on the lab and the boards you have so they should be handled outside of the test definitions.

IMHO this section is controversial. It depends on the device type
naming convention and this isn't consistent across all LAVA labs. So I
would consider it as a guidance rather than dependency. I once had an
idea to use this field to compose test plans for devices but it didn't
work.

> * execution instructions: all test definitions in Linaro seem to follow the same pattern, so you should be able to just abstract that into a template. For example:
>     steps:
>         - cd ./automated/linux/busybox/ <-- convert to automated/linux/$TEST
>         - ./busybox.sh <-- convert to ./test.sh
>         - ../../utils/send-to-lava.sh ./output/result.txt

This was done to improve portability. Some definitions have a bit more
complicated scripts though

>
> 4) Output format
>
> Tests already have several standards (junit, tap etc), but I think that all test frameworks end up creating a parser/converter that collects those "standard" logs and transform them into a unified per-framework format. For example, a JSON file in Fuego or a set of LAVA serial messages in Linaro.
>
> In my super-biased opinion, the test definitions should produce a JSON file similar to what Squad/kernelci (or bigquery!?) can accept. And then, each framework can have some glue code to convert that JSON file into their preferred format (for example, a JSON to LAVA script).

why not support all frameworks? By 'all' I mean that test definition
can produce output depending on the framework it runs in. It can be
requested as script parameter. This way the frameworks won't have to
do any magic and it would be a task for test maintainers to make sure
it works in frameworks they care about.

>
> 5) Test naming
>
> I think we should adopt the nomenclature introduced by Squad which includes test results (pass, fail,..) and metrics (bps, I/O ops/sec etc..)
>
> {
>   "test1": "pass",
>   "test2": "pass",
>   "testsuite1/test1": "pass",
>   "testsuite1/test2": "fail",
>   "testsuite2/subgroup1/testA": "pass",
>   "testsuite2/subgroup2/testA": "pass",
>   "testsuite2/subgroup2/testA[variant/one]": "pass",
>   "testsuite2/subgroup2/testA[variant/two]": "pass"
> }
>
> {
>   "v1": 1,
>   "v2": 2.5,
>   "group1/v1": [1.2, 2.1, 3.03],
>   "group1/subgroup/v1": [1, 2, 3, 2, 3, 1]
> }
>
> If we agree on that, then we can prepare a list of tests that adhere to such nomenclature. For example (just invented):
>
> ltp/syscalls/madvice/madvice01
> busybox/cp/symbolic/sym01

I'm OK with that, but I'm a bit skeptical about it. It will be hard to
convince everyone to use the same convention.

>
> 6) Repository
>
> Tests and test runners should be where they are already.
> But the standard set of test definitions can be shared, so it would be nice to put them on a separate directory.
> I propose gitlab.com because it has an easy to use CI system. We could create a new user (automated_testing) and then
> a project inside (test_definitions).
> Note: anybody here took this address? https://gitlab.com/automated_testing

Sounds like a good idea - reduces fragmentation. I'm +1 on that.

>
> Thanks,
> Daniel
>
> [1] https://elinux.org/Automated_Testing
> [2] https://github.com/Linaro/test-definitions/blob/master/automated/lib/sh-test-lib
> [3] https://github.com/sangorrin/test-definitions/blob/master/automated/linux/fuego/fuego.sh
> [4] https://github.com/sangorrin/test-definitions/tree/master/automated/linux/iozone
> [5] https://bitbucket.org/fuegotest/fuego-core/src/master/engine/tests/Benchmark.IOzone/
> [6] https://bitbucket.org/fuegotest/fuego-core/src/next/tests/Functional.linaro/
> [7] https://github.com/sangorrin/test-definitions/tree/master/automated/linux/ptest
> [8] https://bitbucket.org/fuegotest/fuego-core/src/next/tests/Functional.ptest/
> [9] https://github.com/chase-qi/test-definitions/blob/master/automated/linux/fuego-multinode/parser.py
> [10] https://bitbucket.org/fuegotest/fuego-core/src/next/tests/Functional.busybox/
> [11] https://github.com/Linaro/test-definitions/tree/master/automated/linux/busybox
> [12] https://github.com/golang/go/wiki/MinimumRequirements
> [13] http://fuegotest.org/fuego-1.0/Test_definition
> [14] https://github.com/Linaro/squad/blob/master/doc/intro.rst#input-file-formats

[15] https://github.com/Linaro/test-definitions/blob/master/automated/linux/iperf/lava-multinode-job-example-iperf.yaml

Best Regards,
milosz