[Automated-testing] Test Stack Survey - TCF

Tim Orling timothy.t.orling at linux.intel.com
Tue Oct 16 10:57:03 PDT 2018


= TCF Testing survey response -
TCF survey response provided by Iñaky Perez-Gonzalez and Tim Orling

TCF - Test Case Framework

TCF provides support to automate any kind of steps that a human would
do otherwise, typing in the console or issuing command line commands.

TCF will:

* (server side) provide remote access to DUTs (metadata, power
  switching, debug bridges, serial console i/o, networking tunnels,
  flashing support)

* discover testcases

* associate testcases with DUT/groups-of-DUTs where it can run.

* run each the combination testcase/group-of-DUTs in parallel and
  evaluate the testcase steps to determine success; report.

Which of the aspects below of the CI loop does your test framework perform?

Jenkins is

Does your test framework:
==== source code access ====
* access source code repositories for the software under test?
  No, must be checked out by triggering agent (eg: Jenkins)
  
* access source code repositories for the test software?
  No, must be checked out by triggering agent (eg: Jenkins)
  
* include the source for the test software?
  (if meaning the software that implements the system)
  YES, http://github.com/intel/tcf
  
  (if meaning the software needed to test features)
  NO, that is to be
  provided by the user depending on what they want to test;
  convenience libraries provided.
  
* provide interfaces for developers to perform code reviews?
  NO, left for gerrit, github, etc -- can provide reports to such
  systems
  
* detect that the software under test has a new version?
  NO, left to triggering agent (eg: Jenkins)
  
** if so, how? (e.g. polling a repository, a git hook, scanning a mail list, etc.)
* detect that the test software has a new version?
  NO, left to triggering agent (eg: Jenkins) or others

==== test definitions ====
Does your test system:

* have a test definition repository?
  NO, out of scope
  Left to the user to define their tests in any way or form (native
  python, drivers can be created for other formats). TCF will scan the
  provided repository for test definitions
  
** if so, what data format or language is used (e.g. yaml, json, shell script)

Does your test definition include:
* source code (or source code location)?
  Location to python (or other format) test definition in filesystem
  
* dependency information?
  depends on format
  test interdependency (run X only if Y has passed) not yet supported
  
* execution instructions?
  YES (optional)
  
* command line variants?
  NO
  
* environment variants?
  YES
  
* setup instructions?
  YES (optional)
  
* cleanup instructions?
  YES (optional)
  
** if anything else, please describe:

  Test cases are defined as a sequence of phases that have to occur
  and expectations that have to be met; they are all optional or as
  long as needed.

  The phases are configuration, building, target acquisition,
  deployment, evaluation and cleanup.
  
  The steps to implement on each phase are implemented as python
  methods of a subclass of class tcfl.tc.tc_c (if using the native
  format) or might be implemented by a driver for an specific format
  (eg: Zephyr's testcase.yaml loader).

  Details


  https://intel.github.io/tcf/doc/07-Architecture.html#the-testcase-finder-and-runner

  https://intel.github.io/tcf/doc/09-api.html#tcf-s-backbone-test-case-finder-and-runner

  https://intel.github.io/tcf/doc/09-api.html#tcfl.tc.tc_c

Does your test system:
* provide a set of existing tests?

YES
examples for reference and self-test

** if so, how many?

~27 examples, 32 self-tests (combo of TCF tests and py unittests)

==== build management ====
Does your test system:
* build the software under test (e.g. the kernel)?
  YES if needed
  
* build the test software?
  YES if needed
  
* build other software (such as the distro, libraries, firmware)?
  YES if needed
  
* support cross-compilation?
  YES
  
* require a toolchain or build system for the SUT?
  n/a (depends on what the testcase wants to do)
  
* require a toolchain or build system for the test software?
  n/a (depends on what the testcase wants to do)
  
* come with pre-built toolchains?
  n/a (depends on what the testcase wants to do)
  
* store the build artifacts for generated software?
  TEMPORARILY
  while needed, then deleted; testcase can decide to archive it
  somewere else
  
** in what format is the build metadata stored (e.g. json)?
   n/a (testcase specific)
   
** are the build artifacts stored as raw files or in a database?
   n/a (testcase specific)
   
*** if a database, what database?

==== Test scheduling/management ====
Does your test system:
* check that dependencies are met before a test is run?
  UNIMPLEMENTED

* schedule the test for the DUT?
  YES
  
** select an appropriate individual DUT based on SUT or test attributes?
   YES
   
** reserve the DUT?
   YES

** release the DUT?
   YES

* install the software under test to the DUT?
  YES (if required by the testcase)
   
* install required packages before a test is run?
  Testcase specific

* require particular bootloader on the DUT? (e.g. grub, uboot, etc.)
  NO
  Depends on the DUT setup and driver

* deploy the test program to the DUT?
  Depends on the testcase, provides facilities to do so

* prepare the test environment on the DUT?
  DUT driver specific and testcase specific
  Provides facilities to reset DUTs to well-known state

* start a monitor (another process to collect data) on the DUT?
  YES
  serial console by default, any other available by driver

* start a monitor on external equipment?
  Testcase specific
  Other equipment is treated as other DUT hardware that can be used
  and manipualted to accomplish whatever the test needs
  
* initiate the test on the DUT?
  YES
  
* clean up the test environment on the DUT?
  NO
  Allows post-analysis; left to tests to create a clean
  well-known-environment before starting

==== DUT control ====
Does your test system:
* store board configuration data?
  YES
** in what format?
   Defined internall in Python dictionaries, REST exported as JSON
   
* store external equipment configuration data?
  YES, all equipment is treated as a DUT

** in what format?
   same
   
* power cycle the DUT?
  YES
  
* monitor the power usage during a run?
  CAN do if proper equipment is attached and test configures and
  communicates with it

* gather a kernel trace during a run?
  CAN do if test monitors proper outputs
 
* claim other hardware resources or machines (other than the DUT) for use during a test?
  YES
  Testcase declares resources (DUTs) needed and runner will claim them
  all before manipulating them

* reserve a board for interactive use (ie remove it from automated testing)?
  NO
  Single reservation system -- automation can claim another one

* provide a web-based control interface for the lab?
  NO
  cmdline interfaces; web interface doable as a layer on top
  
* provide a CLI control interface for the lab?
  YES

==== Run artifact handling ====
Does your test system:
* store run artifacts
  NO
  left to trigger layer (eg Jenkins)
** in what format?
* put the run meta-data in a database?
  YES
  plugin-based reporting mechanism; current plugins available for text
  files, MongoDB, Junit
  
** if so, which database?
   MongoDB (plugin)
   
* parse the test logs for results?
  test specific

* convert data from test logs into a unified format?
  test specific -- test can choose to parse internal logs and produce
  reporting using the TCF report API for results and KPIs

** if so, what is the format?
   internal format that gets passed to each reporting plugin that is
   loaded real time; such plugins will store in whatever native format
   they support
   
* evaluate pass criteria for a test (e.g. ignored results, counts or thresholds)?
  YES
  test specific will sample the outputs from the DUTs during
  testexecutions for expected, unwelcome or non-expected outputs and
  determine:

** PASS: all went well

** FAILure: deterministic resolution of the test failing, eg we
   multiply 2*2 and yields 5 or power measurements from an attached
   gauge while doing op X yielded a power consumption outside of the
   expected band.
   
** ERRoR: unexpected negative output (for example, a kernel crash
   while readin ga file)
   
** SKIP: the DUTs lack the capabilities needed to run such test) and
   we could only determine that once they were configured, setpu and
   powered up (vs just looking at the metadata)

** BLoCK: any problem related to infrastructure that disallowed from
   carrying the test to completion (eg: network failre communicating
   with the server, DUT power switch failing to execute a power up
   command, etc...)

* do you have a common set of result names: (e.g. pass, fail, skip,
  etc.)
  YES
** if so, what are they?
   PASS, ERRR, FAIL, SKIP, BLCK

* How is run data collected from the DUT?

  By any means the DUT can provide output and on the test's needs
  
  Currently we actively use:
  - serial output
  - network output (dep on DUT and its configuration)
  - JTAGs
  
  other interfaces can be added depending on target capabilities

** e.g. by pushing from the DUT, or pulling from a server?

   interface and DUT specific:

   - serial line proxyied by the test server
   - network port access direct tunneled through the server
   - JTAG (when DUT provides) proxyed through server
   
* How is run data collected from external equipment?
  Same as DUTs; external equipment can be considered another DUT or
  part of the DUT and the inputs connected to the server equally (eg:
  serial ports or network ports)
  
* Is external equipment data parsed?
  test specific
  a
==== User interface ====
Does your test system:
* have a visualization system?
  NO
  left to external reporting interface (eg: TestRails)
  Currently also feeding output to a Google Sheet

  https://docs.google.com/presentation/d/1R9yEEJrQiyGRD2PUkUbYb_HfMKM2vsgAGFWWMCYkY8A/edit#slide=id.g42b5960348_0_634

  https://docs.google.com/presentation/d/1R9yEEJrQiyGRD2PUkUbYb_HfMKM2vsgAGFWWMCYkY8A/edit#slide=id.g42b5960348_0_727

  https://docs.google.com/presentation/d/1R9yEEJrQiyGRD2PUkUbYb_HfMKM2vsgAGFWWMCYkY8A/edit#slide=id.g42b5960348_0_645

  https://docs.google.com/presentation/d/1R9yEEJrQiyGRD2PUkUbYb_HfMKM2vsgAGFWWMCYkY8A/edit#slide=id.g42b5960348_0_716

* show build artifacts to users?
  N/A
  left to external reporting/triggering interface (eg: Jenkins)
  
* show run artifacts to users?
  N/A
  left to external reporting/triggering interface (eg: Jenkins)

* do you have a common set of result colors?
  N/A
  report tool issue

** if so, what are they?

* generate reports for test runs?
  YES
  
* notify users of test results by e-mail?
  N/A
  left to CI integration; report interface can take a plugin to do it

* can you query (aggregate and filter) the build meta-data?
  YES
  
* can you query (aggregate and filter) the run meta-data?
  YES
  
* what language or data format is used for online results presentation? (e.g. HTML, Javascript, xml, etc.)
  N/A    
  report tool specific
  current report plugins provide text, junit, mongoDB storage
  
* what language or data format is used for reports? (e.g. PDF, excel, etc.)
  N/A
  report tool specific

* does your test system have a CLI control tool?
  YES
  
** what is it called?
   ycf
   
==== Languages: ====
Examples: json, python, yaml, C, javascript, etc.
* what is the base language of your test framework core?
  python
  
What languages or data formats is the user required to learn?
(as opposed to those used internally)

- python like simple query for target and testcase filtering

- formats they project uses that are loaded by TCF (eg: Zephyr's
  testcase.yaml) or Python to implement Testcases in Python

- Jinja2 templating for reports

==== Can a user do the following with your test framework: ====
* manually request that a test be executed (independent of a CI trigger)?
  YES
  
* see the results of recent tests?
  YES
  
* set the pass criteria for a test?
  YES
  
** set the threshold value for a benchmark test?
   YES (if test supports it)
** set the list of testcase results to ignore?
   NO
   but they can be postfiltered

* provide a rating for a test? (e.g. give it 4 stars out of 5)
  NO
  
* customize a test?
  YES if access to the test repository

** alter the command line for the test program?
   YES
   
** alter the environment of the test program?
   YES
   
** specify to skip a testcase?
   YES
   
** set a new expected value for a test?
   YES (if test permits it)

** edit the test program source?
   YES (if access to the repository)

* customize the notification criteria?
  YES (via client config)

** customize the notification mechanism (eg. e-mail, text)
   YES (via client config)

* generate a custom report for a set of runs?
  YES
  Jinja2 templating (not fully completed)
  
* save the report parameters to generate the same report in the future?
  YES
  Jinja2 templating (not fully completed)
  
==== Requirements ====
Does your test framework:

* require minimum software on the DUT?
  NO
  
* require minimum hardware on the DUT (e.g. memory)
  NO
  
** If so, what? (e.g. POSIX shell or some other interpreter, specific libraries, command line tools, etc.)
   DUT could be a toaster for the framework basic needs; tests are
   expected to declare in which DUTs they can run based on the
   interfaces those DUTs provide to interact with them and be able to
   perform the test.

   Most basic tests will always need some kind of output collection
   from targets, so a computing target will likely provide at least a
   serial port interface to take output from it.
   
* require agent software on the DUT? (e.g. extra software besides production software)
  NO
  
** If so, what agent?
* is there optional agent software or libraries for the DUT?
  NO
  
* require external hardware in your labs?
  DEPENDS
  On what the tests need to do with the DUTs (power switch interface?
  need a remote controlled power switch, etc...)

==== APIS ====
Does your test framework:
* use existing APIs or data formats to interact within itself, or with 3rd-party modules?
  OPTIONAL
  Can provide drivers to load any existing format

* have a published API for any of its sub-module interactions (any of the lines in the diagram)?
  YES
  
** Please provide a link or links to the APIs?

   https://intel.github.io/tcf/doc/09-api.html#module-tcfl.tc

Sorry - this is kind of open-ended...
* What is the nature of the APIs you currently use?
Are they:
** RPCs?
   YES
** Unix-style? (command line invocation, while grabbing sub-tool output)
** compiled libraries?
** interpreter modules or libraries?
** web-based APIs?
** something else?

==== Relationship to other software: ====
* what major components does your test framework use (e.g. Jenkins, Mondo DB, Squad, Lava, etc.)

  CI workflow, Jenkins triggers, checks out SUT and test
  repositories, configures TCF reporting to MongoDB and launches TCF
  on the repositories; collects reports and publishes summaries to
  Google Sheet and Testrails.
  
  https://docs.google.com/presentation/d/1R9yEEJrQiyGRD2PUkUbYb_HfMKM2vsgAGFWWMCYkY8A/edit#slide=id.g42db933815_2_0

  Gihub/Gerrit (WIP) workflow for pull request verification: Jenkins
  triggers, checks out SUT and test repositories, determines based on
  the change what checks need to be run, launches TCF to run them, and
  reports back to Github/gerrit.

  https://docs.google.com/presentation/d/1R9yEEJrQiyGRD2PUkUbYb_HfMKM2vsgAGFWWMCYkY8A/edit#slide=id.g42db933815_2_32

* does your test framework interoperate with other test frameworks or software?
  PENDING
  plans discussed to have the runner interfact with other target
  servers.
  Can load testcases from other frameworks based on loader driver.

** which ones?
   Eg: Zephyr's sanitycheck

== Overview ==
Please list the major components of your test system.


* Server: provides remote access to DUTs, exposes whichever interfaces
they provide

** basic interface: metadata, acquisition/release, tunnels,
** serial consoles
** debug interfaces
** power on/off/cycle
** imaging/flashing

* Client:

** primitives to access DUTs via server
** testcase runner

Please list your major components here:
* python
* Flask

== Glossary ==
Here is a glossary of terms.  Please indicate if your system uses different terms for these concepts.
Also, please suggest any terms or concepts that are missing.

* Bisection - automatic testing of SUT variations to find the source of a problem
* Boot - to start the DUT from an off state
* Build artifact - item created during build of the software under test
* Build manager (build server) - a machine that performs builds of the software under test
* Dependency - indicates a pre-requisite that must be filled in order for a test to run (e.g. must have root access, must have 100 meg of memory, some program must be installed, etc.)
* Device under test (DUT) - the hardware or product being tested (consists of hardware under test and software under test) (also 'board', 'target')
* Deploy - put the test program or SUT on the DUT
** this one is ambiguous - some people use this to refer to SUT installation, and others to test installation
* Device under Test (DUT) - a product, board or device that is being tested
* DUT controller - program and hardware for controlling a DUT (reboot, provision, etc.)
* DUT scheduler - program for managing access to a DUT (take online/offline, make available for interactive use)
** This is not shown in the CI Loop diagram - it could be the same as the Test Scheduler
* Lab - a collection of resources for testing one or more DUTs (also 'board farm')
* Log - one of the run artifacts - output from the test program or test framework
* Log Parsing - extracting information from a log into a machine-processable format (possibly into a common format)
* Monitor - a program or process to watch some attribute (e.g. power) while the test is running
** This can be on or off the DUT.
* Notification - communication based on results of test (triggered by results and including results)
* Pass criteria - set of constraints indicating pass/fail conditions for a test
* Provision (verb) - arrange the DUT and the lab environment (including other external hardware) for a test
** This may include installing the SUT to the device under test and booting the DUT.
* Report generation - generation of run data into a formatted output
* Request (noun) - a request to execute a test
* Result - the status indicated by a test - pass/fail (or something else) for a Run
* Results query - Selection and filtering of data from runs, to find patterns
* Run (noun) - an execution instance of a test (in Jenkins, a build)
* Run artifact - item created during a run of the test program
* Serial console - the Linux console connected over a serial connection
* Software under test (SUT) - the software being tested
* Test agent - software running on the DUT that assists in test operations (e.g. test deployment, execution, log gathering, debugging
** One example would be 'adb', for Android-based systems)
* Test definition - meta-data and software that comprise a particular test
* Test program - a script or binary on the DUT that performs the test
* Test scheduler - program for scheduling tests (selecting a DUT for a test, reserving it, releasing it)
* Test software - source and/or binary that implements the test
* Transport (noun) - the method of communicating and transferring data between the test system and the DUT
* Trigger (noun) - an event that causes the CI loop to start
* Variant - arguments or data that affect the execution and output of a test (e.g. test program command line; Fuego calls this a 'spec')
* Visualization - allowing the viewing of test artifacts, in aggregated form (e.g. multiple runs plotted in a single diagram)


More information about the automated-testing mailing list