[Automated-testing] Glossary words unfamiliar to enterprise testers

Thu Nov 1 12:08:37 PDT 2018

Thanks for responding with this additional information.
I think it is really helpful to clarify how these terms might not
be obvious, depending on how and what different groups are testing.

Some comments and questions inline below.

Anyone else with comments about what these concepts are
called in their framework can chime in as well.

> -----Original Message-----
> From: Cyril Hrubis 
> 
> Hi!
> > In the recent summit, you mentioned that some of the terms in the
> glossary
> > that I sent with the survey were unfamiliar to you, and that you had to
> > map them onto aspects of LTP that had different names (IIRC).
> >
> > This may not be easy to do now that you know them, but could
> > you put an hash sign (#) before to the terms that were originally unfamiliar
> > to you, from the list below?  The reason for this request is that I would like
> to
> > identify the items that may not be obvious to non-embedded testers, and
> > either select a new term, or mark them, or put more information in the
> > description for these terms.
> 
> Here you go. I've tried to explain why some of the terms were unfamiliar
> or why even does not make any sense in our settings as well. Mostly that
> is because we do not care about actuall hardware most of the time.
> 
> > There was a good discussion at the summit about the "device under test"
> > terminology, and how different people interpreted this in different ways.
> >
> > Thanks,
> >  -- Tim
> >
> > P.S. Anyone else on the list who would like to mark terms that were
> originally
> > unfamiliar to them, can do so as well.  Just reply-all to this e-mail, and mark
> > the unfamiliar or problematic items with a hash sign.
> >
> > Here is the originally-posted glossary:
> >

Just a question up front - are you talking about SUSE distribution testing
or LTP?   Or are you talking about how SUSE specifically uses LTP?
I know LTP is used in a lot of different situations, including hardware testing,
but maybe those are outside your personal experience.  I'm just wanting
to clarify.  (This question might explain some of my other questions below).

> > * Bisection - automatic testing of SUT variations to find the source of a
> problem
> > * Boot - to start the DUT from an off state  (addition: point of time when a
> test can be started)
> # * Build artifact - item created during build of the software under test
> 
> Basically we call artifacts assets and we do not differentiate between
> build and run assets and logs, everything is an asset including the
> virtual machine disk images.

Interesting.  I'm trying to think if I ever had anything similar in the
physical realm.  The closest thing I can think of is that we once
disassembled a mobile phone to see if we could determine where
it was water-damaged.  That's far outside of software testing, but
we had the disk images for the system, because we had the physical
device itself, with its flash memory.

We should definitely note that when doing VM testing, the VM images
can be a part of the assets (or run artifacts).

To others:
How many other people use the term "asset" as opposed to artifact?

To me "asset" implies something used during the test, and not a product
of it, but it's pretty generic either way.

> 
> # * Build manager (build server) - a machine that performs builds of the
> software under test
> 
> We do product testing, hence we do not build the product ourselves, we
> are handed an installation iso. Even for our kernel of the day tests we
> are handed a rpm kernel package.

Fuego has a blind spot here as well, due to its focus on product testing.
Fuego more-or-less expects something external to provide the software
under test.  There are exceptions, and we have some mechanisms for building
the kernel, but I think we're in a similar situation.

> 
> > * Dependency - indicates a pre-requisite that must be filled in order for a
> test to run (e.g. must have root access, must have 100 meg of memory, some
> program must be installed, etc.)
> # * Device under test (DUT) - the hardware or product being tested (consists
> of hardware under test and software under test) (also 'board', 'target')
> 
> We call this more or less SUT i.e. system under test, since we do system
> wide testing and we mostly do not care about the hardware.

Given this feedback, I think that maybe SUT is a term of art for "System Under Test",
and that we should avoid defining it as "Software Under Test".

To others:
How many people are more familiar with "system under test" than with "device under test"?

I was leaning towards using "software under test" to distinguish the software
part of the system being tested from the hardware part of the system being tested.
DUT is often used (as noted in the current glossary definition) to mean the
whole thing being tested, including both the software and hardware.  But in your
case where there is no hardware (well, it's buried under a lot more software deep
in the system and is really not the subject of the test), I can see how the DUT
terminology would be confusing.

I think at the summit Kevin suggested the word "target".  I've actually used the word
'target' for years, for the system under test, and I had to work to switch to 'board'
and 'DUT' in Fuego.

What do other people think of using 'target' instead of 'DUT'?  Or of switching
'SUT' to "System Under Test"?  Which meaning of "SUT" is preferred?

> 
> # * Deploy - put the test program or SUT on the DUT
> > ** this one is ambiguous - some people use this to refer to SUT installation,
> and others to test program installation
> 
> This is complicated to describe correctly. But basically we do not call
> our OS installation deployment because the installation is a test
> itself. So instead of deploying anything for kernel tests, these tests
> are depending on installation tests. We are handed an qemu disk image at
> the end of the installation tests, which we boot and then install our
> tests such as LTP. So maybe we can call deployment the act of LTP
> installation but we do not call it as such.

(A) Do you have a term for "installation of the system software"?
(B) Do you have a term for "installation of the test software"?

Fuego was using "provisioning" for A (but very rarely as Fuego doesn't
have (A) formalized in our system yet), and "deploy" for B.

Lots of systems seem to not have strict vocabulary around one or the
other of these concepts.  For example, some systems assume the test
software (e.g. sysbench) is already present on the target, and don't really
have nomenclature for how it gets there.

> 
> # * Device under Test (DUT) - a product, board or device that is being tested
> # * DUT controller - program and hardware for controlling a DUT (reboot,
> provision, etc.)
> # * DUT scheduler - program for managing access to a DUT (take
> online/offline, make available for interactive use)
> 
> This does not even make any sense in our automated testing, most of the
> time we do not care about hardware. We spawn virtual machines on
> demand.

OK - I have some questions here.  Even though you don't have hardware,
I feel like there must be some layer of software here that maybe you call
something else.  Maybe in your case this is libvirt (a virtual machine manager).

Please forgive these questions because I am unfamiliar with working with
VMs.

You have to spawn the virtual machines.  I assume there's an API for starting
and stopping them, and maybe for taking snapshots?  Do you have a term
for this layer?  (libvirt?, VM manager? - I'm making this up, I have no idea).
Is there an API for configuring what appears on the buses inside the VM?
Is there an API for indicating how the networking is configured?  This
would correspond, roughly, to the DUT control layer I think.

Are all VMs identical?  Or is there the possibility that you may want to
spawn VMS of a particular kind for one type of test, and VMs of a different
kind for other tests.  For example, do you have VMs of different
architectures (ARM vs Intel)?  What layer of your system knows about this
difference, and selects or spawns the correct VM type?

I presume that the pool of VMS is essentially infinite.  But is there any
mechanism for limiting the number of VMs that are spawned at the same
time.  This would correspond, roughly, to the concept of a DUT scheduler,
I believe.  Maybe a term like "target manager" would work better to cover
both cases (control of physical hardware and management of VMs).

Is there a layer of software that detects that a VM is hung, and terminates
it?  What is that layer called?

> 
> > ** This is not shown in the CI Loop diagram - it could be the same as the
> Test Scheduler
> # * Lab - a collection of resources for testing one or more DUTs (also 'board
> farm')
> 
> For us Lab is a beefy server than can run reasonable amount of virtual
> machines at one time.
> 
> > * Log - one of the run artifacts - output from the test program or test
> framework
> > * Log Parsing - extracting information from a log into a machine-
> processable format (possibly into a common format)
> > * Monitor - a program or process to watch some attribute (e.g. power)
> while the test is running
> > ** This can be on or off the DUT.
> > * Notification - communication based on results of test (triggered by results
> and including results)
> > * Pass criteria - set of constraints indicating pass/fail conditions for a test
> # * Provision (verb) - arrange the DUT and the lab environment (including
> other external hardware) for a test
> 
> Not sure if this makes sense in our settings. We certainly do not mess
> with any hardware most of the time.
> 
> Well we do create virtual switches that connects our VMs on demand for
> certain tests though.
I think this would indeed fall in the category of what we mean by
the word "provision" here.  Are there other things external to the
VM itself, that require setup before the test?
For example, if you do network testing, do you set up the other endpoint
of the network.  Is there a name for this external environment setup?

Here's a concrete example: In Fuego, we run the netperf server on the 
Fuego host.  Reserving the netperf server for exclusive use would be part
of provisioning for the netperf test.

I'm not super happy with this term.  Maybe something simpler like
"test setup" (including off-target items) would be more obvious?

> 
> > ** This may include installing the SUT to the device under test and booting
> the DUT.
> > * Report generation - collecting run data and putting it into a formatted
> output
> 
> # * Request (noun) - a request to execute a test
> 
> Not sure where this one actually goes, I suppose that we request a test
> to be executed and this is handled by the test scheduler.

Some systems, like Jenkins, have a "build" object, but it only exists in a queue in memory.
Fuego allows you to create a "request" object, which can be passed to another
system for execution.  In that case, it ends up residing on a server until it is dispatched
to the other lab.  I think LAVA also has an object (I'm not sure what they call it)
that serves a similar purpose.

This is needed if the trigger detection can occur on a different system than the
one that executes the tests.  (Basically, a trigger creates a request, which is
then handled by a test runner.)

> 
> > * Result - the status indicated by a test - pass/fail (or something else) for a
> Run
> > * Results query - Selection and filtering of data from runs, to find patterns
> > * Run (noun) - an execution instance of a test (in Jenkins, a build)
> # * Run artifact - item created during a run of the test program
> > * Serial console - the Linux console connected over a serial connection
> > * Software under test (SUT) - the software being tested
> # * Test agent - software running on the DUT that assists in test operations
> (e.g. test deployment, execution, log gathering, debugging
> > ** One example would be 'adb', for Android-based systems)
> 
> We just type commands into virtual serial console, so I guess that we
> can say that our test agent is bash.
Maybe, but I think every test system requires a posix shell.  This concept
is more about describing a piece of software on the target side that
is dedicated to supporting test operations.  Eclipse has something
called TCF (Target Communication Framework), that handles program
deployment and debugging, that I would call a "test agent".  Android has
adb (Android Debug Bridge).  Usually these are special programs that
support not only communication between the system running the
test and the target, but other operations like instantiating a debugger,
translating log messages, or capturing dumps.

> 
> > * Test definition - meta-data and software that comprise a particular test
> > * Test program - a script or binary on the DUT that performs the test
> > * Test scheduler - program for scheduling tests (selecting a DUT for a test,
> reserving it, releasing it)
> > * Test software - source and/or binary that implements the test
> # * Transport (noun) - the method of communicating and transferring data
> between the test system and the DUT
> 
> Again we use virtual serial console but we do not call it transport.

Is there any notion of transferring data to or from the VM during the
test?  How are logs retrieved?  (always through the serial console?)
How would a trace (like from ftrace) be retrieved?

For that matter, how is the virtual machine image retrieved (if that
is saved as part of test assets)?  What is the name of the layer that
handles that, and what APIs are there for that?

> 
> > * Trigger (noun) - an event that causes the CI loop to start
> # * Variant - arguments or data that affect the execution and output of a test
> (e.g. test program command line; Fuego calls this a 'spec')

OK - I'll admit that "variant" is a made-up term, but I think many systems
have this and don't realize it (or maybe just call it something else).
Maybe another term would be better.

LTP has files in the 'runtest' directory, that include the program to run, as well
as command line arguments for the program.  And I believe the runtest file
associates a testcase id to each combination of program and arguments.

In my mind this conflates two things: the test plan (list of things to run) and
(variant) way to run each item.  But maybe I'm confused.  Does someone
"run" a runtest file, or a "scenario" file.

But I'm not sure I understand totally, because there's also the 'scenario_groups'
file, which seems to serve the purpose of test plan (list of things to run) also.
Is a runtest file a scenario file?  I'm confused.

Maybe it's just a different factorization of the same concept.

In Fuego, we have a test, and the test can have a 'spec' file, which can indicate
different sets of command-line arguments and environment variables used by
that test.

We don't really have a good name for these (command line args and env vars),
but I've started to use the term "test variables" in the Fuego documentation. 
"Variant" is the name of a collection of these test variables, and they can be
customized on a per-board, per-user, or per-test basis.
Obviously, which variant of a test you run can affect the results.

Is it a common use-case for people to write new scenario_group or runtest files for
LTP?  Do LTP users share them with each other?

> > * Visualization - allowing the viewing of test artifacts, in aggregated form
> (e.g. multiple runs plotted in a single diagram)

Thanks.  I hope you don't think I'm belaboring the point.  I'm just trying to 
understand how LTP is used, and what concepts map to the terms we suggested
in the glossary.  In particular, for some of the less familiar ones this discussion
may help us find more understandable terms.
 -- Tim