[Automated-testing] Glossary words unfamiliar to enterprise testers

Thu Nov 1 15:49:49 PDT 2018

Hi!
> Just a question up front - are you talking about SUSE distribution testing
> or LTP?   Or are you talking about how SUSE specifically uses LTP?

These answers are about SUSE distribution testing. You asked why the
terms were unfamiliar to me and obviously openQA, the SUSE automation
framework, is the one I'm most familiar with, so I'm trying to explain
it to outline the differencies.

> I know LTP is used in a lot of different situations, including hardware testing,
> but maybe those are outside your personal experience.  I'm just wanting
> to clarify.  (This question might explain some of my other questions below).

I have very vague idea how LTP is used anywhere else than SUSE, I
learned a thing or two during last few years but overall I'm not very
informed in that regard.

> > # * Deploy - put the test program or SUT on the DUT
> > > ** this one is ambiguous - some people use this to refer to SUT installation,
> > and others to test program installation
> > 
> > This is complicated to describe correctly. But basically we do not call
> > our OS installation deployment because the installation is a test
> > itself. So instead of deploying anything for kernel tests, these tests
> > are depending on installation tests. We are handed an qemu disk image at
> > the end of the installation tests, which we boot and then install our
> > tests such as LTP. So maybe we can call deployment the act of LTP
> > installation but we do not call it as such.
> 
> (A) Do you have a term for "installation of the system software"?
> (B) Do you have a term for "installation of the test software"?
> 
> Fuego was using "provisioning" for A (but very rarely as Fuego doesn't
> have (A) formalized in our system yet), and "deploy" for B.
> 
> Lots of systems seem to not have strict vocabulary around one or the
> other of these concepts.  For example, some systems assume the test
> software (e.g. sysbench) is already present on the target, and don't really
> have nomenclature for how it gets there.

The (A) is just an parent test for LTP, you can see it here:

https://openqa.opensuse.org/tests/785051

This is text-mode installation test, that boots the iso in qemu and then
sends keypresses to finish the installation.

The (B) i.e. LTP installation is another test that runs commands over
shell to download and install LTP.

And finally for example the LTP syscalls run, or CVE run or any other
LTP run depends on the LTP installation and is handed the VM image with
the LTP installed, you can see the dependencies at the bottom of:

https://openqa.opensuse.org/tests/785116#settings

So we do not have fancy names for the different stages, it's all just a
chain of tests depending on each other.

> > # * Device under Test (DUT) - a product, board or device that is being tested
> > # * DUT controller - program and hardware for controlling a DUT (reboot,
> > provision, etc.)
> > # * DUT scheduler - program for managing access to a DUT (take
> > online/offline, make available for interactive use)
> > 
> > This does not even make any sense in our automated testing, most of the
> > time we do not care about hardware. We spawn virtual machines on
> > demand.
> 
> OK - I have some questions here.  Even though you don't have hardware,
> I feel like there must be some layer of software here that maybe you call
> something else.  Maybe in your case this is libvirt (a virtual machine manager).
> 
> Please forgive these questions because I am unfamiliar with working with
> VMs.
> 
> You have to spawn the virtual machines.  I assume there's an API for starting
> and stopping them, and maybe for taking snapshots?  Do you have a term
> for this layer?  (libvirt?, VM manager? - I'm making this up, I have no idea).
> Is there an API for configuring what appears on the buses inside the VM?
> Is there an API for indicating how the networking is configured?  This
> would correspond, roughly, to the DUT control layer I think.

Basically there is perl library that builds up qemu command line, runs
the process and also kill it, can do a snapshot, etc.

https://github.com/os-autoinst/os-autoinst/blob/master/backend/qemu.pm

So I guess that you are right, that we do something along these lines.

We also have one for ipmi that can controll physical servers:

https://github.com/os-autoinst/os-autoinst/blob/master/backend/ipmi.pm

And looking at the paths we obviously call them backends.

> Are all VMs identical?  Or is there the possibility that you may want to
> spawn VMS of a particular kind for one type of test, and VMs of a different
> kind for other tests.  For example, do you have VMs of different
> architectures (ARM vs Intel)?  What layer of your system knows about this
> difference, and selects or spawns the correct VM type?

There are some variables in the settings (see the link above) that can
be processed by the backend to change ram size, etc.

We do run tests for aarch64, x86_64, ppc64le and s390x. The s390x is
completely different beast, the rest runs on qemu-kvm.

> I presume that the pool of VMS is essentially infinite.  But is there any
> mechanism for limiting the number of VMs that are spawned at the same
> time.  This would correspond, roughly, to the concept of a DUT scheduler,
> I believe.  Maybe a term like "target manager" would work better to cover
> both cases (control of physical hardware and management of VMs).

I do not think that we separate test scheduller and DUT scheduller, the
whole thing as far as I know is just a perl daemon that runs tests when
test are triggered, mostly that means that new iso has been build, and
it only has to do simple bookkeeping to ensure we are not overbooked.

Well it gets a bit more complicated in case that there are tests that
needs more than one machine, but it shouldn't be more complicated than
that.

> Is there a layer of software that detects that a VM is hung, and terminates
> it?  What is that layer called?

Each command we run on the VM has timeout, when the timeout is reached
the VM is killed and the test fails. No idea if that even has a name.

> > # * Provision (verb) - arrange the DUT and the lab environment (including
> > other external hardware) for a test
> > 
> > Not sure if this makes sense in our settings. We certainly do not mess
> > with any hardware most of the time.
> > 
> > Well we do create virtual switches that connects our VMs on demand for
> > certain tests though.
> I think this would indeed fall in the category of what we mean by
> the word "provision" here.  Are there other things external to the
> VM itself, that require setup before the test?
> For example, if you do network testing, do you set up the other endpoint
> of the network.  Is there a name for this external environment setup?

We do not really separate test setup from the rest. So if we have a
network test that runs on two machines, we just execute differnt
commands on each and we have a mechanism which allows to synchronize the
execution at certain points between the VMs.

> Here's a concrete example: In Fuego, we run the netperf server on the 
> Fuego host.  Reserving the netperf server for exclusive use would be part
> of provisioning for the netperf test.

I'm not avare of anything that would map to reserving a support server.

> I'm not super happy with this term.  Maybe something simpler like
> "test setup" (including off-target items) would be more obvious?

Term "test setup" sounds more obvious to me.

> > > ** This may include installing the SUT to the device under test and booting
> > the DUT.
> > > * Report generation - collecting run data and putting it into a formatted
> > output
> > 
> > # * Request (noun) - a request to execute a test
> > 
> > Not sure where this one actually goes, I suppose that we request a test
> > to be executed and this is handled by the test scheduler.
> 
> Some systems, like Jenkins, have a "build" object, but it only exists in a queue in memory.
> Fuego allows you to create a "request" object, which can be passed to another
> system for execution.  In that case, it ends up residing on a server until it is dispatched
> to the other lab.  I think LAVA also has an object (I'm not sure what they call it)
> that serves a similar purpose.
> 
> This is needed if the trigger detection can occur on a different system than the
> one that executes the tests.  (Basically, a trigger creates a request, which is
> then handled by a test runner.)

Thanks, now it's clear to me.

> > > * Test definition - meta-data and software that comprise a particular test
> > > * Test program - a script or binary on the DUT that performs the test
> > > * Test scheduler - program for scheduling tests (selecting a DUT for a test,
> > reserving it, releasing it)
> > > * Test software - source and/or binary that implements the test
> > # * Transport (noun) - the method of communicating and transferring data
> > between the test system and the DUT
> > 
> > Again we use virtual serial console but we do not call it transport.
> 
> Is there any notion of transferring data to or from the VM during the
> test?  How are logs retrieved?  (always through the serial console?)
> How would a trace (like from ftrace) be retrieved?

I had to look this up and it looks like we do upload files from the VM
with curl. So the VM relies on having network connection to the host on
that.

> For that matter, how is the virtual machine image retrieved (if that
> is saved as part of test assets)?  What is the name of the layer that
> handles that, and what APIs are there for that?

What exactly means 'retrieved' here?

If I want to download it I can download via http:// protocol and it's
shown in the list of assets in the web UI.

> > > * Trigger (noun) - an event that causes the CI loop to start
> > # * Variant - arguments or data that affect the execution and output of a test
> > (e.g. test program command line; Fuego calls this a 'spec')
> 
> OK - I'll admit that "variant" is a made-up term, but I think many systems
> have this and don't realize it (or maybe just call it something else).
> Maybe another term would be better.
> 
> LTP has files in the 'runtest' directory, that include the program to run, as well
> as command line arguments for the program.  And I believe the runtest file
> associates a testcase id to each combination of program and arguments.

Yes, which is nightmare to maintain, since we have to make sure the
testcase ids are unique.

Basically one runtest file should contain related testcases, the
question here is what 'related' really means. It also means that some of
the runtest files are subset of others. I.e. syscalls-ipc is subset of
syscalls which is pain to maintain as well.

Basically organinzing tests groups in set of files is a pain. Maybe we
should just define list of tests with tags, so that one test can be
tagged with tag syscall and tag ipc and the test runner could be then
asked to run only tests with particular tags.

> In my mind this conflates two things: the test plan (list of things to run) and
> (variant) way to run each item.  But maybe I'm confused.  Does someone
> "run" a runtest file, or a "scenario" file.

I do agree that the runtest files are kind of both.

> But I'm not sure I understand totally, because there's also the 'scenario_groups'
> file, which seems to serve the purpose of test plan (list of things to run) also.
> Is a runtest file a scenario file?  I'm confused.

This is mostly historic thing I would say, there are only two scenarios
at the moment, default and network.

The default scenario should contain only stable testcase, or maybe list
of maintaned testcases, but apart from that it does not serve any other
purpose.

> Maybe it's just a different factorization of the same concept.
> 
> In Fuego, we have a test, and the test can have a 'spec' file, which can indicate
> different sets of command-line arguments and environment variables used by
> that test.
> 
> We don't really have a good name for these (command line args and env vars),
> but I've started to use the term "test variables" in the Fuego documentation. 
> "Variant" is the name of a collection of these test variables, and they can be
> customized on a per-board, per-user, or per-test basis.
> Obviously, which variant of a test you run can affect the results.
> 
> Is it a common use-case for people to write new scenario_group or runtest files for
> LTP?  Do LTP users share them with each other?

I have not idea, anybody else to comment?

> > > * Visualization - allowing the viewing of test artifacts, in aggregated form
> > (e.g. multiple runs plotted in a single diagram)
> 
> Thanks.  I hope you don't think I'm belaboring the point.  I'm just trying to 
> understand how LTP is used, and what concepts map to the terms we suggested
> in the glossary.  In particular, for some of the less familiar ones this discussion
> may help us find more understandable terms.

No problem, actually you made me think about how we organize tests in
LTP, maybe we can even manage to figure out something better than the
status quo we have now.

-- 
Cyril Hrubis
chrubis at suse.cz