[Automated-testing] power-control standard (was RE: Linaro Connect Report)

Tim.Bird at sony.com Tim.Bird at sony.com
Thu Apr 11 10:53:35 PDT 2019


> -----Original Message-----
> From: Matt Hart
> 
> On Thu, 11 Apr 2019 at 15:08, Dave Pigott <dave.pigott at linaro.org> wrote:
> >
> >
> > > On 11 Apr 2019, at 01:51, <daniel.sangorrin at toshiba.co.jp>
> <daniel.sangorrin at toshiba.co.jp> wrote:
> > >
> > > Thanks for the report Dan!
> > >
> > >> From: Dan Rue
> > > [...]
> > >> - Core pdu api actions: on, off, status
> > >> - Core relay api actions: on, off, ???
> > >
> > > I think that we also talked about "reset" (or "reboot", but I think that
> "reset" is a better verb because it implies that the operation is _forced_
> unlike "reboot" which is usually done by the OS).
> > > I am not an expert about PDUs/Relays but it seems that "reset" can be
> realized by combining the "on" and "off" (experts, please correct me if I am
> wrong) operations. That should work if the driver knows the limitations of
> the PDU and can add the necessary delays. In that case, I guess we can safely
> leave out the "reset" operation.

Just to put in my 2 cents about the verb name.

ttc uses 'reset' to mean a board reset - which sometimes corresponds to
the Linux software 'reboot' command, and no hardware/power intervention, and sometimes
to correspond to toggling the hardware 'reset' button (or pin).  Note that for some boards
the hardware reset button has a different effect than a power cycle would, which is
why these operations are separate in ttc.  (For example, in some cases it may be
possible to retrieve the kernel log messages from memory after a reset but not
after a reboot.)

ttc uses 'reboot' to mean a hardware power cycle, with associated
re-loading of the kernel and rootfs.

ttc reboot performs a composite operation, which means that
'ttc reboot' does the entire process of power cycle, firmware bootstrap,
and kernel boot.  It would call out to a power-control layer for portions
of the whole reboot operation.

ttc uses the verbs 'on' and 'off' for power control.  And  'ttc status' is used for
status of more than just power control.  'ttc status' reports 4 things:
 1) power status
 2) network status (is device pingable?)
 3) operational status (can a command be executed on the board?)
 4) reservation status (does a test or a user have the board reserved, and for how long?)

See https://elinux.org/Ttc_Program_Usage_Guide (this is somewhat dated, unfortunately)

The power status by 'ttc status' (and what it expects from subsidiary helper scripts and apps)
is a single word from the set ('ON', 'OFF', 'UNKOWN')
(exactly as spelled, and in all uppercase).

Fuego internally uses the following functions:

rootfs_reboot - for a software or distribution-initiated reboot (ie Linux 'reboot' command)
board_control_reboot - for a hardware reboot (ie PDU power cycle)

Fuego doesn't use the 'reset' terminology.

I think I was lobbying for 'reboot' in the meeting, but on further thought I don't think
that word is good, as most people associate that with the full bootup process
and not just the power control aspect of it.

One other option for a name for the operation of turning the power off and on again,
might be 'cycle'.  I've seen that used in some documentation for PDUs.
And it would not conflict with names used elsewhere in the board control stack.
So: "power-control minnowboard1 cycle" would be the command for turning the
power off and on again, to the board named minnowboard1.

> >
> > Hi Daniel,
> >
> > I’ve added you to the PDU control document so you can comment.
> >
> > We can implement reset within the daemon by, as you say, sending off,
> pause and then on.
> 
> PDUDaemon already uses "reboot" but I'm happy to switch to "reset", or
> more likely just accept both so as not to break existing users.
> 
> "Reset" is just a wrapper around off/on but it's a common operation in
> automated testing so I think it's fine to have it specified in this
> document, and at the same time a client/user doesn't need to know if
> the PDU can support it directly as the drivers already handle it for
> them.
> I'm only expecting to need to add "status" as an command to PDUDaemon,
> when the devices support it. On/Off/Reboot are already accepted, it
> wouldn't be a very useful power control daemon otherwise.
> 
> >
> > >
> > > By the way, should we still use the word PDU or something like "power
> control unit"? I think that there was a discussion about that, but I am not sure
> if there was a consensus.
> >
> > I’m already planning on calling it “power_control” or something similar. This
> can also include IPMI type interfaces.

I think other important people to get feedback from on this are
the r4d people (Linutronix), the labgrid people (Pengutronix), the SLAV
people (Samsung),  and maybe see what the verb name is in libvirt.

I'm cc-ing Manuel Traut and Jan Lubbe and Pawel Wieczorek (even though
they're on the automated-testing list) so that this catches their attention.
 -- Tim


More information about the automated-testing mailing list