[yocto] Yocto Realtime tests on beaglebone black

Wed Feb 18 06:57:23 PST 2015

On 15-02-17 05:57 PM, Stephen Flowers wrote:
>
> I loaded the system effectively and also changed my rt application to
> use asynchronous IO - I find the rt kernel is much tighter at periodic
> latency yet seems to be worse in the latency measurements. I'm asuming
> the non-deteministic nature of userland file IO operations is causing
> the additional latency, even when using aio. Setting the IO scheduler
> did not have an effect.
>
> Results show periodic timer latency in microseconds & interrupt latency
> in microseconds.

The results are still puzzling, since that max value really shouldn't
be higher in the -rt kernel.

What sort of device is backing the filesystem and IO ?  There are some
well known latency issues with USB and flash .. so that very well could
be causing some issues with -rt, and why you are getting what we expect
with the cyclictest results, but not in this run.

Consider running cyclictest at the same time, and enabling the latency
tracing .. that will allow you to peek under the covers and see if
there's an obvious latency issue being triggered.

Bruce

>
> Realtime
> Min        -324.0833333    159.75
> Max        367.8333333        526.4166667
> Avg        0.587306337        206.8056595
>
> Standard
> Min        -608.6666667    123.75
> Max        612                448.0833333
> Avg        0.5557039        153.5281784
>
> All help appreciated,
> Steve
>
> On 13/02/2015 05:08, Bruce Ashfield wrote:
>> On 2015-02-12 7:20 PM, William Mills wrote:
>>>
>>>
>>> On 02/12/2015 05:05 PM, Stephen Flowers wrote:
>>>>
>>>> So I ran cyclictest with an idle system and loaded with multiple
>>>> instances of cat /dev/zero > /dev/null &
>>>>
>>>
>>> When I suggested filesystem activity I was suggesting getting a kernel
>>> filesystem and a physical I/O device to be active.
>>> The load above is just two character devices so not a ton of kernel
>>> code is active.
>>>
>>> If you are interested in pursuing this further I would write a script
>>> that writes multiple files to MMC and then deletes them and do this in
>>> a loop.
>>
>> The mmc/flash/usb are definitely hot paths for any -rt kernel
>> and will really show any lurking latency issues.
>>
>>>
>>> Perhaps Bruce knows if there is already a test like this in the
>>> rt-tests.
>>
>> It seems like everyone has their own set of scripts that load
>> cpu, io and memory. I now that we have a few @ Wind River that
>> really kick the crap out of a system.
>>
>> rt-tests itself doesn't have any packaged, but it really sounds
>> like something we should pull together.
>>
>> In the meantime, using a combo of lmbench, an application that
>> allocates and frees memory and a "find /" will generate a pretty
>> good load on the system.
>>
>>>
>>>> #cyclictest -a 0 -p 99 -m -n -l 100000 -q
>>>>
>>>> I ran this command as shown by Toyoka at the 2014 Linuxcon Japan
>>>> [http://events.linuxfoundation.org/sites/events/files/slides/toyooka_LCJ2014_v10.pdf]
>>>>
>>>>
>>>>
>>>> to compare against his results for the BBB.  I also threw in xenomai
>>>> with kernel 3.8 for comparison.  For the standard kernel HR timers were
>>>> disabled.
>>>
>>> I believe cyclictest requires HR timers for proper operation.
>>
>> You are correct.
>>
>>> This may explain the very strange numbers for standard kernel below.
>>>
>>>>
>>>> [idle]
>>>> preempt_rt: min 12 avg: 20 max: 59
>>>> standard: min: 8005 avg: 309985,955 max: 619963985
>>>> xenomai: min: 8 avg: 16: max 803
>>>>
>>>> [loaded]
>>>> preempt_rt: min 16 avg: 21 max: 47
>>>> standard: min: 15059 avg: 67769851 max: 135530885
>>>> xenomai: min: 10 avg: 15: max 839
>>>>
>>>
>>> Yes, the RT numbers now look reasonable.
>>>
>>> The standard kernel numbers are way out.  I can't believe the average
>>> latency on an idle system was 5 minutes. Perhaps the dependency on HR
>>> timers is more than I expect and without it the numbers are just
>>> bonkers. I would have expected the numbers to have a floor near the tick
>>> rate w/o HR.
>>> Bruce: Is that really what that number means??
>>
>> Without hrtimers, the results really can get out of whack.
>> cyclictest should be yelling when it starts if they aren't found in
>> the system. While I would expect them to be worse (i.e. jiffies
>> granularity ~ 10ms without HRT), I wouldn't expect them to be that
>> bad .. it more smells like cyclic test is using uninitialized variable
>> when high res timers aren't in play.
>>
>>>
>>> The loaded numbers are smaller for RT and std.  Strange.
>>> It might be that the "load" is not very significant.
>>
>> Or the cache is staying hot, and hence -mm is staying out of the way.
>> We've seen variants of this as well, keeping a close cpu in a tight
>> loop, and then measuring interrupt latency to a second cpu results
>> in better latencies.
>>
>>> Its not really the CPU load that were after.  Instead we are trying to
>>> activate code paths that have premtption disabled due to critical
>>> sections and locks.
>>>
>>> I don't know if your are interested in taking this to ground, but if so
>>> I would enable HR in std and try a load as I suggest above or is
>>> already included in the rt-tests.
>>> Bruce certainly knows more about this than I do and might suggest a
>>> load script.
>>
>> See above.
>>
>> Also, let cyclictest trigger ftrace you your behalf, and the pathological
>> case triggering the biggest spikes will be caught.
>>
>> Cheers,
>>
>> Bruce
>>
>>>
>>>> Actually the preempt_rt results tie up pretty well with Toyooka above,
>>>> leading me to conclude theres something off in my code that could be
>>>> optimised - what do you guys think.
>>>
>>> Is your test code userspace or kernel space?
>>> You can look at cyclictest to see if you missed something.
>>> The RT wiki also has some examples for RT apps.
>>>
>>> https://rt.wiki.kernel.org/index.php/HOWTO:_Build_an_RT-application
>>>
>>>> Also, I ran a test with preempt_rt at 100Hz and there was maybe 10%
>>>> improvement in latency.
>>>>
>>> That sounds reasonable to me.
>>>
>>>
>>>> Steve
>>>>
>>>> On 12/02/2015 00:35, William Mills wrote:
>>>>> + meta-ti
>>>>> Please keep meta-ti in the loop.
>>>>>
>>>>> [Sorry for the shorting.  Thunderbird keep locking up when I tried
>>>>> replay all in plain text to this message.]
>>>>>
>>>>> ~ 15-02-11, Stephen Flowers wrote:
>>>>> > Thanks for your input.  Here are results of 1000 samples over a
>>>>> > 10 second period:
>>>>> >
>>>>> > Interrupt response (microseconds)
>>>>> > standard: min: 81, max:118, average: 84
>>>>> > rt: min: 224, max: 289, average: 231
>>>>> >
>>>>> >Will share the .config later once I get on that machine.
>>>>>
>>>>> Steve I agree the numbers look strange.
>>>>> There may well be something funny for RT going on for BBB.
>>>>> TI is just starting to look into RT for BBB.
>>>>>
>>>>> I would like to see the cyclictest results under heavy system load for
>>>>> standard and RT kernels.  The whole point of RT is to limit the max
>>>>> latency when the system is doing *anything*.
>>>>>
>>>>> I am not surprised that the standard kernel has good latency when
>>>>> idle.
>>>>> As you add load (filessystem is usually a good load) you should see
>>>>> that max goes up a lot.
>>>>>
>>>>> Also, as Bruce says, some degradation of min and average and also
>>>>> general system throughput is expected for RT.  That is the trade-off.
>>>>> I still think the number you are getting for RT seem high but I don't
>>>>> know what your test is doing in detail.  (I did read your
>>>>> explanation.)
>>>>> cyclictest should give us a standard baseline.
>>>>>
>>>>>
>>>>> On 02/11/2015 10:25 AM, Bruce Ashfield wrote:
>>>>>> On 15-02-11 03:50 AM, Stephen Flowers wrote:
>>>>>>>
>>>>>>> my bad, here is the patch set.
>>>>>>> As for load, only system idle load for the results I posted
>>>>>>> previously.
>>>>>>> Will run some cyclic test next.
>>>>>>
>>>>>> One thing that did jump out was the difference in config_hz, you
>>>>>> are taking a lot more ticks in the preempt-rt configuration. If
>>>>>> you run both at the same hz, or with no_hz enabled, it would be
>>>>>> interesting to see if there's a difference.
>>>>>>
>>>>>> Bruce
>>>>
>>
>