[yocto] gdb built with musl libc segfault

Paul Barker paul at betafive.co.uk
Wed Apr 3 07:16:18 PDT 2019


On 02/04/2019 14:51, Lluis Campos wrote:
> Hi Paul,
> 
> 
> On 02.04.2019 14:49, Paul Barker wrote:
>> On 02/04/2019 12:45, Lluis Campos wrote:
>>> Hi all,
>>>
>>> This is my very first question in the Yocto mailing list. Very 
>>> exited! Please let me know if I should use other list for this.
>>>
>>>
>>> I am building an image using musl libc instead of gnu libc. I am not 
>>> using yocto-tiny distro, instead I achieve this by setting on my 
>>> local.conf:
>>>
>>> TCLIBC = "musl"
>>>
>>>
>>> My app (mender) got a segfault just starting. See output from strace:
>>>
>>> root at raspberrypi3:~# strace mender
>>> execve("/usr/bin/mender", ["mender"], 0x7ee65e10 /* 13 vars */) = 0
>>> set_tls(0x76f1bffc)                     = 0
>>> set_tid_address(0x76f1bfa0)             = 3020
>>> --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x530bc8} 
>>> ---
>>> +++ killed by SIGSEGV +++
>>> Segmentation fault
>>>
>>>
>>> To be able to debug the process, I added gdb to my image adding to my 
>>> local.conf:
>>>
>>> CORE_IMAGE_EXTRA_INSTALL += "packagegroup-core-buildessential 
>>> packagegroup-core-tools-debug"
>>>
>>>
>>> Then, ironically, gdb itself also segfaults:
>>>
>>> root at raspberrypi3:~# strace gdb 2>&1 | tail
>>> fcntl64(3, F_SETFD, FD_CLOEXEC)         = 0
>>> getdents64(3, /* 6 entries */, 2048)    = 144
>>> getdents64(3, /* 0 entries */, 2048)    = 0
>>> close(3)                                = 0
>>> ioctl(0, TIOCGWINSZ, {ws_row=25, ws_col=74, ws_xpixel=0, 
>>> ws_ypixel=0}) = 0
>>> getcwd("/home/root", 4096)              = 11
>>> access("/usr/local/bin/gdb", X_OK)      = -1 ENOENT (No such file or 
>>> directory)
>>> access("/usr/bin/gdb", X_OK)            = 0
>>> --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, 
>>> si_addr=0x7e35aff0} ---
>>> +++ killed by SIGSEGV +++
>>>
>>>
>>> So, what is going on here? My guess is that some recipes are being 
>>> wrongly linked with gnu libc instead of musl, and then cannot run in 
>>> my device.
>>>
>>> Any ideas on how to debug the issue?
>>>
>>
>> Hi Lluis,
>>
>> This is an issue I've seen before with runc and gdb.
>>
>> In runc we saw SIGILL which I tracked down to some hideous 
>> setjmp/longjmp magic written in C. cgo is used to include this C code 
>> in with the Go code that comprises the rest of the application.
> 
> Our application is written in Go and we use CGO as well. So it sounds 
> quite similar.
> 
> 

Does your application use setjmp/longjmp or C++ exceptions in the 
sections built with CGO?

I've dug into the runc disassembly as well as adding extra prints and 
can see that it's at calls to those functions that the program counter 
jumps off into the weeds resulting in SIGILL. For gdb there's usage of 
C++ exception handling around gdb_main() and strace shows that the crash 
is very very early in execution so I suspect it's setjmp/longjmp again, 
used by C++ exception handling.

>>
>> In gdb we saw SIGSEGV which is what you've got above.
>>
>> I think things are being correctly linked against musl but then 
>> there's some runtime issue in recent musl versions, possibly in 
>> conjunction with recent kernel headers.
>>
>> Are you using the thud or master branch?
>>
> I am using thud branch. I haven't actually tried with master but I will 
> do it later today
> 

I've reproduced the issue on master. I've also ruled out the issue in 
sumo branch even if we uprev musl to the same version used on master. 
It's likely a gcc/musl incompatibility of some kind introduced in a 
recent gcc version.

I'm passing this one to a colleague for now but I'll try to have another 
look myself next week. It's captured in our issue tracker for Oryx here: 
https://gitlab.com/oryx/oryx/issues/14.

Thanks,

-- 
Paul Barker
Managing Director & Principal Engineer
Beta Five Ltd


More information about the yocto mailing list