[meta-freescale] Chromium file descriptor leak and crash issue on imx6 with Weston

Thu Oct 8 11:11:10 PDT 2015

Hello everyone,

We're currently developing a product that relies heavily on Chromium to
display content in a loop (including webgl and videos) and are running in a
few stability issues. At this point we're unsure if it's webgl or video
related (or both).

One thing we noticed is that there seems to be a lot of file descriptors
leaking from the gpu-process. Whenever a video is playing, reloading the
page / loading a new url or closing the tab seems to be leaking file
descriptors according the /proc/[gpu-process-pid]/fd. However, if we wait
for the video to end completely before doing anything else, the leak does
not seem to happen. I've been able to reproduce the issue on our custom
hardware (imx6qdl based) using Freescale Yocto 1.6, as well as on a
Nitrogen6X using the community's 1.7 and 1.8. We tried with both Chromium
38.0.2125.101 and 40.0.2214.91. Galcore version is 5.0.11:25762.

The simplest way to reproduce the issue is to start Chromium with the
following video:
(Note: we run Chromium in incognito mode as it seems more stable, but
running as a normal session behaves the same)
google-chrome --incognito
http://techslides.com/demos/sample-videos/small.mp4

Once the page is loaded, playing the video a couple of times while
monitoring the number of fd in /proc/[gpu-process-pid]/fd should show that
there are 3 more fds while the video is playing, and they dissapear once the
video ends. For example:
(while playing)
root at imx6qh120:~# ls -al /proc/2007/fd | wc -l
63
(video ended)
root at imx6qh120:~# ls -al /proc/2007/fd | wc -l
60

If you then start the video and hit reload while the video is playing, here
is what you should see (result should be similar if you close the tab
instead):
(while playing)
root at imx6qh120:~# ls -al /proc/2007/fd | wc -l
70
(video ended)
root at imx6qh120:~# ls -al /proc/2007/fd | wc -l
68

Each time this happens, fds appear to be left trailing behind, looking like
this:
(here, the chrome process flagged as "--type=gpu-process" is 1176)

ls al /proc/1176/fd
[Šsnip]
lrwx------ 1 root root 64 Oct  8 12:03 50 ->
/dev/shm/.org.chromium.Chromium.yqa9hs (deleted)
lrwx------ 1 root root 64 Oct  8 12:03 51 ->
/dev/shm/.org.chromium.Chromium.rxZ7gw (deleted)
lrwx------ 1 root root 64 Oct  8 12:03 53 ->
/dev/shm/.org.chromium.Chromium.wQhgYz (deleted)
lrwx------ 1 root root 64 Oct  8 12:03 54 ->
/dev/shm/.org.chromium.Chromium.Sd1bnl (deleted)
lrwx------ 1 root root 64 Oct  8 12:03 55 ->
/dev/shm/.org.chromium.Chromium.roolL0 (deleted)
lrwx------ 1 root root 64 Oct  8 12:03 56 ->
/dev/shm/.org.chromium.Chromium.8b1eGN (deleted)
lrwx------ 1 root root 64 Oct  8 12:03 59 ->
/dev/shm/.org.chromium.Chromium.k8DJ8O (deleted)
lr-x------ 1 root root 64 Oct  8 12:03 6 -> /dev/urandom
lrwx------ 1 root root 64 Oct  8 12:03 63 ->
/dev/shm/.org.chromium.Chromium.dC0JSb (deleted)
lrwx------ 1 root root 64 Oct  8 12:03 64 ->
/dev/shm/.org.chromium.Chromium.kvYJfm (deleted)
lrwx------ 1 root root 64 Oct  8 12:03 66 ->
/dev/shm/.org.chromium.Chromium.3Uvica (deleted)
lrwx------ 1 root root 64 Oct  8 12:03 69 ->
/dev/shm/.org.chromium.Chromium.n0KK3C (deleted)
lrwx------ 1 root root 64 Oct  8 12:03 7 -> socket:[9825]
lrwx------ 1 root root 64 Oct  8 12:03 70 ->
/dev/shm/.org.chromium.Chromium.4frFLT (deleted)
lrwx------ 1 root root 64 Oct  8 12:03 73 ->
/dev/shm/.org.chromium.Chromium.T3zH8f (deleted)
lrwx------ 1 root root 64 Oct  8 12:03 8 ->
/run/user/root/weston-shared-mt7RJ3 (deleted)
lrwx------ 1 root root 64 Oct  8 12:06 84 -> socket:[45599]
lrwx------ 1 root root 64 Oct  8 12:06 88 -> socket:[47512]
lrwx------ 1 root root 64 Oct  8 12:03 9 -> /dev/fb0
lrwx------ 1 root root 64 Oct  8 12:06 92 ->
/dev/shm/.org.chromium.Chromium.Vjrn54 (deleted)
lrwx------ 1 root root 64 Oct  8 12:03 77 ->
/dev/shm/.org.chromium.Chromium.l7PDMZ (deleted)
[ŠsnipŠ]

There are typically 2 crash outputs we frequently see. This most common one:
----------------------------------------------------------------------
[5101:5101:0930/182607:ERROR:power_save_blocker_ozone.cc(32)] Not
implemented reached in virtual
content::PowerSaveBlockerImpl::~PowerSaveBlockerImpl()
[8314:8314:0930/182608:ERROR:sandbox_linux.cc(301)] InitializeSandbox()
called with multiple threads in process gpu-process
[5101:5101:0930/182609:ERROR:command_buffer_proxy_impl.cc(150)] Could not
send GpuCommandBufferMsg_Initialize.
[5101:5101:0930/182609:ERROR:webgraphicscontext3d_command_buffer_impl.cc(213
)] CommandBufferProxy::Initialize failed.
[5101:5101:0930/182609:ERROR:webgraphicscontext3d_command_buffer_impl.cc(230
)] Failed to initialize command buffer.
[8258:8265:0930/182609:ERROR:webgraphicscontext3d_command_buffer_impl.cc(274
)] Failed to initialize GLES2Implementation.
[5152:5177:0930/182609:ERROR:webgraphicscontext3d_command_buffer_impl.cc(274
)] Failed to initialize GLES2Implementation.
[8320:8320:0930/182609:ERROR:sandbox_linux.cc(301)] InitializeSandbox()
called with multiple threads in process gpu-process
[5152:5177:0930/182609:ERROR:webgraphicscontext3d_command_buffer_impl.cc(274
)] Failed to initialize GLES2Implementation.
[8258:8265:0930/182609:ERROR:command_buffer_proxy_impl.cc(150)] Could not
send GpuCommandBufferMsg_Initialize.
[5101:5101:0930/182609:ERROR:command_buffer_proxy_impl.cc(150)] Could not
send GpuCommandBufferMsg_Initialize.
[5101:5101:0930/182609:ERROR:webgraphicscontext3d_command_buffer_impl.cc(213
)] CommandBufferProxy::Initialize failed.
[5101:5101:0930/182609:ERROR:webgraphicscontext3d_command_buffer_impl.cc(230
)] Failed to initialize command buffer.
[8258:8265:0930/182609:ERROR:webgraphicscontext3d_command_buffer_impl.cc(213
)] CommandBufferProxy::Initialize failed.
[8258:8265:0930/182609:ERROR:webgraphicscontext3d_command_buffer_impl.cc(230
)] Failed to initialize command buffer.
[5101:5101:0930/182609:ERROR:surface_factory_ozone.cc(53)] Not implemented
reached in virtual scoped_ptr<ui::SurfaceOzoneCanvas> ui::SurfaceFactoryOzon
e::CreateCanvasForWidget(gfx::AcceleratedWidget)
[5101:5101:0930/182609:FATAL:software_output_device_ozone.cc(22)] Failed to
initialize canvas
----------------------------------------------------------------------

And this one:
----------------------------------------------------------------------
[16856:16856:1006/110817:ERROR:power_save_blocker_ozone.cc(29)] Not
implemented reached in
content::PowerSaveBlockerImpl::PowerSaveBlockerImpl(content:
:PowerSaveBlocker::PowerSaveBlockerType, const string&)
[17672:17672:1006/110822:ERROR:display.cc(87)] WaylandDisplay failed to
initialize hardware
[17672:17672:1006/110822:FATAL:ozone_platform_wayland.cc(106)] failed to
initialize display hardware
[16856:16856:1006/110822:ERROR:surface_factory_ozone.cc(53)] Not implemented
reached in virtual scoped_ptr<ui::SurfaceOzoneCanvas>
ui::SurfaceFactoryOzone::CreateCanvasForWidget(gfx::AcceleratedWidget)
[16856:16856:1006/110822:FATAL:software_output_device_ozone.cc(22)] Failed
to initialize canvas
----------------------------------------------------------------------

Also, unless we perform the dreaded trick of dropping caches every now and
then, we get the usual:
[742:742:1006/162906:ERROR:texture_manager.cc(1706)]
[.GPU-VideoAccelerator-Offscreen-0x78323380]GL ERROR :GL_OUT_OF_MEMORY :
glTexImage2D:

I'm basically looking to see if anyone would know how "problematic" the file
leaks can be to the system, and if that could explain why we get so many
crashes while loading content. Someone here did some tracing on a debug
build when Chrome crashes and it was in libvpu, but couldn't get more useful
info. From what we understand, libvpu maps CMA memory for the application
and we suspect that it's possibly being misused, leading to a crash.

Any pointers are appreciated.
Thanks!

--
Dominique Bureau

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.yoctoproject.org/pipermail/meta-freescale/attachments/20151008/5f792b49/attachment.html>