Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wgpu::Instance fd leak #3813

Closed
wez opened this issue May 29, 2023 · 3 comments
Closed

wgpu::Instance fd leak #3813

wez opened this issue May 29, 2023 · 3 comments
Labels
api: gles Issues with GLES or WebGL platform: x11 Issues with integration with linux/x11 type: bug Something isn't working

Comments

@wez
Copy link
Contributor

wez commented May 29, 2023

Description

wgpu::Instance::new leaks a unix domain socket on each call

Repro steps

Strip down the hello example:

async fn run() {
    for _ in 0.. 10 {
        let _instance = wgpu::Instance::default();
    }
}

fn main() {
    env_logger::init();
    pollster::block_on(run());
    std::thread::sleep(std::time::Duration::from_secs(600));
}

Then run lsof -p PID where PID is the pid of the hello example:

$ ps -ef | grep hello
wez      1741941 1686246  5 11:09 pts/2    00:00:00 /home/wez/wez-personal/github/wgpu/target/debug/examples/hello
wez      1742031 1703851  0 11:09 pts/3    00:00:00 grep --color=auto hello
$ lsof -p 1741941
COMMAND     PID USER   FD   TYPE             DEVICE  SIZE/OFF     NODE NAME
...
hello   1741941  wez    3u  unix 0x00000000c9baa492       0t0 77484943 type=STREAM (CONNECTED)
hello   1741941  wez    4u  unix 0x00000000371d2572       0t0 77488864 type=STREAM (CONNECTED)
hello   1741941  wez    5u  unix 0x00000000d88f2c4d       0t0 77488866 type=STREAM (CONNECTED)
hello   1741941  wez    6u  unix 0x000000007168298a       0t0 77488868 type=STREAM (CONNECTED)
hello   1741941  wez    7u  unix 0x00000000cdf7593e       0t0 77488870 type=STREAM (CONNECTED)
hello   1741941  wez    8u  unix 0x000000000fd77226       0t0 77488872 type=STREAM (CONNECTED)
hello   1741941  wez    9u  unix 0x00000000f7236e0d       0t0 77488874 type=STREAM (CONNECTED)
hello   1741941  wez   10u  unix 0x00000000e0391444       0t0 77488876 type=STREAM (CONNECTED)
hello   1741941  wez   11u  unix 0x00000000b0d2e05d       0t0 77488878 type=STREAM (CONNECTED)
hello   1741941  wez   12u  unix 0x000000009376cc41       0t0 77488880 type=STREAM (CONNECTED)

The number of unix domain socket lines corresponds to the number of times that wgpu::Instance::new is called.

Expected vs observed behavior

I'd expect all resources to be released when the Instance is dropped. Instead, resources are leaked.

Extra materials

See above

Platform
Information about your OS, version of wgpu, your tech stack, etc.

This was reported in:

but reproduces with 4478c52 of this repo, independent of wezterm.

This is running on Fedora 38 under X11 with the following adapters:

> wezterm.gui.enumerate_gpus()
[
    {
        "backend": "Vulkan",
        "device": 29730,
        "device_type": "DiscreteGpu",
        "driver": "radv",
        "driver_info": "Mesa 23.0.3",
        "name": "AMD Radeon Pro W6400 (RADV NAVI24)",
        "vendor": 4098,
    },
    {
        "backend": "Vulkan",
        "device": 0,
        "device_type": "Cpu",
        "driver": "llvmpipe",
        "driver_info": "Mesa 23.0.3 (LLVM 16.0.1)",
        "name": "llvmpipe (LLVM 16.0.1, 256 bits)",
        "vendor": 65541,
    },
    {
        "backend": "Gl",
        "device": 0,
        "device_type": "Other",
        "name": "AMD Radeon Pro W6400 (navi24, LLVM 16.0.1, DRM 3.49, 6.2.14-300.fc38.x86_64)",
        "vendor": 4098,
    },
]

The user reporting the issue to wezterm was running X11 with awesome on an nvidia system; I don't believe that this is driver specific.

@cwfitzgerald
Copy link
Member

Is there any possible way to get the backtrace where these FDs are made?

@wez
Copy link
Contributor Author

wez commented Jun 1, 2023

From experimenting with strace I found that these are X11 display sockets.
Setting the DISPLAY env to an invalid value prevents the leak, but obviously prevents using wgpu with X11.

gdb suggests:

#0  __GI_socket () at ../sysdeps/unix/syscall-template.S:120
#1  0x00007ffff6ed1a8f in _xcb_socket (family=family@entry=1, type=type@entry=1, proto=proto@entry=0) at /usr/src/debug/libxcb-1.13.1-11.fc38.x86_64/src/xcb_util.c:317
#2  0x00007ffff6ed3f91 in _xcb_open_abstract (filelen=17, file=0x5555566505e0 "/tmp/.X11-unix/X0", protocol=0x0)
    at /usr/src/debug/libxcb-1.13.1-11.fc38.x86_64/src/xcb_util.c:476
#3  _xcb_open (display=0, protocol=0x0, host=0x55555672f090 "") at /usr/src/debug/libxcb-1.13.1-11.fc38.x86_64/src/xcb_util.c:291
#4  xcb_connect_to_display_with_auth_info (displayname=displayname@entry=0x0, auth=auth@entry=0x0, screenp=screenp@entry=0x0)
    at /usr/src/debug/libxcb-1.13.1-11.fc38.x86_64/src/xcb_util.c:515
#5  0x00007ffff6ed46ee in xcb_connect (displayname=displayname@entry=0x0, screenp=screenp@entry=0x0) at /usr/src/debug/libxcb-1.13.1-11.fc38.x86_64/src/xcb_util.c:489
#6  0x00007fffed2ff47a in _XConnectXCB (dpy=0x5555567335d0, display=0x0, screenp=0x7fffffff7bcc) at /usr/src/debug/libX11-1.8.4-1.fc38.x86_64/src/xcb_disp.c:78
#7  0x00007fffed2f04de in XOpenDisplay (display=0x0) at /usr/src/debug/libX11-1.8.4-1.fc38.x86_64/src/OpenDis.c:129
#8  0x0000555555d2086f in wgpu_hal::gles::egl::open_x_display () at wgpu-hal/src/gles/egl.rs:120
#9  0x0000555555d25a14 in wgpu_hal::gles::egl::{impl#12}::init (desc=0x7fffffffa748) at wgpu-hal/src/gles/egl.rs:685
#10 0x0000555555c7bedd in wgpu_core::instance::{impl#0}::new::init<wgpu_hal::gles::Api> (instance_desc=0x7fffffffaa00) at wgpu-core/src/instance.rs:86
#11 0x0000555555c4ded9 in wgpu_core::instance::Instance::new (name=..., instance_desc=...) at wgpu-core/src/instance.rs:103
#12 0x0000555555a87885 in wgpu_core::global::Global<wgpu_core::identity::IdentityManagerFactory>::new<wgpu_core::identity::IdentityManagerFactory> (name=..., factory=...,
    instance_desc=...) at wgpu-core/src/global.rs:36
#13 0x000055555599f801 in wgpu::backend::direct::{impl#7}::init (instance_desc=...) at wgpu/src/backend/direct.rs:543
#14 0x0000555555837d4f in wgpu::Instance::new (instance_desc=...) at wgpu/src/lib.rs:1345
#15 0x0000555555837cfa in wgpu::{impl#21}::default () at wgpu/src/lib.rs:1332
#16 0x0000555555686dfb in hello::run::{async_fn#0} () at wgpu/examples/hello/main.rs:3
#17 0x0000555555686b7b in core::future::from_generator::{impl#1}::poll<hello::run::{async_fn_env#0}> (self=..., cx=0x7fffffffd988)
    at /rustc/a55dd71d5fb0ec5a6a3a9e8c27b2127ba491ce52/library/core/src/future/mod.rs:91
#18 0x0000555555687090 in pollster::block_on<core::future::from_generator::GenFuture<hello::run::{async_fn_env#0}>> (fut=...)
    at /home/wez/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/pollster-0.2.5/src/lib.rs:125
#19 0x0000555555686a95 in hello::main () at wgpu/examples/hello/main.rs:9

It seems as though:

let func: libloading::Symbol<XOpenDisplayFun> = library.get(b"XOpenDisplay").unwrap();

is missing a Drop impl that will guarantee closing the X display and that there are two different calls to open_x_display that may leak in that file

@teoxoy teoxoy added type: bug Something isn't working api: gles Issues with GLES or WebGL platform: x11 Issues with integration with linux/x11 labels Jun 1, 2023
wez added a commit to wez/wgpu that referenced this issue Jul 12, 2023
Introduces a DisplayOwner struct to own both the library
and associated display pointer; their lifetimes are combined
in that struct.

The display pointer is encapsulated in a DisplayRef.

When DisplayOwner is dropped, it ensures that the DisplayRef
is correctly closed prior to unloading the library.

refs: gfx-rs#3813
wez added a commit to wez/wgpu that referenced this issue Jul 12, 2023
Introduces a DisplayOwner struct to own both the library
and associated display pointer; their lifetimes are combined
in that struct.

The display pointer is encapsulated in a DisplayRef.

When DisplayOwner is dropped, it ensures that the DisplayRef
is correctly closed prior to unloading the library.

refs: gfx-rs#3813
wez added a commit to wez/wgpu that referenced this issue Jul 12, 2023
Introduces a DisplayOwner struct to own both the library
and associated display pointer; their lifetimes are combined
in that struct.

The display pointer is encapsulated in a DisplayRef.

When DisplayOwner is dropped, it ensures that the DisplayRef
is correctly closed prior to unloading the library.

refs: gfx-rs#3813
nical pushed a commit that referenced this issue Jul 13, 2023
Introduces a DisplayOwner struct to own both the library
and associated display pointer; their lifetimes are combined
in that struct.

The display pointer is encapsulated in a DisplayRef.

When DisplayOwner is dropped, it ensures that the DisplayRef
is correctly closed prior to unloading the library.

refs: #3813
@wez
Copy link
Contributor Author

wez commented Jul 22, 2023

Fixed by #3924

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: gles Issues with GLES or WebGL platform: x11 Issues with integration with linux/x11 type: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants