Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suspend to RAM doesn't work with NVidia Prime / Thinkpad P1 Gen2 #73494

Open
mdedetrich opened this issue Nov 16, 2019 · 18 comments
Open

Suspend to RAM doesn't work with NVidia Prime / Thinkpad P1 Gen2 #73494

mdedetrich opened this issue Nov 16, 2019 · 18 comments
Labels

Comments

@mdedetrich
Copy link

mdedetrich commented Nov 16, 2019

Describe the bug
On the new Lenovo Thinkpad P1 Gen 2 series (suspect its the same as X1 Extreme), when using NVidia Prime suspend to RAM no longer works. If you disable hybrid graphics in the BIOS and just use the NVidia graphics card then suspend to RAM works fine (albeit it completely kills your battery and makes the fans on your laptop go crazy).

When you do suspend to RAM in this configuration and then resume, the screen just stays black (the laptop does properly power up and the keyboard lights up). Unlike other suspend to RAM black screen problems, the only way to get out of this black screen is to physically restart the laptop with the power button, manually switching to TTY with Ctrl+Alt+F1 and doing systemctl restart display-manager does not work.

To Reproduce
Steps to reproduce the behavior:

  1. Set up NVidia Prime in static mode as described here https://nixos.wiki/wiki/Nvidia
  2. Press suspend to RAM
  3. Wait for laptop to suspend to RAM and then resume by hitting the power button. The screen is now stuck being black.

Expected behavior
Resuming from suspend to RAM works (i.e. it opens your login manager)

Additional context
The relevant nixos configuration is here

boot.blacklistedKernelModules = [ "nouveau" ];
hardware.nvidia = {
  modesetting.enable = true;
  optimus_prime = {
    enable = true;
    nvidiaBusId = "PCI:1:0.0";
    intelBusId = "PCI:0:2.0";
  };
};
services.xserver.videoDrivers = [ "intel" "nvidiaBeta" ];

Archlinux has a good resource https://wiki.archlinux.org/index.php/Lenovo_ThinkPad_X1_Extreme_(Gen_2) . According to their documentation the latest nvidia beta drivers should just work fine (Prime Offloading should also work although that is covered by this PR #66601 )

@eadwu Noted that the most likely reason behind this is that Nixos uses default power management rather than one based on systemd. For nvidia its recommended to use the systemd power management (more info here https://download.nvidia.com/XFree86/Linux-x86_64/435.17/README/powermanagement.html)

Metadata

 - system: `"x86_64-linux"`
 - host os: `Linux 5.3.11, NixOS, 20.03pre201791.c1966522d7d (Markhor)`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.3.1`
 - channels(root): `"nixos-20.03pre201791.c1966522d7d"`
 - channels(mdedetrich): `"nixpkgs-20.03pre201329.f1682a7f126"`
 - nixpkgs: `/home/mdedetrich/.nix-defexpr/channels/nixpkgs`

Maintainer information:

# a list of nixpkgs attributes affected by the problem
attribute:  [ "services.xserver.videoDrivers" "hardware.nvidia" "boot.blacklistedKernelModules" ]
@mdedetrich mdedetrich added the 0.kind: bug Something is broken label Nov 16, 2019
@eadwu
Copy link
Member

eadwu commented Nov 16, 2019

Personally for me, suspend for RAM on NVIDIA has always been "glitchy". I suppose that this issue will remain as long as we use the default power management instead of systemd-based one NVIDIA recommends.

@mdedetrich
Copy link
Author

mdedetrich commented Nov 16, 2019

@eadwu

I suppose that this issue will remain as long as we use the default power management instead of systemd-based one NVIDIA recommends.

Can you expand on this, what power management does NixOS use? Is it possible for NixOS to use the systemd based power management currently?

Ah I suppose you mean https://download.nvidia.com/XFree86/Linux-x86_64/435.17/README/powermanagement.html, will update the issue with this as a reference

@mdedetrich
Copy link
Author

mdedetrich commented Nov 16, 2019

Maybe it makes sense to add an hardware.nvidia.powerManagement.enable attribute along with powerManagement.systemd.enable where if you set hardware.nvidia.power_management.enable to true then it will set a powerManagement.systemd to true? Alternately we could just have it so that if powerManagement.systemd.enable = true and hardware.nvidia.enable = true then it will install the necessary systemd modules as described in the article.

Sounds like a big feature though, should another ticket be made specifically for systemd style power management?

@eadwu
Copy link
Member

eadwu commented Nov 16, 2019

I don't know of any other drivers that use systemd-based power management, so if it is just a nvidia thing, keeping it under hardware.nvidia is probably the better way to do this.

Also on the expansion of the "glitchy" behavior, I don't usually have problems resuming within a short time (<1 min) of suspending with NVIDIA online, but when I leave it in suspend for long periods of time, I experience the same problem. Before this use to also effect it when the monitor brightness was 0 but this was fixed by HardDPMS and fixes in the driver I believe.

@mdedetrich
Copy link
Author

mdedetrich commented Nov 16, 2019

I don't know of any other drivers that use systemd-based power management, so if it is just a nvidia thing, keeping it under hardware.nvidia is probably the better way to do this.

Agreed

Before this use to also effect it when the monitor brightness was 0 but this was fixed by HardDPMS and fixes in the driver I believe.
I have an OLED screen so I am not sure if this is making an impact (OLED screens technically don't have brightness so the standard kernel way of configuring brightness doesn't work on them).

I also tried looking at dmesg/journalctl and I couldn't actually find any errors when trying to resume from hibernate, so the screen just being "off" could be the issue.

Also judging from what NVidia is saying about this, i.e.

However, these allocations are collectively large, and typically cannot be evicted. Since the amount of system memory available to drivers at suspend time is often insufficient to accommodate large copies of video memory, the NVIDIA kernel drivers are designed to act conservatively, and normally only save essential video memory allocations.

The resulting loss of video memory contents is partially compensated for by the user-space NVIDIA drivers, and by some applications, but can lead to failures such as rendering corruption and application crashes upon exit from power management cycles.

Another theoretical reason why the default power management doesn't work is due to limitations of memory in the kernel memory and the memory of the graphics card that I have is quite high (4 GB of VRAM). It would be good to figure out if Archlinux is using the systemd style of power management.

Before this use to also effect it when the monitor brightness was 0 but this was fixed by HardDPMS and fixes in the driver I believe.

Is HardDPMS on by default now?

Also fwiw, hibernate works flawlessly, its only suspend thats the issue.

@eadwu
Copy link
Member

eadwu commented Nov 16, 2019

The limitation of memory is the most likely cause, on the Arch Linux front, seems like they use it [1], though not sure whether or not it's enabled by default. As for HardDPMS, on the latest few drivers it has been enabled by default.

Actually, judging from how Arch Linux installations work, the service are installed so they are used.

[1] https://git.archlinux.org/svntogit/packages.git/tree/trunk/PKGBUILD?h=packages/nvidia-utils#n166

@mdedetrich
Copy link
Author

mdedetrich commented Nov 16, 2019

Okay so it seems like this is a first good step, is this something you would be willing to look at or should I do an attempt? Note that I am very new to nix/nixos so it would likely take some time for me to learn how it works properly.

I don't think it should be that difficult since we already have a reference to work on (i.e. arch). Would be great if we could it in by the next stable (20.03)

@eadwu
Copy link
Member

eadwu commented Nov 16, 2019

I'll see how much time I can pull, though yeah by the next stable the implementation should be ready if nobody else picks it up, (since at least I'll work on it over winter break).

@mdedetrich
Copy link
Author

So update on this, I am no longer getting this issue from a couple of months ago although I am not sure if this is due to an update in NVIDIA/Kernel or the mix module itself

@nh2
Copy link
Contributor

nh2 commented May 8, 2020

the screen just stays black

I have this problem with my ThinkPad T25 since I upgraded from NixOS 19.09 to 20.03.

@mdedetrich
Copy link
Author

@nh2

What is your config (i.e. NVidia/XServer hardware conf?)

@nh2
Copy link
Contributor

nh2 commented May 10, 2020

@mdedetrich

  services.xserver.videoDrivers = [ "nvidia" ];
  hardware.nvidia.optimus_prime.enable = true;
  # Bus ID of the NVIDIA GPU. You can find it using lspci, either under 3D or VGA
  hardware.nvidia.optimus_prime.nvidiaBusId = "PCI:2:0:0";
  # Bus ID of the Intel GPU. You can find it using lspci, either under 3D or VGA
  hardware.nvidia.optimus_prime.intelBusId = "PCI:0:2:0";
  #hardware.nvidia.modesetting.enable = true; # tried both, makes no difference

  services.xserver.displayManager.defaultSession = "xfce+i3";
niklas:~/ $ lspci | grep -i nvidia
02:00.0 3D controller: NVIDIA Corporation GM108M [GeForce 940MX] (rev a2)

@nh2
Copy link
Contributor

nh2 commented Jun 15, 2020

This still does not work for me, even with these patches applied on top of 20.03 in order:

In both the new offload mode, and before in sync mode, do I get a black screen in Xorg when resuming from standby. I'm pretty sure the screen is fully off, that is no backlight is on (it it turns on when resuming, showing a white caret on black in the top left for 1 second, then turns fully black; it turns on when switching to the a virtual terminal, and back of when switching back to Ctrl+Alt+F7). I then have to switch to the virtual terminal with Ctrl+Alt+F1, and restart Xorg with sudo systemctl restart display-manager. Of course that loses my desktop session.

I also tried with adding "modesetting" to services.xserver.videoDrivers = [ "modesetting" "nvidia" ]; and hardware.nvidia.modesetting.enable = true. The nely added nvidia-resume.service runs thorough successfully according to journalctl output, but it doesn't help.

I'm using ligthdm with services.xserver.displayManager.defaultSession = "i3+xfce";.

@eadwu is this working for you?

(I also want to point out the "no longer works" from the issue description; this worked for me before, but broke recently, perhaps with the upgrade to 20.03.)

@nh2 nh2 mentioned this issue Jun 15, 2020
10 tasks
@nh2
Copy link
Contributor

nh2 commented Jun 15, 2020

I found that this helps: https://askubuntu.com/questions/512192/turn-monitor-back-on-after-xrandr/553944#553944

From the VT, run:

sudo chvt 7; sleep 3; xrandr --display :0.0 --auto

That turns the screen back on.

Isn't that what nvidia-resume.service from #73530 is supposed to do? It does a chvt, but no xrandr.


I see

services.xserver.displayManager.setupCommands = optionalString syncCfg.enable ''
# Added by nvidia configuration module for Optimus/PRIME.
${pkgs.xorg.xrandr}/bin/xrandr --setprovideroutputsource modesetting NVIDIA-0
${pkgs.xorg.xrandr}/bin/xrandr --auto
'';

Should something like that also be run upon resume?

@nh2
Copy link
Contributor

nh2 commented Jun 15, 2020

I found an ugly workaround that makes it work on my laptop:

{
  # Workaround to make standby resume work with nvidia without getting a black screen because the display is off.
  # See https://github.com/NixOS/nixpkgs/issues/73494
  systemd.services.nvidia-resume.serviceConfig = {
    # Requires `xhost +local:` in `sessionCommands` so that root can run X commands.
    ExecStartPost = "${pkgs.xorg.xrandr}/bin/xrandr --display :0.0 --auto";
  };

  services.xserver.displayManager.sessionCommands = ''
    # Needed to fix resume on nvidia, see `nvidia-resume` section.
    # TODO: This is suboptimal but I haven't figured out yet how to make root-commands work with XAUTHORITY
    ${pkgs.xlibs.xhost}/bin/xhost +local:
  '';
}

@eadwu
Copy link
Member

eadwu commented Jun 17, 2020

I'm not entirely sure how the suspending works since I rarely suspend my laptop anyway since there seems to be bugs with waking up from suspend (hardware and/or software?).

The few times I did test it, I never had any problems, though it might be dependent on the situation (nearly all of those times the video card wasn't really being used).

nh2 added a commit to nh2/nixos-configs that referenced this issue Oct 25, 2020
@stale
Copy link

stale bot commented Dec 15, 2020

I marked this as stale due to inactivity. → More info

@stale stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Dec 15, 2020
@nh2
Copy link
Contributor

nh2 commented Jun 17, 2021

This seems to be fixed for me with 21.05, can anyone confirm?

@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Jun 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants