Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GUI] Support gui.fps_limit and reduce idle power consumption #1611

Merged
merged 20 commits into from
Aug 6, 2020

Conversation

archibate
Copy link
Collaborator

Related issue = #

[Click here for the format server]


Setting gui.fps_limit = None make fractal.py to be ~140 fps on CUDA!
Some tweaks in system/timer.cpp is also done for reducing CPU usage on idle (when actual FPS > our FPS limit).

@archibate archibate requested review from k-ye, yuanming-hu, JYLeeLYJ and taichi-gardener and removed request for k-ye and yuanming-hu July 30, 2020 01:31
Copy link
Contributor

@JYLeeLYJ JYLeeLYJ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice idea !
And here are two platform-specific optimization points for windows .

@@ -69,6 +69,24 @@ void Time::sleep(double s) {
Time::usleep(s * 1e6_f64);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a mush more accurate milliseconds level sleep implementation instead of Sleep(DWORD(us * 1e-3)) on windows .

#ifdef _WIN64

#include <Windows.h>
#pragma comment(lib, "Winmm.lib")

void win_sleep(DWORD ms) {
  if (ms == 0)
    Sleep(0);
  else {
    HANDLE hEvent = CreateEvent(NULL, FALSE, FALSE, NULL);
    timeSetEvent(ms, 1, (LPTIMECALLBACK)hEvent , 0, TIME_ONESHOT | TIME_CALLBACK_EVENT_SET);
    WaitForSingleObject(hEvent, INFINITE);
    CloseHandle(hEvent);
  }
}
#endif

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds cool! But is that possible to sleep for microseconds (us) on Windows?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds cool! But is that possible to sleep for microseconds (us) on Windows?

It's sad truth that sleep accuracy can not be guaranteed when delay time is lower than 1ms :(
The only possible approach is to use high resolution clock to continually counting in a for loop , but it 's unable to release CPU time slice ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sleeping for us seems tricky on win, how about just:

#ifndef _WIN64
} while (dt > 1e-4_f64);  // until dt <= 100us
#else
} while (dt > 1e-2_f64);  // until dt <= 10ms
#endif

Copy link
Contributor

@JYLeeLYJ JYLeeLYJ Jul 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1ms is more accurate ( 1/60 = 16.6 ms , 1/120 = 8.3ms , 1/40 = 25 ms )

#else
} while (dt > 1e-3_f64);  // until dt <= 1ms
#endif

if (dt <= 0) {
return;
}
Time::sleep(dt * 1e-3);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We set the precision as 100us. But relatively , dt * 1e-3 is too small .
As a result , Sleep(0) is always called on windows , which will not give up the CPU time slice usually.
I think Time::sleep(dt) or Time::sleep(dt * 0.5) is enough.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the unit of (t - Time::get_time()) is seconds, while the unit of Time::sleep is milliseconds.

Copy link
Contributor

@JYLeeLYJ JYLeeLYJ Jul 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Time::sleep should be seconds . See Time::usleep(s * 1e6_f64) .
Here are some actual value while running :
dt = 0.011... , dt = 0.008... and dt * 1e-3 is about 10 us

@codecov

This comment has been minimized.

@archibate archibate self-assigned this Jul 31, 2020
@archibate
Copy link
Collaborator Author

Hi, given that it's impossible to sleep for < 1ms on win, I simply abandoned reduce-idle-power-consumption there, WDYT about this change? Feel free to request change if you found a solution for this :)

@archibate archibate requested a review from JYLeeLYJ August 1, 2020 03:52
Copy link
Contributor

@JYLeeLYJ JYLeeLYJ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dt * 0.4 may be an enough scale . If it's ok , here is no need to ban on Windows .
It's still possible to reduce CPU usage at milliseconds level .

  • while dt is at seconds or milliseconds level , Sleep is work .
  • while dt is under 1ms , using Sleep(0) is just like polling with for-loop.

In addition , for the frame rate accuracy , we just need to using that more accurate implementation win_sleep I suggested above (using WIN multimedia timer api and waitable event ), instead of Sleep which sometimes could lead to 1 ~ 10 ms error .
I can help u take an additional commit and test for this : )

taichi/system/timer.cpp Outdated Show resolved Hide resolved
taichi/system/timer.cpp Show resolved Hide resolved
if (dt <= 0) {
return;
}
Time::sleep(dt * (dt < 4e-2_f64 ? 0.02 : 0.4));
Copy link
Contributor

@JYLeeLYJ JYLeeLYJ Aug 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is any difference here ( using 0.02 or 0.4 )?
Why not just use :

Suggested change
Time::sleep(dt * (dt < 4e-2_f64 ? 0.02 : 0.4));
Time::sleep(dt * 0.4);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ensures that the dt of sleep is more accurate when small; without this, I found the FPS to be around 59; with this, I found the FPS to be accurately 60.00.

@archibate archibate requested a review from JYLeeLYJ August 3, 2020 04:37
@archibate
Copy link
Collaborator Author

In addition , for the frame rate accuracy , we just need to using that more accurate implementation win_sleep I suggested above (using WIN multimedia timer api and waitable event ), instead of Sleep which sometimes could lead to 1 ~ 10 ms error .
I can help u take an additional commit and test for this : )

That would be great! I invited you to archibate/taichi so that you could push to this branch, would you mind:

  1. Implement a win_sleep of us precision in this PR?
  2. In another PR and left #ifndef _WIN64 in this PR?

@JYLeeLYJ
Copy link
Contributor

JYLeeLYJ commented Aug 3, 2020

I have added win_usleep , win_msleep for different use cases on Windows .
None behavior changes on Linux. You can try to run on your machine to check if anything is different : )

@@ -57,18 +57,82 @@ double Time::get_time() {
}
#endif

#ifdef _WIN64
#include <Windows.h>
#pragma comment(lib, "Winmm.lib")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added win_usleep , win_msleep for different use cases on Windows .
None behavior changes on Linux. You can try to run on your machine to check if anything is different : )

Nice work! I can confirm that it's running happily on my machine!
One concern, what is Winmm.lib? Will it cause compat issue?
We'd better include it in CMakeList.txt instead of an inline #pragma comment, WDYT?
I'd like to provide more usage about linking libraries in CMake if you are not familiar to.

Copy link
Contributor

@JYLeeLYJ JYLeeLYJ Aug 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Winmm.lib is just a system multimedia API , which contains high resolution timer event functions .
Winmm.dll could be found in system path . And the minimum support client is winXP . This might not cause incompatibility problem.
See : MSDN

At the last commit I have moved #pragma commit away and used target_link_library instead .

@archibate archibate requested a review from xumingkuan August 4, 2020 15:28
@archibate archibate added the LGTM label Aug 5, 2020
cmake/TaichiCore.cmake Outdated Show resolved Hide resolved
python/taichi/cc_compose.py Outdated Show resolved Hide resolved
@archibate archibate merged commit 5f2f216 into taichi-dev:master Aug 6, 2020
@yuanming-hu yuanming-hu mentioned this pull request Aug 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants