-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvement: Multithreaded BPM analysis #8686
Comments
Commented by: JosepMaJAZ I've taken a look at the code. This is what we have now:
_
-AnalyzerQueue becomes a single-instance class and so playermanager and library will use again the same instance. -AnalyzerQueue contains a method to add to the queue, with a priority boolean parameter. If the amount of workers in the pool is less than the maximum number of threads allowed, add a new one and let it run. (We can assume that if there is any other worker, it is working already).
|
Commented by: JosepMaJAZ I forgot to mention one thing. The idea of the worker having only pointers to the analyzers is not so much an efficiency decision as it is a decision to dynamically allocate the analyzers for a concrete track. This way, when we prioritize one track (which we assume comes from the playermanager), the worker that will take it can be configured to include the waveform analyzer. The exact way to implement this can change, but maybe the queue of tracks could have an instance of another class that could contain the priority flag. We cannot assume that we can always get a worker and make it skip its work. We should only do that on workers that are working on a non-prioritized track (think about loading three decks on a 2 core machine). |
Commented by: rryan Sounds good to support a thread pool of analyzers dequeueing work from the pool. The UI interaction is a little tricky so it'd be nice to discuss that a little more before starting work on implementation. I am worried about the plan of sizing the analyzer thread pool based on the number of cores. My concern is that no matter what our heuristic, it will go wrong for some folks. For these people, a problem like this can make Mixxx unsuitable for live use (e.g. Bug #529614). To this end, I think we should expose the # of analyzer threads as a setting to the user in Preferences so they have a recourse if something goes wrong. Maybe we can set the default based on # of available hardware threads. Some thoughts on thread priorities: In 2014 we re-organized our thread priorities into priority classes, Bug #1270583. QThread::setPriority(QThread::IdlePriority) translates to pthread SCHED_IDLE policy with priority 0 (BSD, OSX and Linux) and THREAD_PRIORITY_IDLE on Windows. Let's refer to concrete details when discussing priority changes. Anything above QThread::IdlePriority still uses the thread's original scheduling policy (e.g. SCHED_OTHER) with pthreads. We have had problems with IdlePriority in the past causing tracks to take a long time to analyze (e.g. we had one user report analyzing a 3 minute track took 1 hour IIRC). I would use caution here. Ultimately, the "priority group" scheme we introduced in Bug #1270583 addresses expressing the relative importance of Mixxx threads to the OS. I'm not sure we need to do more. Setting thread priorities (especially SCHED_IDLE policy) on a CPU bound thread can also have unintended consequences of causing excessive thread pre-emption which can slow down analysis massively because every pre-emption causes CPU caches and TLB to flush. |
Commented by: daschuer One more thing to consider is, that the track analysis needs a CPU core, memory access and HDD access, which is shared with Sqlite and CachingReaders. We have already users complaining about audio dropouts when analyzing the track on the fly. The assumption: "If it is playing, max-1" is probably wrong, since Mixxx uses already several threads. Multitheading analysis will be a great benefit when Mixxx is not playing. We may also consider to split the analysis itself into parallel task. Only one should require HDD. Has anyone OpenMP experience? This could also be interesting for us. |
Commented by: MelGrubb As an end user, I think that keeping total resource usage low during a performance is the most important thing. From a very high level, I would think the prioritization logic should be more along the lines of:
Then again, I don't understand why anyone would load up the analysis queue while performing. That is something I only do offline, and usually only when I know I've recently added a lot of new music to the computer. I rescan the library, select all "new" and hit Analyze. Then I go get dinner or watch TV for a while. |
Commented by: JosepMaJAZ Let's put everything in context first: A) The batch analysis of tracks is only performed when the user explicitly presses analyze on the analyze view. D) The "on demand" analisys is that which happens when the track is loaded into a deck, and such track has not been analyzed yet. If an user does "A" during a live performance, we can only provide a degree of glitch-free response dependant on the OSs ability of priorizing threads. And if it is not enough, the user has option "B". About "C", this is about maximizing performance so we are interested in using the resources, if they are available. It is usual to offer the option to the user to change the amount of threads to use in case the detected setting isn't working as good as it should. In this context, we can only think about "D" as being a "live performance" case, and the truth is that this case is not changing, because that is one single analysis. I am not proposing to do each analysis type into a different thread, since that requires thread synchronization and most probably the waveform analysis takes most of the time, so the whole analysis time would not improve much. (We might try that at some point, but it's not the main idea in this bug) If there isn't any value for our "multithreaded_analysis" setting, or its value is zero, calculate the number of threads and set it. In this context, thread priority was supposed to play a role since this is by definition a "long-running" task, and we still want to continue working on other parts of the application. Still, we can give hints with LOW_PRIORITY, NORMAL_PRIORITY and HIGH_PRIORITY. About the amount of threads: The amount of threads only matters when they need to do something, since it will take longer for a lower priority thread to get its share of CPU if other more priorized threads are active. The real problem happens if we need an immediate response. An OS, except for interruptions, will not pause a running thread if it hasn't had it's time-slice. Since this is dependant on the OS, our priorized task might need to wait for a whole time-slice to start (and that's when there's no other similarly priorized task or other types of locks). At last, when i said about reducing by one the number of threads if a track is playing, it is only so that in that case we know it is going to require some extra CPU. (And we UI graphics count too) |
Commented by: JosepMaJAZ I started working on this some days ago and I've got what I consider to be the workers ready. It was a bit complicated to understand correctly how do the slots/signals operate in a multithreaded scenario, and combine them correctly with mutex, waitconditions and the likes. I've done it this way:
I.e. threads will be alive only while they are required. The scenario of prioritizing tracks (i.e. analysis of the player tracks) goes as follow:
Most of the manager part is still on todo, but the parts on the worker that the manager will use are already there. Hope this will be good enough and not too complicate to understand either (hey.. it wasn't that easy right now anyway) |
Commented by: sblaisot can you also please explain the failing/error handling scenarios, like if a worker thread crash/hang and never send back any signal to the manager ? |
Commented by: JosepMaJAZ Mmmm. I didn't think about that. If i didn't overlook it, there's nothing in this regard right now either. The crashing scenario might require a simple try-catch on the main process method, so that it can get captured and inform the manager that it ends. The hang scenario is a bit more controversial. Maybe it could get detected (checking the periodicity of calls to progress), but nevertheless, it needs a forced stop, which means telling the thread directly to die. Maybe a QTimer once it has asked the thread to stop could test if it is still alive and then force it to stop. |
Commented by: JosepMaJAZ First version of the changes here: https://github.com/JosepMaJAZ/mixxx/tree/multithreaded-analysis |
Commented by: daschuer Cool! |
Commented by: uklotzde Superseded by the responsive analysis framework that supports multi-threading implicitly by design: Sorry, JosepMa, I didn't plan to capture this task from you! The multitude of issues around our existing analysis framework just became apparent while digging deeper and deeper into the topic, leading to a complete rewrite literally. |
Issue closed with status Fix Released. |
Reported by: JosepMaJAZ
Date: 2016-11-11T16:14:20Z
Status: Fix Released
Importance: Wishlist
Launchpad Issue: lp1641153
The current implementation of the BPM and graphic analysis (analysis view) is single threaded (one file at a time).
While it is good that it doesn't hog the CPU for the rest of the application, it's not that good that the CPU usage stays at 20% on a high end machine.
My proposed solution (initial, previous to verify how it all is coded) is like this:
Get a thread-synchronized cue of tasks to do (the easiest model is simply the list of tracks selected to analyze).
Calculate number of threads that the processor has.
There are several ways to get this information, depending on the platform, but the standard way is this: http://en.cppreference.com/w/cpp/thread/thread/hardware_concurrency
Note: As a safety measure, programs that can use multiple threads tend to allow the user to specify it explicitly. It could be added as an option in the BPM preference page of Mixxx settings. 0 = automatic >0 = use that value. There would always be at least one thread.
Prepare a stack of threads filled with "hardware_concurrency-1" threads. All them with idle thread priority.
That is, we will have at much Max-1 threads for analysis, so there's always one processor thread free for the rest of the application.
This could be improved as in: If mixxx is not playing, use all available threads. If it is playing, max-1.
Modify the track analysis done on track load to use this stack of threads. Concretely, ensure that loading tracks use the same thread limit. If all threads are in use, discard one that is doing the backgroud analysis and assign it to the track(s) being loaded.
When a thread finishes doing its task (i.e. finishes the track analysis), it "pop"s a new tasks from the task list and starts processing it.
If no more tasks are ready, the thread ends.
This would require also a small change in the UI, in how we report the current track being analyzed.
We could also study if it would be good to first do the bpm/grid/replaygain analysis, and when this finishes for all tracks, do the graphic analysis of those same tracks. the UI message could indicate "Part 1/2, Part 2/2)" or any other message to let the user know that this is a two step action.
The text was updated successfully, but these errors were encountered: