-
Notifications
You must be signed in to change notification settings - Fork 717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better defaults for new users and other feedback #299
Comments
Please remember my column sizes, preferred sorting, group pats and foldpats.. Allow for horizontal scrolling. |
A 64-bit executable is now available (starting with #194). Is this working for you? For each of the other bullets, a separate issue would be good. |
It is reasonable to provide a popup that tells people how to get/move to the 64 bit version. In fact it is not that hard to make PerfView simply relaunch itself as a 64 bit process (it could be a user-set default as well. You can make an issue for that if you wish (as Sam mentions, you can get a 64 bit version if you want). I have pushed back against simply making PerfView 64 bit because frankly large data sets will lead to a pretty poor (slow) experience, and the better solution is to determine how to get the data volume down for the scenarios in questions. There is a good chance there is a memory leak or a memory inefficiency that moving to 64 bit is simply masking.
More details would be useful here. Fundamentally, ETW is a machine wide data collection mechanism so there is no 'attach' (and generally speaking this is a good thing). The 'Run' command lets you do it for a single executable (but it is still machine wide, it is just that the starting, running and stopping are done for you). If you need 'attach' frankly that is what 'Collect' does. The viewing already allows you to pick a process quickly (it is typically the first (the one that used the most CPU)), so it is pretty quick. If you have a suggestion on improving this please provide details.
Generally speaking the defaults need to be SAFE (that is if it takes longer, that is OK, it is more important that it work without hiccups). If you DID care about a 3.X runtime and this was not checked, then you get a VERY bad experience (no symbols for NGEN images), and there is no great way to let them know the solution. Thus until 3.X scenarios really become very close to 0, I would prefer to keep this the way it is. However we do remember certain values from run to run (like whether you want to merge/zip your file) (see App.ConfigData["Zip"] in the code), and adding the 3.X NGEN to this list is reasonable (thus one a more advanced user changes the default, it will be remembered until he explicitly changes it.
If you unclick the 'merge' box in the gui, PerfView will remember it run-to-run, and thus it will never merge from then on. I do this, and it works great (in fact there are popups that tell you about it). Again I want this way because it is SAFE. Long ago PerfView did not merge by default, and event with LOTS of warnings, users screwed it up ALOT (copied data off the machine without merging). I think the current system works well (users who use it a lot can avoid the overhead, but they have to learn more (read the hints).
PerfView does this for some things (merging), and should probably do it for other (3.X rundown), but it is not a slam dunk to do this uniformly because although there is a 'clear user config' option, it is easy to get into states you don't know how to get out of easily. Thus a case could be made that the amount of persistent state should be kept low. For any particular thing, however it is straightforward to fix, and you should make an issue for it.
I am suspicious that you are running into the ETW 196 frame limit (can happen with deep async calls). Historically this was rare, but it seems things are changing. More investigation is reasonable, and we can probably fix it so that we have a 'BROKEN_TOO_MANY_FRAMES' tag to make diagnosis easier, but actually fixing the broken stacks is much harder. Note that the bottom up approaches suggested as work-around do continue to work.
Sounds like you wish the docs to be better. Specific suggestions are the way forward.
The thing is, unless you have empty grouping, filtering and folding, this could be true. Moreover, because CPU is sampled, you STILL may not see a method you 'KNOW' has been called if it is not on the stack long enough. Perhaps docs would help, but generally speaking people don't read them. I don't have a good solution but if you have a specific suggestion we can discuss it.
Back and forward change the values in the textboxes to their previous values (as a unit). I agree is it not intuitive, but the intuitive thing seems to be hard to implement (I could not do it easily back when this was being implemented). If some experienced GUI programmer can fix it great. Docs are also reasonable.
This is a deficiency in New window. It does set the TextBoxes and the tab to what you had before, but I notice it does not set the focus node for the 'callers and 'callees' view. It could also set the context for the treeview (what nodes where expanded), but that may be harder. This should be straightforward to fix and should have its own issue. Note that my may find the 'Drill Into' feature on the right click context menu helpful in working around this. Basically you can take any set of values (e.g. some chunk of the call tree) and create a new window with just those samples. It is true that you may not have the tree opened in just the same way (but typically getting there is easy because there are no other samples to confuse things.
For what it is worth, this is a FAQ. I am not convinced greying things out is that much better than !?, but if it is easy to do in the GUI, I have not real objection. We can make an issue for it. Thanks for the feedback. |
Vance, I first want to say that I find PerfView to have been invaluable these past few weeks to diagnose performance issues both CPU, UI-blocking delays and GC-wise. It's much more elegant and less invasive way than instrumentation and plain CPU sampling that I've used in the past. Before using it, I watched all the Channel 9 videos that you did around using it - mind you some of my scenarios were a lot more complex than what you covered (blocked threads), but I didn't enter into using PerfView blind. I found myself using PerfView for 3 different scenarios:
There were 3 major things that made these scenarios harder:
This is why I was looking for ways to reduce the amount of data and processing time here - such as being able to limit it to a single process that I absolutely know is the cause of the issue.
I'm not convinced that better docs are way to resolve these issues - there's already very comprehensive but very overwhelming amount of docs, I think a few tweaks of some UI elements and tweaking of the defaults could really make a difference here. I'll file individual issues here for what I believe are bugs, feel free to resolve them if you disagree. |
Just played around with GroupPats, I think I expected Just My App to work like group CLS/OS - hence my troubles. I think if this was remembered between sessions, this would probably resolve that issue. |
I want to echo @davkean, PerfView is a great tool, it lets you do some many things that are hard/impossible to do otherwise. Thankyou soo much for making it!
This has always confused me, why, when you specify the launch of a single executable, does it still collect traces from all processes that are currently running? I know ETW is machine-wide, but in this scenario couldn't the default option be just collection of events from the process that you ask PerfView to launch? (I know you are prompted with a list of processes whenever you look at some events, but it seems strange to need to select a process each time) For instance I use 'Run' quite often (but maybe me use-case isn't that common?) : In this case, couldn't the default behaviour be to only capture data from |
📝 I have comments ready for most of the above issues (in most cases either supporting or supporting with design conditions), but I'm waiting for separate issues to be filed so the discussion doesn't go all over the place attempting to discuss many things in a single GitHub issue. |
Perhaps it is better to wait for individual issues to be made, but just FYI, when PerfView was created ETW did not have the option of collecting for a single process. You had no choice but to collect machine wide. Some per-process filtering was added in Win8, however event today all kernel events are system wide (which are the most interestingly and in particular include CPU and context switch). Also for most investigations besides CPU (e.g. blocked time, disk I/O etc), knowing that other processes are interfering (stealing your processor or disk) is important. Finally, the main value of filtering is to lower file-size/overhead, but this overhead tends to be concentrated in the process of interest anyway (and if other processes are generating lots of events they are probably interesting!). Thus you typically don't save much (the only exception to this that I have seen is a multi-processor server box running many things but you only care about one of the services (and it is a CPU investigation). It does not seem worth tuning for... For what it is worth... |
@vancem That doesn't match my experience, here's why:
I can see where it maybe be interesting investigating and looking at events from all the processes on the box, but that shouldn't be to the detriment of when I only care about a single process case. |
Your scenario is a reasonable argument for support of process level filtering. However as mentioned, at the present time ETW simply does not support filtering of kernel events, which are the most important ones to filter. The work-around is to simply live with the larger files. |
The past 2 weeks I've been using PerfView fairly significantly and I found it took me over a week of real usage to figure out how to better utilize it. I know that @CyrusNajmabadi and @sharwell has individual feedback, but the following is mine. I'll open bugs for those that you think should be tracked via individual items.
cc @vancem
The text was updated successfully, but these errors were encountered: