Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filtering references to console outside of profvis call() #57

Closed
wch opened this issue May 5, 2016 · 1 comment
Closed

Filtering references to console outside of profvis call() #57

wch opened this issue May 5, 2016 · 1 comment

Comments

@wch
Copy link
Member

wch commented May 5, 2016

The problem

If any functions are defined at the console (as opposed to in a file) but outside of the profvis({ }) block, the profiler output does not include information to tell which expression the call came from. For example, if you run this code:

f <- function() { pause(0.1) }
g <- function() {
  f()
}
profvis({

  g()
}, prof_output = "temp.prof")

The profiler output in temp.prof might look like this (irrelevant stuff has been trimmed):

#File 1: 
"pause" 1#1 "f" 1#2 "g" 1#3

So even though f() and pause() were called from code that was outside of profvis({ }), all of the lines of code are reported to be from a file with no name. When calculating the time taken for each line of code in <expr>, for this line of code profvis will report that there was one sample on line 1, one sample on line 2, and one sample on line 3, even though the time spent on lines 1 and 2 were in expressions that were completely outside the profvis({ }) call.

Additionally, if one of the external functions was long, it might result in, say a 1#10 in the profiler output. But there's no line 10 in the <expr> that profvis has, so this would result in an error when it's displayed.

The current workaround

The current workaround is to use the following heuristic to filter out line number references to <expr>:

  • Search for the label (like "pause" or "f") in the <expr> code at that line. If the label is found, then count that sample toward that line's total time. If not found, don't count the time.
  • Do the same if the label is "", because anonymous functions by definition won't have their text in that line of code.

This strategy is prone to false positives. Even if the code is defined outside of <expr>, the text of the label might happen to be in that line of code. Similarly, an anonymous function might be called outside of <expr>.

It's also prone to false negatives. Sometimes a function label is not the same as the text in that line of code, as is the case for functions like [<-. Sometimes an expression spans lines of code, but only the first line is reported in the profiler. In the code below, line #2 of the <expr>, the line with identity(, will be reported to take the time, but the time is really spent in line #3, with g().

f <- function() { pause(0.1) }
g <- function() {
  f()
}
profvis({
  identity(
    g()
  )
}, prof_output = "temp.prof")

The output:

#File 1: 
"pause" 1#1 "f" 1#2 "g" "identity" 1#2

Solution

The solution is to change the srcfile attribute in the expression passed to profvis() from "" to `"". That way, the profiler will output something like this:

#File 1: 
#File 2: <expr>
"pause" 1#1 "f" 1#2 "g" 2#3
@wch wch closed this as completed in 2cc923d May 5, 2016
@wch
Copy link
Member Author

wch commented May 5, 2016

This is also related to #15.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant