Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reduce some codegen costs #36600

Merged
merged 3 commits into from
Sep 23, 2020
Merged

reduce some codegen costs #36600

merged 3 commits into from
Sep 23, 2020

Conversation

vtjnash
Copy link
Member

@vtjnash vtjnash commented Jul 9, 2020

Alter some patterns and thresholds that can be causes of minor code bloat.

(partly related to #34270)

@vtjnash vtjnash requested review from c42f and JeffBezanson July 9, 2020 20:57
@JeffBezanson
Copy link
Member

Ah, also implements #36236?

@vtjnash
Copy link
Member Author

vtjnash commented Jul 9, 2020

No, this doesn't implement any part of #36236 (which would require making closures, and would generally thus be rather expensive at compile and run time)

@JeffBezanson
Copy link
Member

I was thinking of #36236 (comment); some similar ideas.

@oxinabox
Copy link
Contributor

@c42f and I were talking about this in the context of FluxML/Zygote.jl#88 (comment)
where @warn and @info etc were breaking Zygote.

@vtjnash
Copy link
Member Author

vtjnash commented Jul 10, 2020

That comment talks about how to use closures to redesign the behavior of logging system, this does not do that and just optimizes the implementation. There's overlap only in that it talks about how sometimes the closures that are allocated would be simple in some of the same cases that I optimize here.

@vtjnash
Copy link
Member Author

vtjnash commented Jul 10, 2020

@nanosoldier runbenchmarks(ALL, vs=":master")

base/logging.jl Outdated Show resolved Hide resolved
@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@oxinabox
Copy link
Contributor

That comment talks about how to use closures to redesign the behavior of logging system, this does not do that and just optimizes the implementation. There's overlap only in that it talks about how sometimes the closures that are allocated would be simple in some of the same cases that I optimize here.

Yes, we don't have written of the exact discussion anywhere anymore. Slack ate it.
It was related to that issue, but not recorded in that issue.
But it was very much this PR, identifying if a @warn had only simple types, and if so not entering the code-path that has exception handling.
So I think this is a good design, because it aligns with our thoughts

val isa Number && return true
if val isa Expr
val.head === :quote && val[1] isa Symbol && return true
val.head === :inert && return true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to catch tuple and vector and matrix literals where the head is vect tuple etc,
and the args are all themselves issimple

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not really, no

@vtjnash vtjnash force-pushed the jn/codegen-costs branch from 976ae1b to d2f6e9a Compare July 13, 2020 19:55
@c42f
Copy link
Member

c42f commented Jul 13, 2020

That comment talks about how to use closures to redesign the behavior of logging system, this does not do that and just optimizes the implementation.

Actually the changes proposed there were motivated firstly by a desire to improve the implementation (as requested in the OP of that issue) and then later, possibly, to use that to improve the interface. So I'd say this PR is very much in line with what I was wanting to do, and in the order I was imagining doing it 👍

Copy link
Member

@c42f c42f left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Largely looks good to me, I'm very much on board with removing these try-catch in the particular way which is done here. Needs tests covering the various cases of message code generation

I'd point out that there's one semantic change introduced here, which is that errors arising from the logger are no longer caught. I think this is a good improvement though, as catching errors from the logger (rather than the message generation) was a misfeature of the previous design.

msg = try
"Exception while generating log record in module $_module at $filepath:$line"
catch ex
"Exception handling log message: $ex"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally I think we should avoid any nontrivial code in the final fallback code path of message formatting — even formatting the exception — as that itself could cause another exception. The point is that the application should never crash due to broken message formatting.

Suggested change
"Exception handling log message: $ex"
"Exception handling log message: " * try "$ex" catch ; "<unprintable>" end

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we can't just keep going deep down the error chain. Eventually you have to admit you've broken everything.

Copy link
Member

@c42f c42f Jul 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point is to fall back to emitting a message which can't depend on user-defined code (formatting "$ex" can still run user code).

We can clearly do this. I think you're asserting that you don't want to?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your "correction" still depends on user-defined code being valid, so, yes, clearly we can't do that. The subsequent lines of code need to run yet more user-defined code (and more complex code) so there's little point in adding try/catch here while forgetting to handle most of the actual sources of errors.

Copy link
Member

@c42f c42f Jul 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems we're talking past each other here in some way, because I don't understand this statement.

Let me attempt to spell out my reasoning very explicitly.

The following code formats potentially user defined types in both branches, and these arise from the logging macro

msg = try
    # The user may have passed some odd type for _module or filepath or line here
    # so this may fail.
    "Exception while generating log record in module $_module at $filepath:$line"
catch ex
    # `ex` here is the potentially-user-defined exception resulting from running user-defined `string` functions,
    # so may fail.
    # Granted, this is unlikely
    "Exception handling log message: $ex"
end

The following invocation of user-defined code is related to the installed logger (which may internally protect itself with a try-catch of its own, should it so choose)

    handle_message(
        logger, Error, msg, _module, :logevent_error, id, filepath, line;
        exception=err)

So to me it seems there's a clear separation of the user code which

  • emits the log record, vs
  • consumes the log record

and the former is what I'm arguing should be wrapped in various stupidly-nested try-catch statements.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"protect" is just a relatively useless concept here, so I'm intentionally not attempting to pretend to do it excessively

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I don't understand that comment; saying it's a "useless concept" just leaves me guessing here which is a bit disappointing. I want to know the why here, not what.

But anyway, the actual technicality here is not a hill I'm willing to die on.

base/logging.jl Show resolved Hide resolved
base/logging.jl Outdated Show resolved Hide resolved
@@ -96,38 +96,41 @@ function termlength(str)
end

function handle_message(logger::ConsoleLogger, level, message, _module, group, id,
filepath, line; maxlog=nothing, kwargs...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove maxlog=nothing from here? It seems to make the code below rather more complicated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Saves the compiler/frontend from needing to create a keyword sorter and corresponding structdiff methods

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why this is worthwhile... unless something about this API forces the amount of compiler work to scale with the number of logging statements in the program?

Is the point that kwargs can have many different types (one per logging statement) and that leads to trouble?

If so that connects to another idea I've been thinking about: That perhaps we should be using a data structure more like Dict{Symbol,Any}, or ImmutableDict{Symbol,Any} for passing the key-value pairs of log records rather than (ab)using keyword arguments?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change, kwargs can already be (already is) an AbstractDict{Symbol,<:Any}, without making any further changes to the code

@vtjnash
Copy link
Member Author

vtjnash commented Jul 14, 2020

 I'd point out that there's one semantic change introduced here, which is that errors arising from the logger are no longer caught. I think this is a good improvement though, as catching errors from the logger (rather than the message generation) was a misfeature of the previous design.

Thanks for highlighting that change. I'd meant to do that as I was typing the change, and apparently forgot to add that to the issue text here.

@vtjnash vtjnash force-pushed the jn/codegen-costs branch from d2f6e9a to 68aad67 Compare July 20, 2020 20:07
Copy link
Member

@c42f c42f left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine. Just needs tests covering the various codegen cases in logmsg_code.

@vtjnash
Copy link
Member Author

vtjnash commented Sep 2, 2020

What are you expect there is to test? The idea here is that these are common enough patterns that they're already being tested, and this just optimizes them.

@c42f
Copy link
Member

c42f commented Sep 2, 2020

What are you expect there is to test?

That @info msg a=x throws an UndefVarError when either msg or x are undefined. And that this can be caught by the logger.

These are now handled by special case code paths which I doubt we have any tests to cover.

@vtjnash
Copy link
Member Author

vtjnash commented Sep 3, 2020

Ah, that's what you meant. Okay, added.

Copy link
Member

@c42f c42f left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate the extra tests, thanks.

Added in #22210 (and earlier begun in #20853), this is no longer
necessary to avoid heap allocations, and thus serves little purpose now.
If we have not eliminated them already, codegen is unlikely to do so as well.
This should avoid the try/catch in some cases, where the message is
simply a constant value or simple variable. It also tries to slightly
reduce required specialization in some other cases too.
@c42f
Copy link
Member

c42f commented Sep 23, 2020

I went ahead and rebased this to fix the conflicts. Should we merge it?

@vtjnash vtjnash merged commit ee93bd8 into master Sep 23, 2020
@vtjnash vtjnash deleted the jn/codegen-costs branch September 23, 2020 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants