Create a LazyString type, for places (like errors) which we would prefer to defer the actual work #33711

Keno · 2019-10-29T00:03:31Z

One common theme when looking at inference logs is that formatting error messages
tends to pull in the whole I/O and printing system which a) can take some time to
infer and b) adds backedges if that stuff gets invalidated. Both seem wasteful.
This is an attempt to try to improve the situation. For various error structs
that used to contain a string, they now take either a String or a closure that
will format the required string on demand when showing the error.

I have some tooling to count the total number of methods that inference looks
at and went through and switched over all the error messages that are used
by *(::Matrix{Float64}, ::Matrix{Float64} to the new setup. Doing this
reduces the total number of methods inference looks at by about 20%,
which seems like a worthwhile improvement.

Obviously there are a lot of error sites that could potentially throw, but this approach is backwards compatible, so if we like the direction, we can merge it and then we can gradually go through and clean up the callsite (or somebody could write a femtocleaner pass to do it).

JeffBezanson · 2019-10-29T02:39:10Z

This is not my favorite way to deal with this, since the large number of types generated for the closures is also pretty sub-optimal for the compiler. A possible alternative is to have a macro that makes string interpolation lazy, i.e. returning an object with the arguments. That might be useful for other purposes as well.

Keno · 2019-10-29T06:29:57Z

Alright, how do you like this version? Seems to still give the same benefits in inference (modulo a few extra LazyString constructors).

Keno · 2019-10-29T22:39:43Z

@JeffBezanson do you like this approach better? If so, I'll fix up this PR and get it ready.

StefanKarpinski · 2019-10-30T13:38:04Z

I kind of like this as an implementation, but it suggests to me that having a macro for throwing errors would be beneficial, e.g. @throw DomainError "blah $x blah".

JeffBezanson · 2019-10-30T16:49:07Z

base/strings/lazy.jl

+    :(LazyString($(parts...)))
+end
+
+function show(io::IO, s::LazyString)


Suggested change

function show(io::IO, s::LazyString)

function print(io::IO, s::LazyString)

JeffBezanson · 2019-10-30T16:55:01Z

Yes I like this. Unfortunately implementing the normal string interface for this type is not very natural, but oh well.

base/strings/lazy.jl

c42f · 2019-10-31T08:55:03Z

@StefanKarpinski related to a possible @throw, I've sometimes thought it would be useful to be able to attach arbitrary key-value pairs to exceptions in much the same way we attach them to log records. I think people sometimes have a temptation to log-then-throw for adding context which is really not ideal.

(See also this crazy idea #7026 (comment) on how we might unify return codes with exceptions. It would make even more sense if combined with pattern matching of key value pairs.)

StefanKarpinski · 2019-10-31T13:10:31Z

I've been using the logging infrastructure for PkgServer and it's quite nice to just list key-value pairs for values that may be of interest when looking at logs. Doing something similar for errors would make sense to me as well.

User-764Q · 2021-10-22T18:47:11Z

This has been open for a while, is it worth closing it now?

KristofferC · 2021-10-22T19:03:30Z

@User-764Q please stop commenting on issues/PRs when the only information you bring up is that they are "old". It creates notifications for a lot of people.

Roger-luo · 2021-10-22T21:43:59Z

Just saw this issue it seems this is also useful for solving KristofferC/Crayons.jl#47 since when interpolating crayons we are lack of the IO context. Doing the interpolation lazily would defer this information when we actually print it which solves the problem.

I'm wondering if people still want it in Base for other reasons or maybe we can have this in a package first.

base/strings/lazy.jl

Keno · 2022-01-19T22:44:08Z

I think this is good to go. Will merge in a bit so if there are any objections, say so now.

c42f

I think this is good useful utility, and I like the @lazy_str export. For minimalism I could live without the LazyString export, but leaving it in also seems harmless enough.

c42f · 2022-01-19T23:58:10Z

base/abstractarray.jl

@@ -914,19 +914,21 @@ end
 # copy from an some iterable object into an AbstractArray
 function copyto!(dest::AbstractArray, dstart::Integer, src, sstart::Integer)
    if (sstart < 1)
-        throw(ArgumentError(string("source start offset (",sstart,") is < 1")))
+        throw(ArgumentError(LazyString("source start offset (",sstart,") is < 1")))


Should these cases be using the string macro for consistency? It seems you've included strings/lazy.jl before abstractarray.jl during bootstrap so I guess it should work?

[edit] ah, I see that some uses of string() are just for keeping indentation under control. It sure would be nice if it was easier to reuse the parsing logic from #40753 so we had consistency of escaping rules for these kind of "almost but not quite a normal string" type of custom string literals. Rather than needing the raw string escaping rules.

I just did the straightforward replacement. We can change these for consistency in a followup.

Certainly fine by me.

vtjnash · 2022-01-20T01:40:25Z

This seems like it might potentially move in the opposite direction of the recent escape analysis-driven design changes, where we wanted to capture objects less frequently.

vtjnash · 2022-01-20T01:43:23Z

base/strings/lazy.jl

+as much work as possible to either the macro or later printing operations.
+"""
+struct LazyString <: AbstractString
+    parts::Tuple


It might be better to allocate an Array here, so we do not need to create a whole new Tuple type and can examine and iterate it more easily later?

It can be be an immutable array once that's merged, but currently the analysis in #43852 is not strong enough to model the mutation of the mutable array at the moment. I think for the current use cases, a tuple is mostly fine, since the types will be known at compile time and are stable. Plus it's the error path anyway, which we heavily pessimize and in particular, don't infer, so it'll allocate tons of tuple types anyway.

vtjnash · 2022-01-20T01:46:10Z

base/strings/lazy.jl

+This type is designed to be cheap to construct at runtime, trying to offload
+as much work as possible to either the macro or later printing operations.
+"""
+struct LazyString <: AbstractString


It feels a bit cheeky to declare that this is a kind of String, then implement none of the interface for one.

We could potentially make this object mutable, then swap the field for the string'd value when you first access it, perhaps?

Yes, I think that's reasonable. Let me do that.

Keno · 2022-01-20T02:00:48Z

This seems like it might potentially move in the opposite direction of the recent escape analysis-driven design changes, where we wanted to capture objects less frequently.

Well, as I said in those discussions, I don't like the idea to string summarize on error construction and if there are escaping issues, we should insert maybecopies.

KristofferC · 2022-01-20T22:46:30Z

base/strings/lazy.jl

+end
+
+function String(l::LazyString)
+    if !isdefined(l, :str)


Is there any worry about race conditions for code like this?

Yes. We can use an atomic acquire and cmpswap release here to fix that. (Or perhaps create an "assign_once" type, that manages this pattern for us?)

aviatesk · 2022-01-21T02:28:32Z

This seems like it might potentially move in the opposite direction of the recent escape analysis-driven design changes, where we wanted to capture objects less frequently.

Well, as I said in those discussions, I don't like the idea to string summarize on error construction and if there are escaping issues, we should insert maybecopies.

I'm still not convinced that maybecopy is the right solution. One problem of the maybecopy idea is that it will require a global transformation, or many similar builtin function definitions.
E.g. if we want to allocate a on the stack in the following example, where a can only escape via possible bounds error in arrayref call, we need to change %24 = Base.arrayref(true, %2, 2)::Int64 to something like %24 = Base.arrayref_copy_on_throw(true, %2, 2)::Int, where arrayref_copy_on_throw uses another version of BoundsError constructor that makes a copy of a:

julia> const Lx = Ref{Any}();

julia> @noinline storelen!(xs) = Lx[] = length(xs);

julia> code_escapes() do
          a = [1,2,3]
          storelen!(a)
          a[2]
      end
#9() in Main at REPL[9]:2
2 ◌  1 ─ %1  = Core.tuple(1, 2, 3)::Core.Const((1, 2, 3))                                │╻      vect
  ↑  │   %2  = $(Expr(:foreigncall, :(:jl_alloc_array_1d), Vector{Int64}, svec(Any, Int64), 0, :(:ccall), Vector{Int64}, 3, 3))::Vector{Int64}
  ◌  └──       goto #7 if not true                                                       ││     
  ✓  2 ┄ %4  = φ (#1 => 1, #6 => %16)::Int64                                             ││     
  ✓  │   %5  = φ (#1 => 1, #6 => %17)::Int64                                             ││     
  ✓  │   %6  = φ (#1 => 1, #6 => %9)::Int64                                              ││     
  ◌  │   %7  = Base.getfield(%1, %4, true)::Int64                                        ││╻      getindex
  ◌  │         Base.arrayset(false, %2, %7, %6)::Vector{Int64}                           ││╻      setindex!
  ◌  │   %9  = Base.add_int(%6, 1)::Int64                                                ││     
  ◌  │   %10 = (%5 === 3)::Bool                                                          │││╻      ==
  ◌  └──       goto #4 if not %10                                                        │││    
  ◌  3 ─       Base.nothing::Nothing                                                     │││    
  ◌  └──       goto #5                                                                   │││    
  ◌  4 ─ %14 = Base.add_int(%5, 1)::Int64                                                │││╻      +
  ◌  └──       goto #5                                                                   ││╻      iterate
  ✓  5 ┄ %16 = φ (#4 => %14)::Int64                                                      ││     
  ✓  │   %17 = φ (#4 => %14)::Int64                                                      ││     
  ◌  │   %18 = φ (#3 => true, #4 => false)::Bool                                         ││     
  ◌  │   %19 = Base.not_int(%18)::Bool                                                   ││     
  ◌  └──       goto #7 if not %19                                                        ││     
  ◌  6 ─       goto #2                                                                   ││     
  ◌  7 ┄       goto #8                                                                   ││     
3 ◌  8 ─       invoke Main.storelen!(%2::Vector{Int64})::Any                             │      
4 ◌  │   %24 = Base.arrayref(true, %2, 2)::Int64                                         │╻      getindex
  ◌  └──       return %24                                                                │

This is because the constructor of BoundsError is opaque from this frame, and it's tricky to change the semantics of it based on the analysis on this frame. So probably the most sensible way to implement the maybecopy idea is to define "no-capturing on exception" alternatives of certain builtin functions whose no-throw checks inherently need to be conservative.

To me that sounds more complicated and I'd say it's much simpler if we change the semantics of certain error constructors so that they don't capture and let analyzers and optimizers assume that invariant.

Keno · 2022-01-22T01:59:45Z

Yes, the initial implementation of maybecopy is just to always copy the array, so you can assume no-escape. The issue people had with that is that the array may be big, so we want to have the flexibility to not do that in contexts where memory allocations are supposed to be tightly controlled. That's why the maybecopy is useful, because the optimizer can choose. I agree with you of course that it requires a global transformation, but I also think the extra memory issue is somewhat theoretical, and by the time it becomes a problem we should be able to do the compiler transform.

Keno requested a review from JeffBezanson October 29, 2019 00:03

JeffBezanson reviewed Oct 30, 2019

View reviewed changes

Keno force-pushed the kf/errorclosures branch from 13571c2 to 0d26536 Compare October 31, 2019 02:20

c42f reviewed Oct 31, 2019

View reviewed changes

base/strings/lazy.jl Outdated Show resolved Hide resolved

fredrikekre added needs compat annotation Add !!! compat "Julia x.y" to the docstring needs news A NEWS entry is required for this change labels Oct 31, 2019

Keno force-pushed the kf/errorclosures branch 2 times, most recently from d3b19f5 to 7f56d0e Compare October 31, 2019 11:09

ararslan added the needs tests Unit tests are required for this change label Oct 31, 2019

c42f mentioned this pull request Jun 13, 2020

Can logging macros call code hidden behind a function? #36236

Open

jmert mentioned this pull request Nov 15, 2020

Isolate string interpolation in error paths for sparse matrix construction #38446

Merged

Keno force-pushed the kf/errorclosures branch from 7f56d0e to 1283a39 Compare January 18, 2022 08:19

simeonschaub reviewed Jan 18, 2022

View reviewed changes

base/strings/lazy.jl Outdated Show resolved Hide resolved

Keno force-pushed the kf/errorclosures branch from 190c7d5 to 1da5d4e Compare January 19, 2022 06:23

Keno changed the title ~~RFC: Allow a closure in various kinds of errors~~ Allow a closure in various kinds of errors Jan 19, 2022

c42f approved these changes Jan 20, 2022

View reviewed changes

vtjnash reviewed Jan 20, 2022

View reviewed changes

Keno removed needs tests Unit tests are required for this change needs news A NEWS entry is required for this change needs compat annotation Add !!! compat "Julia x.y" to the docstring labels Jan 20, 2022

Keno force-pushed the kf/errorclosures branch from 849858c to 286f803 Compare January 20, 2022 02:29

vtjnash changed the title ~~Allow a closure in various kinds of errors~~ Create a LazyString type, for places (like errors) which we would prefer to defer the actual work Jan 20, 2022

Add lazy string type

6522370

Keno force-pushed the kf/errorclosures branch from 286f803 to 6522370 Compare January 20, 2022 14:49

Keno merged commit f290339 into master Jan 20, 2022

Keno deleted the kf/errorclosures branch January 20, 2022 22:35

KristofferC reviewed Jan 20, 2022

View reviewed changes

LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Feb 22, 2022

Add lazy string type (JuliaLang#33711)

6e16b6e

ranocha mentioned this pull request Feb 28, 2022

Possibly interesting features of Julia v1.8 trixi-framework/Trixi.jl#1075

Open

LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Mar 8, 2022

Add lazy string type (JuliaLang#33711)

eccce1b

wheeheee mentioned this pull request Mar 16, 2024

Support for LazyString and lazy_str JuliaLang/Compat.jl#824

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create a LazyString type, for places (like errors) which we would prefer to defer the actual work #33711

Create a LazyString type, for places (like errors) which we would prefer to defer the actual work #33711

Keno commented Oct 29, 2019

JeffBezanson commented Oct 29, 2019

Keno commented Oct 29, 2019

Keno commented Oct 29, 2019

StefanKarpinski commented Oct 30, 2019 •

edited

Loading

JeffBezanson Oct 30, 2019

JeffBezanson commented Oct 30, 2019

c42f commented Oct 31, 2019

StefanKarpinski commented Oct 31, 2019

User-764Q commented Oct 22, 2021

KristofferC commented Oct 22, 2021

Roger-luo commented Oct 22, 2021

Keno commented Jan 19, 2022

c42f left a comment

c42f Jan 19, 2022

Keno Jan 20, 2022

c42f Jan 20, 2022

vtjnash commented Jan 20, 2022

vtjnash Jan 20, 2022

Keno Jan 20, 2022

vtjnash Jan 20, 2022

Keno Jan 20, 2022

Keno commented Jan 20, 2022 •

edited

Loading

KristofferC Jan 20, 2022

vtjnash Jan 20, 2022

aviatesk commented Jan 21, 2022 •

edited

Loading

Keno commented Jan 22, 2022

	function show(io::IO, s::LazyString)
	function print(io::IO, s::LazyString)

Create a LazyString type, for places (like errors) which we would prefer to defer the actual work #33711

Create a LazyString type, for places (like errors) which we would prefer to defer the actual work #33711

Conversation

Keno commented Oct 29, 2019

JeffBezanson commented Oct 29, 2019

Keno commented Oct 29, 2019

Keno commented Oct 29, 2019

StefanKarpinski commented Oct 30, 2019 • edited Loading

Choose a reason for hiding this comment

JeffBezanson commented Oct 30, 2019

c42f commented Oct 31, 2019

StefanKarpinski commented Oct 31, 2019

User-764Q commented Oct 22, 2021

KristofferC commented Oct 22, 2021

Roger-luo commented Oct 22, 2021

Keno commented Jan 19, 2022

c42f left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vtjnash commented Jan 20, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Keno commented Jan 20, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aviatesk commented Jan 21, 2022 • edited Loading

Keno commented Jan 22, 2022

StefanKarpinski commented Oct 30, 2019 •

edited

Loading

Keno commented Jan 20, 2022 •

edited

Loading

aviatesk commented Jan 21, 2022 •

edited

Loading