Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal Error/StackOverflowError in type inference #42327

Open
Seelengrab opened this issue Sep 21, 2021 · 4 comments
Open

Internal Error/StackOverflowError in type inference #42327

Seelengrab opened this issue Sep 21, 2021 · 4 comments
Labels
bug Indicates an unexpected problem or unintended behavior compiler:inference Type inference

Comments

@Seelengrab
Copy link
Contributor

Seelengrab commented Sep 21, 2021

Reproducer can be found here. Run as julia reproducer.jl, it takes care of a temp environment and such.

What I think is happening is that the definition of iterate here kills type inference when trying to iterate over the result of a recursive buildup of Flatten{T}. That definition is copied verbatim from Base.Iterators.flatten and also contains this line:

        y = iterate(Base.tail(state)...)

which potentially splats a very large/deeply nested tuple. I think this is what leads to the type inference death in the end and since this is copied from Base.Iterators.flatten, the Base version should also have this same problem. It's hard to manifest, as recursively building up a regular flatten already gives Any much sooner and the subsequent loss of performance makes this crash infeasible, which was the motivation for building Flatten{T} in the first place as I can guarantee here that the eltype will always be the same.

I'm fairly certain this is the cause, as interrupting the "waiting" code before hitting the internal error leads to this stacktrace:

image

I know that splatting large things is bad, so I'll move to a queue/Channel based design soon™, but I don't think it should crash/throw that unsightly internal error here..

Initial internal error
Internal error: encountered unexpected error in runtime:
StackOverflowError()                                    
is_derived_type at ./compiler/typelimits.jl:39   # I've also seen 65 here
is_derived_type at ./compiler/typelimits.jl:66          
is_derived_type at ./compiler/typelimits.jl:66          
is_derived_type at ./compiler/typelimits.jl:66          
is_derived_type at ./compiler/typelimits.jl:66          
is_derived_type at ./compiler/typelimits.jl:66          
is_derived_type at ./compiler/typelimits.jl:66          
is_derived_type at ./compiler/typelimits.jl:66          
is_derived_type at ./compiler/typelimits.jl:66          
is_derived_type at ./compiler/typelimits.jl:66          
is_derived_type at ./compiler/typelimits.jl:66          
is_derived_type at ./compiler/typelimits.jl:66          
Transition to `abstract_call`
is_derived_type at ./compiler/typelimits.jl:66                      
is_derived_type at ./compiler/typelimits.jl:66                      
is_derived_type_from_any at ./compiler/typelimits.jl:74             
type_more_complex at ./compiler/typelimits.jl:196                   
limit_type_size at ./compiler/typelimits.jl:21                      
abstract_call_method at ./compiler/abstractinterpretation.jl:454    
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:105
abstract_call_known at ./compiler/abstractinterpretation.jl:1341    
abstract_call at ./compiler/abstractinterpretation.jl:1396          
abstract_apply at ./compiler/abstractinterpretation.jl:997          
abstract_call_known at ./compiler/abstractinterpretation.jl:1258    
abstract_call at ./compiler/abstractinterpretation.jl:1396          
abstract_call at ./compiler/abstractinterpretation.jl:1381          
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1536
typeinf_local at ./compiler/abstractinterpretation.jl:1901          
typeinf_nocycle at ./compiler/abstractinterpretation.jl:2017        
_typeinf at ./compiler/typeinfer.jl:226                             
typeinf at ./compiler/typeinfer.jl:209                              
typeinf_edge at ./compiler/typeinfer.jl:825 [inlined]               
abstract_call_method at ./compiler/abstractinterpretation.jl:504    
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:105
abstract_call_known at ./compiler/abstractinterpretation.jl:1341    
abstract_call at ./compiler/abstractinterpretation.jl:1396          
abstract_apply at ./compiler/abstractinterpretation.jl:997          
abstract_call_known at ./compiler/abstractinterpretation.jl:1258    
abstract_call at ./compiler/abstractinterpretation.jl:1396          
abstract_call at ./compiler/abstractinterpretation.jl:1381          
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1536
typeinf_local at ./compiler/abstractinterpretation.jl:1901          
typeinf_nocycle at ./compiler/abstractinterpretation.jl:2017        
_typeinf at ./compiler/typeinfer.jl:226                             
typeinf at ./compiler/typeinfer.jl:209                              
.
.
.
# this continues for a very long time - I haven't seen it finish printing yet
julia> versioninfo()                           
Julia Version 1.8.0-DEV.548                    
Commit c5f348726c* (2021-09-16 15:09 UTC)      
Platform Info:                                 
  OS: Linux (x86_64-linux-gnu)                 
  CPU: Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz
  WORD_SIZE: 64                                
  LIBM: libopenlibm                            
  LLVM: libLLVM-12.0.1 (ORCJIT, skylake)       
Environment:                                   
  JULIA_PKG_SERVER =                           
  JULIA_NUM_THREADS = 4                        

I'll build 1.7 later today and check if it breaks on there as well, but I'm fairly certain that it will.

@Seelengrab
Copy link
Contributor Author

Seelengrab commented Sep 21, 2021

Yep, also breaks on 1.7-rc1:

julia> versioninfo()                           
Julia Version 1.7.0-rc1                        
Commit 9eade6195e* (2021-09-12 06:45 UTC)      
Platform Info:                                 
  OS: Linux (x86_64-linux-gnu)                 
  CPU: Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz
  WORD_SIZE: 64                                
  LIBM: libopenlibm                            
  LLVM: libLLVM-12.0.1 (ORCJIT, skylake)       
Environment:                                   
  JULIA_PKG_SERVER =                           
  JULIA_NUM_THREADS = 4                        

though this time in egal_types instead of is_derived_type:

Internal error: encountered unexpected error in runtime:
StackOverflowError()                                    
egal_types at ~/julia/src/builtins.c:130     
egal_types at ~/julia/src/builtins.c:145     
egal_types at ~/julia/src/builtins.c:145     
egal_types at ~/julia/src/builtins.c:145     
egal_types at ~/julia/src/builtins.c:145     
egal_types at ~/julia/src/builtins.c:145     
egal_types at ~/julia/src/builtins.c:145     
egal_types at ~/julia/src/builtins.c:145     
egal_types at ~/julia/src/builtins.c:145     
.
.
.
egal_types at ~/julia/src/builtins.c:145                         
egal_types at ~/julia/src/builtins.c:145                         
egal_types at ~/julia/src/builtins.c:145 [inlined]               
jl_types_egal at ~/julia/src/builtins.c:192                      
jl_types_equal at ~/julia/src/subtype.c:1916                     
== at ./operators.jl:248                                                    
jfptr_EQ.EQ._15392 at ~/julia/usr/lib/julia/sys.so (unknown line)
_jl_invoke at ~/julia/src/gf.c:2245 [inlined]                    
jl_apply_generic at ~/julia/src/gf.c:2427                        
abstract_call_method at ./compiler/abstractinterpretation.jl:408            
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:105        
abstract_call_known at ./compiler/abstractinterpretation.jl:1319            
abstract_call at ./compiler/abstractinterpretation.jl:1374                  
abstract_apply at ./compiler/abstractinterpretation.jl:975                  
abstract_call_known at ./compiler/abstractinterpretation.jl:1236            
abstract_call at ./compiler/abstractinterpretation.jl:1374                  
abstract_call at ./compiler/abstractinterpretation.jl:1359                  
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1514        
typeinf_local at ./compiler/abstractinterpretation.jl:1879                  
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1993                
_typeinf at ./compiler/typeinfer.jl:226                                     
typeinf at ./compiler/typeinfer.jl:209                                      
typeinf_edge at ./compiler/typeinfer.jl:823 [inlined]                       
abstract_call_method at ./compiler/abstractinterpretation.jl:504            
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:105        
abstract_call_known at ./compiler/abstractinterpretation.jl:1319            
abstract_call at ./compiler/abstractinterpretation.jl:1374                  
abstract_apply at ./compiler/abstractinterpretation.jl:975                  
abstract_call_known at ./compiler/abstractinterpretation.jl:1236            
abstract_call at ./compiler/abstractinterpretation.jl:1374                  
abstract_call at ./compiler/abstractinterpretation.jl:1359                  
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1514        
typeinf_local at ./compiler/abstractinterpretation.jl:1879                  
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1993                
_typeinf at ./compiler/typeinfer.jl:226                                     
typeinf at ./compiler/typeinfer.jl:209                                      

@aviatesk aviatesk added the compiler:inference Type inference label Sep 21, 2021
@KristofferC
Copy link
Member

Dup of #38364 arguably.

Also, I want to highlight this comment: #38364 (comment)

There are various issues in the system when dealing with large tuples: if type inference doesn't get you, then subtyping or codegen probably will. Of course we want to fix all of this eventually, but for now you really just have to avoid big tuples.

@Seelengrab
Copy link
Contributor Author

Seelengrab commented Sep 21, 2021

Kind of? The MWE by @martinholters at least throws a sensible StackOverflowError and not an internal error for me:

julia> xs = tuple(("a" for _ in 1:2000)...);
                                            
julia> foo(xs) = xs[1:20]                   
foo (generic function with 1 method)        
                                            
julia> @code_typed foo(xs)                  
ERROR: StackOverflowError:                  
Stacktrace:                                 
     [1] _methods_by_ftype                  
       @ ./reflection.jl:908 [inlined]      
     [2] #findall#246                       
...
 [18967] _typeinf(interp::Core.Compiler.NativeInterpreter, frame::Core.Compiler.InferenceState)                                             
       @ Core.Compiler ./compiler/typeinfer.jl:226                                                                                          
 [18968] typeinf(interp::Core.Compiler.NativeInterpreter, frame::Core.Compiler.InferenceState)                                              
       @ Core.Compiler ./compiler/typeinfer.jl:209                                                                                          
 [18969] typeinf_code(interp::Core.Compiler.NativeInterpreter, method::Method, atypes::Any, sparams::Core.SimpleVector, run_optimizer::Bool)
       @ Core.Compiler ./compiler/typeinfer.jl:845                                                                                          
 [18970] code_typed_by_type(tt::Type; optimize::Bool, debuginfo::Symbol, world::UInt64, interp::Core.Compiler.NativeInterpreter)            
       @ Base ./reflection.jl:1213                                                                                                          
 [18971] code_typed(f::Any, types::Any; optimize::Bool, debuginfo::Symbol, world::UInt64, interp::Core.Compiler.NativeInterpreter)          
       @ Base ./reflection.jl:1181                                                                                                          
 [18972] code_typed(f::Any, types::Any)                                                                                                     
       @ Base ./reflection.jl:1168                                                                                                          

though the original example by @fonsp also throws the internal error (though a different one that also ends in type inference).

I'd be happy with this getting a non-internal error and otherwise being a duplicate (though I'm not sure the cause is the same, as the errors seem to be handled differently...). The use case in the reproducer in my OP, while valid code, can be worked around on my end by not creating those large tuples internally.

I guess the difference between the two issues is that in my case, the code creating those Flatten actually runs fine and inference is only hit once the result is iterated over, while in the other issue inference immediately throws. If anything, this could indicate that maybe Base.Iterators.flatten shouldn't splat here as well 🤷‍♂️

@KristofferC
Copy link
Member

Like the comment says, the exact point where things explode might vary but in the end, it still gets you.

@nsajko nsajko added the bug Indicates an unexpected problem or unintended behavior label May 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior compiler:inference Type inference
Projects
None yet
Development

No branches or pull requests

4 participants