Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unexpected/sporadic testDefaultDeconv test error? #1482

Closed
dehann opened this issue Feb 10, 2022 · 2 comments · Fixed by #1486
Closed

unexpected/sporadic testDefaultDeconv test error? #1482

dehann opened this issue Feb 10, 2022 · 2 comments · Fixed by #1486
Assignees
Milestone

Comments

@dehann
Copy link
Member

dehann commented Feb 10, 2022

during development of #1481 , the follow test failure happened in test, but not during separate REPL testing:

Test Summary:            | Pass  Total
deconv through multihypo |    7      7
[ Info: Ensure variables are all initialized (graphinit)
[ Info: x0 is not initialized, and will do so now...
[ Info: try doautoinit! of x0
[ Info: init with useinitfct [:x0f1]
[ Info: do init of x0
[ Info: x1 is not initialized, and will do so now...
[ Info: try doautoinit! of x1
[ Info: init with useinitfct [:x0x1f1]
[ Info: do init of x1
[ Info: x2 is not initialized, and will do so now...
[ Info: try doautoinit! of x2
[ Info: init with useinitfct [:x1x2f1]
[ Info: do init of x2
[ Info: x3 is not initialized, and will do so now...
[ Info: try doautoinit! of x3
[ Info: init with useinitfct [:x2x3f1]
[ Info: do init of x3
[ Info: x4 is not initialized, and will do so now...
[ Info: try doautoinit! of x4
[ Info: init with useinitfct [:x3x4f1]
[ Info: do init of x4
[ Info: x5 is not initialized, and will do so now...
[ Info: try doautoinit! of x5
[ Info: init with useinitfct [:x4x5f1]
[ Info: do init of x5
[ Info: x6 is not initialized, and will do so now...
[ Info: try doautoinit! of x6
[ Info: init with useinitfct [:x5x6f1]
[ Info: do init of x6
[ Info: l1 is not initialized, and will do so now...
[ Info: try doautoinit! of l1
[ Info: init with useinitfct [:x0l1f1, :x6l1f1]
[ Info: do init of l1
[ Info: Solving over the Bayes (Junction) tree.
Building Bayes net...
Find potential functions for each clique
[ Info: Bayes Tree Complete
[ Info: Do tree based init-ference
FSM cliq1, iter=1 -- setCliqueRecycling_StateMachine, NULL
FSM cliq1, iter=2 -- buildCliqSubgraph_StateMachine, NULL
FSM cliq1, iter=3 -- presolveChecklist_StateMachine, NULL
FSM cliq1, iter=4 -- waitForUp_StateMachine, NULL
FSM cliq2, iter=1 -- setCliqueRecycling_StateMachine, NULL
FSM cliq2, iter=2 -- buildCliqSubgraph_StateMachine, NULL
FSM cliq2, iter=3 -- presolveChecklist_StateMachine, NULL
FSM cliq2, iter=4 -- waitForUp_StateMachine, NULL
FSM cliq3, iter=1 -- setCliqueRecycling_StateMachine, NULL
FSM cliq3, iter=2 -- buildCliqSubgraph_StateMachine, NULL
FSM cliq3, iter=3 -- presolveChecklist_StateMachine, NULL
FSM cliq3, iter=4 -- waitForUp_StateMachine, NULL
FSM cliq3, iter=5 -- preUpSolve_StateMachine, NULL
FSM cliq3, iter=6 -- solveUp_StateMachine, NULL
FSM cliq3, iter=7 -- postUpSolve_StateMachine, UPSOLVED
FSM cliq4, iter=1 -- setCliqueRecycling_StateMachine, NULL
FSM cliq4, iter=2 -- buildCliqSubgraph_StateMachine, NULL
FSM cliq4, iter=3 -- presolveChecklist_StateMachine, NULL
FSM cliq4, iter=4 -- waitForUp_StateMachine, NULL
FSM cliq4, iter=5 -- preUpSolve_StateMachine, NULL
FSM cliq4, iter=6 -- solveUp_StateMachine, NULL
FSM cliq4, iter=7 -- postUpSolve_StateMachine, UPSOLVED
FSM cliq5, iter=1 -- setCliqueRecycling_StateMachine, NULL
FSM cliq5, iter=2 -- buildCliqSubgraph_StateMachine, NULL
FSM cliq5, iter=3 -- presolveChecklist_StateMachine, NULL
FSM cliq5, iter=4 -- waitForUp_StateMachine, NULL
FSM cliq5, iter=5 -- preUpSolve_StateMachine, NULL
FSM cliq5, iter=6 -- solveUp_StateMachine, NULL
FSM cliq5, iter=7 -- postUpSolve_StateMachine, UPSOLVED
FSM cliq6, iter=1 -- setCliqueRecycling_StateMachine, NULL
FSM cliq6, iter=2 -- buildCliqSubgraph_StateMachine, NULL
FSM cliq6, iter=3 -- presolveChecklist_StateMachine, NULL
FSM cliq6, iter=4 -- waitForUp_StateMachine, NULL
FSM cliq6, iter=5 -- preUpSolve_StateMachine, NULL
FSM cliq6, iter=6 -- solveUp_StateMachine, NULL
FSM cliq6, iter=7 -- postUpSolve_StateMachine, UPSOLVED
FSM cliq3, iter=8 -- waitForDown_StateMachine, UPSOLVED
FSM cliq4, iter=8 -- waitForDown_StateMachine, UPSOLVED
FSM cliq5, iter=8 -- waitForDown_StateMachine, UPSOLVED
FSM cliq6, iter=8 -- waitForDown_StateMachine, UPSOLVED
FSM cliq2, iter=5 -- preUpSolve_StateMachine, NULL
FSM cliq2, iter=6 -- solveUp_StateMachine, NULL
FSM cliq2, iter=7 -- postUpSolve_StateMachine, UPSOLVED

FSM cliq2, iter=8 -- waitForDown_StateMachine, UPSOLVEDInterruptException:

Stacktrace:
  [1] try_yieldto(undo::typeof(Base.ensure_rescheduled))
    @ Base ./task.jl:777
  [2] wait()
    @ Base ./task.jl:837
  [3] wait(c::Base.GenericCondition{Base.Threads.SpinLock})
    @ Base ./condition.jl:123
  [4] _wait(t::Task)
    @ Base ./task.jl:293
  [5] sync_end(c::Channel{Any})
    @ Base ./task.jl:361
  [6] macro expansion
    @ ./task.jl:400 [inlined]
  [7] waitForUp_StateMachine(csmc::CliqStateMachineContainer{BayesTreeNodeData, GraphsDFG{SolverParams, DFGVariable, DFGFactor}, GraphsDFG{SolverParams, DFGVariable, DFGFactor}, MetaBayesTree})
    @ IncrementalInference ~/.julia/dev/IncrementalInference/src/CliqueStateMachine.jl:181
  [8] (::StateMachine{CliqStateMachineContainer})(userdata::CliqStateMachineContainer{BayesTreeNodeData, GraphsDFG{SolverParams, DFGVariable, DFGFactor}, GraphsDFG{SolverParams, DFGVariable, DFGFactor}, MetaBayesTree}, timeout::Int64; pollinterval::Float64, breakafter::Function, verbose::Bool, verbosefid::Base.TTY, verboseXtra::IncrementalInference.CliqStatus, iterlimit::Int64, injectDelayBefore::Nothing, recordhistory::Bool, housekeeping_cb::IncrementalInference.var"#475#477"{IncrementalInference.TreeClique, CliqStateMachineContainer{BayesTreeNodeData, GraphsDFG{SolverParams, DFGVariable, DFGFactor}, GraphsDFG{SolverParams, DFGVariable, DFGFactor}, MetaBayesTree}})
    @ FunctionalStateMachine ~/.julia/packages/FunctionalStateMachine/2JZFG/src/StateMachine.jl:94
  [9] initStartCliqStateMachine!(dfg::GraphsDFG{SolverParams, DFGVariable, DFGFactor}, tree::MetaBayesTree, cliq::IncrementalInference.TreeClique, timeout::Int64; oldcliqdata::BayesTreeNodeData, verbose::Bool, verbosefid::Base.TTY, drawtree::Bool, show::Bool, incremental::Bool, limititers::Int64, upsolve::Bool, downsolve::Bool, recordhistory::Bool, delay::Bool, logger::Base.CoreLogging.SimpleLogger, solve_progressbar::Nothing, algorithm::Symbol, solveKey::Symbol)
    @ IncrementalInference ~/.julia/dev/IncrementalInference/src/CliqueStateMachine.jl:64
 [10] tryCliqStateMachineSolve!(dfg::GraphsDFG{SolverParams, DFGVariable, DFGFactor}, treel::MetaBayesTree, cliqKey::Int64, timeout::Int64; oldtree::MetaBayesTree, verbose::Bool, verbosefid::Base.TTY, drawtree::Bool, limititers::Int64, downsolve::Bool, incremental::Bool, delaycliqs::Vector{Symbol}, recordcliqs::Vector{Symbol}, solve_progressbar::Nothing, algorithm::Symbol, solveKey::Symbol)
    @ IncrementalInference ~/.julia/dev/IncrementalInference/src/SolverAPI.jl:112
 [11] (::IncrementalInference.var"#526#529"{MetaBayesTree, Bool, Bool, Base.TTY, Bool, Bool, Vector{Symbol}, Vector{Symbol}, Symbol, Symbol, GraphsDFG{SolverParams, DFGVariable, DFGFactor}, MetaBayesTree, Int64, Nothing, Int64, Int64})()
    @ IncrementalInference ./task.jl:423┌ Warning: printCliqHistorySummary -- No CSM history found.
└ @ IncrementalInference ~/.julia/dev/IncrementalInference/src/TreeDebugTools.jl:191
┌ Warning: printCliqHistorySummary -- No CSM history found.
└ @ IncrementalInference ~/.julia/dev/IncrementalInference/src/TreeDebugTools.jl:191
┌ Error: Task 1 failed, sending error to all cliques
└ @ IncrementalInference ~/.julia/dev/IncrementalInference/src/CliqStateMachineUtils.jl:185
[ Info: All cliques should have exited
┌ Error: CSM clique 5: Exit with error state
└ @ IncrementalInference ~/.julia/dev/IncrementalInference/src/CliqStateMachineUtils.jl:152
┌ Error: CSM clique 4: Exit with error state
└ @ IncrementalInference ~/.julia/dev/IncrementalInference/src/CliqStateMachineUtils.jl:152
┌ Error: CSM clique 6: Exit with error state
└ @ IncrementalInference ~/.julia/dev/IncrementalInference/src/CliqStateMachineUtils.jl:152
┌ Error: CSM clique 3: Exit with error state
└ @ IncrementalInference ~/.julia/dev/IncrementalInference/src/CliqStateMachineUtils.jl:152
FSM cliq5, iter=9 -- exitStateMachine, UPSOLVED
FSM cliq4, iter=9 -- exitStateMachine, UPSOLVED
FSM cliq6, iter=9 -- exitStateMachine, UPSOLVED
FSM cliq3, iter=9 -- exitStateMachine, UPSOLVED
┌ Error: Task 1 failed, sending error to all cliques
└ @ IncrementalInference ~/.julia/dev/IncrementalInference/src/CliqStateMachineUtils.jl:185
┌ Error: CSM clique 2: Exit with error state
└ @ IncrementalInference ~/.julia/dev/IncrementalInference/src/CliqStateMachineUtils.jl:152
FSM cliq2, iter=9 -- exitStateMachine, UPSOLVED
deconvolution tools via differential factors: Error During Test at /home/dehann/.julia/dev/IncrementalInference/test/testDefaultDeconv.jl:89
[ Info: All cliques should have exited
  Got exception outside of a @test
  TaskFailedException
  
      nested task error: InterruptException:
      Stacktrace:
        [1] try_yieldto(undo::typeof(Base.ensure_rescheduled))
          @ Base ./task.jl:777
        [2] wait()
          @ Base ./task.jl:837
        [3] wait(c::Base.GenericCondition{Base.Threads.SpinLock})
          @ Base ./condition.jl:123
        [4] _wait(t::Task)
          @ Base ./task.jl:293
        [5] sync_end(c::Channel{Any})
          @ Base ./task.jl:361
        [6] macro expansion
          @ ./task.jl:400 [inlined]
        [7] waitForUp_StateMachine(csmc::CliqStateMachineContainer{BayesTreeNodeData, GraphsDFG{SolverParams, DFGVariable, DFGFactor}, GraphsDFG{SolverParams, DFGVariable, DFGFactor}, MetaBayesTree})
          @ IncrementalInference ~/.julia/dev/IncrementalInference/src/CliqueStateMachine.jl:181
        [8] (::StateMachine{CliqStateMachineContainer})(userdata::CliqStateMachineContainer{BayesTreeNodeData, GraphsDFG{SolverParams, DFGVariable, DFGFactor}, GraphsDFG{SolverParams, DFGVariable, DFGFactor}, MetaBayesTree}, timeout::Int64; pollinterval::Float64, breakafter::Function, verbose::Bool, verbosefid::Base.TTY, verboseXtra::IncrementalInference.CliqStatus, iterlimit::Int64, injectDelayBefore::Nothing, recordhistory::Bool, housekeeping_cb::IncrementalInference.var"#475#477"{IncrementalInference.TreeClique, CliqStateMachineContainer{BayesTreeNodeData, GraphsDFG{SolverParams, DFGVariable, DFGFactor}, GraphsDFG{SolverParams, DFGVariable, DFGFactor}, MetaBayesTree}})
          @ FunctionalStateMachine ~/.julia/packages/FunctionalStateMachine/2JZFG/src/StateMachine.jl:94
        [9] initStartCliqStateMachine!(dfg::GraphsDFG{SolverParams, DFGVariable, DFGFactor}, tree::MetaBayesTree, cliq::IncrementalInference.TreeClique, timeout::Int64; oldcliqdata::BayesTreeNodeData, verbose::Bool, verbosefid::Base.TTY, drawtree::Bool, show::Bool, incremental::Bool, limititers::Int64, upsolve::Bool, downsolve::Bool, recordhistory::Bool, delay::Bool, logger::Base.CoreLogging.SimpleLogger, solve_progressbar::Nothing, algorithm::Symbol, solveKey::Symbol)
          @ IncrementalInference ~/.julia/dev/IncrementalInference/src/CliqueStateMachine.jl:64
       [10] tryCliqStateMachineSolve!(dfg::GraphsDFG{SolverParams, DFGVariable, DFGFactor}, treel::MetaBayesTree, cliqKey::Int64, timeout::Int64; oldtree::MetaBayesTree, verbose::Bool, verbosefid::Base.TTY, drawtree::Bool, limititers::Int64, downsolve::Bool, incremental::Bool, delaycliqs::Vector{Symbol}, recordcliqs::Vector{Symbol}, solve_progressbar::Nothing, algorithm::Symbol, solveKey::Symbol)
          @ IncrementalInference ~/.julia/dev/IncrementalInference/src/SolverAPI.jl:112
       [11] (::IncrementalInference.var"#526#529"{MetaBayesTree, Bool, Bool, Base.TTY, Bool, Bool, Vector{Symbol}, Vector{Symbol}, Symbol, Symbol, GraphsDFG{SolverParams, DFGVariable, DFGFactor}, MetaBayesTree, Int64, Nothing, Int64, Int64})()
          @ IncrementalInference ./task.jl:423
  Stacktrace:
    [1] sync_end(c::Channel{Any})
      @ Base ./task.jl:381
    [2] macro expansion
      @ ./task.jl:400 [inlined]
    [3] taskSolveTree!(dfg::GraphsDFG{SolverParams, DFGVariable, DFGFactor}, treel::MetaBayesTree, timeout::Int64; oldtree::MetaBayesTree, drawtree::Bool, verbose::Bool, verbosefid::Base.TTY, limititers::Int64, limititercliqs::Vector{Pair{Symbol, Int64}}, downsolve::Bool, incremental::Bool, multithread::Bool, skipcliqids::Vector{Symbol}, recordcliqs::Vector{Symbol}, delaycliqs::Vector{Symbol}, smtasks::Vector{Task}, algorithm::Symbol, solveKey::Symbol)
      @ IncrementalInference ~/.julia/dev/IncrementalInference/src/SolverAPI.jl:50
    [4] solveTree!(dfgl::GraphsDFG{SolverParams, DFGVariable, DFGFactor}, oldtree::MetaBayesTree; timeout::Int64, storeOld::Bool, verbose::Bool, verbosefid::Base.TTY, delaycliqs::Vector{Symbol}, recordcliqs::Vector{Symbol}, limititercliqs::Vector{Pair{Symbol, Int64}}, injectDelayBefore::Nothing, skipcliqids::Vector{Symbol}, eliminationOrder::Vector{Symbol}, eliminationConstraints::Vector{Symbol}, smtasks::Vector{Task}, dotreedraw::Vector{Int64}, runtaskmonitor::Bool, algorithm::Symbol, solveKey::Symbol, multithread::Bool)
      @ IncrementalInference ~/.julia/dev/IncrementalInference/src/SolverAPI.jl:356
    [5] macro expansion
      @ ~/.julia/dev/IncrementalInference/test/testDefaultDeconv.jl:102 [inlined]
    [6] macro expansion
      @ /usr/local/share/julia-1.7.2/share/julia/stdlib/v1.7/Test/src/Test.jl:1283 [inlined]
    [7] top-level scope
      @ ~/.julia/dev/IncrementalInference/test/testDefaultDeconv.jl:93
    [8] include(fname::String)
      @ Base.MainInclude ./client.jl:451
    [9] top-level scope
      @ ~/.julia/dev/IncrementalInference/test/runtests.jl:45
   [10] include(fname::String)
      @ Base.MainInclude ./client.jl:451
   [11] top-level scope
      @ none:6
   [12] eval
      @ ./boot.jl:373 [inlined]
   [13] exec_options(opts::Base.JLOptions)
      @ Base ./client.jl:268
   [14] _start()
      @ Base ./client.jl:495
Test Summary:                                | Error  Total
deconvolution tools via differential factors |     1      1
ERROR: [ Info: monitorCSMs: all tasks done
LoadError: Some tests did not pass: 0 passed, 0 failed, 1 errored, 0 broken.
in expression starting at /home/dehann/.julia/dev/IncrementalInference/test/testDefaultDeconv.jl:89
in expression starting at /home/dehann/.julia/dev/IncrementalInference/test/runtests.jl:45
ERROR: Package IncrementalInference errored during testing
@dehann dehann added this to the v0.27.0 milestone Feb 10, 2022
@dehann dehann added the bug label Feb 10, 2022
@dehann dehann added the testing label Feb 10, 2022
@dehann dehann modified the milestones: v0.27.0, v0.0.x Feb 10, 2022
dehann added a commit that referenced this issue Feb 10, 2022
runtest sequence hides sporadic error #1482
@dehann dehann modified the milestones: v0.0.x, v0.27.0 Feb 10, 2022
@dehann dehann changed the title testDefaultDeconv sporadic test error? testDefaultDeconv test error? Feb 11, 2022
@dehann
Copy link
Member Author

dehann commented Feb 11, 2022

Happens on both JL v1.6.5 and v1.7.2. Only happens with user REPL to Pkg, call on test IncrementalInference. Does not seem to happen on CI calls.

@dehann
Copy link
Member Author

dehann commented Feb 12, 2022

Ah, figured it out. There was a FSM timeout, and my machine was throttled down. So the FSM "watchdog" uses Interrupt Exception to stop CSMs:

tree = solveTree!(fg, eliminationOrder=vo, timeout=40, verbose=true)

See

@dehann dehann changed the title testDefaultDeconv test error? unexpected/sporadic testDefaultDeconv test error? Feb 12, 2022
@dehann dehann self-assigned this Feb 12, 2022
@dehann dehann moved this from In Progress to Done in Factors/Distr all JSON (DFG18 / IIF27) MOVED Feb 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

1 participant