Some Viz Calculations Being Dropped #11882

jdunkerley · 2024-12-16T17:07:47Z

Reading a 1mm row Excel file, the visualization is not always rendered.
Interrupted exception also shows up (but thats another issue).

from Standard.Base import all
from Standard.Table import all
import Standard.Visualization

main =
    node1 = Data.read 'C:\\Users\\jdunk\\Downloads\\1m.xlsx' ..Sheet
    integer1 = node1.row_count
    table1 = node1.column_info

https://docs.google.com/spreadsheets/d/1hGx9W7qeIUvBvJMlR-a2fy6k_TtRFh_f/edit?usp=sharing&ouid=105802918667196204721&rtpof=true&sd=true

Logs:
enso-project-manager-2024-12-16.0.zip

The text was updated successfully, but these errors were encountered:

hubertp · 2024-12-18T11:05:07Z

I can't seem to reproduce that problem. I would need to see it live, I'm afraid. Also, the exchanges between GUI and LS would be much appreciated.

hubertp · 2024-12-18T14:23:59Z

I think I got it now and also got a bit more logs.
I believe what happens is:

Computation gets interrupted and visualizations get the missing method on exception, as reported in InterruptException/ClosedByInterruptException are still showing up in long running computations #11896
That value becomes a bit sticky and somehow expression doesn't get updated
Eventually, visualization will get calculated (not really dropped despite the title of this ticket) but it may take a reaaaaaly long time until that happens due to numerous interrupts

enso-bot · 2024-12-19T10:17:16Z

Hubert Plociniczak reports a new STANDUP for the provided date (2024-12-17):

Progress: Investigating vis issues, can't seem to reproduce the main problem but created #11896 for a related issue. Continued doing small tweaks to native image - needs testing with electron. It should be finished by 2024-12-19.

Next Day: Next day I will be working on the #11896 task. Continue investigating the issue

hubertp · 2024-12-20T11:25:11Z

Just to prove my point, I can see execution of visualizations being done once

[org.apache.poi.openxml4j.opc.OPCPackage] The close() method is intended to SAVE a package. This package is open in READ ONLY mode, use the revert() method instead!

is being reported in the logs.

Note that:
a) in IDE it seems that table visualization of the read Excel file is already being shown for a while so it seems as if the computation hung
b) nothing is being reported in the logs (apart from usual ping/pong) for over 2 minutes and suddenly boom, The close() method.... method appears and execution of the visualization is finished.

hubertp · 2024-12-23T16:32:30Z

Another important note:
IDE issues a number of Standard.Visualization.Preprocessor.default_preprocessor requests on startup. While of only limited use, it provides a glimpse at the data. In the above example the default preprocessor will show first 10 results. Currently it seems that data is simply ignored by IDE and a user stares at a blank visualization until full results arrive (2 minutes later?). Similarly row_count - the results are available early on but they are not being displayed until the second visualization comes: Standard.Visualization.Table.Visualization.prepare_visualization.

enso-bot · 2024-12-23T22:15:54Z

Hubert Plociniczak reports a new STANDUP for the provided date (2024-12-20):

Progress: Back to investigating slow visualizations after fixing #11896. Definitely seeing odd behavior when reading large files. Trying to deploy self-hosted runners for Bazel integration but blocked by permissions problem. It should be finished by 2024-12-23.

Next Day: Next day I will be working on the #11882 task. Continue investigating the issue.

hubertp · 2024-12-27T17:18:53Z

There isn't much that backend can do about UI not using the partial visualization results (will raise as a separate topic to GUI team).
But I noticed one important thing: in the presence of interruptions it may well be that we perform the same execution of expressions more than once, despite the presence of caches. This is due to the logic that runs cache invalidation when upserting the visualization. So when running long computations, we artificially may run them longer than necessary which makes things much worse.

The scenario is as follows:

A long running execution is triggered and interrupted, meaning that the Data.read expression is not cached
Re-trigger of that job is started and is well pass runtime hooks part
Visualization upsert is triggered, and it sees that the expression is not cached so it triggers cache invalidation
Execute job finishes, and caches the result of the long running expression
Further execute job is triggered, this time for the visualization. It runs runtime hooks that clean the cache.
We have to start Data.read again..

enso-bot · 2024-12-30T19:52:39Z

Hubert Plociniczak reports a new STANDUP for the provided date (2024-12-27):

Progress: Cache invalidation done when upserting visualizations has a negative effect on long-running computations. There is a race between caching the initial result and immediately reseting it due to a schedule visualization job. That may significantly delay startup. There doesn't appear to be an easy way to disable it without loosing existing functionality. Need to weight on pros and cons. It should be finished by 2024-12-27.

Next Day: Next day I will be working on the #11882 task. Continue investigating the issue.

enso-bot · 2024-12-30T19:59:13Z

Hubert Plociniczak reports a new STANDUP for the provided date (2024-12-23):

Progress: Continued investigating odd visualization problems. Surprisingly backend does send partial results to GUI with default_processor but the results aren't being shown in IDE. Additionally it looks like some computations are being done unnecessairly when interrupt is triggered. Needs more investigation after a break. It should be finished by 2024-12-27.

Next Day: Next day I will be working on the #11882 task. Continue investigating the issue.

enso-bot · 2025-01-01T22:55:19Z

Hubert Plociniczak reports a new STANDUP for the provided date (2024-12-30):

Progress: Prepared a PR that disables cache invalidation when upserting visualization, at the cost of not being able to attach one to subexpression. This will need to be revisited with observables approach. Back to native image generation for Language Server. Got weird classpath failures when including database jars that need investigating. It should be finished by 2024-12-30.

Next Day: Next day I will be working on the #11882 task. Continue on native image, address PR review.

The possibility of attaching visualizations to subexpressions meant that we (currently) have to invalidate caches for their parent expression every time a request comes. That was an acceptable cost when an expression is relatively fast to compute but unacceptable when dealing with slow computations that would have to be repeated. Since currently attaching visualizations is not used at all and we can rely on caching RHS and self, it is _safe_ to disable it. An observable pattern is better suited for visualizations and would mitigate this problem by design, something that we planned for a while in #10525 Should help with long running visualizations in #11882. Partial visualization results should be addressed on GUI side. # Important Notes For the example in #11882 we would at least re-read the large file at least twice, which adds 40-60seconds to the overall startup. Exchanges before the change: ![Screenshot from 2024-12-27 16-52-46](https://github.com/user-attachments/assets/63e7a6db-73f8-48dd-a24a-a44e197e4ee6) Responses after the change: ![Screenshot from 2024-12-30 12-18-28](https://github.com/user-attachments/assets/08020b1c-58f0-4c0f-b06d-1d904373b946) Results in the same (final) data: ![Screenshot from 2024-12-30 12-24-02](https://github.com/user-attachments/assets/280f6ef5-6691-4744-b67a-2a0898f55b8b)

github-project-automation bot added this to Issues Board Dec 16, 2024

github-project-automation bot moved this to ❓New in Issues Board Dec 16, 2024

jdunkerley added --bug Type: bug -language-server labels Dec 16, 2024

JaroslavTulach assigned hubertp Dec 17, 2024

hubertp mentioned this issue Dec 17, 2024

InterruptException/ClosedByInterruptException are still showing up in long running computations #11896

Closed

enso-bot bot mentioned this issue Dec 19, 2024

Slow openFile request may disrupt initial loading of a project #11721

Open

hubertp mentioned this issue Dec 30, 2024

Disable visualizations for subexpressions #11949

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some Viz Calculations Being Dropped #11882

Some Viz Calculations Being Dropped #11882

jdunkerley commented Dec 16, 2024 •

edited

Loading

hubertp commented Dec 18, 2024

hubertp commented Dec 18, 2024

enso-bot bot commented Dec 19, 2024

hubertp commented Dec 20, 2024

hubertp commented Dec 23, 2024 •

edited

Loading

enso-bot bot commented Dec 23, 2024

hubertp commented Dec 27, 2024

enso-bot bot commented Dec 30, 2024

enso-bot bot commented Dec 30, 2024

enso-bot bot commented Jan 1, 2025

Some Viz Calculations Being Dropped #11882

Some Viz Calculations Being Dropped #11882

Comments

jdunkerley commented Dec 16, 2024 • edited Loading

hubertp commented Dec 18, 2024

hubertp commented Dec 18, 2024

enso-bot bot commented Dec 19, 2024

hubertp commented Dec 20, 2024

hubertp commented Dec 23, 2024 • edited Loading

enso-bot bot commented Dec 23, 2024

hubertp commented Dec 27, 2024

enso-bot bot commented Dec 30, 2024

enso-bot bot commented Dec 30, 2024

enso-bot bot commented Jan 1, 2025

jdunkerley commented Dec 16, 2024 •

edited

Loading

hubertp commented Dec 23, 2024 •

edited

Loading