Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI: Enable Live Debugging #849

Open
josephjclark opened this issue Jan 3, 2025 · 3 comments
Open

CLI: Enable Live Debugging #849

josephjclark opened this issue Jan 3, 2025 · 3 comments

Comments

@josephjclark
Copy link
Collaborator

It would be incredibly useful to be able to drop a breakpoint in the CLI and pause execution for debugging. Something like:

openfn workflow.json --debug

Note that if we add a --debug flag, then we can't use "shortcut" log levels like --info or --debug (which is something I was going to do a while back but slipped off the agenda).

Node supports this stuff natively and will open the chromium debugger, so from one perspective it's not a huge deal.

I think there are two big questions here

1. How and when does the user insert a breakpoint?

This is a problem particularly when running a workflow.

The easiest starting point would be to just break at the start of every step. Easy.

The command line argument could take a job name and line number to break on that line (although this will rarely make sense, see point 2 below), like this to break on line 22.

openfn workflow.json --debug get-data:22

You could pass multiple breakpoints.

But this is all pretty clumsy and awkward. In VSC I just click a line number and the code will break there.

The user could also add debugger statements to the job code. But part of the benefit of debugging is that you don't need to modify your source - if we rely on debug statements then this is no longer true.

2. How do we handle sourcemapping?

Chromium devtools support debugging with sourcemaps, and the sourcemapping work is well under way.

But there's one big gotcha: job code doesn't really actually execute. Putting a breakpoint on the get() function on line 2 is meaningless: the get() function is just a constructor for a function which gets added to a pipeline and executed later. All the top-level job code gets run synchronously at initialisation time.So if you want to debug that line of code and see the top-level state of the program at that time, well, you sort of can't.

Breakpoints DO work well in callback code because that's all called within the pipeline, and we can sourcemap to it. It's just the operations that won't work.

Could we wrap the operations in some meaningful way to make them debuggable? You'd have to use the compiler to defer the creation of the factory function, but then also immediately execute it within the pipeline when the breakpoint is triggered. And the debug code stack will look nothing like the original souce.

The way our adaptor functions work actually make debugging and sourcemapping really hard. All the job code is just factory instantiation and initialisation - nothing actually runs. This is definitely something to consider if we radically overhaul the runtime in the next couple of years.

3. Debugging in VLC?

Adding VLC support and allowing debugging from there makes it much easier to add breakpoints. But the sourcemapping problem remains, and VLC support is a different story.

@github-project-automation github-project-automation bot moved this to New Issues in v2 Jan 3, 2025
@doc-han
Copy link
Collaborator

doc-han commented Jan 3, 2025

This is something I was looking at. Our vscode extension will also need this functionality when implementing DAP

@josephjclark
Copy link
Collaborator Author

Another thing making debugging harder is the fact that the CLI uses a child process for execution. Adding node-brk on the CLI bin command itself doesn't help you - you have to push it down to the child process. I don't know how easy or difficult that will be - presumably we just delegate the debug argument to the child process.

The only reason we use a child process is to mask and default a bunch of node VM args. If we can find a better way to handle that, and if delegating the debugger proves to be difficult, then I'm open to running the CLI in the main process.

@josephjclark
Copy link
Collaborator Author

Hang on, we must be in control of this?

Sourcemapping is basically arbitrary. When the nth operation is being executed, we should be able to sourcemap in the debugger to wherever we want. We can pretend we're in job code line 2, for example, even though we're actuallly in runtime.dist.js line 19914. Like it doesn't matter.

I'm not about HOW we do this, but there must be some way to wrap each operation in the pipeline and map it back to the right line of source in the original job code. And declare this in a way that the debugger understands.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: New Issues
Development

No branches or pull requests

2 participants