Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

integrations: CML #91

Closed
Tracked by #1036
daavoo opened this issue Jun 15, 2021 · 34 comments · Fixed by #266
Closed
Tracked by #1036

integrations: CML #91

daavoo opened this issue Jun 15, 2021 · 34 comments · Fixed by #266
Labels

Comments

@daavoo
Copy link
Contributor

daavoo commented Jun 15, 2021

From a high level user perspective, I think it would be interesting to have some kind of integration between dvclive and CML.

Just like there is already a custom dvclive behavior when used along with DVC (i.e. generating checkpoints) maybe a similar approach could be used when dvclive is also used along with CML.

I was thinking on a use case like dvclive taking care of using cml-send-comment inside next_step if CML is detected.

Conceptually i think that it would be something similar to the existing make_checkpoint:

https://github.com/iterative/dvclive/blob/master/dvclive/metrics.py#L105

This might be just an specific use case of #90

@dberenbaum
Copy link
Collaborator

This seems like a general need for integration between cml and dvc checkpoints. Do you see any functionality as being specific to dvclive?

@daavoo daavoo added the discussion requires active participation to reach a conclusion label Jul 12, 2021
@daavoo
Copy link
Contributor Author

daavoo commented Jul 13, 2021

This seems like a general need for integration between cml and dvc checkpoints. Do you see any functionality as being specific to dvclive?

Given the discussion so far in iterative/dvc#6182, it looks like the exp push integration won't require any changes from the dvclive side.

I probably didn't express it right but I was also taking about an additional integration, which will be something like using cml-send-comment in the context of a long running experiment (i.e. deep learning) as a way to periodically report the status of the training.

From an user perspective what I would want is, by using dvclive in an environment where CML is installed, getting a P.R. comment reporting the dvclive metrics summary each time an epoch is finished (which would translate to each time dvclive.next_step is called).

This is a feature (sending the metrics to somewhere) that I have been repeatedly implementing through workarounds when using other ML Loggers and I though that ot could be nice to have it working out of the box with dvclive<>CML.

The feature probably doesn't make too much sense in the context of training a lot of short experiments and it's probably not a deal breaker for attracting new users.

@dberenbaum
Copy link
Collaborator

The feature probably doesn't make too much sense in the context of training a lot of short experiments and it's probably not a deal breaker for attracting new users.

Yes, this could be really useful, and absolutely dvclive should address issues for long-running experiments! iterative/dvc#6182 will at least make it possible to pull the latest checkpoint and see the associated metrics and plots, but we could make the experience smoother and better integrate cml.

Seems like this has 2 requirements:

  1. Generate some default report to send.
  2. Trigger cml-send-comment at checkpoints or some other point in code (should it be coupled with checkpoints/next_step?).

@shcheklein
Copy link
Member

@iterative/cml I know that you have been discussing (and even prototyping) something like this? Could you share your thoughts, outcomes?

@pared
Copy link
Contributor

pared commented Jul 14, 2021

@dberenbaum I think we could even start with pretty printing metrics to some env var that CML could pick up and show in the build log.

@casperdcl
Copy link

casperdcl commented Jul 14, 2021

I'd say generic callback support (#90) would be best. The user can use that to run cml-send-comment --update per-checkpoint.

Vaguely related (though this is automatic rather than manual): auto-push per checkpoint iterative/dvc#6182

@DavidGOrtega
Copy link

DavidGOrtega commented Jul 20, 2021

From a high level user perspective, I think it would be interesting to have some kind of integration between dvclive and CML.

That would be awesome. I think that the best integration would be if dvclive would work as tensorboard. Something that you can launch and give you a url in return. We wrapped that under cml-tensorboard-dev.

@iterative/cml I know that you have been discussing (and even prototyping) something like this? Could you share your thoughts, outcomes?

I implemented something like that in cml-publish. The idea was to initiate a daemon when you used cml-publish to watch for file changes of the file that you published with cml-publish. If the daemon see a new change republishes the file again in the same url. It works like badges, so the images in the report change. The daemon was a very simple piece of software using chokidar.

@daavoo daavoo added the A: frameworks Area: ML Framework integration label Jul 20, 2021
@dberenbaum
Copy link
Collaborator

That sounds cool, @DavidGOrtega! Definitely cleaner than a separate report for each checkpoint. In this case, can dvclive republish to the same url at each step without needing a daemon?

@DavidGOrtega
Copy link

DavidGOrtega commented Jul 20, 2021

Yes, it could. Something that actually @pared also liked when I shared the idea in general and gave the same feedback.
Its not implemented in CML (yet), its a local experiment like the extension that we did

@daavoo
Copy link
Contributor Author

daavoo commented Feb 3, 2022

I'm currently working on what I consider the minimum viable integration between DVCLive and CML


The current prototype doesn't depend on DVC (or DVC checkpoints) but still works with both.

It is focused on using cml send-comment and cml publish to share the progress of a training stage while the python process it's being run.

  • If CML is installed, on each step update (i.e. next_step()), DVCLive generates a markdown report using the metrics already logged.
  • The report includes a table (uses tabulate) and rendered images for the plots (uses matplotlib and cml publish).
  • Calls cml send-comment once the report is ready.

It currently uses subprocess.run to call cml commands.

It only works for posting multiple comments (1 per step) because of iterative/cml#888
I still have to try to set up a workflow supporting "P.R. comments" instead of "commit comments".


@iterative/cml I would like to hear some thoughts about the downsides of raw subprocess.run v.s. potential python bindings to see how "blocker" those bindings are.

@DavidGOrtega
Copy link

@daavoo I can't follow up the links.

I did a wrapper for python called CMLpy. Basically it exposes all the CML commands.
I just only need to do the installer. We should sync. I'm going to send you an invite

@daavoo
Copy link
Contributor Author

daavoo commented Feb 3, 2022

@daavoo I can't follow up the links.

Updated the permissions

@dberenbaum
Copy link
Collaborator

@daavoo Could you give me permission, too 🙏 ?

@daavoo
Copy link
Contributor Author

daavoo commented Feb 3, 2022

@daavoo Could you give me permission, too 🙏 ?

Done

@dberenbaum
Copy link
Collaborator

Looks great! Should report be set by config file/environment variable? It seems better to not have to touch the code to toggle this option (similar to html).

@DavidGOrtega
Copy link

@dberenbaum @daavoo I have invited you to pycml we should invite everyone

@dberenbaum
Copy link
Collaborator

I'd say generic callback support (#90) would be best. The user can use that to run cml-send-comment --update per-checkpoint.

Vaguely related (though this is automatic rather than manual): auto-push per checkpoint iterative/dvc#6182

What about this? Can dvclive have some built-in CML callback but also allow users to add their own?

@dberenbaum
Copy link
Collaborator

@daavoo:

I'd say the important requirements from the dvclive user perspective here are:

  1. This should be as close as possible to the same report that gets generated locally in html when running a script with dvclive. Since one is in html and the other in markdown, they can't be identical, but it should seem like the same report whether generated locally or via cml.
  2. I like the metrics table, but this means we should try to find a way to include it in the local html also.
  3. Minimal extra dependencies, and whatever is needed is installed on the default cml image.

@DavidGOrtega:

Base on your earlier comments about how to integrate, I'm wondering if it would be possible to reverse this integration? Could cml host this report, watch for changes to it, and update the comment accordingly?

@DavidGOrtega
Copy link

Base on your earlier comments about how to integrate, I'm wondering if it would be possible to reverse this integration? Could cml host this report, watch for changes to it, and update the comment accordingly?

There are some quirks @dberenbaum the biggest issue is the GH image cache. While redoing the log in every epoch using send-comment --update is right now solved and similar to other logging platform

@daavoo
Copy link
Contributor Author

daavoo commented Feb 10, 2022

What about this? Can dvclive have some built-in CML callback but also allow users to add their own?

It makes sense to me to offer flexibility to the users but only with the limited scope of sharing the report at the end of each step.

Given that DVCLive integrations are already a callback in the ML Frameworks, it feels over-complicated to support custom callbacks inside Live.

@dberenbaum
Copy link
Collaborator

It makes sense to me to offer flexibility to the users but only with the limited scope of sharing the report at the end of each step.

Given that DVCLive integrations are already a callback in the ML Frameworks, it feels over-complicated to support custom callbacks inside Live.

Good point. The problem here and in other issues like #90 and #204 is that users can already add similar callbacks directly in their model training code, so integrations/callbacks only add value if we are using the info specifically captured by DVCLive.

@dberenbaum
Copy link
Collaborator

I mentioned this to @0x2b3bfa0, but I wanted to post it in the issue to get more feedback. Happy to open an issue in CML, but wanted to discuss here first.

So to follow up on your thoughts about CML-DVCLive integration, it seems possible to have a more modular approach to this instead of needing a Python API if:

  1. There is some option like cml send-comment --publish that publishes all referenced images from the markdown and converts the markdown to reference the published images.
  2. There is some file watcher daemon like cml send-comment --watch that performs the action whenever the file is updated.

With those options, something like cml send-comment --publish --watch dvclivereport.md could intermittently send comments whenever DVCLive updates the report.

@0x2b3bfa0
Copy link
Member

0x2b3bfa0 commented May 6, 2022

There is some option like cml send-comment --publish that publishes all referenced images from the markdown and converts the markdown to reference the published images.

Sounds useful. In fact, this is probably the only use case for cml publish (?) 👍🏼

There is some file watcher daemon like cml send-comment --watch that performs the action whenever the file is updated.

The devil is in the [implementation] details. Synchronizing API request latency and rate limits sounds risky. See for instance the file–based API used by dvc exp checkpointing for a sample of the involved challenges.

@casperdcl
Copy link

casperdcl commented May 6, 2022

cml send-comment --publish

I think you mean a new feature request like cml publish --watch <plot.png> (i.e. static URL), constantly updated image.

cml send-comment --watch <report.md>

would technically need a subprocess file-watcher that also checked if the file was mid-way through being updated. Implementation is a bit advanced, but it's definitely great for end-user UX (single CLI flag, no API code).

@0x2b3bfa0
Copy link
Member

I think you mean a new feature request like cml publish --watch <plot.png> (i.e. static URL), constantly updated image.

I think he doesn't quite mean that. GitHub caches images; updating them implies replacing the link by a different one.

@0x2b3bfa0
Copy link
Member

would technically need a subprocess file-watcher that also checked if the file was mid-way through being updated. Implementation is a bit advanced, but it's definitely great for end-user UX (single CLI flag, no API code).

Sounds interesting, but would require users to run something in the background 🤔

@0x2b3bfa0
Copy link
Member

1598352521_128552_1598354440_sumario_normal

@dberenbaum
Copy link
Collaborator

I think you mean a new feature request like cml publish --watch <plot.png> (i.e. static URL), constantly updated image.

I think he doesn't quite mean that. GitHub caches images; updating them implies replacing the link by a different one.

Yeah, I don't think it's necessary to update in place. Forgetting about updates/watchers, I meant for cml send-comment --publish myreport.md to:

  1. Parse myreport.md for references to local images.
  2. Publish each of those images.
  3. Replace the references in myreport.md to point to the published URLs.
  4. Do the usual send-comment stuff.

It's about converting a locally useful markdown report into one ready to be published via cml send-comment.

If cml send-comment --watch is out of scope, having cml send-comment publish still provides a way that DVCLive (or anything else) can produce a generic local markdown report with images and easily put the report into a GH comment.

Again, happy to open an issue in CML for ideas that don't sound too crazy.

@casperdcl
Copy link

casperdcl commented May 7, 2022

Ah. That's yet another use case. Trying to keep track of ideas:

  1. report per-epoch metrics
  2. publish per-epoch plots
  3. auto-parse, publish & substitute local markdown links (is that a common use case? Would also seriously bump priority of Self-host https://assets.cml.dev cml#291)

or is there something else?

@dberenbaum
Copy link
Collaborator

I see it all as one use case: publish per-epoch reports (see #90). The typical DVCLive report includes metrics and plots, and it would be great to publish the entire report as a CML comment.

To achieve CML integration, DVCLive can either call CML directly, or it can generate a markdown report that can be useful anywhere (locally, in CML, and possibly for publishing elsewhere). If DVCLive goes with the option to just generate markdown, we need some way to convert from local markdown file to something that can be published in CML.

I also see it as potentially useful for CML generally. Instead of:

          cat metrics.txt >> report.md
          cml-publish plot.png --md >> report.md
          cml-send-comment report.md

You can do:

          cat metrics.txt >> report.md
          cat plot.png >> report.md
          cml-send-comment --publish report.md

This makes workflows less dependent on CML-specific commands, and it enables users to do things like have Python scripts that generate complex markdown and then publish them as comments with one command.

@0x2b3bfa0
Copy link
Member

See also iterative/cml#1026

@casperdcl
Copy link

Better yet, see iterative/cml#1036 :)

@casperdcl casperdcl added enhancement feature request and removed discussion requires active participation to reach a conclusion labels Jul 1, 2022
@casperdcl
Copy link

casperdcl commented Jul 1, 2022

Just to confirm after discussions:

are not needed for this integration. Instead, the proposal iterative/cml#1036 is

  • DVCLive option to write markdown (instead of html)
## current
while true; do
  if [[ -f report.md ]]; then
    for img in $(extract_images --file=report.md); do
      sed -i s/$img/$(cml publish $img)/ report.md
    done
    cml send-comment --update report.md
  fi
  sleep 5
done &
python train.py --per-epoch-report=report.md

## proposed
cml report watch --publish report.md &
python train.py --per-step-dvclive-output=report.md

## dvc way, no code modification, only dvc exp params
cml report watch --publish report.md &
dvc exp run -S dvclive_output=report.md

@daavoo can you add this option (markdown instead of html) to DVCLive?
Note that report.md can contain local images ![](./image.png) - CML will automagically publish them :) so it looks like we only need a small subset of #215

@daavoo
Copy link
Contributor Author

daavoo commented Jul 1, 2022

@daavoo can you add this option (markdown instead of html) to DVCLive?

Will do. Have to work on some pre-requisites iterative/dvc-render#62

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants