-
Notifications
You must be signed in to change notification settings - Fork 343
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Micro #67
Micro #67
Conversation
@dmpetrov please consider what to be done with the code removal and or docs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job @DavidGOrtega! This design is much better than the monolith one.
Comments are inline.
README.md
Outdated
test -f requirements.txt && pip3 install -r requirements.txt | ||
# Run report: | ||
dvc_cml_run | ||
cml-setup |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We agreed to extract this code to bash level.
README.md
Outdated
# Run report: | ||
dvc_cml_run | ||
cml-setup | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please replace on: [push, pull_request]
to one event. push
is the best - we discussed this on slack.
README.md
Outdated
cml-setup | ||
|
||
apt-get update && apt-get install -y python-pip && pip install --upgrade pip | ||
pip install tensorflow wget |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should go through the requirements file
test -f requirements.txt && pip3 install -r requirements.txt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, this is only my project
README.md
Outdated
dvc_cml_run | ||
cml-setup | ||
|
||
apt-get update && apt-get install -y python-pip && pip install --upgrade pip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not python3? py2 was deprecated already. We should force the best practices :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its just my project, I have to update wiki with the new approach
README.md
Outdated
pip install tensorflow wget | ||
|
||
dvc pull -f | ||
dvc repro train.dvc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the docs we should use the most general case. So, dvc repro
is enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how, in the example it should be two steps train and eval. In fact in my example its wrong should be eval to run the whole pipeline. So the two steps pipeline should be train.dvc and Dvcfile?
src/settings.js
Outdated
const DVC_TAG_PREFIX = ''; | ||
const CI_SKIP_MESSAGE = '[ci skip]'; | ||
const METRICS_FORMAT = '0[.][0000000]'; | ||
const BASELINE = 'origin/master'; | ||
const REPRO_TARGETS = ['Dvcfile']; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed anymore
src/settings.js
Outdated
const DVC_TAG_PREFIX = ''; | ||
const CI_SKIP_MESSAGE = '[ci skip]'; | ||
const METRICS_FORMAT = '0[.][0000000]'; | ||
const BASELINE = 'origin/master'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To the script level
const DVC_TAG_PREFIX = ''; | ||
const CI_SKIP_MESSAGE = '[ci skip]'; | ||
const METRICS_FORMAT = '0[.][0000000]'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI_SKIP_MESSAGE and METRICS_FORMAT - to the command option
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are not dooing skip anymore right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
src/settings.js
Outdated
const DVC_TAG_PREFIX = ''; | ||
const CI_SKIP_MESSAGE = '[ci skip]'; | ||
const METRICS_FORMAT = '0[.][0000000]'; | ||
const BASELINE = 'origin/master'; | ||
const REPRO_TARGETS = ['Dvcfile']; | ||
const A_REV = BASELINE; | ||
const B_REV = 'HEAD'; | ||
const INPUT_SKIP = '-'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A_REV, B_REV, INPUT_SKIP - all on the script level
src/settings.js
Outdated
const DVC_TAG_PREFIX = ''; | ||
const CI_SKIP_MESSAGE = '[ci skip]'; | ||
const METRICS_FORMAT = '0[.][0000000]'; | ||
const BASELINE = 'origin/master'; | ||
const REPRO_TARGETS = ['Dvcfile']; | ||
const A_REV = BASELINE; | ||
const B_REV = 'HEAD'; | ||
const INPUT_SKIP = '-'; | ||
|
||
exports.DVC_TITLE = DVC_TITLE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Script level
Settings is actually not used by the new architecture but the old one |
Right. This is why we need to remove it. |
@dmpetrov Im going to remove the dvc-cml code in another PR if not this PR would be huge. I have included the send report as comment in gitlab since tags were broken. Not broken but you needed to do lots of stuff manually that differs a lot with GH. The work can be tested with the container at: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good improvements on the comment and cleaning code. But the major issues still need to be addressed.
src/cml.js
Outdated
@@ -1,17 +1,17 @@ | |||
const print = console.log; | |||
// console.log = console.error; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commented code? Please remove all the commented code without a good reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is outdated... The code that I committed don't have it, maybe an issue in GH UI?
dvc_cml_run | ||
|
||
# needed to be able to do dvc metrics and dvc diff | ||
git fetch --prune --unshallow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make it a part of actions/checkout@v2
? I hope it supports all these options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could do as stated by them:
- uses: actions/checkout@v2
- run: |
git fetch --prune --unshallow
but I think is easier to loose the track. What do you think? should I move it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found it in their docs. yep, it looks good, let's keep it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have understood to keep it in our side and not put it in the checkout action. Also one point to remember is the gitlab similarity. GL does not have that action
README.md
Outdated
dvc push | ||
|
||
echo "# CML report" >> report.md | ||
dvc metrics diff --show-json origin/master HEAD | cml-metrics >> report.md |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to not to extract BASELINE
as a variable?
bin/cml-files.js
Outdated
|
||
const { metrics_args, diff_run, error_handler } = require('../src/cml'); | ||
const opts = metrics_args(); | ||
diff_run(opts).catch(e => error_handler(e)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the code needs to be extracted from scm/cml
(except common libs).
Each command is supposed to be as independent as possible. It should simplify command replacement for users.
bin/cml-metrics.js
Outdated
|
||
const { metrics_args, metrics_diff_run, error_handler } = require('../src/cml'); | ||
const opts = metrics_args(); | ||
metrics_diff_run(opts).catch(e => error_handler(e)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same. Please move all then needed code from src/cml
to here.
bin/cml-send-report.js
Outdated
} = require('../src/cml'); | ||
|
||
const opts = send_report_args(); | ||
send_report_run(opts).catch(e => error_handler(e)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same.
bin/cml-setup-env-remote.js
Outdated
@@ -0,0 +1,5 @@ | |||
#!/usr/bin/env node | |||
|
|||
const { setup_env_remote, error_handler } = require('../src/cml'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same
@@ -0,0 +1,36 @@ | |||
const fs = require('fs'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't understand why do we need this module. We should just read stdin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
read the buffer, stdin buffer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you need a buffer? you can read everything at once. if it fails - totally fine since our limit is way below any machine memory restriction.
I assume there is a 1-3 line of JS code that can do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume there is a 1-3 line of JS code that can do this.
No, I dont think so, code was much simpler before but with DVC output breaks. I can try a library like https://www.npmjs.com/package/get-stdin
but the code is prettty similar to mine using buffer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. let's keep it for now
Whats the benefit? You are setting it on there or in the variable all that its in the yaml file. I dont see the difference and its worse since we handle code that its not needed. Maybe Im losing the point |
The point:
|
OK, in the bash or in the ENV?
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great change @DavidGOrtega ✨
One question inline - please answer
README.md
Outdated
dvc metrics diff --show-json origin/master HEAD | cml-metrics >> report.md | ||
dvc diff --show-json origin/master HEAD | cml-files >> report.md | ||
dvc metrics diff --show-json "$BASELINE" HEAD | cml-metrics >> report.md | ||
dvc diff --show-json "$BASELINE" HEAD | cml-files >> report.md |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to compare it to HEAD? It means the results need to be committed.
Based on my understanding it should look like dvc diff --show-json "$BASELINE" | ...
.
Could you please make sure which approach is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are right
Introduces the new architecture.
Still to do before merge:
I have uploaded an image with this work to be tested:
dvcorg/dvc-cml:micro
with this workflow: