-
Notifications
You must be signed in to change notification settings - Fork 6.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ResNet CIFAR 10 generates scalar data faster #154
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
djarpin
approved these changes
Dec 23, 2017
@@ -137,7 +138,8 @@ | |||
"\n", | |||
"It takes a few minutes to provision containers and start the training job.**TensorBoard** will start to display metrics shortly after that.\n", | |||
"\n", | |||
"You can access **TensorBoard** locally at [http://localhost:6006](http://localhost:6006) or using your SageMaker notebook instance [proxy/6006/](/proxy/6006/)(TensorBoard will not work if forget to put the slash, '/', in end of the url). If TensorBoard started on a different port, adjust these URLs to match." | |||
"You can access **TensorBoard** locally at [http://localhost:6006](http://localhost:6006) or using your SageMaker notebook instance [proxy/6006/](/proxy/6006/)(TensorBoard will not work if forget to put the slash, '/', in end of the url). If TensorBoard started on a different port, adjust these URLs to match.", | |||
"This example uses the optional hyperparameter **```min_eval_frequency```** to generate training evaluations more often, allowing to visualize **TensorBoard** scalar data faster. You can find the available optional hyperparameters [here](https://github.com/aws/sagemaker-python-sdk#optional-hyperparameters)**." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the trailing "**" at the end of line 142 intentional? It's fine either way, but thought it might just be imbalanced markdown bolding.
atqy
pushed a commit
to atqy/amazon-sagemaker-examples
that referenced
this pull request
Aug 16, 2022
* Make outdir optional arg, use default path in sagemaker environment, also change temp location when writing local files * remove is_s3 import * add tests and fix case when / is at the front of filepath * add comments * change to .tmp suffix * update testing script to take a tag
atqy
pushed a commit
to atqy/amazon-sagemaker-examples
that referenced
this pull request
Aug 16, 2022
* Add custom rule * updated notebook * Add rule script * Expanding rule monitoring section and improving BYOR notebook (aws#180) * Adding sagemaker example notebook * Remocing unused training script * Tornasole hook from config json (aws#104) * creating tornasole hook from config * making a quick variance fix (aws#99) * Adding the change to convert ndarray to np.ndarray when operator is not available in mxnet. * Cleanup and tests for TF and mxnet * remove rmtree from s3 test * Fixed the function invocation of get_numpy_reduction * Changes to read from hardcoded path * fixing pytorch test * Setting SaveConfig per mode (aws#94) * add doc for passing saveconfig specific to modes * add save config for collection * Create an option to build tornasole with no framework, TORNASOLE_FOR_RULES=1 (aws#95) * add option to build only for rules * Adding support to set save config per mode through json, also copying load collection method to all frameworks as that was missed * remove set -ex from tests script since it prevents upload of reports * move json config out of hooks * Adding tests to create hook from tornasole configs for pytorch * Change link of latest tornasole binaries (aws#120) * change link to binary and introduce latest * make container scripts working again * remove -U * fix path to ts binary in docker * log when single process is to stdout * Addressed the review comments. Added the correct asserts to check the reduction values. Added the test to test the training mode. * Setup versioning (aws#119) * added _verion.py and support * fixed __init__.py * Improve PR template (aws#128) * Setup versioning (aws#134) * added _verion.py and support * fixed __init__.py * using PEP 440 standard versioning it. * Json Config Hook Tests (aws#129) * added json config hook tests * Add LossNotDecreasing rule and change how required tensors API works (aws#126) * add loss rule and tests. refactoring rules api. * Adding mxnet tests for hook_from_json (aws#143) * Adding config file for reduce and save_all test scripts * Fixing bug in mxnet reduction util sloved issue aws#142 * Update build script for PT container - modified S3 path to pick up from PT folder - added parameter to enable installation of sagemaker_pytorch_container.whl into image * mode writer support (aws#144) * Add sagemaker docs and notebooks (aws#133) * Changing link of latest binaries for 0.3 (aws#122) * change link to binary and introduce latest * make container scripts working again * remove -U * fix path to ts binary in docker * log when single process is to stdout * uploaded sagemaker docs update analysis docs remove sagemaker docs update TF doc add sagemaker docs update api docs change link for rules binary add files from s3 bucket * refactor positions * minor changes * fix links in old examples * fix paths in integration tests * Update test_training_end.py * Update test_training_end.py * Update integration_testing_rules.py * bring back examples section in analysis readme * create sagemaker-notebooks directory * fix links * remove accidental include of key * update links, and update dev guide rules after changes in alpha * Add new regions for container images (aws#147) * update regions * add check for tag * add regions * Make required tensors optional (aws#148) * make required tensors optional * Update README.md * add a directory to clean in build binaries script * Updating the notebooks to include good and bad exampels. * Update scripts to build containers (aws#153) * Update scripts to build containers add a directory to clean in build binaries script add policy working container scripts for TF now added along with other frameworks fix binary in container script * Add script to tag as latest * Sagemaker TF notebook (aws#145) * Changing link of latest binaries for 0.3 (aws#122) * change link to binary and introduce latest * make container scripts working again * remove -U * fix path to ts binary in docker * log when single process is to stdout * uploaded sagemaker docs update analysis docs remove sagemaker docs update TF doc add sagemaker docs update api docs change link for rules binary add files from s3 bucket * refactor positions * minor changes * fix links in old examples * fix paths in integration tests * Update test_training_end.py * Update test_training_end.py * Update integration_testing_rules.py * bring back examples section in analysis readme * create sagemaker-notebooks directory * fix links * updated notebook for tf * fix name of rule * Delete README.md * remove rules scripts * Update tensorflow-simple.ipynb * Update tensorflow-simple.ipynb * add pytorch notebook from s3 (aws#156) * Changes for temp location and out_dir with Sagemaker in mind (aws#154) * Make outdir optional arg, use default path in sagemaker environment, also change temp location when writing local files * remove is_s3 import * add tests and fix case when / is at the front of filepath * add comments * change to .tmp suffix * update testing script to take a tag * Updated the uploader script to include pytorch scripts * Updating the paths to the examples in the notebooks. * Removed unnecessary copy * resolving warning mesg of loading yaml (aws#149) * Fix out dir bug (aws#160) * fix out dir bug * print mode.name instead of mode * print mode.name instead of mode * print mode.name instead of mode * parallelize builds for pytorch and mxnet (aws#162) * TF notebook (aws#163) * Changing link of latest binaries for 0.3 (aws#122) * change link to binary and introduce latest * make container scripts working again * remove -U * fix path to ts binary in docker * log when single process is to stdout * uploaded sagemaker docs update analysis docs remove sagemaker docs update TF doc add sagemaker docs update api docs change link for rules binary add files from s3 bucket * refactor positions * minor changes * fix links in old examples * fix paths in integration tests * Update test_training_end.py * Update test_training_end.py * Update integration_testing_rules.py * bring back examples section in analysis readme * create sagemaker-notebooks directory * fix links * updated notebook for tf * fix name of rule * Delete README.md * remove rules scripts * Update tensorflow-simple.ipynb * Update tensorflow-simple.ipynb * add sagemaker args * add model dir to resnet * remove action style args in script and reindent * update resnet example * make num epochs take priority over num_batches * change name of tf notebook * Add updated sagemaker tf notebook * change scripts to include all scripts in tf examples * change names of estimators * update files * Updating the mxnet notebook * Updating the mxnet notebook. * Updated notebook as per review. * Update mxnet.ipynb * Update mxnet.ipynb * Fixed the type of container from TensorFlow to MXNet. * Pytorch Notebook Updates (aws#170) * pytorch notebook * Update pytorch.ipynb * Update pytorch.ipynb * Pytorch (aws#171) * pytorch notebook * Update pytorch.ipynb * Update pytorch.ipynb * Heading fix * Expanding rule section and modifying BYOR * make tf notebook same as alpha * undo changes for rules, as that's now going into a different PR * Revert "Expanding rule monitoring section and improving BYOR notebook (aws#180)" This reverts commit 7f7c17c0f73b95f614859fa9ed05b29e50166eec. * Add first party rules file * update cloudwatch section
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This change is explained in details in aws/sagemaker-python-sdk#26.