Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docs] rewrite #5175

Merged
merged 40 commits into from
Aug 6, 2019
Merged

[docs] rewrite #5175

merged 40 commits into from
Aug 6, 2019

Conversation

richardliaw
Copy link
Contributor

@richardliaw richardliaw commented Jul 11, 2019

What do these changes do?

TODOs:

Punting but need to add:

Related issue number

Linter

  • I've run scripts/format.sh to lint the changes in this PR.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-Perf-Integration-PRB/1625/
Test FAILed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15316/
Test PASSed.

@robertnishihara robertnishihara changed the title [docs] rewrite [WIP] [docs] rewrite Jul 12, 2019
@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15450/
Test FAILed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15451/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15453/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15452/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15455/
Test PASSed.


Ray Ecosystem
-------------

Ray comes with libraries that accelerate deep learning and reinforcement learning development:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure about the phrasing "comes with" - makes them sound like Ray add-ons instead of standalone projects built on Ray

Ray comes with libraries that accelerate deep learning and reinforcement learning development:

- `Tune`_: Scalable Hyperparameter Search
- `RLlib`_: Scalable Reinforcement Learning
- `Distributed Training <distributed_training.html>`__

Other projects in development:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
TODO: add short descriptions and progress

doc/source/index.rst Show resolved Hide resolved
doc/source/index.rst Outdated Show resolved Hide resolved
@@ -103,48 +96,71 @@ Ray comes with libraries that accelerate deep learning and reinforcement learnin
rllib-dev.rst
rllib-package-ref.rst

.. toctree::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this going to its own page? or is this just temporary?


install-source.rst
contrib.rst
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
contrib.rst
contributing.rst

nit: other names are spelled out

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15457/
Test FAILed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15456/
Test PASSed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-Perf-Integration-PRB/1715/
Test FAILed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-Perf-Integration-PRB/1716/
Test PASSed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15602/
Test FAILed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15601/
Test PASSed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15608/
Test FAILed.

@ericl ericl self-assigned this Jul 30, 2019
@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15812/
Test PASSed.

Copy link
Collaborator

@robertnishihara robertnishihara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't get very far. Awesome work so far, this is a huge improvement. Still working through it, but wanted to leave partial comments.

-----------

.. code-block:: python

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest adding the import ray as well so people can copy/paste.


Ray programs can run on a single machine, and can also seamlessly scale to large clusters. To execute the above Ray script in the cloud, just download `this configuration file <https://github.com/ray-project/ray/blob/master/python/ray/autoscaler/aws/example-full.yaml>`__, and run:

``ray submit [CLUSTER.YAML] example.py --start``
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love what you're doing here. I think it will be substantially easier for users if it can be fully done with copy/paste. E.g.,

wget https://.../example-full.yaml  # Get the cluster config file.
wget https://.../example.py  # Get the example script.

ray submit example-full.yaml example.py --start  # Run the script on the cluster.


``ray submit [CLUSTER.YAML] example.py --start``

See more details in the `Cluster Launch page <autoscaling.html>`_.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
See more details in the `Cluster Launch page <autoscaling.html>`_.
Read more about `launching clusters <autoscaling.html>`_.

doc/source/index.rst Outdated Show resolved Hide resolved
doc/source/index.rst Outdated Show resolved Hide resolved
-----------------

`RLlib`_ is an open-source library for reinforcement learning that offers both high scalability and a unified API for a variety of applications.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a link to the main Tune documentation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I missed the fact that the initial word Tune is a link.

doc/source/index.rst Show resolved Hide resolved
done = self.cur_pos >= self.end_pos
return [self.cur_pos], 1 if done else 0, done, {}

tune.run(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example takes too long to run, can we make it finish in under 30 seconds?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is expected for RL.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that for a real RL application it will run for a long time, but this is code that people will be copying and pasting from the browser into the terminal just to verify that it runs (e.g., that they managed to install it properly) and it's a better experience if it runs to completion instead of the user having to manually kill it.

@@ -42,103 +42,20 @@ Here are links to the latest wheels (which are built off of master). To install
.. _`MacOS Python 3.5`: https://s3-us-west-2.amazonaws.com/ray-wheels/latest/ray-0.8.0.dev2-cp35-cp35m-macosx_10_6_intel.whl
.. _`MacOS Python 2.7`: https://s3-us-west-2.amazonaws.com/ray-wheels/latest/ray-0.8.0.dev2-cp27-cp27m-macosx_10_6_intel.whl

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add instructions for installing Ray from any particular recent commit? E.g.,

Suggested change
To install Ray from any recent commit, do the following.
.. code-block:: bash
RAY_COMMIT=63a6b0e710fded230f340a237e55085fb4ace4e9
RAY_VERSION=0.8.0.dev3 # You can find this in python/ray/__init__.py.
RAY_BRANCH=master
pip install -U https://s3-us-west-2.amazonaws.com/ray-wheels/$RAY_BRANCH/$RAY_COMMIT/ray-$RAY_VERSION-cp36-cp36m-manylinux1_x86_64.whl

Actually, the above doesn't quite work because we also need to modify the wheel for Mac/Linux and different Python versions. Maybe it's best to put it in a simple script and then link to it from here.

doc/source/installation.rst Outdated Show resolved Hide resolved
@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15895/
Test FAILed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15904/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15911/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15914/
Test PASSed.

Copy link
Contributor

@ericl ericl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM. Let's merge and iterate.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15989/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/16005/
Test PASSed.

@ericl ericl merged commit a08ea09 into ray-project:master Aug 6, 2019
@raulchen
Copy link
Contributor

raulchen commented Aug 6, 2019

@richardliaw, the CI lint failure on the master branch seems related to this PR. Can you fix it? thanks.

2. Fetching results (object IDs) [``ray.put``, ``ray.get``, ``ray.wait``]
3. Using remote classes (actors) [``ray.remote``]

.. list-table:: Title
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the plan for this table?


**Object IDs** can also be passed into remote functions. When the function actually gets executed, **the argument will be a retrieved as a regular Python object**.

.. code:: python
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block doesn't run.

1. **Invocation:** The regular version is called with ``regular_function()``, whereas the remote version is called with ``remote_function.remote()``.
2. **Return values:** ``regular_function`` immediately executes and returns ``1``, whereas ``remote_function`` immediately returns an object ID (a future) and then creates a task that will be executed on a worker process. The result can be retrieved with ``ray.get``.

.. code:: python
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For as many of these as possible, I would remove the >>> because it makes copy and paste harder. For example, this code block could be rewritten as

regular_function()  # 1

remote_function.remote()  # ObjectID(1c80d6937802cd7786ad25e50caf2f023c95e350)

ray.get(remote_function.remote())  # 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to push back on this - I think this is non-standard and not as reader friendly as you expect, especially for those who are glancing over documentation.

For examples, this is definitely necessary.

On the other hand, I can leave a note to show that it's actually possible to copy and paste these: https://numpydoc.readthedocs.io/en/latest/format.html#sections

In the future, we can incorporate a modded copybutton: https://docs.scipy.org/doc/numpy-1.16.0/reference/generated/numpy.random.randint.html
https://github.com/scipy/scipy-sphinx-theme/blob/c466764e2231ba132c09826b5b138fffa1cfcec3/_theme/scipy/static/js/copybutton.js

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why it's not reader friendly is because the in-line comment is not a standard signifier for return value. However >>> and subsequent lines are standard for python interpreter output.

done = self.cur_pos >= self.end_pos
return [self.cur_pos], 1 if done else 0, done, {}

tune.run(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that for a real RL application it will run for a long time, but this is code that people will be copying and pasting from the browser into the terminal just to verify that it runs (e.g., that they managed to install it properly) and it's a better experience if it runs to completion instead of the user having to manually kill it.

@@ -41,104 +42,3 @@ Here are links to the latest wheels (which are built off of master). To install
.. _`MacOS Python 3.6`: https://s3-us-west-2.amazonaws.com/ray-wheels/latest/ray-0.8.0.dev2-cp36-cp36m-macosx_10_6_intel.whl
.. _`MacOS Python 3.5`: https://s3-us-west-2.amazonaws.com/ray-wheels/latest/ray-0.8.0.dev2-cp35-cp35m-macosx_10_6_intel.whl
.. _`MacOS Python 2.7`: https://s3-us-west-2.amazonaws.com/ray-wheels/latest/ray-0.8.0.dev2-cp27-cp27m-macosx_10_6_intel.whl


Building Ray from source
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that you moved this to a different file (which doesn't appear to be linked to from anywhere). However, why not leave it here? This is the obvious place Ray developers would look for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll move this back.


.. code-block:: bash

conda install libgcc
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably not needed anymore though.

that it can submit tasks to its raylet and get objects from the object
store, but it is different in that the raylet will not assign tasks to
the driver to be executed.
- A **Redis server** maintains much of the system's state. For example, it keeps
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are actually multiple Redis servers.

Also, there are some monitor processes (on the head node) and log monitor process (on each machine).


To get information about the current available resource capacity of your cluster, you can use ``ray.available_resources()``.

.. autofunction:: ray.cluster_resources
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.. autofunction:: ray.cluster_resources
.. autofunction:: ray.available_resources

:noindex:


Object Information
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this will be useful to people, I'd probably omit it.

Profiling for Ray Users
=======================
How-to: Profile Ray programs
============================

This document is intended for users of Ray who want to know how to evaluate
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This page either needs to be improved a lot or scrapped.

However, the timeline stuff is super important, so I would keep that (maybe make it its own page).

@richardliaw
Copy link
Contributor Author

@robertnishihara I'll make a separate PR for this as this was merged a bit prematurely, though there's no need to revert.

edoakes pushed a commit to edoakes/ray that referenced this pull request Aug 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants