Skip to content

Commit

Permalink
Update docstrings all through (#9)
Browse files Browse the repository at this point in the history
Fix a few more bugs with upload
  • Loading branch information
jkanche authored May 19, 2024
1 parent dbdfdd1 commit b6c8a1c
Show file tree
Hide file tree
Showing 26 changed files with 886 additions and 153 deletions.
21 changes: 16 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,20 +4,31 @@
[![Built Status](https://api.cirrus-ci.com/github/<USER>/gypsum-client.svg?branch=main)](https://cirrus-ci.com/github/<USER>/gypsum-client)
[![ReadTheDocs](https://readthedocs.org/projects/gypsum-client/badge/?version=latest)](https://gypsum-client.readthedocs.io/en/stable/)
[![Coveralls](https://img.shields.io/coveralls/github/<USER>/gypsum-client/main.svg)](https://coveralls.io/r/<USER>/gypsum-client)
[![PyPI-Server](https://img.shields.io/pypi/v/gypsum-client.svg)](https://pypi.org/project/gypsum-client/)
[![Conda-Forge](https://img.shields.io/conda/vn/conda-forge/gypsum-client.svg)](https://anaconda.org/conda-forge/gypsum-client)
[![Monthly Downloads](https://pepy.tech/badge/gypsum-client/month)](https://pepy.tech/project/gypsum-client)
[![Twitter](https://img.shields.io/twitter/url/http/shields.io.svg?style=social&label=Twitter)](https://twitter.com/gypsum-client)
-->

[![Project generated with PyScaffold](https://img.shields.io/badge/-PyScaffold-005CA0?logo=pyscaffold)](https://pyscaffold.org/)
[![PyPI-Server](https://img.shields.io/pypi/v/gypsum-client.svg)](https://pypi.org/project/gypsum-client/)

# Python client to the gypsum REST API


Provides Python client for the [**gypsum** REST API](https://github.com/ArtifactDB/gypsum-worker).

Readers are referred to the [API's documentation](https://gypsum-test.aaron-lun.workers.dev) or the [user guide](https://bioconductor.org/packages/devel/bioc/vignettes/gypsum/inst/doc/userguide.html) from its R equivalent for more details.

# gypsum-client
***Note: check out the R/Bioconductor package for the gypsum client [here](https://github.com/ArtifactDB/gypsum-R).***

> Add a short description here!
## Installation

A longer description of your project goes here...
Package is published to [PyPI](https://pypi.org/project/gypsum-client/),

```sh
pip install gypsum_client
```

<!-- pyscaffold-notes -->

Expand Down
13 changes: 12 additions & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@
"sphinx.ext.ifconfig",
"sphinx.ext.mathjax",
"sphinx.ext.napoleon",
"sphinx_autodoc_typehints",
]

# Add any paths that contain templates here, relative to this directory.
Expand Down Expand Up @@ -166,12 +167,22 @@
# If this is True, todo emits a warning for each TODO entries. The default is False.
todo_emit_warnings = True

autodoc_default_options = {
# 'members': 'var1, var2',
# 'member-order': 'bysource',
"special-members": True,
"undoc-members": True,
"exclude-members": "__weakref__, __dict__, __str__, __module__",
}

autosummary_generate = True
autosummary_imported_members = True

# -- Options for HTML output -------------------------------------------------

# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
html_theme = "alabaster"
html_theme = "furo"

# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
Expand Down
2 changes: 2 additions & 0 deletions docs/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,5 @@
# sphinx_rtd_theme
myst-parser[linkify]
sphinx>=3.2.1
furo
sphinx-autodoc-typehints
1 change: 1 addition & 0 deletions src/gypsum_client/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@

from .auth import access_token, set_access_token
from .clone_operations import clone_version
from .config import REQUESTS_MOD
from .create_operations import create_project
from .fetch_metadata_database import fetch_metadata_database
from .fetch_metadata_schema import fetch_metadata_schema
Expand Down
22 changes: 17 additions & 5 deletions src/gypsum_client/auth.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import os
import time
from typing import Optional
from typing import Optional, Union

import requests
from filelock import FileLock
Expand All @@ -25,13 +25,19 @@ def access_token(
request: bool = True,
cache_dir: Optional[str] = _cache_directory(),
token_expiration_limit: int = 10,
) -> Optional[str]:
) -> Optional[Union[str, dict]]:
"""Get GitHub access token for authentication to the gypsum API's.
Example:
.. code-block:: python
token = access_token()
Args:
full:
Whether to return the full token details.
Defaults to False.
Defaults to False, only ``token`` is returned.
request:
Whether to request a new token if no token is found or the
Expand All @@ -45,7 +51,10 @@ def access_token(
Integer specifying the number of seconds until the token expires.
Returns:
The GitHub token to access gypsum's resources.
If `full=False` A string specifying the GitHub token to
access gypsum's resources.
If `full=True` retuns a dicionary containing the full token details.
"""
global TOKEN_CACHE

Expand Down Expand Up @@ -123,7 +132,10 @@ def set_access_token(
Defaults to None, indicating token is not cached to disk.
Returns:
The GitHub token to access gypsum's resources.
Dictionary containing the following keys:
- ``token``, a string containing the token.
- ``name``, the name of the GitHub user authenticated by the token.
- ``expires``, the Unix time at which the token expires.
"""
global TOKEN_CACHE

Expand Down
90 changes: 54 additions & 36 deletions src/gypsum_client/clone_operations.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,43 @@
"""Clone a version's directory structure.
Cloning of a versioned asset involves creating a directory at the destination
that has the same contents as the corresponding project-asset-version directory.
All files in the specified version are represented as symlinks from the
destination to the corresponding file in the cache.
The idea is that, when the destination is used in
:py:func:`~gypsum_client.prepare_directory_for_upload.prepare_directory_upload`,
the symlinks are converted into upload links, i.e., ``links=`` in
:py:func:`~gypsum_client.upload_api_operations.start_upload`.
This allows users to create new versions very cheaply as duplicate files
are not uploaded to/stored in the backend.
Users can more-or-less do whatever they want inside the cloned destination,
but they should treat the symlink targets as read-only.
That is, they should not modify the contents of the linked-to file, as these
refer to assumed-immutable files in the cache.
If a file in the destination needs to be modified, the symlink should be
deleted and replaced with an actual file;
this avoids mutating the cache and it ensures that
:py:func:`~gypsum_client.prepare_directory_for_upload.prepare_directory_upload`
recognizes that a new file actually needs to be uploaded.
Advanced users can set ``download=False``, in which case symlinks are created
even if their targets are not present in the cache.
In such cases, the destination should be treated as write-only due to the
potential presence of dangling symlinks.
This mode is useful for uploading a new version of an asset without
downloading the files from the existing version,
assuming that the modifications associated with the former can be
achieved without reading any of the latter.
On Windows, the user may not have permissions to create symbolic links,
so the function will transparently fall back to creating hard links or
copies instead.
This precludes any optimization by prepare_directory_upload as the hard
links/copies cannot be converted into upload links.
It also assumes that download=True as dangling links/copies cannot be created.
"""

import errno
import os
import shutil
Expand Down Expand Up @@ -26,42 +66,20 @@ def clone_version(
Clone the directory structure for a versioned asset into a separate location.
This is typically used to prepare a new version for a lightweight upload.
Cloning of a versioned asset involves creating a directory at the destination
that has the same contents as the corresponding project-asset-version directory.
All files in the specified version are represented as symlinks from the
destination to the corresponding file in the cache.
The idea is that, when the destination is used in
:py:func:`~gypsum_client.prepare_directory_upload.prepare_directory_upload`,
the symlinks are converted into upload links, i.e., links= in
:py:func:`~gypsum_client.start_upload.start_upload`.
This allows users to create new versions very cheaply as duplicate files
are not uploaded to/stored in the backend.
Users can more-or-less do whatever they want inside the cloned destination,
but they should treat the symlink targets as read-only.
That is, they should not modify the contents of the linked-to file, as these
refer to assumed-immutable files in the cache.
If a file in the destination needs to be modified, the symlink should be
deleted and replaced with an actual file;
this avoids mutating the cache and it ensures that
:py:func:`~gypsum_client.prepare_directory_upload.prepare_directory_upload`
recognizes that a new file actually needs to be uploaded.
Advanced users can set download=False, in which case symlinks are created
even if their targets are not present in the cache.
In such cases, the destination should be treated as write-only due to the
potential presence of dangling symlinks.
This mode is useful for uploading a new version of an asset without
downloading the files from the existing version,
assuming that the modifications associated with the former can be
achieved without reading any of the latter.
On Windows, the user may not have permissions to create symbolic links,
so the function will transparently fall back to creating hard links or
copies instead.
This precludes any optimization by prepare_directory_upload as the hard
links/copies cannot be converted into upload links.
It also assumes that download=True as dangling links/copies cannot be created.
See Also:
:py:func:`~gypsum_client.prepare_directory_for_upload.prepare_directory_upload`,
to prepare an upload based on the directory contents.
Example:
.. code-block:: python
import tempfile
cache = tempfile.mkdtemp()
dest = tempfile.mkdtemp()
clone_version("test-R", "basic", "v1", destination=dest, cache_dir=cache)
Args:
project:
Expand Down
18 changes: 15 additions & 3 deletions src/gypsum_client/config.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,20 @@
"""
Set this to False if SSL certificates are not properly setup on your machine.
essentially sets ``verify=False`` on all requests to the gypsum REST API.
Example:
.. code-block::python
from gypsum_client import REQUESTS_MOD
# to set verify to False
REQUESTS_MOD["verify"] = False
"""

__author__ = "Jayaram Kancherla"
__copyright__ = "Jayaram Kancherla"
__license__ = "MIT"

## Set this to False if SSL certificates are not properly setup on your machine.
## essentially sets verify=False on all requests going out.

REQUESTS_MOD = {"verify": True}
29 changes: 22 additions & 7 deletions src/gypsum_client/create_operations.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from typing import List
from typing import List, Union
from urllib.parse import quote_plus

import requests
Expand All @@ -14,7 +14,7 @@

def create_project(
project: str,
owners: List[str],
owners: Union[str, List[str]],
uploaders: List[str] = [],
baseline: int = None,
growth_rate: int = None,
Expand All @@ -24,6 +24,20 @@ def create_project(
):
"""Create a new project with the associated permissions.
See Also:
:py:func:`~gypsum_client.remove_operations.remove_project`,
to remove the project.
Example:
.. code-block:: python
createProject(
"test-Py-create",
owners="jkanche",
uploaders=[{"id": "ArtifactDB-bot"}]
)
Args:
project:
Project name.
Expand All @@ -32,6 +46,8 @@ def create_project(
List of GitHub users or organizations that are owners of this
project.
May also be a string containing the Github user or organization.
uploaders:
List of authorized uploaders for this project.
Defaults to an empty list.
Expand All @@ -53,9 +69,7 @@ def create_project(
token:
GitHub access token to authenticate with the gypsum REST API.
Returns:
True if project is successfully created.
The token must refer to a gypsum administrator account.
"""
url = _remove_slash_url(url)
uploaders = _sanitize_uploaders(uploaders) if uploaders is not None else []
Expand All @@ -73,6 +87,9 @@ def create_project(
if year is not None:
quota["year"] = year

if isinstance(owners, str):
owners = [owners]

body = {"permissions": {"owners": owners, "uploaders": uploaders}}
if len(quota) > 0:
body["quota"] = quota
Expand All @@ -89,5 +106,3 @@ def create_project(
raise Exception(
f"Failed to create a project, {req.status_code} and reason: {req.text}"
) from e

return True
31 changes: 25 additions & 6 deletions src/gypsum_client/fetch_metadata_database.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,11 @@
"""Fetch the metadata database.
This function will automatically check for updates to the SQLite files
and will download new versions accordingly. New checks are performed when one hour
or more has elapsed since the last check. If the check fails, a warning is raised
and the function returns the currently cached file.
"""

import os
import tempfile
import time
Expand All @@ -21,16 +29,27 @@ def fetch_metadata_database(
) -> str:
"""Fetch the SQLite database containing metadata from the gypsum backend.
This function will automatically check for updates to the SQLite files
and will download new versions accordingly. New checks are performed when one hour
or more has elapsed since the last check. If the check fails, a warning is raised
and the function returns the currently cached file.
See `metadata index <https://github.com/ArtifactDB/bioconductor-metadata-index>`_
for more details.
Each database is generated by aggregating metadata across multiple assets
and/or projects, and can be used to perform searches for interesting objects.
See Also:
:py:func:`~gypsum_client.fetch_metadata_schema.fetch_metadata_schema`, to get
the JSON schema used to define the database tables.
Example:
.. code-block:: python
sql_path = fetch_metadata_database()
Args:
name:
Name of the database.
This can be the name of any SQLite file in
https://github.com/ArtifactDB/bioconductor-metadata-index/releases/tag/latest.
This can be the name of any SQLite file published
`here <https://github.com/ArtifactDB/bioconductor-metadata-index/releases/tag/latest>`_.
Defaults to "bioconductor.sqlite3".
Expand Down
Loading

0 comments on commit b6c8a1c

Please sign in to comment.