Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataSet configuration #46

Merged
merged 7 commits into from
May 29, 2022
Merged

DataSet configuration #46

merged 7 commits into from
May 29, 2022

Conversation

c42f
Copy link
Contributor

@c42f c42f commented May 26, 2022

This change allows dataset configuration to be modified via the DataSets.config!() API.

A notable internal change is that this requires each DataSet to be owned by one project, such that dataset changes can be written back to the project Data.toml (or other dataset project storage, such as JuliaHub).

Now that datasets can be mutated via the API, I found it necessary to remove and deprecate the crude @__DIR__ templating mechanism for local file storage dataset paths, and simply specify that relative paths are relative to the location of the Data.toml, via the local_data_abspath() mechanism.


@jeremiedb, here's the user-facing documentation:

    config!(name::AbstractString; kws...)
    config!(proj::AbstractDataProject, name::AbstractString; kws...)

    config!(dataset::DataSet; kws...)

Update the configuration of dataset with the given keyword arguments and
persist it in the dataset's project storage. The versions which take a name
use that name to search within the given data project.

Examples

Update the description of the dataset named "SomeData" in the global project:

DataSets.config!("SomeData"; description="This is a description")

Alternatively, setting DataSet properties can be used to update metadata. For
example, to tag the dataset "SomeData" with tags "A" and "B".

ds = dataset("SomeData")
ds.tags = ["A", "B"]

c42f added 6 commits May 23, 2022 16:44
Templating paths with `@__DIR__` was always a hack because it was
unaware of the structure of the TOML file and didn't generalize well.

Replace this with querying of a `DataSet`s parent project for its
absolute path (or rather, by asking the project to join its root path to
the relative path contained in the Data.toml).

This allows DataSet instances to be safely saved back to their parent
projects.
@c42f c42f mentioned this pull request May 27, 2022
15 tasks
This improves path handling for data projects which will be saved back
to disk.
@c42f c42f force-pushed the cjf/dataset-config branch from 7c9a70a to bf6b295 Compare May 27, 2022 05:55
@c42f c42f merged commit d35f11e into master May 29, 2022
@c42f c42f deleted the cjf/dataset-config branch May 29, 2022 04:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant