Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uploadDataPackage should support 'common' objects #251

Closed
gothub opened this issue Jun 5, 2020 · 2 comments
Closed

uploadDataPackage should support 'common' objects #251

gothub opened this issue Jun 5, 2020 · 2 comments
Assignees
Milestone

Comments

@gothub
Copy link
Collaborator

gothub commented Jun 5, 2020

A typical DataONE use case is to have the same data object in multiple packages.

It should be possible to download a package using getDataPackage(), add a package member for a pid that already exists in DataONE (in another package, for example), and upload the package using uploadDataPackage(). In this case, uploadDataPackage() should not try to upload the 'common' data object to DataONE again, and should include this new package member in the resource map.

@ranicrab
Copy link

It would be great to get this implemented - we share species and site tables across multiple datasets and currently do not have a way to upload these datasets in R unless we create different identifiers for the same table; we would prefer not to do that if possible!

@gothub
Copy link
Collaborator Author

gothub commented Sep 1, 2020

Existing DataONE objects can now be included into new DataPackages. The workflow to use is:

  • create a new DataPackage
  • create and add new DataObjects from local files if desired
  • create and new DataObjects from existing DataONE items if desired
  • upload the package to DataONE
    Here is an example that adds one new DataObject and one existing item:
library(dataone)
library(datapack)

d1c <- D1Client("STAGING2", "urn:node:mnTestKNB")
dp <- new("DataPackage")

# Read in and create a metadata object that describes science data
emlFile <- system.file("extdata/strix-pacific-northwest.xml", package="dataone")
metadataObj <- new("DataObject", format="eml://ecoinformatics.org/eml-2.1.1", filename=emlFile)
dp <- addMember(dp, metadataObj)

# Create a new DataObject from a local file
progFile <- system.file("extdata/filterObs.R", package="dataone")
progObj <- new("DataObject", format="application/R", filename=progFile, mediaType="text/x-rsrc")
dp <- addMember(dp, progObj, metadataObj)

# Reuse and existing DataONE object
# OwlNightj.csv
sourceObj <- getDataObject(d1c, id="urn:uuid:4dc4a896-31c2-4185-b1d7-ebb37f3f9cd6", lazyLoad=T, limit="1GB", quiet=F);
dp <- addMember(dp, sourceObj, metadataObj)

# Upload the data package to DataONE
pkgId <- uploadDataPackage(d1c, dp, public=TRUE, quiet=FALSE)

A minor fix was made to uploadDataPackage that prevented downloaded items to be used. This fix was made in commit ce5f02c

@gothub gothub closed this as completed Sep 1, 2020
@gothub gothub added this to the 2.2.0 milestone Oct 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants