You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A simple rdataone program error can create a package that will be serialized as valid RDF/XML (resmap) but will not be indexible by DataONE. The case that will be shown below should be able to be caught by rdataone before uploading the package, and either a warning printed, or the package repaired before upload.
The following program shows this case, with source comments indicating the erroneous lines and the effect.
# Create a DataObject and add it to the DataPackage
library(datapack)
library(uuid)
d1c_test <- D1Client("STAGING", "urn:node:mnTestARCTIC")
packageID <- "resource_map_urn:uuid:825cee81-e676-4a58-9a32-054884376c0c"
dp <- getDataPackage(d1c_test, packageID, lazyLoad = TRUE, quiet = FALSE)
dataID <- selectMember(dp, name = "sysmeta@fileName", value = "OwlNightj.csv")
# The next linee should be:
# dp <- replaceMember(dp, dataID, replacement=system.file("./extdata/pkg-example/binary.csv.zip", package="datapack"), formatId="application/zip")
# The next, erroneous line has the effect of the datapackage 'dp' not be updated correctly, causing the package relationships to become corrupted, and not indexable by DataONE
replaceMember(dp, dataID, replacement=system.file("./extdata/pkg-example/binary.csv.zip", package="datapack"), formatId="application/zip")
filePath <- file.path(sprintf("%s/%s.rdf", tempdir(), packageID))
status <- serializePackage(dp, filePath, id=packageID, resolveURI="https://cn-stage.test.dataone.org/cn/v2/resolve")
writeLines(readLines(filePath))
The resource map below shows that the pid urn:uuid:301e805a-66cf-41e1-99a4-2c459638802f does not have the DataONE 'resolve' URL, seen on line 34, 66:
This pid was from the original, downloaded package, and should have been deleted (then replaced) from the datapackage 'dp' as well as all it's relationships.
When the package is serialized, all package members have their pids 'promoted' to include the DataONE resolve URL. Since this pid was no longer in the package list, but it's relationships were, the pid was not 'promoted' and this caused a problem for the D1 indexer, as it was shown to be 'isDocumentedBy' but was not resolvable.
It should be able to detect these type of pids, that have isDocumentedBy relationships but are no longer package members.
Then a warning could be printed, or the offending relationships be removed.
The text was updated successfully, but these errors were encountered:
A simple rdataone program error can create a package that will be serialized as valid RDF/XML (resmap) but will not be indexible by DataONE. The case that will be shown below should be able to be caught by rdataone before uploading the package, and either a warning printed, or the package repaired before upload.
The following program shows this case, with source comments indicating the erroneous lines and the effect.
The resource map below shows that the pid
urn:uuid:301e805a-66cf-41e1-99a4-2c459638802f
does not have the DataONE 'resolve' URL, seen on line 34, 66:This pid was from the original, downloaded package, and should have been deleted (then replaced) from the datapackage 'dp' as well as all it's relationships.
When the package is serialized, all package members have their pids 'promoted' to include the DataONE resolve URL. Since this pid was no longer in the package list, but it's relationships were, the pid was not 'promoted' and this caused a problem for the D1 indexer, as it was shown to be 'isDocumentedBy' but was not resolvable.
It should be able to detect these type of pids, that have
isDocumentedBy
relationships but are no longer package members.Then a warning could be printed, or the offending relationships be removed.
The text was updated successfully, but these errors were encountered: