-
Notifications
You must be signed in to change notification settings - Fork 19
/
v01-dataone-overview.Rmd
98 lines (75 loc) · 3.6 KB
/
v01-dataone-overview.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
---
title: "dataone Package Overview"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{dataone Package Overview}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
%\usepackage[utf8]{inputenc}
---
## *dataone* R Package Overview
The *dataone* R package enables R scripts to search, download and upload science data and metadata
to the [DataONE Federation](https://www.dataone.org). This package calls DataONE
web services that allow client programs to interact with [Member Nodes (MN)](https://www.dataone.org/current-member-nodes)
and DataONE Coordinating Nodes (CN).
## Quick Start
See the full manual (`help dataone`) for documentation.
To search the DataONE Federation Member Node *Knowledge Network for Biocomplexity (KNB)* for a dataset:
```{r, warning=FALSE, eval=FALSE}
library(dataone)
cn <- CNode("PROD")
mn <- getMNode(cn, "urn:node:KNB")
mySearchTerms <- list(q="abstract:salmon+AND+keywords:acoustics+AND+keywords:\"Oncorhynchus nerka\"",
fl="id,title,dateUploaded,abstract,size",
fq="dateUploaded:[2013-01-01T00:00:00.000Z TO 2014-01-01T00:00:00.000Z]",
sort="dateUploaded+desc")
result <- query(mn, solrQuery=mySearchTerms, as="data.frame")
result[1,c("id", "title")]
pid <- result[1,'id']
```
The metadata file located in the above search can be downloaded with the commands:
```{r, warning=FALSE, eval=FALSE}
library(XML)
metadata <- rawToChar(getObject(mn, pid))
```
The metadata file that describes the located research can be viewed in an XML viewer or text editor, once
it is written to a disk file. This file details a data file (CSV) that can be obtained using the listed
identifier, using the commands:
```{r, warning=FALSE, eval=FALSE}
dataRaw <- getObject(mn, "df35d.443.1")
dataChar <- rawToChar(dataRaw)
theData <- textConnection(dataChar)
df <- read.csv(theData, stringsAsFactors=FALSE)
df[1,]
```
Uploading a CSV file to a DataONE Member Node requires user authentication. DataONE user
authentication is described in the vignette `DataONE-Federation`.
Once the authentication steps have been followed, uploading is done with:
```{r, warning=FALSE,eval=FALSE}
library(datapack)
library(uuid)
d1c <- D1Client("STAGING", "urn:node:mnStageUCSB2")
id <- paste("urn:uuid:", UUIDgenerate(), sep="")
testdf <- data.frame(x=1:10,y=11:20)
csvfile <- paste(tempfile(), ".csv", sep="")
write.csv(testdf, csvfile, row.names=FALSE)
# Build a DataObject containing the csv, and upload it to the Member Node
d1Object <- new("DataObject", id, format="text/csv", filename=csvfile)
```
```{r, warning=FALSE, eval=FALSE}
uploadDataObject(d1c, d1Object, public=TRUE)
```
## Additional Resources
The `dataone` R package vignettes can be viewed using the R `vignette` command,
for example `vignette("dataone-overview")`.
The `dataone` vignettes describe these topics:
- Searching for data in DataONE described in vignette *searching-dataone*.
- Downloading data from DataONE described in vignette *download-data*
- Uploading data to DataONE is described in vignettes *upload-data* and *update-package*.
- The DataONE Federation is described in the vignettes *DataONE-Federation*.
## Acknowledgements
Work on this package was supported by:
- NSF-ABI grant #1262458 to C. Gries, M. Jones, and S. Collins.
- NSF-DATANET grants #0830944 and #1430508 to W. Michener, M. Jones, D. Vieglais, S. Allard and P. Cruse
Additional support was provided for working group collaboration by the National Center for Ecological Analysis and Synthesis, a Center funded by the University of California, Santa Barbara, and the State of California.