Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update endpoint for Fluxnet2015 #3172

Open
meetagrawal09 opened this issue May 22, 2023 · 3 comments
Open

Update endpoint for Fluxnet2015 #3172

meetagrawal09 opened this issue May 22, 2023 · 3 comments

Comments

@meetagrawal09
Copy link
Collaborator

Bug Description

The endpoint used to retrieve Fluxnet2015 data does not work.

To Reproduce

Run the below code block :

sitename   <- 'Niwot Ridge Forest/LTER NWT1'

site <- sub(".* \\((.*)\\)", "\\1", sitename)

url <- "http://wile.lbl.gov:8080/AmeriFlux/DataDownload.svc/datafileURLs"
json_query <- paste0("{
              \"username\":\"", "pecan", "\",
              \"siteList\":[\"", site, "\"],
              \"intendedUse\":\"Research - Land model/Earth system model\",
              \"description\":\"PEcAn download\",
              \"dataProduct\":\"SUBSET\",
              \"policy\":\"TIER1\"}")
              
result <- httr::POST(url, body = json_query, encode = "json", httr::add_headers(`Content-Type` = "application/json"))
link <- httr::content(result)
print(link)

This should ideally print an FTP link but the current output is :

[1] "This endpoint is no longer available. Please contact [email protected] for assistance."

Expected behavior

We should be able to get back a valid FTP link after a call to the API.

@mdietze
Copy link
Member

mdietze commented May 22, 2023

FLUXNET2015 data is still available at https://fluxnet.org/data/download-data/. This requires login but sends you to https://ftp.fluxdata.org for download. I think it would violate the spirit (if not the letter) of their system to attempt to automate the download from their site. I see three options, but am open to other ideas

  1. Deprecate support for FLUXNET2015 (not ideal since it's an open dataset that includes a lot of valuable tower data)
  2. Provide users who want to use this data with instructions on how to download from fluxnet.org and then how to tell PEcAn where that download is so that the workflow can pick up from a manual download and proceed on to met2CF, gapfilling, and model conversions
  3. Request permission from FLUXNET to download the entire dateset, convert it all to CF, and redistribute internally within the PEcAn workflow (e.g. on one of the PEcAn servers via the PEcAn API or a S3 bucket).

Both 2 and 3 are a nontrivial amount of work, but 2 is probably more sustainable (we don't need to maintain a file store) and more likely acceptable to FLUXNET. If we go down that route, in addition to developing the required documentation and vignette, we should make sure to remove FLUXNET2015 from the PEcAn tutorials. For integration testing we will either need to drop FLUXNET2015 or cache the raw downloaded files for the sites that are part of the test so that we can verify that the remaining steps work correctly. This (partially) recreates some of the option 3 problems, as this data volume would be much larger than a unit test, and thus shouldn't live in the data.atm package itself.

@ankurdesai
Copy link
Contributor

Yes, Ameriflux changed their API for access for Ameriflux, Fluxnet, Fluxnet2015 data earlier in 2022. There was a thread on this, somewhere, but will reproduce here. I think we updated this for Ameriflux using Housen's amerifluxr package but not the others - however, it looks like it isn't too hard a fix:

From [email protected] Sept 23, 2021

We are overhauling our Web service stack, and the download service will be available via a different endpoint, with a slightly different payload and response (nothing major). This is due in part to the support of the newer CC-By4.0 policy as well.

The new service will be available via https://amfcdn.lbl.gov/api/v1/data_download and the payload will be something like the following (everything in the json after the -d flag in the sample curl request):

curl -X POST -d '{"user_id": <user_id>, "user_email": , "data_product": <BASE-BADM|FLUXNET2015|FLUXNET-CH4>, "data_variant": <FULLSET|SUBSET> (optional for BASE-BADM), "data_policy": <CCBY4.0|LEGACY|TIER2> (Tier2 for FLUXNET products, LEGACY for BASE_BADM, "site_ids": ["site_id1", "site_id2", ...], "intended_use": , "description":,
"is_test": true (optional argument for testing purposes of the service, download emails will not be sent to site teams)}' https://amfcdn.lbl.gov/api/v1/data_download

It will probably also be necessary to let us know the request IP/domain for where the requests will be coming in so that we can identify valid sources.

Please let us know if you need further assistance and we will work with you or someone else to continue providing access to AmeriFlux data products.

Best, -- You-Wei


Sept 24, 2021 from You-Wei

I poked around the R repo, and this code for FLUXNET2015 will need similar changes if it is still used: https://github.com/PecanProject/pecan/blob/develop/modules/data.atmosphere/R/download.Fluxnet2015.R on lines 34 and 35 as well.

The endpoints to get the policies are here:
https://amfcdn.lbl.gov/api/v1/site_availability/FLUXNET/FLUXNET2015
https://amfcdn.lbl.gov/api/v1/data_availability/FLUXNET/FLUXNET2015/CCBY4.0
https://amfcdn.lbl.gov/api/v1/data_availability/FLUXNET/FLUXNET2015/TIER2
Best, -- You-Wei


From Housen Chu [email protected] Jan 18, 2022

A small update on this. I have worked out the fixes to Pecan to download Ameriflux. However, I learned on twitter that Housen has a new R package for download Ameriflux https://github.com/chuhousen/amerifluxr . Do you think it is better if I replaced our Pecan download code with that one knowing it will be maintained and has the same functionality?


Housen Chu:

I plan to submit it to CRAN soon but feel free to use it as of now. I think you may need to modify a small piece of code (e.g., remove the data policy acknowledgment prompt) for PeCAN if it runs regularly. Also, I haven't included the FLUXNET product in this version given it's still changing frequently.


You might follow up by contact to ameriflux-support

Copy link

This issue is stale because it has been open 365 days with no activity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants