Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Block harvest_source_list API endpoint on catalog #4725

Closed
FuhuXia opened this issue May 1, 2024 · 1 comment
Closed

Block harvest_source_list API endpoint on catalog #4725

FuhuXia opened this issue May 1, 2024 · 1 comment
Assignees
Labels
bug Software defect or bug

Comments

@FuhuXia
Copy link
Member

FuhuXia commented May 1, 2024

Endpoint from ckanext-harvest harvest_source_list includes deleted harvest sources in the result. Anonymous user is not supposed to see deleted packages. The API does not support pagination. In order to show catalog's all harvest sources, we have to set a very high limit (2000?) to include all current (active) and deleted (inactlive) sources in one API call, which is very slow.

I think we should block this API endpoint and guide user to use alternative APIs

  1. Call this API to get all harvest sources in paginated results:
    https://catalog.data.gov/api/action/package_search?fq=(dataset_type:harvest)&fl=id,name,url,organization&rows=1000

  2. Get details on a specific source with this API. You can use either id or name:
    https://catalog.data.gov/api/action/harvest_source_show?id=energy-json

How to reproduce

https://catalog.data.gov/api/action/harvest_source_list

search active: false in the result

Sketch

We have a list of blocked api endpoint in nginx config:

https://github.com/GSA/catalog.data.gov/blob/8dda50797980f40d6921aa3e299087ddfe31d8c9/proxy/nginx-common.conf#L27-L44

@FuhuXia FuhuXia added the bug Software defect or bug label May 1, 2024
@gujral-rei
Copy link

Redirect the call to package search API call.

@gujral-rei gujral-rei moved this to 📔 Product Backlog in data.gov team board May 2, 2024
@btylerburton btylerburton moved this from 📔 Product Backlog to 📥 Queue in data.gov team board Oct 10, 2024
@FuhuXia FuhuXia self-assigned this Nov 22, 2024
@FuhuXia FuhuXia closed this as completed Nov 25, 2024
@github-project-automation github-project-automation bot moved this from 👀 Needs Review [2] to ✔ Done in data.gov team board Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Software defect or bug
Projects
Status: ✔ Done
Development

No branches or pull requests

2 participants