Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make user agent string customisable #557

Merged
merged 5 commits into from
Nov 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ jobs:
runs-on: ubuntu-latest
container:
image: ${{ matrix.ckan-image }}
options: --user root
services:
solr:
image: ckan/ckan-solr:${{ matrix.ckan-version }}-solr9
Expand Down Expand Up @@ -63,7 +64,7 @@ jobs:
- name: Install requirements (2.9)
run: |
pip install -U pytest-rerunfailures
if: ${{ matrix.ckan-version == '2.9' }}
if: ${{ matrix.ckan-version == '2.9' }}
- name: Setup extension (CKAN >= 2.9)
run: |
ckan -c test.ini db init
Expand Down
5 changes: 4 additions & 1 deletion README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -232,7 +232,7 @@ For example, in case you want to retain changes made by the users to the fields
Command line interface
======================

The ``ckan harvester`` command provides utilities to manage harvest operations from the command line.
The ``ckan harvester`` command provides utilities to manage harvest operations from the command line.
Please refer to the help message of each command for more details::


Expand Down Expand Up @@ -329,6 +329,9 @@ field. The currently supported configuration options are:
* api_key: If the remote CKAN instance has restricted access to the API, you
can provide a CKAN API key, which will be sent in any request.

* user_agent: Set a custom user agent string on gathering and fetching,
to handle servers that whitelist or blacklist specific values.

* read_only: Create harvested packages in read-only mode. Only the user who
performed the harvest (the one defined in the previous setting or the
'harvest' sysadmin) will be able to edit and administer the packages
Expand Down
5 changes: 5 additions & 0 deletions ckanext/harvest/harvesters/ckanharvester.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,11 @@ def _get_search_api_offset(self):
def _get_content(self, url):

headers = {}

user_agent = self.config.get('user_agent')
if user_agent:
headers['User-Agent'] = str(user_agent)

api_key = self.config.get('api_key')
if api_key:
headers['Authorization'] = api_key
Expand Down