A bunch of scripts to perform tasks using CKAN API and https://github.com/GSA/ckan-php-client
- PHP 7.0+ : http://php.net
$ git clone https://github.com/GSA/ckan-php-manager.git
Use composer to install/update dependencies
If you don't have Composer, install it:
$ curl -sS https://getcomposer.org/installer | php
$ mv composer.phar /usr/local/bin/composer
$ composer install
Copy config.sample.php to config.php. Update it with your custom values, if needed.
$ cp inc/config.sample.php inc/config.php
- Update
cli/export_packages_by_org.php
, editing the title of exported organization ORGANIZATION_TO_EXPORT - Run importer using php
$ php cli/export_packages_by_org.php
Script is taking all terms, including sub-agencies from http://www.data.gov/app/themes/roots-nextdatagov/assets/Json/fed_agency.json and makes CKAN requests, looking for packages by these organization list.
Results can be found in /results/{timestamp} dir after script finished its work, including _{term}.log
with package counts for each agency.
To add tag add_legacy_dms_and_make_private
to all datasets of some group:
- Update ORGANIZATION_TO_TAG in the
cli/add_legacy_dms_and_make_private.php
- Double check CKAN_URL and CKAN_API_KEY for editing datasets
- Run script
$ php cli/add_legacy_dms_and_make_private.php
-
Put csv files to /data dir, with
assign_<any-title>.csv
(must haveassign_
prefix) The format of these files must be:dataset, group, categories
First line is caption, leave the first line in each file:
dataset,group,categories
Then put one dataset per line.
-
Dataset can be: * Dataset url, ex. https://catalog.data.gov/dataset/food-access-research-atlas * Dataset name, ex. download-crossing-inventory-data-highway-rail-crossing * Dataset id
-
Group just one group per line. If you need to add multiple groups, you must create another row in csv with same dataset and another group, because all the categories are tagged by current row group. Make sure your group exist in your CKAN instance (to list all existing groups, go to http://catalog.data.gov/api/3/action/group_list?all_fields=true , replacing
catalog.data.gov
with your CKAN domain) -
Categories one of multiple categories per current row group, separated by semicolon
;
Example csv file:
dataset, group, categories https://catalog.data.gov/dataset/food-access-research-atlas,Agriculture,"Natural Resources and Environment" aerial-image-of-alaskas-arctic-coastal-plain-1955,Climate,"Arctic; Arctic Ocean, Sea Ice and Coasts; Permafrost and Arctic Landscapes" 28d30c1f-75a5-4042-b0fc-de26cc7d70f2,Climate,Arctic; Arctic Development and Transport
-
-
Double check CKAN_URL and CKAN_API_KEY for editing datasets, defined in
inc/config.php
-
Run script
$ php cli/tagging/assign_groups_and_tags.php
- Detailed logs and results are stored in folder
results/[time-stamp]_ASSIGN_GROUPS
- Prepare same csv file as for previous script, and put them to /data dir, with
remove_<any-title>.csv
$ php cli/tagging/remove_groups_and_tags.php
- This command will remove listed categories from the dataset of the row. If an empty list of categories is provided, this command will remove the group and all categories from the dataset.
http://docs.ckan.org/en/latest/api/index.html
To minimize requirements on a system, we've added a minimal setup with docker-compose. This should replace the above usage instructions as the default workflow.
$ docker-compose build
$ docker-compose run --rm app php cli/harvest_stats_csv.php
Run the tests.
$ docker-compose run --rm app phpunit