Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/issue 186 Implement API keys #188

Merged
merged 17 commits into from
Jun 18, 2024
Merged

Conversation

nikki-t
Copy link
Collaborator

@nikki-t nikki-t commented Jun 5, 2024

Github Issue: #186

Description

Implement API keys to control usage.

  • Enable API keys for the Hydrocron API and create two keys, one for default users and another for more specialized or heavy use.
  • Create usage plans for each API key starting with expected access patterns and cost modelling for defaults.
  • Create a Lambda authorizer which will facilitate the use of API keys and usage plans to throttle requests and set quotas.

NOTE - We cannot implement the trusted key usage plan quite yet as we are waiting on platform to implement a solution that allows the passing of the trusted API key to our API endpoint.

Overview of work done

  • Created a Lambda authorizer function
  • Defined API Keys and Authorizer definition in OpenAPI specification
  • Defined API gateway API keys
  • Defined API gateway usage plans and associated with gateway stage
  • Defined SSM parameters to store API key values

Overview of verification done

Created three new unit tests

  1. Test the connection to SSM client
  2. Test default user API key IAM policy response in Lambda Authorizer
  3. Test trusted user API key IAM policy response in Lambda Authorizer

New and existing unit tests pass.

Overview of integration done

Deployed to SIT environment, reviewed API Gateway architecture, and ran the following commands to test.

Reach CSV (application/json)

curl --location 'https://soto.podaac.sit.earthdatacloud.nasa.gov/hydrocron/v1/timeseries?feature=Reach&feature_id=63470800171&start_time=2024-02-01T00:00:00%2b00:00&end_time=2024-10-30T00:00:00%2b00:00&output=csv&fields=reach_id,time_str,wse'
{
    "status": "200 OK",
    "time": 706.182,
    "hits": 2,
    "results": {
        "csv": "reach_id,time_str,wse,wse_units\n63470800171,2024-02-01T02:26:50Z,3386.9332,m\n63470800171,2024-02-08T13:48:41Z,1453.4136,m\n",
        "geojson": {}
    }
}

Node GeoJSON (application/geo+json)

curl --header "Accept: application/geo+json" --location 'https://soto.podaac.sit.earthdatacloud.nasa.gov/hydrocron/v1/timeseries?feature=Node&feature_id=12228200110861&start_time=2023-11-29T00:00:00Z&end_time=2024-10-30T00:00:00Z&output=geojson&fields=reach_id,node_id,time_str,wse'
{
    "type": "FeatureCollection",
    "features": [
        {
            "id": "0",
            "type": "Feature",
            "properties": {
                "reach_id": [
                    "12228200111",
                    "12228200111"
                ],
                "node_id": [
                    "12228200110861",
                    "12228200110861"
                ],
                "time_str": [
                    "2024-01-30T21:19:19Z",
                    "2024-02-06T08:37:09Z"
                ],
                "wse": [
                    "3389.616",
                    "1346.93836"
                ],
                "wse_units": [
                    "m",
                    "m"
                ]
            },
            "geometry": {
                "type": "Point",
                "coordinates": [
                    35.149314,
                    -10.256285
                ]
            }
        }
    ]
}

CSV accept header (text/csv)

curl --header "Accept: text/csv" --location 'https://soto.podaac.sit.earthdatacloud.nasa.gov/hydrocron/v1/timeseries?feature=Reach&feature_id=63470800171&start_time=2024-02-01T00:00:00%2b00:00&end_time=2024-10-30T00:00:00%2b00:00&fields=reach_id,time_str,wse'
"reach_id,time_str,wse,wse_units\n63470800171,2024-02-01T02:26:50Z,3386.9332,m\n63470800171,2024-02-08T13:48:41Z,1453.4136,m\n"

Sample CloudWatch log for authorizer (execution logs)

2024-06-04T20:37:15.241Z (xxx) Incoming identity: {method.request.header.User-Agent=*****8.6.0}
2024-06-04T20:37:15.241Z (xxx) Endpoint request URI: https://lambda.xxx.amazonaws.com/2015-03-31/functions/arn:aws:lambda:xxx:xxx:function:svc-hydrocron-sit-authorizer-lambda/invocations
2024-06-04T20:37:15.241Z (xxx) Endpoint request headers: [TRUNCATED]
2024-06-04T20:37:15.241Z (xxx) Endpoint request body after transformations: [TRUNCATED]
2024-06-04T20:37:15.241Z (xxx) Sending request to https://lambda.xxx.amazonaws.com/2015-03-31/functions/arn:aws:lambda:xxx:xxx:function:svc-hydrocron-sit-authorizer-lambda/invocations
2024-06-04T20:37:15.363Z (xxx) Authorizer result body before parsing: {"principalId": "default_user", "policyDocument": {"Version": "2012-10-17", "Statement": [{"Action": "execute-api:Invoke", "Effect": "Allow", "Resource": "arn:aws:execute-api:xxx:xxx:xxx/v1/GET/timeseries"}]}, "usageIdentifierKey": "xxx"}
2024-06-04T20:37:15.363Z (xxx) Using valid authorizer policy for principal: ******abc
2024-06-04T20:37:15.363Z (xxx) Successfully completed authorizer execution
2024-06-04T20:37:15.363Z (xxx) Verifying Usage Plan for request: xxx. API Key: **********************************123 API Stage: xxx/v1
2024-06-04T20:37:15.365Z (xxx) Usage Plan check succeeded for API Key **********************************123 and API Stage xxx/v1

Response when usage plan quota is hit:

{"message":"Limit Exceeded"}

PR checklist:

  • Linted
  • Updated unit tests
  • Updated changelog
  • Integration testing

See Pull Request Review Checklist for pointers on reviewing this pull request

@nikki-t nikki-t self-assigned this Jun 5, 2024
@nikki-t
Copy link
Collaborator Author

nikki-t commented Jun 5, 2024

This is a draft PR so that we can figure out starting defaults for throttling and the monthly requests quota.

In the meantime, I wanted to check on some decisions I made:

  1. The location of the Lambda authorizer is in hydrocron.api.controllers.authorizer. Does this seem okay with the current organization of code?
  2. Per this documentation,"A REQUEST authorizer receives the caller's identity in a combination of headers, query string parameters, stageVariables, and $context variables."
    1. The identity source that I selected was the User-Agent header since that gets passed by most (all?) requests and end users won't have to modify their current requests. Are there any thoughts or concerns around using this header? Should we pick a different one?

@frankinspace frankinspace changed the title Feature/issue 186 Feature/issue 186 Implement API keys Jun 6, 2024
@nikki-t
Copy link
Collaborator Author

nikki-t commented Jun 12, 2024

I updated the Lambda authorizer to use a x-hydrocron-key header to differentiate between a default user and trusted partner.

I was able to confirm that the trusted partner key counted against the request quota defined in the trusted partner usage plan. I could not confirm request throttling but would like to deploy this to UAT and test using the benchmark tests. We can manually edit the throttling parameters to confirm throttling and the error response that is returned.

I also confirmed that the user does not need to pass in an API key which will count against the default usage plan.

I think we are ready to deploy this to UAT after we finalize the quota and throttle parameters for the different usage plans.

@nikki-t
Copy link
Collaborator Author

nikki-t commented Jun 12, 2024

I updated the "timeseries" documentation to include a section on API keys. We need to decide:

  1. What we consider heavy usage to be and when a user should request an API key. It is currently defined as: "Heavy usage can be defined as continued used with over x requests per day or continue use which require many requests per second or concurrent requests." What should be consider x to be?

  2. How do users obtain API keys? The documentation currently says: "To request an API key or to discuss your use case, please contact us at x." Should we facilitate the request through an e-mail? If so what address do we use?

@frankinspace @torimcd @cassienickles - Interested in your thoughts!

@nikki-t
Copy link
Collaborator Author

nikki-t commented Jun 13, 2024

I updated the quota and throttle settings per #186 discussion. I set the "trusted_partner" usage plan to have a quota of 5 requests per month and set the rate per second to 1 and the concurrent requests to 1. I don't think this should matter since we aren't rolling the trusted partner API key quite yet but wanted to keep the numbers low. I think this PR is ready for review!

@nikki-t
Copy link
Collaborator Author

nikki-t commented Jun 18, 2024

@frankinspace - I think the updated changes are ready to be reviewed and then possibly deployed to UAT?

@frankinspace frankinspace merged commit 5d86e7a into develop Jun 18, 2024
9 checks passed
@frankinspace frankinspace deleted the feature/issue-186 branch June 18, 2024 17:58
nikki-t added a commit that referenced this pull request Jun 20, 2024
* /version 1.3.0a0

* Update build.yml

* /version 1.3.0a1

* /version 1.3.0a2

* Feature/issue 175 - Update docs to point to OPS (#176)

* changelog

* update examples, remove load_data readme, info moved to wiki

* Dependency update to fix snyk scan

* issues/101: Support for HTTP Accept header (#172)

* Reorganize timeseries code to  prep for Accept header

* Enable Accept header to return response of specific content-type

* Fix whitespace and string continuation

* Make error handling consistent and add an additional test where a reach can't be found

* Update changelog with issue for unreleased version

* Add 415 status code to API definition

* Few minor cleanup items

* Few minor cleanup items

* Update to [email protected]

* Fix dependencies

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a3

* issues/102: Support compression of API response (#173)

* Enable payload compression

* Update changelog with issue

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a4

* Feature/issue 100 Add option to 'compact' GeoJSON result into single feature (#177)

* Reorganize timeseries code to  prep for Accept header

* Enable Accept header to return response of specific content-type

* Fix whitespace and string continuation

* Make error handling consistent and add an additional test where a reach can't be found

* Update changelog with issue for unreleased version

* Add 415 status code to API definition

* Few minor cleanup items

* Few minor cleanup items

* Update to [email protected]

* Fix dependencies

* Update required query parameters based on current API functionality

* Enable return of 'compact' GeoJSON response

* Fix linting and add test data

* Update documentation for API accept headers and compact GeoJSON response

* Fix references to incorrect Accept header examples

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a5

* Feature/issue 183 (#185)

* Provide introduction to timeseries endpoint

* Remove _units in fields list

* Fix typo

* Update examples with Accept headers and compact query parameter

* Add issue to changelog

* Fix typo in timeseries documentation

* Update pymysql

* Update pymysql

* Provide clarity on accept headers and request parameter fields

* /version 1.3.0a6

* Feature/issue 186 Implement API keys (#188)

* API Gateway Lambda authorizer to facilitate API keys and usage plans

* Unit tests to test Lambda authorizer

* Fix terraform file formatting

* API Gateway Lambda Authorizer

- Lambda function
- API Keys and Authorizer definition in OpenAPI spec
- API gateway API keys
- API gateway usage plans
- SSM parameters for API keys

* Fix trailing whitespace

* Set default region environment variable

* Fix SNYK vulnerabilities

* Add issue to changelog

* Implement custom trusted partner header x-hydrocron-key

* Update cryptography for SNYK vulnerability

* Update documentation to include API key usage

* Update quota and throttle settings for API Gateway

* Update API keys documentation to indicate to be implemented

* Move API key lookup to Lambda INIT

* Remove API key authentication and update API key to x-hydrocron-key

* /version 1.3.0a7

* Update changelog for 1.3.0 release

* Update dependencies

* Fix trailing whitespace in README

* /version 1.3.0rc1

* Set api key source to authorizer

* /version 1.3.0rc2

---------

Co-authored-by: nikki-t <[email protected]>
Co-authored-by: Frank Greguska <[email protected]>
Co-authored-by: frankinspace <[email protected]>
Co-authored-by: Victoria McDonald <[email protected]>
@frankinspace frankinspace linked an issue Jun 22, 2024 that may be closed by this pull request
torimcd added a commit that referenced this pull request Sep 25, 2024
* /version 1.3.0a0

* Update build.yml

* /version 1.3.0a1

* /version 1.3.0a2

* Feature/issue 175 - Update docs to point to OPS (#176)

* changelog

* update examples, remove load_data readme, info moved to wiki

* Dependency update to fix snyk scan

* issues/101: Support for HTTP Accept header (#172)

* Reorganize timeseries code to  prep for Accept header

* Enable Accept header to return response of specific content-type

* Fix whitespace and string continuation

* Make error handling consistent and add an additional test where a reach can't be found

* Update changelog with issue for unreleased version

* Add 415 status code to API definition

* Few minor cleanup items

* Few minor cleanup items

* Update to [email protected]

* Fix dependencies

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a3

* issues/102: Support compression of API response (#173)

* Enable payload compression

* Update changelog with issue

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a4

* Feature/issue 100 Add option to 'compact' GeoJSON result into single feature (#177)

* Reorganize timeseries code to  prep for Accept header

* Enable Accept header to return response of specific content-type

* Fix whitespace and string continuation

* Make error handling consistent and add an additional test where a reach can't be found

* Update changelog with issue for unreleased version

* Add 415 status code to API definition

* Few minor cleanup items

* Few minor cleanup items

* Update to [email protected]

* Fix dependencies

* Update required query parameters based on current API functionality

* Enable return of 'compact' GeoJSON response

* Fix linting and add test data

* Update documentation for API accept headers and compact GeoJSON response

* Fix references to incorrect Accept header examples

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a5

* Feature/issue 183 (#185)

* Provide introduction to timeseries endpoint

* Remove _units in fields list

* Fix typo

* Update examples with Accept headers and compact query parameter

* Add issue to changelog

* Fix typo in timeseries documentation

* Update pymysql

* Update pymysql

* Provide clarity on accept headers and request parameter fields

* /version 1.3.0a6

* Feature/issue 186 Implement API keys (#188)

* API Gateway Lambda authorizer to facilitate API keys and usage plans

* Unit tests to test Lambda authorizer

* Fix terraform file formatting

* API Gateway Lambda Authorizer

- Lambda function
- API Keys and Authorizer definition in OpenAPI spec
- API gateway API keys
- API gateway usage plans
- SSM parameters for API keys

* Fix trailing whitespace

* Set default region environment variable

* Fix SNYK vulnerabilities

* Add issue to changelog

* Implement custom trusted partner header x-hydrocron-key

* Update cryptography for SNYK vulnerability

* Update documentation to include API key usage

* Update quota and throttle settings for API Gateway

* Update API keys documentation to indicate to be implemented

* Move API key lookup to Lambda INIT

* Remove API key authentication and update API key to x-hydrocron-key

* /version 1.3.0a7

* Update changelog for 1.3.0 release

* /version 1.4.0a0

* Feature/issue 198 (#207)

* Update pylint to deal with errors and fix collection reference

* Initial CMR and Hydrocron queries

- Includes placeholders for other operations needed to track granule
ingest.
- GranuleUR query for Hydrocron tables.

* Add and set up vcrpy for testing CMR API query

* Test track ingest operations

- Test CMR and hydrocron queries
- Test granuleUR query
- Update database to include granuleUR GSI

* Update to use track_ingest naming consistently

* Initial Lambda function and IAM role definition

* Replace deprecated path function with as_file

* Add SSM read IAM permissions

* Add DynamoDB read permissions

* Update track ingest lambda memory

* Remove duplicate IAM permissions

* Add in permissions to query index

* Update changelog

* Update changelog description

* Use python_cmr for CMR API queries

* /version 1.4.0a1

* Add doi to documentation pages (#216)

* Update intro.md with DOI

* Update overview.md with DOI

* /version 1.4.0a2

* issue-193: Add Dynamo DB Table for SWOT Prior Lakes (#209)

* add code to handle prior lakes shapefiles, add test prior lake data

* update terraform to add prior lake table

* fix tests, change to smaller test data file, changelog

* linting

* reconfigure main load_data method to make more readable and pass linting

* lint

* lint

* fix string casting to lower storage req & update test responses to handle different rounding pattern in coords

* update load benchmarking function for linting and add unit test

* try parent collection for lakes

* update version parsing for parent collection

* fix case error

* fix lake id reference

* add logging to troubleshoot too large features

* add item size logging and remove error raise for batch write

* clean up logging statements & move numeric_columns assignment

* update batch logging statement

* Rename constant

* Fix temp dir security risk https://rules.sonarsource.com/python/RSPEC-5443/

* Fix temp dir security risk https://rules.sonarsource.com/python/RSPEC-5443/

* fix code coverage calculation

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a3

* Feature/issue 201 Create a table for tracking granule ingest status (#214)

* Define track ingest database and IAM permissions

* Update changelog with issue

* Modify table structure to support sparse status index

* Updated to only apply PITR in ops

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a4

* Feature/issue 210 - Load large geometry polygons (#219)

* add functions to handle null geometries and convert polygons to points

* update doi in docs

* fix fill null geometries

* fix tests and update changelog

* /version 1.4.0a5

* Feature/issue 222 - Add granule info to track ingest table on load (#223)

* adjust lambdas to populate track ingest table on granule load

* changelog

* remove test cnm

* lint

* change error caught when handling checksum

* update lambda role permissions to write to track ingest table

* fix typo on lake table terraform

* set default fill values for checksum and rev date in track status

* fix checksum handling in bulk load data

* lint

* add logging to debug

* /version 1.4.0a6

* Feature/issue-225: Create one track ingest table per feature type (#226)

* add track ingest tables for each feature type and adjust load data to populate

* changelog

* /version 1.4.0a7

* Feature/issue 196 Add new feature type to query the API for lake data (#224)

* Initial API queries for lake data

* Unit tests for lake data

* Updates after center point calculations

- Removed temp code to calculate a point in API
- Implemented unit test to test lake data retrieval
- Updated fixtures to load in lake data for testing

* Add read lake table permissions to lambda timeseries and track ingest roles

* Update documenation to include lake data

* Updated documentation to include info on lake centerpoints

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a8

* Feature/issue 205 - Add Confluence API key (#221)

* Fix possible variable references before value is assigned

* Define Confluence API key and trusted partner plan limits

* Define a list of trusted partner keys and store under single parameter

* Define API keys as encrypted envrionment variables for Lambda authorizer

* Update authorizer and connection class to use KMS to retrieve API keys

* Hack to force lambda deployment when ssm value changes (#218)

* Add replace_triggered_by to hydrocron_lambda_authorizer

* Introduce environment variable that contains random id which will change whenever an API key value changes. This will force lambda to publish new version of the function.

* Remove unnecessary hash function

* Update to SSM parameter API key storage and null_resource enviroment variable

* Update Terraform and AWS provider

* Update API key documentation

* Set source_code_hash to force deployment of new image

* Downgrade AWS provider to 4.0 to remove inline policy errors

* Update docs/timeseries.md

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a9

* /version 1.4.0a10

* changelog for 1.4.0 release

* update dependencies for 1.4.0 release

* fix CMR query in UAT

* /version 1.4.0rc1

* fix typo in load_data lambda

* /version 1.4.0rc2

* fix index on rev date in load data lambda

* update dependencies

* lint readme

* /version 1.4.0rc3

* /version 1.4.0rc4

* fix cmr env search by venue

* /version 1.4.0rc5

---------

Co-authored-by: nikki-t <[email protected]>
Co-authored-by: Frank Greguska <[email protected]>
Co-authored-by: frankinspace <[email protected]>
Co-authored-by: Nikki Tebaldi <[email protected]>
Co-authored-by: Cassie Nickles <[email protected]>
Co-authored-by: cassienickles <[email protected]>
Co-authored-by: podaac-cicd[bot] <podaac-cicd[bot]@users.noreply.github.com>
Co-authored-by: torimcd <[email protected]>
frankinspace added a commit that referenced this pull request Oct 3, 2024
…gest" status (#227)

* /version 1.3.0a0

* Update build.yml

* /version 1.3.0a1

* /version 1.3.0a2

* Feature/issue 175 - Update docs to point to OPS (#176)

* changelog

* update examples, remove load_data readme, info moved to wiki

* Dependency update to fix snyk scan

* issues/101: Support for HTTP Accept header (#172)

* Reorganize timeseries code to  prep for Accept header

* Enable Accept header to return response of specific content-type

* Fix whitespace and string continuation

* Make error handling consistent and add an additional test where a reach can't be found

* Update changelog with issue for unreleased version

* Add 415 status code to API definition

* Few minor cleanup items

* Few minor cleanup items

* Update to [email protected]

* Fix dependencies

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a3

* issues/102: Support compression of API response (#173)

* Enable payload compression

* Update changelog with issue

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a4

* Feature/issue 100 Add option to 'compact' GeoJSON result into single feature (#177)

* Reorganize timeseries code to  prep for Accept header

* Enable Accept header to return response of specific content-type

* Fix whitespace and string continuation

* Make error handling consistent and add an additional test where a reach can't be found

* Update changelog with issue for unreleased version

* Add 415 status code to API definition

* Few minor cleanup items

* Few minor cleanup items

* Update to [email protected]

* Fix dependencies

* Update required query parameters based on current API functionality

* Enable return of 'compact' GeoJSON response

* Fix linting and add test data

* Update documentation for API accept headers and compact GeoJSON response

* Fix references to incorrect Accept header examples

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a5

* Feature/issue 183 (#185)

* Provide introduction to timeseries endpoint

* Remove _units in fields list

* Fix typo

* Update examples with Accept headers and compact query parameter

* Add issue to changelog

* Fix typo in timeseries documentation

* Update pymysql

* Update pymysql

* Provide clarity on accept headers and request parameter fields

* /version 1.3.0a6

* Feature/issue 186 Implement API keys (#188)

* API Gateway Lambda authorizer to facilitate API keys and usage plans

* Unit tests to test Lambda authorizer

* Fix terraform file formatting

* API Gateway Lambda Authorizer

- Lambda function
- API Keys and Authorizer definition in OpenAPI spec
- API gateway API keys
- API gateway usage plans
- SSM parameters for API keys

* Fix trailing whitespace

* Set default region environment variable

* Fix SNYK vulnerabilities

* Add issue to changelog

* Implement custom trusted partner header x-hydrocron-key

* Update cryptography for SNYK vulnerability

* Update documentation to include API key usage

* Update quota and throttle settings for API Gateway

* Update API keys documentation to indicate to be implemented

* Move API key lookup to Lambda INIT

* Remove API key authentication and update API key to x-hydrocron-key

* /version 1.3.0a7

* Update changelog for 1.3.0 release

* /version 1.4.0a0

* Feature/issue 198 (#207)

* Update pylint to deal with errors and fix collection reference

* Initial CMR and Hydrocron queries

- Includes placeholders for other operations needed to track granule
ingest.
- GranuleUR query for Hydrocron tables.

* Add and set up vcrpy for testing CMR API query

* Test track ingest operations

- Test CMR and hydrocron queries
- Test granuleUR query
- Update database to include granuleUR GSI

* Update to use track_ingest naming consistently

* Initial Lambda function and IAM role definition

* Replace deprecated path function with as_file

* Add SSM read IAM permissions

* Add DynamoDB read permissions

* Update track ingest lambda memory

* Remove duplicate IAM permissions

* Add in permissions to query index

* Update changelog

* Update changelog description

* Use python_cmr for CMR API queries

* /version 1.4.0a1

* Add doi to documentation pages (#216)

* Update intro.md with DOI

* Update overview.md with DOI

* /version 1.4.0a2

* issue-193: Add Dynamo DB Table for SWOT Prior Lakes (#209)

* add code to handle prior lakes shapefiles, add test prior lake data

* update terraform to add prior lake table

* fix tests, change to smaller test data file, changelog

* linting

* reconfigure main load_data method to make more readable and pass linting

* lint

* lint

* fix string casting to lower storage req & update test responses to handle different rounding pattern in coords

* update load benchmarking function for linting and add unit test

* try parent collection for lakes

* update version parsing for parent collection

* fix case error

* fix lake id reference

* add logging to troubleshoot too large features

* add item size logging and remove error raise for batch write

* clean up logging statements & move numeric_columns assignment

* update batch logging statement

* Rename constant

* Fix temp dir security risk https://rules.sonarsource.com/python/RSPEC-5443/

* Fix temp dir security risk https://rules.sonarsource.com/python/RSPEC-5443/

* fix code coverage calculation

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a3

* Feature/issue 201 Create a table for tracking granule ingest status (#214)

* Define track ingest database and IAM permissions

* Update changelog with issue

* Modify table structure to support sparse status index

* Updated to only apply PITR in ops

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a4

* Feature/issue 210 - Load large geometry polygons (#219)

* add functions to handle null geometries and convert polygons to points

* update doi in docs

* fix fill null geometries

* fix tests and update changelog

* /version 1.4.0a5

* Feature/issue 222 - Add granule info to track ingest table on load (#223)

* adjust lambdas to populate track ingest table on granule load

* changelog

* remove test cnm

* lint

* change error caught when handling checksum

* update lambda role permissions to write to track ingest table

* fix typo on lake table terraform

* set default fill values for checksum and rev date in track status

* fix checksum handling in bulk load data

* lint

* add logging to debug

* /version 1.4.0a6

* Add SSM parameter read for last run time

* Feature/issue-225: Create one track ingest table per feature type (#226)

* add track ingest tables for each feature type and adjust load data to populate

* changelog

* /version 1.4.0a7

* Feature/issue 196 Add new feature type to query the API for lake data (#224)

* Initial API queries for lake data

* Unit tests for lake data

* Updates after center point calculations

- Removed temp code to calculate a point in API
- Implemented unit test to test lake data retrieval
- Updated fixtures to load in lake data for testing

* Add read lake table permissions to lambda timeseries and track ingest roles

* Update documenation to include lake data

* Updated documentation to include info on lake centerpoints

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a8

* Feature/issue 205 - Add Confluence API key (#221)

* Fix possible variable references before value is assigned

* Define Confluence API key and trusted partner plan limits

* Define a list of trusted partner keys and store under single parameter

* Define API keys as encrypted envrionment variables for Lambda authorizer

* Update authorizer and connection class to use KMS to retrieve API keys

* Hack to force lambda deployment when ssm value changes (#218)

* Add replace_triggered_by to hydrocron_lambda_authorizer

* Introduce environment variable that contains random id which will change whenever an API key value changes. This will force lambda to publish new version of the function.

* Remove unnecessary hash function

* Update to SSM parameter API key storage and null_resource enviroment variable

* Update Terraform and AWS provider

* Update API key documentation

* Set source_code_hash to force deployment of new image

* Downgrade AWS provider to 4.0 to remove inline policy errors

* Update docs/timeseries.md

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a9

* /version 1.4.0a10

* changelog for 1.4.0 release

* /version 1.5.0a0

* Initial track ingest table query

* Fix linting and code style

* Implement feature count operations

* Enable S3 permissions and set environment variable for track lambda

* Fix trailing white spaces and code format

* Update docstrings for class methods

* Implement run time storage in SSM

* Query track table unit tests

* Update CHANGELOG with issue

* Update SSM run time parameter

* Fix trailing whitespace

* Fix reference to IAM policy

* Enable specification of temporal range to search revision date by

* Fix SSM put parameter policy

* Update IAM permissions for reading track ingest

* Enable full temporal search on CMR granules

* Add capability to download shapefile granule to count features

* Update granule UR to include .zip

* Count features via Hydrocron table query

* Remove unnecessary s3 permissions

* Remove whitespace from blank line

* Update cryptography to 43.0.1

* update dependencies

* upgrade geopandas

* update dependencies

---------

Co-authored-by: nikki-t <[email protected]>
Co-authored-by: Frank Greguska <[email protected]>
Co-authored-by: frankinspace <[email protected]>
Co-authored-by: Victoria McDonald <[email protected]>
Co-authored-by: Cassie Nickles <[email protected]>
Co-authored-by: cassienickles <[email protected]>
Co-authored-by: podaac-cicd[bot] <podaac-cicd[bot]@users.noreply.github.com>
Co-authored-by: Victoria McDonald <[email protected]>
Co-authored-by: torimcd <[email protected]>
frankinspace added a commit that referenced this pull request Oct 3, 2024
* /version 1.3.0a0

* Update build.yml

* /version 1.3.0a1

* /version 1.3.0a2

* Feature/issue 175 - Update docs to point to OPS (#176)

* changelog

* update examples, remove load_data readme, info moved to wiki

* Dependency update to fix snyk scan

* issues/101: Support for HTTP Accept header (#172)

* Reorganize timeseries code to  prep for Accept header

* Enable Accept header to return response of specific content-type

* Fix whitespace and string continuation

* Make error handling consistent and add an additional test where a reach can't be found

* Update changelog with issue for unreleased version

* Add 415 status code to API definition

* Few minor cleanup items

* Few minor cleanup items

* Update to [email protected]

* Fix dependencies

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a3

* issues/102: Support compression of API response (#173)

* Enable payload compression

* Update changelog with issue

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a4

* Feature/issue 100 Add option to 'compact' GeoJSON result into single feature (#177)

* Reorganize timeseries code to  prep for Accept header

* Enable Accept header to return response of specific content-type

* Fix whitespace and string continuation

* Make error handling consistent and add an additional test where a reach can't be found

* Update changelog with issue for unreleased version

* Add 415 status code to API definition

* Few minor cleanup items

* Few minor cleanup items

* Update to [email protected]

* Fix dependencies

* Update required query parameters based on current API functionality

* Enable return of 'compact' GeoJSON response

* Fix linting and add test data

* Update documentation for API accept headers and compact GeoJSON response

* Fix references to incorrect Accept header examples

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a5

* Feature/issue 183 (#185)

* Provide introduction to timeseries endpoint

* Remove _units in fields list

* Fix typo

* Update examples with Accept headers and compact query parameter

* Add issue to changelog

* Fix typo in timeseries documentation

* Update pymysql

* Update pymysql

* Provide clarity on accept headers and request parameter fields

* /version 1.3.0a6

* Feature/issue 186 Implement API keys (#188)

* API Gateway Lambda authorizer to facilitate API keys and usage plans

* Unit tests to test Lambda authorizer

* Fix terraform file formatting

* API Gateway Lambda Authorizer

- Lambda function
- API Keys and Authorizer definition in OpenAPI spec
- API gateway API keys
- API gateway usage plans
- SSM parameters for API keys

* Fix trailing whitespace

* Set default region environment variable

* Fix SNYK vulnerabilities

* Add issue to changelog

* Implement custom trusted partner header x-hydrocron-key

* Update cryptography for SNYK vulnerability

* Update documentation to include API key usage

* Update quota and throttle settings for API Gateway

* Update API keys documentation to indicate to be implemented

* Move API key lookup to Lambda INIT

* Remove API key authentication and update API key to x-hydrocron-key

* /version 1.3.0a7

* Update changelog for 1.3.0 release

* /version 1.4.0a0

* Feature/issue 198 (#207)

* Update pylint to deal with errors and fix collection reference

* Initial CMR and Hydrocron queries

- Includes placeholders for other operations needed to track granule
ingest.
- GranuleUR query for Hydrocron tables.

* Add and set up vcrpy for testing CMR API query

* Test track ingest operations

- Test CMR and hydrocron queries
- Test granuleUR query
- Update database to include granuleUR GSI

* Update to use track_ingest naming consistently

* Initial Lambda function and IAM role definition

* Replace deprecated path function with as_file

* Add SSM read IAM permissions

* Add DynamoDB read permissions

* Update track ingest lambda memory

* Remove duplicate IAM permissions

* Add in permissions to query index

* Update changelog

* Update changelog description

* Use python_cmr for CMR API queries

* /version 1.4.0a1

* Add doi to documentation pages (#216)

* Update intro.md with DOI

* Update overview.md with DOI

* /version 1.4.0a2

* issue-193: Add Dynamo DB Table for SWOT Prior Lakes (#209)

* add code to handle prior lakes shapefiles, add test prior lake data

* update terraform to add prior lake table

* fix tests, change to smaller test data file, changelog

* linting

* reconfigure main load_data method to make more readable and pass linting

* lint

* lint

* fix string casting to lower storage req & update test responses to handle different rounding pattern in coords

* update load benchmarking function for linting and add unit test

* try parent collection for lakes

* update version parsing for parent collection

* fix case error

* fix lake id reference

* add logging to troubleshoot too large features

* add item size logging and remove error raise for batch write

* clean up logging statements & move numeric_columns assignment

* update batch logging statement

* Rename constant

* Fix temp dir security risk https://rules.sonarsource.com/python/RSPEC-5443/

* Fix temp dir security risk https://rules.sonarsource.com/python/RSPEC-5443/

* fix code coverage calculation

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a3

* Feature/issue 201 Create a table for tracking granule ingest status (#214)

* Define track ingest database and IAM permissions

* Update changelog with issue

* Modify table structure to support sparse status index

* Updated to only apply PITR in ops

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a4

* Feature/issue 210 - Load large geometry polygons (#219)

* add functions to handle null geometries and convert polygons to points

* update doi in docs

* fix fill null geometries

* fix tests and update changelog

* /version 1.4.0a5

* Feature/issue 222 - Add granule info to track ingest table on load (#223)

* adjust lambdas to populate track ingest table on granule load

* changelog

* remove test cnm

* lint

* change error caught when handling checksum

* update lambda role permissions to write to track ingest table

* fix typo on lake table terraform

* set default fill values for checksum and rev date in track status

* fix checksum handling in bulk load data

* lint

* add logging to debug

* /version 1.4.0a6

* Add SSM parameter read for last run time

* Feature/issue-225: Create one track ingest table per feature type (#226)

* add track ingest tables for each feature type and adjust load data to populate

* changelog

* /version 1.4.0a7

* Feature/issue 196 Add new feature type to query the API for lake data (#224)

* Initial API queries for lake data

* Unit tests for lake data

* Updates after center point calculations

- Removed temp code to calculate a point in API
- Implemented unit test to test lake data retrieval
- Updated fixtures to load in lake data for testing

* Add read lake table permissions to lambda timeseries and track ingest roles

* Update documenation to include lake data

* Updated documentation to include info on lake centerpoints

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a8

* Feature/issue 205 - Add Confluence API key (#221)

* Fix possible variable references before value is assigned

* Define Confluence API key and trusted partner plan limits

* Define a list of trusted partner keys and store under single parameter

* Define API keys as encrypted envrionment variables for Lambda authorizer

* Update authorizer and connection class to use KMS to retrieve API keys

* Hack to force lambda deployment when ssm value changes (#218)

* Add replace_triggered_by to hydrocron_lambda_authorizer

* Introduce environment variable that contains random id which will change whenever an API key value changes. This will force lambda to publish new version of the function.

* Remove unnecessary hash function

* Update to SSM parameter API key storage and null_resource enviroment variable

* Update Terraform and AWS provider

* Update API key documentation

* Set source_code_hash to force deployment of new image

* Downgrade AWS provider to 4.0 to remove inline policy errors

* Update docs/timeseries.md

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a9

* /version 1.4.0a10

* changelog for 1.4.0 release

* /version 1.5.0a0

* Initial track ingest table query

* Fix linting and code style

* Implement feature count operations

* Enable S3 permissions and set environment variable for track lambda

* Fix trailing white spaces and code format

* Update docstrings for class methods

* Implement run time storage in SSM

* Query track table unit tests

* Update CHANGELOG with issue

* Update SSM run time parameter

* Fix trailing whitespace

* Fix reference to IAM policy

* Enable specification of temporal range to search revision date by

* Fix SSM put parameter policy

* Update IAM permissions for reading track ingest

* Enable full temporal search on CMR granules

* Add capability to download shapefile granule to count features

* Update granule UR to include .zip

* Count features via Hydrocron table query

* Remove unnecessary s3 permissions

* Remove whitespace from blank line

* Update cryptography to 43.0.1

* Update track ingest table operations

* Update changelog with issue

* update dependencies

* upgrade geopandas

* update dependencies

---------

Co-authored-by: nikki-t <[email protected]>
Co-authored-by: Frank Greguska <[email protected]>
Co-authored-by: frankinspace <[email protected]>
Co-authored-by: Victoria McDonald <[email protected]>
Co-authored-by: Cassie Nickles <[email protected]>
Co-authored-by: cassienickles <[email protected]>
Co-authored-by: podaac-cicd[bot] <podaac-cicd[bot]@users.noreply.github.com>
Co-authored-by: Victoria McDonald <[email protected]>
Co-authored-by: torimcd <[email protected]>
nikki-t added a commit that referenced this pull request Oct 4, 2024
* /version 1.3.0a0

* Update build.yml

* /version 1.3.0a1

* /version 1.3.0a2

* Feature/issue 175 - Update docs to point to OPS (#176)

* changelog

* update examples, remove load_data readme, info moved to wiki

* Dependency update to fix snyk scan

* issues/101: Support for HTTP Accept header (#172)

* Reorganize timeseries code to  prep for Accept header

* Enable Accept header to return response of specific content-type

* Fix whitespace and string continuation

* Make error handling consistent and add an additional test where a reach can't be found

* Update changelog with issue for unreleased version

* Add 415 status code to API definition

* Few minor cleanup items

* Few minor cleanup items

* Update to [email protected]

* Fix dependencies

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a3

* issues/102: Support compression of API response (#173)

* Enable payload compression

* Update changelog with issue

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a4

* Feature/issue 100 Add option to 'compact' GeoJSON result into single feature (#177)

* Reorganize timeseries code to  prep for Accept header

* Enable Accept header to return response of specific content-type

* Fix whitespace and string continuation

* Make error handling consistent and add an additional test where a reach can't be found

* Update changelog with issue for unreleased version

* Add 415 status code to API definition

* Few minor cleanup items

* Few minor cleanup items

* Update to [email protected]

* Fix dependencies

* Update required query parameters based on current API functionality

* Enable return of 'compact' GeoJSON response

* Fix linting and add test data

* Update documentation for API accept headers and compact GeoJSON response

* Fix references to incorrect Accept header examples

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a5

* Feature/issue 183 (#185)

* Provide introduction to timeseries endpoint

* Remove _units in fields list

* Fix typo

* Update examples with Accept headers and compact query parameter

* Add issue to changelog

* Fix typo in timeseries documentation

* Update pymysql

* Update pymysql

* Provide clarity on accept headers and request parameter fields

* /version 1.3.0a6

* Feature/issue 186 Implement API keys (#188)

* API Gateway Lambda authorizer to facilitate API keys and usage plans

* Unit tests to test Lambda authorizer

* Fix terraform file formatting

* API Gateway Lambda Authorizer

- Lambda function
- API Keys and Authorizer definition in OpenAPI spec
- API gateway API keys
- API gateway usage plans
- SSM parameters for API keys

* Fix trailing whitespace

* Set default region environment variable

* Fix SNYK vulnerabilities

* Add issue to changelog

* Implement custom trusted partner header x-hydrocron-key

* Update cryptography for SNYK vulnerability

* Update documentation to include API key usage

* Update quota and throttle settings for API Gateway

* Update API keys documentation to indicate to be implemented

* Move API key lookup to Lambda INIT

* Remove API key authentication and update API key to x-hydrocron-key

* /version 1.3.0a7

* Update changelog for 1.3.0 release

* /version 1.4.0a0

* Feature/issue 198 (#207)

* Update pylint to deal with errors and fix collection reference

* Initial CMR and Hydrocron queries

- Includes placeholders for other operations needed to track granule
ingest.
- GranuleUR query for Hydrocron tables.

* Add and set up vcrpy for testing CMR API query

* Test track ingest operations

- Test CMR and hydrocron queries
- Test granuleUR query
- Update database to include granuleUR GSI

* Update to use track_ingest naming consistently

* Initial Lambda function and IAM role definition

* Replace deprecated path function with as_file

* Add SSM read IAM permissions

* Add DynamoDB read permissions

* Update track ingest lambda memory

* Remove duplicate IAM permissions

* Add in permissions to query index

* Update changelog

* Update changelog description

* Use python_cmr for CMR API queries

* /version 1.4.0a1

* Add doi to documentation pages (#216)

* Update intro.md with DOI

* Update overview.md with DOI

* /version 1.4.0a2

* issue-193: Add Dynamo DB Table for SWOT Prior Lakes (#209)

* add code to handle prior lakes shapefiles, add test prior lake data

* update terraform to add prior lake table

* fix tests, change to smaller test data file, changelog

* linting

* reconfigure main load_data method to make more readable and pass linting

* lint

* lint

* fix string casting to lower storage req & update test responses to handle different rounding pattern in coords

* update load benchmarking function for linting and add unit test

* try parent collection for lakes

* update version parsing for parent collection

* fix case error

* fix lake id reference

* add logging to troubleshoot too large features

* add item size logging and remove error raise for batch write

* clean up logging statements & move numeric_columns assignment

* update batch logging statement

* Rename constant

* Fix temp dir security risk https://rules.sonarsource.com/python/RSPEC-5443/

* Fix temp dir security risk https://rules.sonarsource.com/python/RSPEC-5443/

* fix code coverage calculation

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a3

* Feature/issue 201 Create a table for tracking granule ingest status (#214)

* Define track ingest database and IAM permissions

* Update changelog with issue

* Modify table structure to support sparse status index

* Updated to only apply PITR in ops

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a4

* Feature/issue 210 - Load large geometry polygons (#219)

* add functions to handle null geometries and convert polygons to points

* update doi in docs

* fix fill null geometries

* fix tests and update changelog

* /version 1.4.0a5

* Feature/issue 222 - Add granule info to track ingest table on load (#223)

* adjust lambdas to populate track ingest table on granule load

* changelog

* remove test cnm

* lint

* change error caught when handling checksum

* update lambda role permissions to write to track ingest table

* fix typo on lake table terraform

* set default fill values for checksum and rev date in track status

* fix checksum handling in bulk load data

* lint

* add logging to debug

* /version 1.4.0a6

* Add SSM parameter read for last run time

* Feature/issue-225: Create one track ingest table per feature type (#226)

* add track ingest tables for each feature type and adjust load data to populate

* changelog

* /version 1.4.0a7

* Feature/issue 196 Add new feature type to query the API for lake data (#224)

* Initial API queries for lake data

* Unit tests for lake data

* Updates after center point calculations

- Removed temp code to calculate a point in API
- Implemented unit test to test lake data retrieval
- Updated fixtures to load in lake data for testing

* Add read lake table permissions to lambda timeseries and track ingest roles

* Update documenation to include lake data

* Updated documentation to include info on lake centerpoints

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a8

* Feature/issue 205 - Add Confluence API key (#221)

* Fix possible variable references before value is assigned

* Define Confluence API key and trusted partner plan limits

* Define a list of trusted partner keys and store under single parameter

* Define API keys as encrypted envrionment variables for Lambda authorizer

* Update authorizer and connection class to use KMS to retrieve API keys

* Hack to force lambda deployment when ssm value changes (#218)

* Add replace_triggered_by to hydrocron_lambda_authorizer

* Introduce environment variable that contains random id which will change whenever an API key value changes. This will force lambda to publish new version of the function.

* Remove unnecessary hash function

* Update to SSM parameter API key storage and null_resource enviroment variable

* Update Terraform and AWS provider

* Update API key documentation

* Set source_code_hash to force deployment of new image

* Downgrade AWS provider to 4.0 to remove inline policy errors

* Update docs/timeseries.md

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a9

* /version 1.4.0a10

* changelog for 1.4.0 release

* /version 1.5.0a0

* Initial track ingest table query

* Fix linting and code style

* Implement feature count operations

* Enable S3 permissions and set environment variable for track lambda

* Fix trailing white spaces and code format

* Update docstrings for class methods

* Implement run time storage in SSM

* Query track table unit tests

* Update CHANGELOG with issue

* Update SSM run time parameter

* Fix trailing whitespace

* Fix reference to IAM policy

* Enable specification of temporal range to search revision date by

* Fix SSM put parameter policy

* Update IAM permissions for reading track ingest

* Enable full temporal search on CMR granules

* Add capability to download shapefile granule to count features

* Update granule UR to include .zip

* Count features via Hydrocron table query

* Remove unnecessary s3 permissions

* Remove whitespace from blank line

* Update cryptography to 43.0.1

* Update track ingest table operations

* Update changelog with issue

* update dependencies

* upgrade geopandas

* update dependencies

* Implement operations to publish CNM messages for granules requiring ingest

* Implement unit test of publication operations

* Fix linting

* Add issue to changelog and fix linting

* Add EventBridge schedules with appropriate Lambda permissions

* Set initial schedule expressions and fix assume policy

* Disable eventbridge schedules by default

* Update schedule to run weekly

* Define 1 hour latency to search by revision_date in CMR

---------

Co-authored-by: nikki-t <[email protected]>
Co-authored-by: Frank Greguska <[email protected]>
Co-authored-by: frankinspace <[email protected]>
Co-authored-by: Victoria McDonald <[email protected]>
Co-authored-by: Cassie Nickles <[email protected]>
Co-authored-by: cassienickles <[email protected]>
Co-authored-by: podaac-cicd[bot] <podaac-cicd[bot]@users.noreply.github.com>
Co-authored-by: Victoria McDonald <[email protected]>
Co-authored-by: torimcd <[email protected]>
nikki-t added a commit that referenced this pull request Oct 4, 2024
* /version 1.3.0a0

* Update build.yml

* /version 1.3.0a1

* /version 1.3.0a2

* Feature/issue 175 - Update docs to point to OPS (#176)

* changelog

* update examples, remove load_data readme, info moved to wiki

* Dependency update to fix snyk scan

* issues/101: Support for HTTP Accept header (#172)

* Reorganize timeseries code to  prep for Accept header

* Enable Accept header to return response of specific content-type

* Fix whitespace and string continuation

* Make error handling consistent and add an additional test where a reach can't be found

* Update changelog with issue for unreleased version

* Add 415 status code to API definition

* Few minor cleanup items

* Few minor cleanup items

* Update to [email protected]

* Fix dependencies

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a3

* issues/102: Support compression of API response (#173)

* Enable payload compression

* Update changelog with issue

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a4

* Feature/issue 100 Add option to 'compact' GeoJSON result into single feature (#177)

* Reorganize timeseries code to  prep for Accept header

* Enable Accept header to return response of specific content-type

* Fix whitespace and string continuation

* Make error handling consistent and add an additional test where a reach can't be found

* Update changelog with issue for unreleased version

* Add 415 status code to API definition

* Few minor cleanup items

* Few minor cleanup items

* Update to [email protected]

* Fix dependencies

* Update required query parameters based on current API functionality

* Enable return of 'compact' GeoJSON response

* Fix linting and add test data

* Update documentation for API accept headers and compact GeoJSON response

* Fix references to incorrect Accept header examples

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a5

* Feature/issue 183 (#185)

* Provide introduction to timeseries endpoint

* Remove _units in fields list

* Fix typo

* Update examples with Accept headers and compact query parameter

* Add issue to changelog

* Fix typo in timeseries documentation

* Update pymysql

* Update pymysql

* Provide clarity on accept headers and request parameter fields

* /version 1.3.0a6

* Feature/issue 186 Implement API keys (#188)

* API Gateway Lambda authorizer to facilitate API keys and usage plans

* Unit tests to test Lambda authorizer

* Fix terraform file formatting

* API Gateway Lambda Authorizer

- Lambda function
- API Keys and Authorizer definition in OpenAPI spec
- API gateway API keys
- API gateway usage plans
- SSM parameters for API keys

* Fix trailing whitespace

* Set default region environment variable

* Fix SNYK vulnerabilities

* Add issue to changelog

* Implement custom trusted partner header x-hydrocron-key

* Update cryptography for SNYK vulnerability

* Update documentation to include API key usage

* Update quota and throttle settings for API Gateway

* Update API keys documentation to indicate to be implemented

* Move API key lookup to Lambda INIT

* Remove API key authentication and update API key to x-hydrocron-key

* /version 1.3.0a7

* Update changelog for 1.3.0 release

* /version 1.4.0a0

* Feature/issue 198 (#207)

* Update pylint to deal with errors and fix collection reference

* Initial CMR and Hydrocron queries

- Includes placeholders for other operations needed to track granule
ingest.
- GranuleUR query for Hydrocron tables.

* Add and set up vcrpy for testing CMR API query

* Test track ingest operations

- Test CMR and hydrocron queries
- Test granuleUR query
- Update database to include granuleUR GSI

* Update to use track_ingest naming consistently

* Initial Lambda function and IAM role definition

* Replace deprecated path function with as_file

* Add SSM read IAM permissions

* Add DynamoDB read permissions

* Update track ingest lambda memory

* Remove duplicate IAM permissions

* Add in permissions to query index

* Update changelog

* Update changelog description

* Use python_cmr for CMR API queries

* /version 1.4.0a1

* Add doi to documentation pages (#216)

* Update intro.md with DOI

* Update overview.md with DOI

* /version 1.4.0a2

* issue-193: Add Dynamo DB Table for SWOT Prior Lakes (#209)

* add code to handle prior lakes shapefiles, add test prior lake data

* update terraform to add prior lake table

* fix tests, change to smaller test data file, changelog

* linting

* reconfigure main load_data method to make more readable and pass linting

* lint

* lint

* fix string casting to lower storage req & update test responses to handle different rounding pattern in coords

* update load benchmarking function for linting and add unit test

* try parent collection for lakes

* update version parsing for parent collection

* fix case error

* fix lake id reference

* add logging to troubleshoot too large features

* add item size logging and remove error raise for batch write

* clean up logging statements & move numeric_columns assignment

* update batch logging statement

* Rename constant

* Fix temp dir security risk https://rules.sonarsource.com/python/RSPEC-5443/

* Fix temp dir security risk https://rules.sonarsource.com/python/RSPEC-5443/

* fix code coverage calculation

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a3

* Feature/issue 201 Create a table for tracking granule ingest status (#214)

* Define track ingest database and IAM permissions

* Update changelog with issue

* Modify table structure to support sparse status index

* Updated to only apply PITR in ops

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a4

* Feature/issue 210 - Load large geometry polygons (#219)

* add functions to handle null geometries and convert polygons to points

* update doi in docs

* fix fill null geometries

* fix tests and update changelog

* /version 1.4.0a5

* Feature/issue 222 - Add granule info to track ingest table on load (#223)

* adjust lambdas to populate track ingest table on granule load

* changelog

* remove test cnm

* lint

* change error caught when handling checksum

* update lambda role permissions to write to track ingest table

* fix typo on lake table terraform

* set default fill values for checksum and rev date in track status

* fix checksum handling in bulk load data

* lint

* add logging to debug

* /version 1.4.0a6

* Add SSM parameter read for last run time

* Feature/issue-225: Create one track ingest table per feature type (#226)

* add track ingest tables for each feature type and adjust load data to populate

* changelog

* /version 1.4.0a7

* Feature/issue 196 Add new feature type to query the API for lake data (#224)

* Initial API queries for lake data

* Unit tests for lake data

* Updates after center point calculations

- Removed temp code to calculate a point in API
- Implemented unit test to test lake data retrieval
- Updated fixtures to load in lake data for testing

* Add read lake table permissions to lambda timeseries and track ingest roles

* Update documenation to include lake data

* Updated documentation to include info on lake centerpoints

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a8

* Feature/issue 205 - Add Confluence API key (#221)

* Fix possible variable references before value is assigned

* Define Confluence API key and trusted partner plan limits

* Define a list of trusted partner keys and store under single parameter

* Define API keys as encrypted envrionment variables for Lambda authorizer

* Update authorizer and connection class to use KMS to retrieve API keys

* Hack to force lambda deployment when ssm value changes (#218)

* Add replace_triggered_by to hydrocron_lambda_authorizer

* Introduce environment variable that contains random id which will change whenever an API key value changes. This will force lambda to publish new version of the function.

* Remove unnecessary hash function

* Update to SSM parameter API key storage and null_resource enviroment variable

* Update Terraform and AWS provider

* Update API key documentation

* Set source_code_hash to force deployment of new image

* Downgrade AWS provider to 4.0 to remove inline policy errors

* Update docs/timeseries.md

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a9

* /version 1.4.0a10

* changelog for 1.4.0 release

* update dependencies for 1.4.0 release

* /version 1.5.0a0

* fix CMR query in UAT

* /version 1.4.0rc1

* fix typo in load_data lambda

* /version 1.4.0rc2

* Initial track ingest table query

* Fix linting and code style

* Implement feature count operations

* Enable S3 permissions and set environment variable for track lambda

* Fix trailing white spaces and code format

* Update docstrings for class methods

* Implement run time storage in SSM

* Query track table unit tests

* Update CHANGELOG with issue

* Update SSM run time parameter

* Fix trailing whitespace

* Fix reference to IAM policy

* Enable specification of temporal range to search revision date by

* Fix SSM put parameter policy

* Update IAM permissions for reading track ingest

* Enable full temporal search on CMR granules

* Add capability to download shapefile granule to count features

* Update granule UR to include .zip

* Count features via Hydrocron table query

* Remove unnecessary s3 permissions

* Remove whitespace from blank line

* Update cryptography to 43.0.1

* Update track ingest table operations

* Update changelog with issue

* update dependencies

* upgrade geopandas

* update dependencies

* fix index on rev date in load data lambda

* update dependencies

* lint readme

* /version 1.4.0rc3

* /version 1.4.0rc4

* Implement operations to publish CNM messages for granules requiring ingest

* Implement unit test of publication operations

* Fix linting

* Add issue to changelog and fix linting

* Add EventBridge schedules with appropriate Lambda permissions

* Set initial schedule expressions and fix assume policy

* fix cmr env search by venue

* /version 1.4.0rc5

* Disable eventbridge schedules by default

* Update schedule to run weekly

* Define 1 hour latency to search by revision_date in CMR

* Allow CMR UAT query based on HYDROCRON_ENV environment variable

* Update unit tests to accomodate UAT CMR query

* Add earthdata login credentials to Lambda

* Add issue to changelog

* Fix linting white space

---------

Co-authored-by: nikki-t <[email protected]>
Co-authored-by: Frank Greguska <[email protected]>
Co-authored-by: frankinspace <[email protected]>
Co-authored-by: Victoria McDonald <[email protected]>
Co-authored-by: Cassie Nickles <[email protected]>
Co-authored-by: cassienickles <[email protected]>
Co-authored-by: podaac-cicd[bot] <podaac-cicd[bot]@users.noreply.github.com>
Co-authored-by: Victoria McDonald <[email protected]>
Co-authored-by: torimcd <[email protected]>
nikki-t added a commit that referenced this pull request Nov 20, 2024
* /version 1.5.0a0

* Fix after rebase (#241)

* Update hydrocron-lambda.tf

* Update pyproject.toml

* update lock

* fix lint

* fix lint

* /version 1.5.0a1

* Feature/issue 211 - Query track ingest table for granules with "to_ingest" status (#227)

* /version 1.3.0a0

* Update build.yml

* /version 1.3.0a1

* /version 1.3.0a2

* Feature/issue 175 - Update docs to point to OPS (#176)

* changelog

* update examples, remove load_data readme, info moved to wiki

* Dependency update to fix snyk scan

* issues/101: Support for HTTP Accept header (#172)

* Reorganize timeseries code to  prep for Accept header

* Enable Accept header to return response of specific content-type

* Fix whitespace and string continuation

* Make error handling consistent and add an additional test where a reach can't be found

* Update changelog with issue for unreleased version

* Add 415 status code to API definition

* Few minor cleanup items

* Few minor cleanup items

* Update to [email protected]

* Fix dependencies

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a3

* issues/102: Support compression of API response (#173)

* Enable payload compression

* Update changelog with issue

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a4

* Feature/issue 100 Add option to 'compact' GeoJSON result into single feature (#177)

* Reorganize timeseries code to  prep for Accept header

* Enable Accept header to return response of specific content-type

* Fix whitespace and string continuation

* Make error handling consistent and add an additional test where a reach can't be found

* Update changelog with issue for unreleased version

* Add 415 status code to API definition

* Few minor cleanup items

* Few minor cleanup items

* Update to [email protected]

* Fix dependencies

* Update required query parameters based on current API functionality

* Enable return of 'compact' GeoJSON response

* Fix linting and add test data

* Update documentation for API accept headers and compact GeoJSON response

* Fix references to incorrect Accept header examples

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a5

* Feature/issue 183 (#185)

* Provide introduction to timeseries endpoint

* Remove _units in fields list

* Fix typo

* Update examples with Accept headers and compact query parameter

* Add issue to changelog

* Fix typo in timeseries documentation

* Update pymysql

* Update pymysql

* Provide clarity on accept headers and request parameter fields

* /version 1.3.0a6

* Feature/issue 186 Implement API keys (#188)

* API Gateway Lambda authorizer to facilitate API keys and usage plans

* Unit tests to test Lambda authorizer

* Fix terraform file formatting

* API Gateway Lambda Authorizer

- Lambda function
- API Keys and Authorizer definition in OpenAPI spec
- API gateway API keys
- API gateway usage plans
- SSM parameters for API keys

* Fix trailing whitespace

* Set default region environment variable

* Fix SNYK vulnerabilities

* Add issue to changelog

* Implement custom trusted partner header x-hydrocron-key

* Update cryptography for SNYK vulnerability

* Update documentation to include API key usage

* Update quota and throttle settings for API Gateway

* Update API keys documentation to indicate to be implemented

* Move API key lookup to Lambda INIT

* Remove API key authentication and update API key to x-hydrocron-key

* /version 1.3.0a7

* Update changelog for 1.3.0 release

* /version 1.4.0a0

* Feature/issue 198 (#207)

* Update pylint to deal with errors and fix collection reference

* Initial CMR and Hydrocron queries

- Includes placeholders for other operations needed to track granule
ingest.
- GranuleUR query for Hydrocron tables.

* Add and set up vcrpy for testing CMR API query

* Test track ingest operations

- Test CMR and hydrocron queries
- Test granuleUR query
- Update database to include granuleUR GSI

* Update to use track_ingest naming consistently

* Initial Lambda function and IAM role definition

* Replace deprecated path function with as_file

* Add SSM read IAM permissions

* Add DynamoDB read permissions

* Update track ingest lambda memory

* Remove duplicate IAM permissions

* Add in permissions to query index

* Update changelog

* Update changelog description

* Use python_cmr for CMR API queries

* /version 1.4.0a1

* Add doi to documentation pages (#216)

* Update intro.md with DOI

* Update overview.md with DOI

* /version 1.4.0a2

* issue-193: Add Dynamo DB Table for SWOT Prior Lakes (#209)

* add code to handle prior lakes shapefiles, add test prior lake data

* update terraform to add prior lake table

* fix tests, change to smaller test data file, changelog

* linting

* reconfigure main load_data method to make more readable and pass linting

* lint

* lint

* fix string casting to lower storage req & update test responses to handle different rounding pattern in coords

* update load benchmarking function for linting and add unit test

* try parent collection for lakes

* update version parsing for parent collection

* fix case error

* fix lake id reference

* add logging to troubleshoot too large features

* add item size logging and remove error raise for batch write

* clean up logging statements & move numeric_columns assignment

* update batch logging statement

* Rename constant

* Fix temp dir security risk https://rules.sonarsource.com/python/RSPEC-5443/

* Fix temp dir security risk https://rules.sonarsource.com/python/RSPEC-5443/

* fix code coverage calculation

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a3

* Feature/issue 201 Create a table for tracking granule ingest status (#214)

* Define track ingest database and IAM permissions

* Update changelog with issue

* Modify table structure to support sparse status index

* Updated to only apply PITR in ops

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a4

* Feature/issue 210 - Load large geometry polygons (#219)

* add functions to handle null geometries and convert polygons to points

* update doi in docs

* fix fill null geometries

* fix tests and update changelog

* /version 1.4.0a5

* Feature/issue 222 - Add granule info to track ingest table on load (#223)

* adjust lambdas to populate track ingest table on granule load

* changelog

* remove test cnm

* lint

* change error caught when handling checksum

* update lambda role permissions to write to track ingest table

* fix typo on lake table terraform

* set default fill values for checksum and rev date in track status

* fix checksum handling in bulk load data

* lint

* add logging to debug

* /version 1.4.0a6

* Add SSM parameter read for last run time

* Feature/issue-225: Create one track ingest table per feature type (#226)

* add track ingest tables for each feature type and adjust load data to populate

* changelog

* /version 1.4.0a7

* Feature/issue 196 Add new feature type to query the API for lake data (#224)

* Initial API queries for lake data

* Unit tests for lake data

* Updates after center point calculations

- Removed temp code to calculate a point in API
- Implemented unit test to test lake data retrieval
- Updated fixtures to load in lake data for testing

* Add read lake table permissions to lambda timeseries and track ingest roles

* Update documenation to include lake data

* Updated documentation to include info on lake centerpoints

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a8

* Feature/issue 205 - Add Confluence API key (#221)

* Fix possible variable references before value is assigned

* Define Confluence API key and trusted partner plan limits

* Define a list of trusted partner keys and store under single parameter

* Define API keys as encrypted envrionment variables for Lambda authorizer

* Update authorizer and connection class to use KMS to retrieve API keys

* Hack to force lambda deployment when ssm value changes (#218)

* Add replace_triggered_by to hydrocron_lambda_authorizer

* Introduce environment variable that contains random id which will change whenever an API key value changes. This will force lambda to publish new version of the function.

* Remove unnecessary hash function

* Update to SSM parameter API key storage and null_resource enviroment variable

* Update Terraform and AWS provider

* Update API key documentation

* Set source_code_hash to force deployment of new image

* Downgrade AWS provider to 4.0 to remove inline policy errors

* Update docs/timeseries.md

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a9

* /version 1.4.0a10

* changelog for 1.4.0 release

* /version 1.5.0a0

* Initial track ingest table query

* Fix linting and code style

* Implement feature count operations

* Enable S3 permissions and set environment variable for track lambda

* Fix trailing white spaces and code format

* Update docstrings for class methods

* Implement run time storage in SSM

* Query track table unit tests

* Update CHANGELOG with issue

* Update SSM run time parameter

* Fix trailing whitespace

* Fix reference to IAM policy

* Enable specification of temporal range to search revision date by

* Fix SSM put parameter policy

* Update IAM permissions for reading track ingest

* Enable full temporal search on CMR granules

* Add capability to download shapefile granule to count features

* Update granule UR to include .zip

* Count features via Hydrocron table query

* Remove unnecessary s3 permissions

* Remove whitespace from blank line

* Update cryptography to 43.0.1

* update dependencies

* upgrade geopandas

* update dependencies

---------

Co-authored-by: nikki-t <[email protected]>
Co-authored-by: Frank Greguska <[email protected]>
Co-authored-by: frankinspace <[email protected]>
Co-authored-by: Victoria McDonald <[email protected]>
Co-authored-by: Cassie Nickles <[email protected]>
Co-authored-by: cassienickles <[email protected]>
Co-authored-by: podaac-cicd[bot] <podaac-cicd[bot]@users.noreply.github.com>
Co-authored-by: Victoria McDonald <[email protected]>
Co-authored-by: torimcd <[email protected]>

* /version 1.5.0a2

* Feature/issue 212 - Update track ingest table with granule status (#228)

* /version 1.3.0a0

* Update build.yml

* /version 1.3.0a1

* /version 1.3.0a2

* Feature/issue 175 - Update docs to point to OPS (#176)

* changelog

* update examples, remove load_data readme, info moved to wiki

* Dependency update to fix snyk scan

* issues/101: Support for HTTP Accept header (#172)

* Reorganize timeseries code to  prep for Accept header

* Enable Accept header to return response of specific content-type

* Fix whitespace and string continuation

* Make error handling consistent and add an additional test where a reach can't be found

* Update changelog with issue for unreleased version

* Add 415 status code to API definition

* Few minor cleanup items

* Few minor cleanup items

* Update to [email protected]

* Fix dependencies

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a3

* issues/102: Support compression of API response (#173)

* Enable payload compression

* Update changelog with issue

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a4

* Feature/issue 100 Add option to 'compact' GeoJSON result into single feature (#177)

* Reorganize timeseries code to  prep for Accept header

* Enable Accept header to return response of specific content-type

* Fix whitespace and string continuation

* Make error handling consistent and add an additional test where a reach can't be found

* Update changelog with issue for unreleased version

* Add 415 status code to API definition

* Few minor cleanup items

* Few minor cleanup items

* Update to [email protected]

* Fix dependencies

* Update required query parameters based on current API functionality

* Enable return of 'compact' GeoJSON response

* Fix linting and add test data

* Update documentation for API accept headers and compact GeoJSON response

* Fix references to incorrect Accept header examples

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a5

* Feature/issue 183 (#185)

* Provide introduction to timeseries endpoint

* Remove _units in fields list

* Fix typo

* Update examples with Accept headers and compact query parameter

* Add issue to changelog

* Fix typo in timeseries documentation

* Update pymysql

* Update pymysql

* Provide clarity on accept headers and request parameter fields

* /version 1.3.0a6

* Feature/issue 186 Implement API keys (#188)

* API Gateway Lambda authorizer to facilitate API keys and usage plans

* Unit tests to test Lambda authorizer

* Fix terraform file formatting

* API Gateway Lambda Authorizer

- Lambda function
- API Keys and Authorizer definition in OpenAPI spec
- API gateway API keys
- API gateway usage plans
- SSM parameters for API keys

* Fix trailing whitespace

* Set default region environment variable

* Fix SNYK vulnerabilities

* Add issue to changelog

* Implement custom trusted partner header x-hydrocron-key

* Update cryptography for SNYK vulnerability

* Update documentation to include API key usage

* Update quota and throttle settings for API Gateway

* Update API keys documentation to indicate to be implemented

* Move API key lookup to Lambda INIT

* Remove API key authentication and update API key to x-hydrocron-key

* /version 1.3.0a7

* Update changelog for 1.3.0 release

* /version 1.4.0a0

* Feature/issue 198 (#207)

* Update pylint to deal with errors and fix collection reference

* Initial CMR and Hydrocron queries

- Includes placeholders for other operations needed to track granule
ingest.
- GranuleUR query for Hydrocron tables.

* Add and set up vcrpy for testing CMR API query

* Test track ingest operations

- Test CMR and hydrocron queries
- Test granuleUR query
- Update database to include granuleUR GSI

* Update to use track_ingest naming consistently

* Initial Lambda function and IAM role definition

* Replace deprecated path function with as_file

* Add SSM read IAM permissions

* Add DynamoDB read permissions

* Update track ingest lambda memory

* Remove duplicate IAM permissions

* Add in permissions to query index

* Update changelog

* Update changelog description

* Use python_cmr for CMR API queries

* /version 1.4.0a1

* Add doi to documentation pages (#216)

* Update intro.md with DOI

* Update overview.md with DOI

* /version 1.4.0a2

* issue-193: Add Dynamo DB Table for SWOT Prior Lakes (#209)

* add code to handle prior lakes shapefiles, add test prior lake data

* update terraform to add prior lake table

* fix tests, change to smaller test data file, changelog

* linting

* reconfigure main load_data method to make more readable and pass linting

* lint

* lint

* fix string casting to lower storage req & update test responses to handle different rounding pattern in coords

* update load benchmarking function for linting and add unit test

* try parent collection for lakes

* update version parsing for parent collection

* fix case error

* fix lake id reference

* add logging to troubleshoot too large features

* add item size logging and remove error raise for batch write

* clean up logging statements & move numeric_columns assignment

* update batch logging statement

* Rename constant

* Fix temp dir security risk https://rules.sonarsource.com/python/RSPEC-5443/

* Fix temp dir security risk https://rules.sonarsource.com/python/RSPEC-5443/

* fix code coverage calculation

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a3

* Feature/issue 201 Create a table for tracking granule ingest status (#214)

* Define track ingest database and IAM permissions

* Update changelog with issue

* Modify table structure to support sparse status index

* Updated to only apply PITR in ops

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a4

* Feature/issue 210 - Load large geometry polygons (#219)

* add functions to handle null geometries and convert polygons to points

* update doi in docs

* fix fill null geometries

* fix tests and update changelog

* /version 1.4.0a5

* Feature/issue 222 - Add granule info to track ingest table on load (#223)

* adjust lambdas to populate track ingest table on granule load

* changelog

* remove test cnm

* lint

* change error caught when handling checksum

* update lambda role permissions to write to track ingest table

* fix typo on lake table terraform

* set default fill values for checksum and rev date in track status

* fix checksum handling in bulk load data

* lint

* add logging to debug

* /version 1.4.0a6

* Add SSM parameter read for last run time

* Feature/issue-225: Create one track ingest table per feature type (#226)

* add track ingest tables for each feature type and adjust load data to populate

* changelog

* /version 1.4.0a7

* Feature/issue 196 Add new feature type to query the API for lake data (#224)

* Initial API queries for lake data

* Unit tests for lake data

* Updates after center point calculations

- Removed temp code to calculate a point in API
- Implemented unit test to test lake data retrieval
- Updated fixtures to load in lake data for testing

* Add read lake table permissions to lambda timeseries and track ingest roles

* Update documenation to include lake data

* Updated documentation to include info on lake centerpoints

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a8

* Feature/issue 205 - Add Confluence API key (#221)

* Fix possible variable references before value is assigned

* Define Confluence API key and trusted partner plan limits

* Define a list of trusted partner keys and store under single parameter

* Define API keys as encrypted envrionment variables for Lambda authorizer

* Update authorizer and connection class to use KMS to retrieve API keys

* Hack to force lambda deployment when ssm value changes (#218)

* Add replace_triggered_by to hydrocron_lambda_authorizer

* Introduce environment variable that contains random id which will change whenever an API key value changes. This will force lambda to publish new version of the function.

* Remove unnecessary hash function

* Update to SSM parameter API key storage and null_resource enviroment variable

* Update Terraform and AWS provider

* Update API key documentation

* Set source_code_hash to force deployment of new image

* Downgrade AWS provider to 4.0 to remove inline policy errors

* Update docs/timeseries.md

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a9

* /version 1.4.0a10

* changelog for 1.4.0 release

* /version 1.5.0a0

* Initial track ingest table query

* Fix linting and code style

* Implement feature count operations

* Enable S3 permissions and set environment variable for track lambda

* Fix trailing white spaces and code format

* Update docstrings for class methods

* Implement run time storage in SSM

* Query track table unit tests

* Update CHANGELOG with issue

* Update SSM run time parameter

* Fix trailing whitespace

* Fix reference to IAM policy

* Enable specification of temporal range to search revision date by

* Fix SSM put parameter policy

* Update IAM permissions for reading track ingest

* Enable full temporal search on CMR granules

* Add capability to download shapefile granule to count features

* Update granule UR to include .zip

* Count features via Hydrocron table query

* Remove unnecessary s3 permissions

* Remove whitespace from blank line

* Update cryptography to 43.0.1

* Update track ingest table operations

* Update changelog with issue

* update dependencies

* upgrade geopandas

* update dependencies

---------

Co-authored-by: nikki-t <[email protected]>
Co-authored-by: Frank Greguska <[email protected]>
Co-authored-by: frankinspace <[email protected]>
Co-authored-by: Victoria McDonald <[email protected]>
Co-authored-by: Cassie Nickles <[email protected]>
Co-authored-by: cassienickles <[email protected]>
Co-authored-by: podaac-cicd[bot] <podaac-cicd[bot]@users.noreply.github.com>
Co-authored-by: Victoria McDonald <[email protected]>
Co-authored-by: torimcd <[email protected]>

* /version 1.5.0a3

* Feature/issue 203 - Construct CNM to trigger load data (#232)

* /version 1.3.0a0

* Update build.yml

* /version 1.3.0a1

* /version 1.3.0a2

* Feature/issue 175 - Update docs to point to OPS (#176)

* changelog

* update examples, remove load_data readme, info moved to wiki

* Dependency update to fix snyk scan

* issues/101: Support for HTTP Accept header (#172)

* Reorganize timeseries code to  prep for Accept header

* Enable Accept header to return response of specific content-type

* Fix whitespace and string continuation

* Make error handling consistent and add an additional test where a reach can't be found

* Update changelog with issue for unreleased version

* Add 415 status code to API definition

* Few minor cleanup items

* Few minor cleanup items

* Update to [email protected]

* Fix dependencies

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a3

* issues/102: Support compression of API response (#173)

* Enable payload compression

* Update changelog with issue

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a4

* Feature/issue 100 Add option to 'compact' GeoJSON result into single feature (#177)

* Reorganize timeseries code to  prep for Accept header

* Enable Accept header to return response of specific content-type

* Fix whitespace and string continuation

* Make error handling consistent and add an additional test where a reach can't be found

* Update changelog with issue for unreleased version

* Add 415 status code to API definition

* Few minor cleanup items

* Few minor cleanup items

* Update to [email protected]

* Fix dependencies

* Update required query parameters based on current API functionality

* Enable return of 'compact' GeoJSON response

* Fix linting and add test data

* Update documentation for API accept headers and compact GeoJSON response

* Fix references to incorrect Accept header examples

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a5

* Feature/issue 183 (#185)

* Provide introduction to timeseries endpoint

* Remove _units in fields list

* Fix typo

* Update examples with Accept headers and compact query parameter

* Add issue to changelog

* Fix typo in timeseries documentation

* Update pymysql

* Update pymysql

* Provide clarity on accept headers and request parameter fields

* /version 1.3.0a6

* Feature/issue 186 Implement API keys (#188)

* API Gateway Lambda authorizer to facilitate API keys and usage plans

* Unit tests to test Lambda authorizer

* Fix terraform file formatting

* API Gateway Lambda Authorizer

- Lambda function
- API Keys and Authorizer definition in OpenAPI spec
- API gateway API keys
- API gateway usage plans
- SSM parameters for API keys

* Fix trailing whitespace

* Set default region environment variable

* Fix SNYK vulnerabilities

* Add issue to changelog

* Implement custom trusted partner header x-hydrocron-key

* Update cryptography for SNYK vulnerability

* Update documentation to include API key usage

* Update quota and throttle settings for API Gateway

* Update API keys documentation to indicate to be implemented

* Move API key lookup to Lambda INIT

* Remove API key authentication and update API key to x-hydrocron-key

* /version 1.3.0a7

* Update changelog for 1.3.0 release

* /version 1.4.0a0

* Feature/issue 198 (#207)

* Update pylint to deal with errors and fix collection reference

* Initial CMR and Hydrocron queries

- Includes placeholders for other operations needed to track granule
ingest.
- GranuleUR query for Hydrocron tables.

* Add and set up vcrpy for testing CMR API query

* Test track ingest operations

- Test CMR and hydrocron queries
- Test granuleUR query
- Update database to include granuleUR GSI

* Update to use track_ingest naming consistently

* Initial Lambda function and IAM role definition

* Replace deprecated path function with as_file

* Add SSM read IAM permissions

* Add DynamoDB read permissions

* Update track ingest lambda memory

* Remove duplicate IAM permissions

* Add in permissions to query index

* Update changelog

* Update changelog description

* Use python_cmr for CMR API queries

* /version 1.4.0a1

* Add doi to documentation pages (#216)

* Update intro.md with DOI

* Update overview.md with DOI

* /version 1.4.0a2

* issue-193: Add Dynamo DB Table for SWOT Prior Lakes (#209)

* add code to handle prior lakes shapefiles, add test prior lake data

* update terraform to add prior lake table

* fix tests, change to smaller test data file, changelog

* linting

* reconfigure main load_data method to make more readable and pass linting

* lint

* lint

* fix string casting to lower storage req & update test responses to handle different rounding pattern in coords

* update load benchmarking function for linting and add unit test

* try parent collection for lakes

* update version parsing for parent collection

* fix case error

* fix lake id reference

* add logging to troubleshoot too large features

* add item size logging and remove error raise for batch write

* clean up logging statements & move numeric_columns assignment

* update batch logging statement

* Rename constant

* Fix temp dir security risk https://rules.sonarsource.com/python/RSPEC-5443/

* Fix temp dir security risk https://rules.sonarsource.com/python/RSPEC-5443/

* fix code coverage calculation

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a3

* Feature/issue 201 Create a table for tracking granule ingest status (#214)

* Define track ingest database and IAM permissions

* Update changelog with issue

* Modify table structure to support sparse status index

* Updated to only apply PITR in ops

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a4

* Feature/issue 210 - Load large geometry polygons (#219)

* add functions to handle null geometries and convert polygons to points

* update doi in docs

* fix fill null geometries

* fix tests and update changelog

* /version 1.4.0a5

* Feature/issue 222 - Add granule info to track ingest table on load (#223)

* adjust lambdas to populate track ingest table on granule load

* changelog

* remove test cnm

* lint

* change error caught when handling checksum

* update lambda role permissions to write to track ingest table

* fix typo on lake table terraform

* set default fill values for checksum and rev date in track status

* fix checksum handling in bulk load data

* lint

* add logging to debug

* /version 1.4.0a6

* Add SSM parameter read for last run time

* Feature/issue-225: Create one track ingest table per feature type (#226)

* add track ingest tables for each feature type and adjust load data to populate

* changelog

* /version 1.4.0a7

* Feature/issue 196 Add new feature type to query the API for lake data (#224)

* Initial API queries for lake data

* Unit tests for lake data

* Updates after center point calculations

- Removed temp code to calculate a point in API
- Implemented unit test to test lake data retrieval
- Updated fixtures to load in lake data for testing

* Add read lake table permissions to lambda timeseries and track ingest roles

* Update documenation to include lake data

* Updated documentation to include info on lake centerpoints

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a8

* Feature/issue 205 - Add Confluence API key (#221)

* Fix possible variable references before value is assigned

* Define Confluence API key and trusted partner plan limits

* Define a list of trusted partner keys and store under single parameter

* Define API keys as encrypted envrionment variables for Lambda authorizer

* Update authorizer and connection class to use KMS to retrieve API keys

* Hack to force lambda deployment when ssm value changes (#218)

* Add replace_triggered_by to hydrocron_lambda_authorizer

* Introduce environment variable that contains random id which will change whenever an API key value changes. This will force lambda to publish new version of the function.

* Remove unnecessary hash function

* Update to SSM parameter API key storage and null_resource enviroment variable

* Update Terraform and AWS provider

* Update API key documentation

* Set source_code_hash to force deployment of new image

* Downgrade AWS provider to 4.0 to remove inline policy errors

* Update docs/timeseries.md

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a9

* /version 1.4.0a10

* changelog for 1.4.0 release

* /version 1.5.0a0

* Initial track ingest table query

* Fix linting and code style

* Implement feature count operations

* Enable S3 permissions and set environment variable for track lambda

* Fix trailing white spaces and code format

* Update docstrings for class methods

* Implement run time storage in SSM

* Query track table unit tests

* Update CHANGELOG with issue

* Update SSM run time parameter

* Fix trailing whitespace

* Fix reference to IAM policy

* Enable specification of temporal range to search revision date by

* Fix SSM put parameter policy

* Update IAM permissions for reading track ingest

* Enable full temporal search on CMR granules

* Add capability to download shapefile granule to count features

* Update granule UR to include .zip

* Count features via Hydrocron table query

* Remove unnecessary s3 permissions

* Remove whitespace from blank line

* Update cryptography to 43.0.1

* Update track ingest table operations

* Update changelog with issue

* update dependencies

* upgrade geopandas

* update dependencies

* Implement operations to publish CNM messages for granules requiring ingest

* Implement unit test of publication operations

* Fix linting

* Add issue to changelog and fix linting

* Add EventBridge schedules with appropriate Lambda permissions

* Set initial schedule expressions and fix assume policy

* Disable eventbridge schedules by default

* Update schedule to run weekly

* Define 1 hour latency to search by revision_date in CMR

---------

Co-authored-by: nikki-t <[email protected]>
Co-authored-by: Frank Greguska <[email protected]>
Co-authored-by: frankinspace <[email protected]>
Co-authored-by: Victoria McDonald <[email protected]>
Co-authored-by: Cassie Nickles <[email protected]>
Co-authored-by: cassienickles <[email protected]>
Co-authored-by: podaac-cicd[bot] <podaac-cicd[bot]@users.noreply.github.com>
Co-authored-by: Victoria McDonald <[email protected]>
Co-authored-by: torimcd <[email protected]>

* Feature/issue 236: Update track ingest to allow UAT query of CMR (#240)

* /version 1.3.0a0

* Update build.yml

* /version 1.3.0a1

* /version 1.3.0a2

* Feature/issue 175 - Update docs to point to OPS (#176)

* changelog

* update examples, remove load_data readme, info moved to wiki

* Dependency update to fix snyk scan

* issues/101: Support for HTTP Accept header (#172)

* Reorganize timeseries code to  prep for Accept header

* Enable Accept header to return response of specific content-type

* Fix whitespace and string continuation

* Make error handling consistent and add an additional test where a reach can't be found

* Update changelog with issue for unreleased version

* Add 415 status code to API definition

* Few minor cleanup items

* Few minor cleanup items

* Update to [email protected]

* Fix dependencies

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a3

* issues/102: Support compression of API response (#173)

* Enable payload compression

* Update changelog with issue

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a4

* Feature/issue 100 Add option to 'compact' GeoJSON result into single feature (#177)

* Reorganize timeseries code to  prep for Accept header

* Enable Accept header to return response of specific content-type

* Fix whitespace and string continuation

* Make error handling consistent and add an additional test where a reach can't be found

* Update changelog with issue for unreleased version

* Add 415 status code to API definition

* Few minor cleanup items

* Few minor cleanup items

* Update to [email protected]

* Fix dependencies

* Update required query parameters based on current API functionality

* Enable return of 'compact' GeoJSON response

* Fix linting and add test data

* Update documentation for API accept headers and compact GeoJSON response

* Fix references to incorrect Accept header examples

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a5

* Feature/issue 183 (#185)

* Provide introduction to timeseries endpoint

* Remove _units in fields list

* Fix typo

* Update examples with Accept headers and compact query parameter

* Add issue to changelog

* Fix typo in timeseries documentation

* Update pymysql

* Update pymysql

* Provide clarity on accept headers and request parameter fields

* /version 1.3.0a6

* Feature/issue 186 Implement API keys (#188)

* API Gateway Lambda authorizer to facilitate API keys and usage plans

* Unit tests to test Lambda authorizer

* Fix terraform file formatting

* API Gateway Lambda Authorizer

- Lambda function
- API Keys and Authorizer definition in OpenAPI spec
- API gateway API keys
- API gateway usage plans
- SSM parameters for API keys

* Fix trailing whitespace

* Set default region environment variable

* Fix SNYK vulnerabilities

* Add issue to changelog

* Implement custom trusted partner header x-hydrocron-key

* Update cryptography for SNYK vulnerability

* Update documentation to include API key usage

* Update quota and throttle settings for API Gateway

* Update API keys documentation to indicate to be implemented

* Move API key lookup to Lambda INIT

* Remove API key authentication and update API key to x-hydrocron-key

* /version 1.3.0a7

* Update changelog for 1.3.0 release

* /version 1.4.0a0

* Feature/issue 198 (#207)

* Update pylint to deal with errors and fix collection reference

* Initial CMR and Hydrocron queries

- Includes placeholders for other operations needed to track granule
ingest.
- GranuleUR query for Hydrocron tables.

* Add and set up vcrpy for testing CMR API query

* Test track ingest operations

- Test CMR and hydrocron queries
- Test granuleUR query
- Update database to include granuleUR GSI

* Update to use track_ingest naming consistently

* Initial Lambda function and IAM role definition

* Replace deprecated path function with as_file

* Add SSM read IAM permissions

* Add DynamoDB read permissions

* Update track ingest lambda memory

* Remove duplicate IAM permissions

* Add in permissions to query index

* Update changelog

* Update changelog description

* Use python_cmr for CMR API queries

* /version 1.4.0a1

* Add doi to documentation pages (#216)

* Update intro.md with DOI

* Update overview.md with DOI

* /version 1.4.0a2

* issue-193: Add Dynamo DB Table for SWOT Prior Lakes (#209)

* add code to handle prior lakes shapefiles, add test prior lake data

* update terraform to add prior lake table

* fix tests, change to smaller test data file, changelog

* linting

* reconfigure main load_data method to make more readable and pass linting

* lint

* lint

* fix string casting to lower storage req & update test responses to handle different rounding pattern in coords

* update load benchmarking function for linting and add unit test

* try parent collection for lakes

* update version parsing for parent collection

* fix case error

* fix lake id reference

* add logging to troubleshoot too large features

* add item size logging and remove error raise for batch write

* clean up logging statements & move numeric_columns assignment

* update batch logging statement

* Rename constant

* Fix temp dir security risk https://rules.sonarsource.com/python/RSPEC-5443/

* Fix temp dir security risk https://rules.sonarsource.com/python/RSPEC-5443/

* fix code coverage calculation

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a3

* Feature/issue 201 Create a table for tracking granule ingest status (#214)

* Define track ingest database and IAM permissions

* Update changelog with issue

* Modify table structure to support sparse status index

* Updated to only apply PITR in ops

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a4

* Feature/issue 210 - Load large geometry polygons (#219)

* add functions to handle null geometries and convert polygons to points

* update doi in docs

* fix fill null geometries

* fix tests and update changelog

* /version 1.4.0a5

* Feature/issue 222 - Add granule info to track ingest table on load (#223)

* adjust lambdas to populate track ingest table on granule load

* changelog

* remove test cnm

* lint

* change error caught when handling checksum

* update lambda role permissions to write to track ingest table

* fix typo on lake table terraform

* set default fill values for checksum and rev date in track status

* fix checksum handling in bulk load data

* lint

* add logging to debug

* /version 1.4.0a6

* Add SSM parameter read for last run time

* Feature/issue-225: Create one track ingest table per feature type (#226)

* add track ingest tables for each feature type and adjust load data to populate

* changelog

* /version 1.4.0a7

* Feature/issue 196 Add new feature type to query the API for lake data (#224)

* Initial API queries for lake data

* Unit tests for lake data

* Updates after center point calculations

- Removed temp code to calculate a point in API
- Implemented unit test to test lake data retrieval
- Updated fixtures to load in lake data for testing

* Add read lake table permissions to lambda timeseries and track ingest roles

* Update documenation to include lake data

* Updated documentation to include info on lake centerpoints

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a8

* Feature/issue 205 - Add Confluence API key (#221)

* Fix possible variable references before value is assigned

* Define Confluence API key and trusted partner plan limits

* Define a list of trusted partner keys and store under single parameter

* Define API keys as encrypted envrionment variables for Lambda authorizer

* Update authorizer and connection class to use KMS to retrieve API keys

* Hack to force lambda deployment when ssm value changes (#218)

* Add replace_triggered_by to hydrocron_lambda_authorizer

* Introduce environment variable that contains random id which will change whenever an API key value changes. This will force lambda to publish new version of the function.

* Remove unnecessary hash function

* Update to SSM parameter API key storage and null_resource enviroment variable

* Update Terraform and AWS provider

* Update API key documentation

* Set source_code_hash to force deployment of new image

* Downgrade AWS provider to 4.0 to remove inline policy errors

* Update docs/timeseries.md

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a9

* /version 1.4.0a10

* changelog for 1.4.0 release

* update dependencies for 1.4.0 release

* /version 1.5.0a0

* fix CMR query in UAT

* /version 1.4.0rc1

* fix typo in load_data lambda

* /version 1.4.0rc2

* Initial track ingest table query

* Fix linting and code style

* Implement feature count operations

* Enable S3 permissions and set environment variable for track lambda

* Fix trailing white spaces and code format

* Update docstrings for class methods

* Implement run time storage in SSM

* Query track table unit tests

* Update CHANGELOG with issue

* Update SSM run time parameter

* Fix trailing whitespace

* Fix reference to IAM policy

* Enable specification of temporal range to search revision date by

* Fix SSM put parameter policy

* Update IAM permissions for reading track ingest

* Enable full temporal search on CMR granules

* Add capability to download shapefile granule to count features

* Update granule UR to include .zip

* Count features via Hydrocron table query

* Remove unnecessary s3 permissions

* Remove whitespace from blank line

* Update cryptography to 43.0.1

* Update track ingest table operations

* Update changelog with issue

* update dependencies

* upgrade geopandas

* update dependencies

* fix index on rev date in load data lambda

* update dependencies

* lint readme

* /version 1.4.0rc3

* /version 1.4.0rc4

* Implement operations to publish CNM messages for granules requiring ingest

* Implement unit test of publication operations

* Fix linting

* Add issue to changelog and fix linting

* Add EventBridge schedules with appropriate Lambda permissions

* Set initial schedule expressions and fix assume policy

* fix cmr env search by venue

* /version 1.4.0rc5

* Disable eventbridge schedules by default

* Update schedule to run weekly

* Define 1 hour latency to search by revision_date in CMR

* Allow CMR UAT query based on HYDROCRON_ENV environment variable

* Update unit tests to accomodate UAT CMR query

* Add earthdata login credentials to Lambda

* Add issue to changelog

* Fix linting white space

---------

Co-authored-by: nikki-t <[email protected]>
Co-authored-by: Frank Greguska <[email protected]>
Co-authored-by: frankinspace <[email protected]>
Co-authored-by: Victoria McDonald <[email protected]>
Co-authored-by: Cassie Nickles <[email protected]>
Co-authored-by: cassienickles <[email protected]>
Co-authored-by: podaac-cicd[bot] <podaac-cicd[bot]@users.noreply.github.com>
Co-authored-by: Victoria McDonald <[email protected]>
Co-authored-by: torimcd <[email protected]>

* /version 1.5.0a4

* /version 1.6.0a0

* /version 1.5.0a1

* Feature/issue 235 - Track ingest table can be populated with granules that aren't loaded into Hydrocron (#245)

* Raise an error if collection shortname does not match Hydrocron table names

* Raise an error unsupported lake data in load granule operations

* Remove trailing whitespace

* Fix code formatting

* Update CHANGELOG with issue

* Feature/issue 248 - Track ingest operations need to query UAT for granule files (#249)

* Query to return granule files should query UAT when running in SIT or UAT environments

* SIT execution should return UAT files for load granule operations

* Set venue environment variable before running test of query_cmr

* Add issue to CHANGELOG

* /version 1.5.0a2

* Update readme and changelog (#253)

* /version 1.5.0a3

* Feature/issue 250 - Handle overlapping times with unique CRIDS (#257)

* Handle overlapping times with unique CRIDs

* Define unit test to test when reprocessed granule arrives

* Add issue to CHANGELOG

* Add reprocessed CRID to eventbridge schedule input

* Handle cases with empty reprocessed_crid

* /version 1.5.0a4

* Feature/issue 258 - Granules with large numbers of features cannot be loadedd (#259)

* change assemble attrs function to avoid for loop

* change how attributes are concatenated during shp unpack to avoid slow looping

* remove unused import

* Update API test data with less precise data coordinates

* remove logging every item in batch writer

* lint

---------

Co-authored-by: Nikki <[email protected]>

* /version 1.5.0a5

* 1.5.0 release and unreleased section

* Update dependencies

* Provide better logging and debug logs

* Increase track ingest and load granule lambda timeout and memory

* Allow track ingest status query limit to prevent timeouts

* Handle granule UR product counter increments

* Fix linting for conditional check

* Test product counter cases

* Make DEBUG_LOG environment variable optional

* Add new track ingest environment variables to test

* /version 1.5.0rc1

* Handle cases where more than a single incremented product arrives

- Filter for both Hydrocron SWOT table queries and Track Ingest
table queries
- Remove duplicates prior to product counter increment detection

* Provide reprocessed_crid argument for track ingest table query

* /version 1.5.0rc2

* Fix bug that exists when removing older product counters

* /version 1.5.0rc3

* Remove overlap between ingested and to_ingest

* /version 1.5.0rc4

* Update collection start date for track ingest CMR queries eventbridge schedule

* /version 1.5.0rc5

---------

Co-authored-by: nikki-t <[email protected]>
Co-authored-by: Frank Greguska <[email protected]>
Co-authored-by: frankinspace <[email protected]>
Co-authored-by: Victoria McDonald <[email protected]>
Co-authored-by: Cassie Nickles <[email protected]>
Co-authored-by: cassienickles <[email protected]>
Co-authored-by: podaac-cicd[bot] <podaac-cicd[bot]@users.noreply.github.com>
Co-authored-by: Victoria McDonald <[email protected]>
Co-authored-by: torimcd <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement API keys to control usage
2 participants