Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend locking mechanism to different resources #1093

Closed
anushka-singh opened this issue Mar 7, 2024 · 6 comments · Fixed by #1338
Closed

Extend locking mechanism to different resources #1093

anushka-singh opened this issue Mar 7, 2024 · 6 comments · Fixed by #1338
Assignees
Labels
effort: medium priority: high status: not-picked-yet At the moment we have not picked this item. Anyone can pick it up

Comments

@anushka-singh
Copy link
Contributor

Is your idea related to a problem? Please describe.
We implemented locking for dataset level. However, we are now facing issues when 2 or more shares are getting approved for the same requestor at the same time. This gives rise to a race condition where the IAM policy of requestor is modified at the same time.
We observed an error like:

Failed to update policy SHARE_POLICY_ARN : An error occurred (NoSuchEntity) when calling the DeletePolicyVersion operation: Policy SHARE_POLICY_ARN version v3 does not exist or is not attachable.

Example:

  • UserA in GroupA requests shares to bucket in Dataset B_1 and folder in Dataset B_2
  • GroupB (who owns both datasets) approves the share at the same time
  • Both shares will be processed at the same time and update the IAM share policy of UserA to give S3 and KMS permissions to the resources (here causes the error)

Describe the solution you'd like
Think about a way to extend locking mechanism beyond just dataset locking to different resource types. What if we need to lock resource X during activity Y (i.e. lock datasets from other actions when updating, or similar)

P.S. Don't attach files. Please, prefer add code snippets directly in the message body.

@dlpzx
Copy link
Contributor

dlpzx commented Mar 20, 2024

@anushka-singh thanks for opening an issue. This feature came up during validation of 2.3. Do you think you have capacity to take on the extension? In any case, we have added it to our 2.5 project and try to pick it up as soon as possible.

@dlpzx dlpzx added priority: high status: not-picked-yet At the moment we have not picked this item. Anyone can pick it up effort: medium priority: medium and removed priority: high labels Mar 20, 2024
@anushka-singh
Copy link
Contributor Author

@dlpzx I will be working on expiration of shares feature next. But if I have bandwidth for 2.5 then can pick this up shortly after.

@noah-paige
Copy link
Contributor

Thoughts on the design for this issue below...

Changes to DB Schema

  • Rename dataset_lock to resource_lock
  • Rename column datasetUri to resourceUri
  • Add column acquiredByResourceType to represent a string/enum value for what resource it is acquired by (i.e. share, etc.)
    • This will be additional information alongside the acquiredBy Column to store the shareUri
    • Do we think this would be useful in long-run (i.e. think if we extend locking to not only be a part of the sharing mechanism) ?

Changes to Share Logic

  • Check if either datasetUri is locked OR Requester Principal is locked
    • OPEN ENDED Q: The primary key for Env Group is composite key of groupUri and environmentUri
      • This issue is that GroupA can request access from many dif Envs (need to check lock for groupUri-envUri pair!)
      • Option A - in resourceUri column store concatenated Uri Pair ('groupUri-envUri')
      • Option B - in resourceUri column store environmentIAMRoleArn (in this case maybe resourceUri is the wrong name for the column)
      • Option C - introduce resourceEnvUri column and make it a composite key along with resourceUri ++ backfilling

Please take a look @anushka-singh @dlpzx - I will get on implementing the above shortly but there are some open questions I would appreciate input on

@anushka-singh
Copy link
Contributor Author

Hey Noah,
Thanks for picking this up and coming up with a design on this.
I am OOO tomorrow, but I will get back to you on this by Thursday.

@anushka-singh
Copy link
Contributor Author

Thoughts on the design for this issue below...

Changes to DB Schema

  • Rename dataset_lock to resource_lock

  • Rename column datasetUri to resourceUri

  • Add column acquiredByResourceType to represent a string/enum value for what resource it is acquired by (i.e. share, etc.)

    • This will be additional information alongside the acquiredBy Column to store the shareUri
    • Do we think this would be useful in long-run (i.e. think if we extend locking to not only be a part of the sharing mechanism) ?
  • Agree to the DB schema changes.
  • Yes I think we should extend locking and not just keep it restricted to sharing. We have discussed in the past that we should introduce resource level locking for datasets, environments and organizations etc when they are undergoing updates. It can be used in tandem with the maintenance window work to ensure processes are paused when updates on these resources start and are picked up again when maintenance window is complete.

Changes to Share Logic

  • Check if either datasetUri is locked OR Requester Principal is locked

    • OPEN ENDED Q: The primary key for Env Group is composite key of groupUri and environmentUri

      • This issue is that GroupA can request access from many dif Envs (need to check lock for groupUri-envUri pair!)
      • Option A - in resourceUri column store concatenated Uri Pair ('groupUri-envUri')
      • Option B - in resourceUri column store environmentIAMRoleArn (in this case maybe resourceUri is the wrong name for the column)
      • Option C - introduce resourceEnvUri column and make it a composite key along with resourceUri ++ backfilling
  • For checking env group, I like option A the best. It is easy to implement and does not need a new column introduced either.

Please take a look @anushka-singh @dlpzx - I will get on implementing the above shortly but there are some open questions I would appreciate input on

noah-paige added a commit that referenced this issue Jun 27, 2024
### Feature or Bugfix
<!-- please choose -->
- Feature
- Bugfix
- Refactoring

### Detail
- Convert Dataset Lock Mechanism to Generic Resource Lock
- Extend locking to Share principals (i.e. EnvironmentGroup and
Consumption Roles)

- Making locking a generic component not tied to datasets


### Relates
- #1093 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: dlpzx <[email protected]>
@noah-paige
Copy link
Contributor

We have implemented this extension of locking mechanism to be generic for any resource rather than dataset specific in PR #1338

We went with Option A above as a rule of thumb to concatenate Uri Pair for records that have composite primary keys

  • For example for EnvGroups (table = enviornment_group_permission) the composite key is groupUri and environmentUri so the primary key identifier in Resource Lock table is 'groupUri-envUri'

We also changed the model such that acquire_lock will create records in the table and release_lock will delete records - this way there is no requirement to add create_lock and delete_lock logic whenever we create/delete a new resource and any resource can leverage the locking class functions without additional custom logic

I will go ahead and close this issue - thanks for all the design considerations and help in above thread!

cc: @anushka-singh

@noah-paige noah-paige linked a pull request Jul 1, 2024 that will close this issue
noah-paige added a commit that referenced this issue Aug 30, 2024
commit 22a6f6ef 
Author: Noah Paige <[email protected]> 
Date: Mon Jul 08 2024 11:28:07 GMT-0400 (Eastern Daylight Time) 

    Add integ tests


commit 4fb7d653 
Author: Noah Paige <[email protected]> 
Date: Mon Jul 08 2024 11:26:36 GMT-0400 (Eastern Daylight Time) 

    Merge env test changes


commit 4cf42e8 
Author: Petros Kalos <[email protected]> 
Date: Fri Jul 05 2024 08:19:34 GMT-0400 (Eastern Daylight Time) 

    improve docs


commit 65f930a 
Author: Petros Kalos <[email protected]> 
Date: Fri Jul 05 2024 08:10:56 GMT-0400 (Eastern Daylight Time) 

    fix failures


commit 170b7ce 
Author: Petros Kalos <[email protected]> 
Date: Wed Jul 03 2024 10:52:20 GMT-0400 (Eastern Daylight Time) 

    add group/consumption_role invite/remove tests


commit ba77d69 
Author: dlpzx <[email protected]> 
Date: Wed Jul 03 2024 06:51:47 GMT-0400 (Eastern Daylight Time) 

    Rename alias for env_vars kms key in cognito lambdas FE and BE (#1385)

### Feature or Bugfix
- Bugfix

### Detail
For the case in which we deploy FE and BE in us-east-1 the new lambda
env_key alias is the same one for TriggerFunctionCognitoUrlsConfig in FE
and for TriggerFunctionCognitoConfig in BE, which results in a failure
of the CICD in the FE stack because the alias already exists.

This PR changes the name of both aliases to avoid this conflict. It also
adds envname to avoid issues with other deployment environments/tooling
account in the future

### Relates
- <URL or Ticket>

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit e5923a9 
Author: dlpzx <[email protected]> 
Date: Wed Jul 03 2024 04:27:11 GMT-0400 (Eastern Daylight Time) 

    Fix lambda_env_key out of scope for vpc-facing cognito setup (#1384)

### Feature or Bugfix
- Bugfix

### Detail
The KMS key for the Lambda environment variables in the Cognito IdP
stack was defined inside an if-clause for internet facing frontend.
Outside of that if, for vpc-facing architecture the kms key does not
exist and the CICD pipeline fails. This PRs move the creation of the KMS
key outside of the if.

### Relates

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 3ccacfc 
Author: Noah Paige <[email protected]> 
Date: Mon Jul 01 2024 13:56:58 GMT-0400 (Eastern Daylight Time) 

    Add delete docs not found when re indexing in catalog task (#1365)

### Feature or Bugfix
<!-- please choose -->
- Feature

### Detail
- Add logic to Catalog Indexer Task to Delete Docs No Longer in RDS
- TODO: Add Ability to Re-index Catalog Items via Dataall Admin UI

### Relates
- #1078

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit e2817a1 
Author: Noah Paige <[email protected]> 
Date: Mon Jul 01 2024 05:14:07 GMT-0400 (Eastern Daylight Time) 

    Fix/glossary status (#1373)

### Feature or Bugfix
<!-- please choose -->
- Bugfix


### Detail
- Add back `status` to Glossary GQL Object for GQL Operations
(getGlossary, listGlossaries)
- Fix  `listOrganizationGroupPermissions` enforce non null on FE


### Relates


### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit c3c58bd 
Author: Petros Kalos <[email protected]> 
Date: Fri Jun 28 2024 06:55:42 GMT-0400 (Eastern Daylight Time) 

    add environment tests (#1371)

### Feature or Bugfix
Feature

### Detail
* add list_environment tests
* add test for updating an environment (via update_stack)
* generalise the polling functions for stacks

### Relates
#1220 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit e913d48 
Author: dlpzx <[email protected]> 
Date: Fri Jun 28 2024 04:15:49 GMT-0400 (Eastern Daylight Time) 

    Add search (Autocomplete) in miscellaneous dropdowns (#1367)

### Feature or Bugfix
- Feature

### Detail
Autocomplete for environments and teams in the following frontend views
as requested in #1012. In this case the views required custom dropdowns.

❗ I used `noOptionsText` whenever it was necessary instead of checking
groupOptions lenght >0
- [x] DatasetEditForm.js -> ❗ I kept the stewards field as `freesolo` -
what that means is that users CAN specify options that are not on the
list. I would like the reviewer to confirm this is what we want. At the
end stewardship is a delegation of permissions, it makes sense that
delegation happens to other teams. Also changed DatasetCreateForm
- [X] RequestDashboardAccessModal.js - already implemented, minor
changes
- [X] EnvironmentTeamInviteForm.js - already implemented, minor changes.
-> Kept `freesolo` because invited teams might not be the user teams.
Same reason why there is no check for groupOptions == 0, if there are no
options there is still the free text option.
- [X] EnvironmentRoleAddForm.js
- [X] NetworkCreateModal.js 

### Relates
- #1012 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit ee71d7b 
Author: Tejas Rajopadhye <[email protected]> 
Date: Thu Jun 27 2024 14:08:27 GMT-0400 (Eastern Daylight Time) 

    [Gh 1301] Enhancement Feature - Bulk share reapply on dataset  (#1363)

### Feature or Bugfix
- Feature


### Detail

- Adds feature to reapply shares in bulk for a dataset. 
- Also contains bugfix for AWS worker lambda errors 

### Relates
- #1301
- #1364

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)? N/A
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization? N/A
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features? N/A
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users? N/A
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: trajopadhye <[email protected]>

commit 27f1ad7 
Author: Noah Paige <[email protected]> 
Date: Thu Jun 27 2024 13:18:32 GMT-0400 (Eastern Daylight Time) 

    Convert Dataset Lock Mechanism to Generic Resource Lock (#1338)

### Feature or Bugfix
<!-- please choose -->
- Feature
- Bugfix
- Refactoring

### Detail
- Convert Dataset Lock Mechanism to Generic Resource Lock
- Extend locking to Share principals (i.e. EnvironmentGroup and
Consumption Roles)

- Making locking a generic component not tied to datasets


### Relates
- #1093 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: dlpzx <[email protected]>

commit e3b8658 
Author: Petros Kalos <[email protected]> 
Date: Thu Jun 27 2024 12:50:59 GMT-0400 (Eastern Daylight Time) 

    ignore ruff change in blame (#1372)

### Feature or Bugfix
<!-- please choose -->
- Feature
- Bugfix
- Refactoring

### Detail
- <feature1 or bug1>
- <feature2 or bug2>

### Relates
- <URL or Ticket>

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 2e80de4 
Author: dlpzx <[email protected]> 
Date: Thu Jun 27 2024 10:59:18 GMT-0400 (Eastern Daylight Time) 

    Generic shares_base module and specific s3_datasets_shares module - part 11 (renaming and cleaning up s3_shares) (#1359)

### Feature or Bugfix
- Refactoring

### Detail
As explained in the design for #1123 and #1283 we are trying to
implement generic `datasets_base` and `shares_base` modules that can be
used by any type of datasets and by any type of shareable object in a
generic way.

This is one of the last PRs focused on renaming files and cleaning-up
the s3_datasets_shares module. The first step is a consolidation of the
file and classes names in the services to clearly refer to s3_shares:
- `services.managed_share_policy_service.SharePolicyService` --->
`services.s3_share_managed_policy_service.S3SharePolicyService`
- `services.dataset_sharing_alarm_service.DatasetSharingAlarmService`
--> `services.s3_share_alarm_service.S3ShareAlarmService`
- `services.managed_share_policy_service.SharePolicyService` -->
`services.s3_share_managed_policy_service.S3SharePolicyService`

👀 The main refactoring happens in what is used to be
`services.dataset_sharing_service`.
- The part that implements the `DatasetServiceInterface` has been moved
to `services/s3_share_dataset_service.py` as the `S3ShareDatasetService`
- The part used in the resolvers and by other methods has been renamed
as `services.s3_share_service.py` and the methods for the folder/table
permissions are also added to the S3ShareService (from
share_item_service)

Lastly, there is one method previously in share_item_service that has
been moved to the GlueClient directly as
`get_glue_database_from_catalog`.


### Relates
- #1283 
- #1123 
- #955 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 1c09015 
Author: Noah Paige <[email protected]> 
Date: Thu Jun 27 2024 04:16:14 GMT-0400 (Eastern Daylight Time) 

    fix listOrganizationGroupPermissions (#1369)

### Feature or Bugfix
<!-- please choose -->
- Bugfix


### Detail
- Fix listOrganizationGroupPermissions


### Relates
- <URL or Ticket>

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 976ec6b 
Author: dlpzx <[email protected]> 
Date: Thu Jun 27 2024 04:13:14 GMT-0400 (Eastern Daylight Time) 

    Add search (Autocomplete) in create pipelines (#1368)

### Feature or Bugfix
- Feature

### Detail
Autocomplete for environments and teams in the following frontend views
as requested in #1012.
This PR implements it for createPipelines

### Relates
- #1012 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 6c909a3 
Author: Noah Paige <[email protected]> 
Date: Wed Jun 26 2024 11:18:04 GMT-0400 (Eastern Daylight Time) 

    fix migration to not rely on OrganizationService or RequestContext (#1361)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- Ensure migration script does not need RequestContext - otherwise fails
in migration trigger lambda as context info not set / available


### Relates
- #1306

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 90835fb 
Author: Anushka Singh <[email protected]> 
Date: Wed Jun 26 2024 11:17:22 GMT-0400 (Eastern Daylight Time) 

    Issue1248: Persistent Email Reminders (#1354)

### Feature or Bugfix
- Feature


### Detail
- When a share request is initiated and remains pending for an extended
period, dataset producers will receive automated email reminders at
predefined intervals. These reminders will prompt producers to either
approve or extend the share request, thereby preventing delays in
accessing datasets.

Attaching screenshots for emails:

<img width="1336" alt="Screenshot 2024-06-20 at 5 34 31 PM"
src="https://github.com/data-dot-all/dataall/assets/26413731/d7be28c3-5c98-4146-92b1-295e136137a3">

<img width="1322" alt="Screenshot 2024-06-20 at 5 34 52 PM"
src="https://github.com/data-dot-all/dataall/assets/26413731/047556e8-59ee-4ebf-b8a7-c0a6684e2a63">


- Email will be sent every Monday at 9am UTC. Schedule can be changed in
cron expression in container.py

### Relates
- #1248

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Anushka Singh <[email protected]>
Co-authored-by: trajopadhye <[email protected]>
Co-authored-by: Mohit Arora <[email protected]>
Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>
Co-authored-by: Raj Chopde <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Sofia Sazonova <[email protected]>
Co-authored-by: Sofia Sazonova <[email protected]>

commit e477bdf 
Author: Noah Paige <[email protected]> 
Date: Wed Jun 26 2024 10:39:09 GMT-0400 (Eastern Daylight Time) 

    Enforce non null on GQL query string if non null defined (#1362)

### Feature or Bugfix
<!-- please choose -->
- Bugfix


### Detail
- Add `String!` to ensure non null input argument on FE if defined as
such on backend GQL operation for `listS3DatasetsSharedWithEnvGroup`


### Relates

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit d6b59b3 
Author: Noah Paige <[email protected]> 
Date: Wed Jun 26 2024 08:48:52 GMT-0400 (Eastern Daylight Time) 

    Fix Init Share Base (#1360)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- Need to register processors in init for s3 dataset shares API module


### Relates

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit bd3698c 
Author: Petros Kalos <[email protected]> 
Date: Wed Jun 26 2024 05:19:14 GMT-0400 (Eastern Daylight Time) 

    split cognito urls setup and cognito user creation (#1366)

### Feature or Bugfix
- Bugfix
### Details
For more details about the issue read #1353 
In this PR we are solving the problem by splitting the configuration of
Cognito in 2.
* First part (cognito_users_config.py) is setting up the required groups
and users and runs after UserPool deployment
* Second part (cognito_urls_config.py) is setting up Cognito's
callback/logout urls and runs after the CloudFront deployment

We chose to split the functionality because we need to have the
users/groups setup for the integration tests which are run after the
backend deployment.

The other althernative is to keep the config functionality as one but
make the integ tests run after CloudFront stage.

### Relates
- Solves #1353 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
noah-paige added a commit that referenced this issue Aug 30, 2024
commit 4425e756 
Author: Noah Paige <[email protected]> 
Date: Mon Jul 08 2024 11:57:31 GMT-0400 (Eastern Daylight Time) 

    Fix


commit 4cd2bf77 
Author: Noah Paige <[email protected]> 
Date: Mon Jul 08 2024 11:56:38 GMT-0400 (Eastern Daylight Time) 

    Fix


commit 22a6f6ef 
Author: Noah Paige <[email protected]> 
Date: Mon Jul 08 2024 11:28:07 GMT-0400 (Eastern Daylight Time) 

    Add integ tests


commit 4fb7d653 
Author: Noah Paige <[email protected]> 
Date: Mon Jul 08 2024 11:26:36 GMT-0400 (Eastern Daylight Time) 

    Merge env test changes


commit 4cf42e8 
Author: Petros Kalos <[email protected]> 
Date: Fri Jul 05 2024 08:19:34 GMT-0400 (Eastern Daylight Time) 

    improve docs


commit 65f930a 
Author: Petros Kalos <[email protected]> 
Date: Fri Jul 05 2024 08:10:56 GMT-0400 (Eastern Daylight Time) 

    fix failures


commit 170b7ce 
Author: Petros Kalos <[email protected]> 
Date: Wed Jul 03 2024 10:52:20 GMT-0400 (Eastern Daylight Time) 

    add group/consumption_role invite/remove tests


commit ba77d69 
Author: dlpzx <[email protected]> 
Date: Wed Jul 03 2024 06:51:47 GMT-0400 (Eastern Daylight Time) 

    Rename alias for env_vars kms key in cognito lambdas FE and BE (#1385)

### Feature or Bugfix
- Bugfix

### Detail
For the case in which we deploy FE and BE in us-east-1 the new lambda
env_key alias is the same one for TriggerFunctionCognitoUrlsConfig in FE
and for TriggerFunctionCognitoConfig in BE, which results in a failure
of the CICD in the FE stack because the alias already exists.

This PR changes the name of both aliases to avoid this conflict. It also
adds envname to avoid issues with other deployment environments/tooling
account in the future

### Relates
- <URL or Ticket>

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit e5923a9 
Author: dlpzx <[email protected]> 
Date: Wed Jul 03 2024 04:27:11 GMT-0400 (Eastern Daylight Time) 

    Fix lambda_env_key out of scope for vpc-facing cognito setup (#1384)

### Feature or Bugfix
- Bugfix

### Detail
The KMS key for the Lambda environment variables in the Cognito IdP
stack was defined inside an if-clause for internet facing frontend.
Outside of that if, for vpc-facing architecture the kms key does not
exist and the CICD pipeline fails. This PRs move the creation of the KMS
key outside of the if.

### Relates

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 3ccacfc 
Author: Noah Paige <[email protected]> 
Date: Mon Jul 01 2024 13:56:58 GMT-0400 (Eastern Daylight Time) 

    Add delete docs not found when re indexing in catalog task (#1365)

### Feature or Bugfix
<!-- please choose -->
- Feature

### Detail
- Add logic to Catalog Indexer Task to Delete Docs No Longer in RDS
- TODO: Add Ability to Re-index Catalog Items via Dataall Admin UI

### Relates
- #1078

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit e2817a1 
Author: Noah Paige <[email protected]> 
Date: Mon Jul 01 2024 05:14:07 GMT-0400 (Eastern Daylight Time) 

    Fix/glossary status (#1373)

### Feature or Bugfix
<!-- please choose -->
- Bugfix


### Detail
- Add back `status` to Glossary GQL Object for GQL Operations
(getGlossary, listGlossaries)
- Fix  `listOrganizationGroupPermissions` enforce non null on FE


### Relates


### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit c3c58bd 
Author: Petros Kalos <[email protected]> 
Date: Fri Jun 28 2024 06:55:42 GMT-0400 (Eastern Daylight Time) 

    add environment tests (#1371)

### Feature or Bugfix
Feature

### Detail
* add list_environment tests
* add test for updating an environment (via update_stack)
* generalise the polling functions for stacks

### Relates
#1220 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit e913d48 
Author: dlpzx <[email protected]> 
Date: Fri Jun 28 2024 04:15:49 GMT-0400 (Eastern Daylight Time) 

    Add search (Autocomplete) in miscellaneous dropdowns (#1367)

### Feature or Bugfix
- Feature

### Detail
Autocomplete for environments and teams in the following frontend views
as requested in #1012. In this case the views required custom dropdowns.

❗ I used `noOptionsText` whenever it was necessary instead of checking
groupOptions lenght >0
- [x] DatasetEditForm.js -> ❗ I kept the stewards field as `freesolo` -
what that means is that users CAN specify options that are not on the
list. I would like the reviewer to confirm this is what we want. At the
end stewardship is a delegation of permissions, it makes sense that
delegation happens to other teams. Also changed DatasetCreateForm
- [X] RequestDashboardAccessModal.js - already implemented, minor
changes
- [X] EnvironmentTeamInviteForm.js - already implemented, minor changes.
-> Kept `freesolo` because invited teams might not be the user teams.
Same reason why there is no check for groupOptions == 0, if there are no
options there is still the free text option.
- [X] EnvironmentRoleAddForm.js
- [X] NetworkCreateModal.js 

### Relates
- #1012 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit ee71d7b 
Author: Tejas Rajopadhye <[email protected]> 
Date: Thu Jun 27 2024 14:08:27 GMT-0400 (Eastern Daylight Time) 

    [Gh 1301] Enhancement Feature - Bulk share reapply on dataset  (#1363)

### Feature or Bugfix
- Feature


### Detail

- Adds feature to reapply shares in bulk for a dataset. 
- Also contains bugfix for AWS worker lambda errors 

### Relates
- #1301
- #1364

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)? N/A
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization? N/A
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features? N/A
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users? N/A
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: trajopadhye <[email protected]>

commit 27f1ad7 
Author: Noah Paige <[email protected]> 
Date: Thu Jun 27 2024 13:18:32 GMT-0400 (Eastern Daylight Time) 

    Convert Dataset Lock Mechanism to Generic Resource Lock (#1338)

### Feature or Bugfix
<!-- please choose -->
- Feature
- Bugfix
- Refactoring

### Detail
- Convert Dataset Lock Mechanism to Generic Resource Lock
- Extend locking to Share principals (i.e. EnvironmentGroup and
Consumption Roles)

- Making locking a generic component not tied to datasets


### Relates
- #1093 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: dlpzx <[email protected]>

commit e3b8658 
Author: Petros Kalos <[email protected]> 
Date: Thu Jun 27 2024 12:50:59 GMT-0400 (Eastern Daylight Time) 

    ignore ruff change in blame (#1372)

### Feature or Bugfix
<!-- please choose -->
- Feature
- Bugfix
- Refactoring

### Detail
- <feature1 or bug1>
- <feature2 or bug2>

### Relates
- <URL or Ticket>

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 2e80de4 
Author: dlpzx <[email protected]> 
Date: Thu Jun 27 2024 10:59:18 GMT-0400 (Eastern Daylight Time) 

    Generic shares_base module and specific s3_datasets_shares module - part 11 (renaming and cleaning up s3_shares) (#1359)

### Feature or Bugfix
- Refactoring

### Detail
As explained in the design for #1123 and #1283 we are trying to
implement generic `datasets_base` and `shares_base` modules that can be
used by any type of datasets and by any type of shareable object in a
generic way.

This is one of the last PRs focused on renaming files and cleaning-up
the s3_datasets_shares module. The first step is a consolidation of the
file and classes names in the services to clearly refer to s3_shares:
- `services.managed_share_policy_service.SharePolicyService` --->
`services.s3_share_managed_policy_service.S3SharePolicyService`
- `services.dataset_sharing_alarm_service.DatasetSharingAlarmService`
--> `services.s3_share_alarm_service.S3ShareAlarmService`
- `services.managed_share_policy_service.SharePolicyService` -->
`services.s3_share_managed_policy_service.S3SharePolicyService`

👀 The main refactoring happens in what is used to be
`services.dataset_sharing_service`.
- The part that implements the `DatasetServiceInterface` has been moved
to `services/s3_share_dataset_service.py` as the `S3ShareDatasetService`
- The part used in the resolvers and by other methods has been renamed
as `services.s3_share_service.py` and the methods for the folder/table
permissions are also added to the S3ShareService (from
share_item_service)

Lastly, there is one method previously in share_item_service that has
been moved to the GlueClient directly as
`get_glue_database_from_catalog`.


### Relates
- #1283 
- #1123 
- #955 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 1c09015 
Author: Noah Paige <[email protected]> 
Date: Thu Jun 27 2024 04:16:14 GMT-0400 (Eastern Daylight Time) 

    fix listOrganizationGroupPermissions (#1369)

### Feature or Bugfix
<!-- please choose -->
- Bugfix


### Detail
- Fix listOrganizationGroupPermissions


### Relates
- <URL or Ticket>

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 976ec6b 
Author: dlpzx <[email protected]> 
Date: Thu Jun 27 2024 04:13:14 GMT-0400 (Eastern Daylight Time) 

    Add search (Autocomplete) in create pipelines (#1368)

### Feature or Bugfix
- Feature

### Detail
Autocomplete for environments and teams in the following frontend views
as requested in #1012.
This PR implements it for createPipelines

### Relates
- #1012 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 6c909a3 
Author: Noah Paige <[email protected]> 
Date: Wed Jun 26 2024 11:18:04 GMT-0400 (Eastern Daylight Time) 

    fix migration to not rely on OrganizationService or RequestContext (#1361)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- Ensure migration script does not need RequestContext - otherwise fails
in migration trigger lambda as context info not set / available


### Relates
- #1306

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 90835fb 
Author: Anushka Singh <[email protected]> 
Date: Wed Jun 26 2024 11:17:22 GMT-0400 (Eastern Daylight Time) 

    Issue1248: Persistent Email Reminders (#1354)

### Feature or Bugfix
- Feature


### Detail
- When a share request is initiated and remains pending for an extended
period, dataset producers will receive automated email reminders at
predefined intervals. These reminders will prompt producers to either
approve or extend the share request, thereby preventing delays in
accessing datasets.

Attaching screenshots for emails:

<img width="1336" alt="Screenshot 2024-06-20 at 5 34 31 PM"
src="https://github.com/data-dot-all/dataall/assets/26413731/d7be28c3-5c98-4146-92b1-295e136137a3">

<img width="1322" alt="Screenshot 2024-06-20 at 5 34 52 PM"
src="https://github.com/data-dot-all/dataall/assets/26413731/047556e8-59ee-4ebf-b8a7-c0a6684e2a63">


- Email will be sent every Monday at 9am UTC. Schedule can be changed in
cron expression in container.py

### Relates
- #1248

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Anushka Singh <[email protected]>
Co-authored-by: trajopadhye <[email protected]>
Co-authored-by: Mohit Arora <[email protected]>
Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>
Co-authored-by: Raj Chopde <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Sofia Sazonova <[email protected]>
Co-authored-by: Sofia Sazonova <[email protected]>

commit e477bdf 
Author: Noah Paige <[email protected]> 
Date: Wed Jun 26 2024 10:39:09 GMT-0400 (Eastern Daylight Time) 

    Enforce non null on GQL query string if non null defined (#1362)

### Feature or Bugfix
<!-- please choose -->
- Bugfix


### Detail
- Add `String!` to ensure non null input argument on FE if defined as
such on backend GQL operation for `listS3DatasetsSharedWithEnvGroup`


### Relates

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit d6b59b3 
Author: Noah Paige <[email protected]> 
Date: Wed Jun 26 2024 08:48:52 GMT-0400 (Eastern Daylight Time) 

    Fix Init Share Base (#1360)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- Need to register processors in init for s3 dataset shares API module


### Relates

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit bd3698c 
Author: Petros Kalos <[email protected]> 
Date: Wed Jun 26 2024 05:19:14 GMT-0400 (Eastern Daylight Time) 

    split cognito urls setup and cognito user creation (#1366)

### Feature or Bugfix
- Bugfix
### Details
For more details about the issue read #1353 
In this PR we are solving the problem by splitting the configuration of
Cognito in 2.
* First part (cognito_users_config.py) is setting up the required groups
and users and runs after UserPool deployment
* Second part (cognito_urls_config.py) is setting up Cognito's
callback/logout urls and runs after the CloudFront deployment

We chose to split the functionality because we need to have the
users/groups setup for the integration tests which are run after the
backend deployment.

The other althernative is to keep the config functionality as one but
make the integ tests run after CloudFront stage.

### Relates
- Solves #1353 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
effort: medium priority: high status: not-picked-yet At the moment we have not picked this item. Anyone can pick it up
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants