-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create generic dataset_base and s3_dataset modules from current datasets #1123
Labels
Comments
dlpzx
added
type: enhancement
Feature enhacement
priority: high
effort: medium
effort: large
and removed
effort: medium
labels
Mar 25, 2024
This was referenced Apr 5, 2024
Closed
15 tasks
Step-by-step implementation planPre-requisites
1. Split datasetsSeparate
2. Split config.jsonSeparate
3. Split dataset sharingSeparate |
This was referenced May 7, 2024
dlpzx
added a commit
that referenced
this issue
May 7, 2024
…me datasets as s3_datasets) (#1250) ### Feature or Bugfix - Refactoring ### Detail - Rename `datasets` module to `s3_datasets` module This PR is the first step to extract a generic datasets_base module that implements the undifferentiated concepts of Dataset in data.all. s3_datasets will use this base module to implement the specific implementation for S3 datatasets. ### Relates - #1123 - #955 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
dlpzx
added a commit
that referenced
this issue
May 15, 2024
…te datasets_base and move enums) (#1257) ### Feature or Bugfix⚠️ This PR should be merged after #1250. - Refactoring ### Detail As explained in the design for #1123 we are trying to implement a generic `datasets_base` module that can be used by any type of datasets in a generic way. This PR: - Creates the skeleton of the `datasets_base` module consisting of 3 packages: `db`, `api`, `services`. And adds the `__init__` file. - Adds the dependency of `s3_datasets` to `datasets_base` in the `__init__` file of the `s3_datasets` module - Moves datasets_enums to datasets_base ### Relates - #1123 - #955 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
This was referenced May 16, 2024
dlpzx
added a commit
that referenced
this issue
May 17, 2024
…te DatasetBase db model and S3Dataset model) (#1258) ### Feature or Bugfix⚠️ This PR should be merged after #1257. - Feature - Refactoring ### Detail As explained in the design for #1123 we are trying to implement a generic `datasets_base` module that can be used by any type of datasets in a generic way. **This PR does**: - Adds a generic `DatasetBase` model in datasets_base.db that is used in s3_datasets.db to build the `S3Dataset` model using joined table inheritance in [sqlalchemy](https://docs.sqlalchemy.org/en/20/orm/inheritance.html) - Rename all usages of Dataset to S3Dataset (in the future some will be returned to DatasetBase, but for the moment we will keep them as S3Dataset) - Add migration script that backfills `datasets` table and renames `s3_datasets` --->⚠️ In the process of migrating we are doing some "scary" operations on the dataset table, if for any reason the migration encounters any issue it could result in catastrophic loss of information --> for this reason this [PR](#1267) implements RDS snapshots on migrations. **This PR does not**: - Feed registration stays as: `FeedRegistry.register(FeedDefinition('Dataset', S3Dataset))` using `Dataset` with the `S3Dataset` resource type. It is out of the scope of this PR to migrate the Feed definition. - Exactly the same for the GlossaryRegistry registration. We keep `object_type='Dataset'` to avoid backwards compatibility issues. - It does not change the resourceType for permissions. We keep using a generic `Dataset` as target for S3 permissions. If we are to split permissions into DatasetBase permissions and S3Dataset permissions we would do it on a different PR #### Remarks Inserting new items of S3Dataset does not require any changes. SQL Alchemy joined inheritance automatically inserts data in the parent table and then another one to the child table as explained in this stackoverflow [link](https://stackoverflow.com/questions/39926937/sqlalchemy-how-to-insert-a-joined-table-inherited-class-instance-when-the-pare) (I was not able to find it in the official docs) ### Relates - #1123 - #955 - #1267 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
dlpzx
added a commit
that referenced
this issue
May 21, 2024
…te DatasetBaseRepository and move DatasetLock) (#1276) ### Feature or Bugfix⚠️ merge after #1258 - Refactoring ### Detail As explained in the design for #1123 we are trying to implement a generic `datasets_base` module that can be used by any type of datasets in a generic way. In this small PR: - we move the generic DatasetLock model to datasets_base - move the DatasetLock db operations to databasets_base DatasetBaseRepository - move activity to DatasetBaseRepository ### Relates - #1123 - #955 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
dlpzx
added a commit
that referenced
this issue
May 21, 2024
…e DatasetServiceInterface to datasets_base, add property, create first list API for datasets_base) (#1281) ### Feature or Bugfix - Feature - Refactoring ### Detail As explained in the design for #1123 we are trying to implement a generic `datasets_base` module that can be used by any type of datasets in a generic way. In this PR we: - Move DatasetServiceInterface to datasets_base. This interface is used by datasets_sharing to "inject" logic in s3_datasets - add property dataset_type to the DatasetServiceInterface interface to distinguish which type of dataset this interface applies to. - create first list API for datasets_base. 👀 This is the most important part. When having multiple types of datasets users will still list all datasets together in several places in the UI (e.g. in listDatasets in DatasetList view, in listDatasetsEnvironment in Environment view) This API calls are not specific to s3_datasets, but generic to any type of dataset. Thus, they should be part of datasets_base. This PR introduces the datasets_list_service, datasetListRepository and includes only one example of API that moves to dataset_base. In next PRs we will move the rest of APIs ### Relates - #1123 - #955 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
dlpzx
added a commit
that referenced
this issue
May 21, 2024
…ve list queries to dataset_base or rename them) (#1282) ### Feature or Bugfix - Refactoring ### Detail As explained in the design for #1123 we are trying to implement a generic `datasets_base` module that can be used by any type of datasets in a generic way. In this PR we: - Restructure listDatasetsOwnedByEnvGroup as listS3DatasetsOwnedByEnvGroup and move it into Worksheets in FE: the reason why it is moved to Worksheets is that it is the only place where it is used in the FE. One could argue that in the BE listS3DatasetsOwnedByEnvGroup is part of the S3_Dataset module. The way I see it, FE and BE are independent and their modularization strategies fit the type of programming, what makes sense in FE might not make it in BE. In BE queries belong to the module whose services/models they are performing actions on, in this case s3_datasets. In FE queries belong to the module where they are used and if a query is used by more than one module then it can be placed in the generic `services` directory. What is important is that we define the dependencies. In this case it is important to make Worksheets dependent of S3_Datasets (as we do in the index in `frontend/src/modules/Worksheets/index.js` and in `backend/dataall/modules/worksheets/__init__.py` - Move listDatasetsCreatedInEnvironment to datasets_base ### Relates - #1123 - #955 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
dlpzx
added a commit
that referenced
this issue
May 22, 2024
…art 1 (renaming, enums and permissions) (#1284) ### Feature or Bugfix - Feature - Refactoring ### Detail As explained in the design for #1123 and #1283 we are trying to implement generic `datasets_base` and `shares_base` modules that can be used by any type of datasets and by any type of shareable object in a generic way. In this PR: - Rename `dataset_sharing` as `s3_dataset_shares` - Create `shares_base` and introduce dependency (`s3_dataset_shares` depends on `shares_base`) - Move generic enums to shares_base - Move generic permissions to shares_base ### Relates - #1283 - #1123 - #955 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
dlpzx
added a commit
that referenced
this issue
Jun 3, 2024
…t config.json) (#1297) ### Feature or Bugfix - Refactoring⚠️ ⚠️ When releasing we need to give a heads up to customers as this change might overwrite their current config.json configurations. It will probably result is some conflicting changes to resolve. ### Detail As explained in the design for #1123 we are trying to implement a generic `datasets_base` module that can be used by any type of datasets in a generic way. This PR is the last one of the series, it moves the generic config.json parameters for datasets into a new field `datasets_base` in the config.json. Specific `s3_datasets` parameters stay in the s3 module configuration. ### Relates - #1123 - #955 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
Completed! |
dlpzx
added a commit
that referenced
this issue
Jun 4, 2024
…art 3 (share processor and manager interfaces) (#1298) ### Feature or Bugfix - Feature - Refactoring ### Detail As explained in the design for #1123 and #1283 we are trying to implement generic `datasets_base` and `shares_base` modules that can be used by any type of datasets and by any type of shareable object in a generic way. In this PR: - Move ECS handler for sharing tasks to shares_base - Move sharing tasks to shares_base - Move DataSharingService to shares_base and rename it as SharingService. Make SharingService generic and remove any reference to specific items. In this process: - attach and delete table and folder permissions are moved into the specific share processor. delete permissions are processed item by item and moved into the shareItemService - Some methods are copied from ShareObjectRepository in s3_datasets_shares to shares_base. They have not been removed from s3_datasets_shares, instead there is a TODO marking those methods that have been copied. The migration and clean-up of shareObjectRepository will be done in a following PR - Need to introduce `DatasetBaseRepository.get_dataset_by_uri(session, share.datasetUri)` to avoid future circular dependencies: shares_base depends only on datasets_base. - Clean-up and consolidate methods: remove updates of the share items outside of state machine transactions. Only re-used methods and dataset lock handling in share_manager --> TODO: dataset_lock manager should be its own service outside of sharing, but this is out of the scope of this PR. - Add updates of share_item statuses in except clauses for more robust sharing - Introduce ShareProcessor and ShareManager interfaces and use them in the SharingService: instead of the processor inheriting the manager class, the processor uses the ShareProcessor interface and constructs a manager when needed. - Introduce new load ImportMode `SHARES_TASK` and register ShareProcessors in s3_datasets_share (`backend/dataall/modules/s3_datasets_shares/__init__.py`) ![image](https://github.com/data-dot-all/dataall/assets/71252798/af4eafc3-c990-4532-ba25-62ef950791aa) See full detail of SharingService design in #1283 ### Next steps/Open questions For failures I think we should rollback whatever actions where performed. For example, if we are sharing a table and it failed in one step, it should revert all the steps executed before. @petrkalos @SofiaSazonova @noah-paige what do you think? ### Relates - #1283 - #1123 - #955 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
This was referenced Jun 5, 2024
dlpzx
added a commit
that referenced
this issue
Jun 6, 2024
…art 4 (remove s3 info from shareItem db models) (#1311) ### Feature or Bugfix - Refactoring ### Detail As explained in the design for #1123 and #1283 we are trying to implement generic `datasets_base` and `shares_base` modules that can be used by any type of datasets and by any type of shareable object in a generic way. In this PR: - Remove the fields `GlueDatabaseName`, `GlueTableName` and `S3AccessPointName` from the ShareObjectItem db model. The goal is to have a generic ShareObjectItem and access the specific item information through its itemUri and itemType. - Migration script to drop those columns. - Disclaimer, there is still work to do on the gql types; but that is out of the scope for this PR ### Relates - #1283 - #1123 - #955 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
dlpzx
added a commit
that referenced
this issue
Jun 6, 2024
…art 5 (move exceptions and notifications to shares_base) (#1312) ### Feature or Bugfix - Refactoring ### Detail As explained in the design for #1123 and #1283 we are trying to implement generic `datasets_base` and `shares_base` modules that can be used by any type of datasets and by any type of shareable object in a generic way. In this PR: - Move share_exceptions to shares_base - Move share_notification_service to shares_base ### Relates - #1283 - #1123 - #955 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
This was referenced Jun 7, 2024
dlpzx
added a commit
that referenced
this issue
Jun 11, 2024
### Feature or Bugfix - Bugfix ### Detail With the refactoring of the datasets and the datasets_sharing tests were not being executed. In this PR the corresponding tests packages are renamed and the tests inside are fixed with the latest changes in main. There are still things to improve, for example we can simplify the conftests of the s3_datasets and the s3_datasets_shars modules; but I am going to leave it for a next PR ### Relates - #1123 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
dlpzx
added a commit
that referenced
this issue
Jun 17, 2024
…art 6 (Split APIs and graphql types) (#1320) ### Feature or Bugfix - Feature - Refactoring ### Detail As explained in the design for #1123 and #1283 we are trying to implement generic `datasets_base` and `shares_base` modules that can be used by any type of datasets and by any type of shareable object in a generic way. - First, this PR splits the query used in Worksheet and Environments to list glue databases and list datasets. They are pretty different queries, the one used in Worksheets is only relevant for S3 datasets, while the one in Environment is focused on the share items in general: - Introduce new API `listS3DatasetsSharedWithEnvGroup` to list shared glue databases in Worksheets view. It is part of the s3_datasets_shares module. This new API replaces the usage of `searchEnvironmentDataItems` in Worksheets frontend. - Remove Glue-parameters from `searchEnvironmentDataItems` API, this API belongs to shares_base. It is only used in the Environment view > Datasets tab, so I moved the API in frontend to modules/Environment. - remove unused parameters (`tables`, `locations`) from statistics in `api/types/ShareObjectStatistics`. Now the statistics are only generic. - Introduce new API `getS3ConsumptionData` in s3_datasets_shares. This new API call gets the details of gluedatabase/table, s3accesspoint that were previously part of ShareObject. This way the graphql ShareObject does not contain specific S3 info. - The rest of the APIs have been split in `shares_base` and `s3_datasets_shares`. In general, all the share lifecycle (create, add items, approve...) is part of shares_base. listDatasetShareObjects, verifyDatasetShareObjects used in the S3-Dataset UI are part of s3_datasets_shares - TODO: Review tests and create new tests for get_consumption_data. Currently tests for shares and datasets are placed in the same folder. I will open a separate PR to order the tests a bit before this ### Relates - #1283 - #1123 - #955 - #1277 ---> This PR needs to be merged and then I will introduce some changes in the ShareView. ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
dlpzx
added a commit
that referenced
this issue
Jun 19, 2024
…art 7 (share_object_service) (#1340) ### Feature or Bugfix - Refactoring ### Detail As explained in the design for #1123 and #1283 we are trying to implement generic `datasets_base` and `shares_base` modules that can be used by any type of datasets and by any type of shareable object in a generic way. The goal of this PR is to move the `share_object_service` to `shares_base` and refactor any dependency to S3 in the service. - Move file and fix imports of ShareObjectService - Use DatasetsBase and DatasetsBaseRepository instead of the S3 equivalents -⚠️ Avoid Dashboard check logic in `ShareObjectService.submit_share_object` see below -⚠️ Avoid SharePolicyService logic in `ShareObjectService.create_share_object` see below - Create ShareLogsService for logs - Remove unused methods - I also copied share_item_service to shares_base (it will be used in next PR) #### Avoid Dashboard check logic in `ShareObjectService.submit_share_object` Currently, whenever a share request is submitted, we check if the REQUESTER environment has dashboards enabled and if there are shared tables we verify that the Quicksight subscription is active. Alternative: perform this check in the share processor of tables. It solves the issue, but it gives a poorer user experience as it is difficult to figure out for the requester why the share failed. This can be solved holistically as requested in #1168. Decision: move the logic to the processor and make the table share fail. #### Avoid SharePolicyService logic in `ShareObjectService.create_share_object` When a share request is first created, we perform a series of operations to ensure that an IAM policy for the share requester principal IAM role is created. Alternative 1: move this logic inside the share processor. Not sure if it is possible. It would be the ideal solution, but the SharePolicyService throws errors in the create share object API if the policy is not attached. Alternative 2: implement interface to define share_policies (similar to the dataset-actions that use share logic in `backend/dataall/modules/s3_datasets_shares/services/dataset_sharing_service.py`). Decision: we want to preserve the user experience of having the IAM policy created before the share request is processed. Plus, it is not an uncommon pattern that could get extended by other dataset types, for example redshift sharing might need additional share policies. For this reason this PR implements alternative 2 ### Relates - #1283 - #1123 - #955 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
dlpzx
added a commit
that referenced
this issue
Jun 19, 2024
…art 8 (sharei_item_service) (#1350) ### Feature or Bugfix - Refactoring ### Detail As explained in the design for #1123 and #1283 we are trying to implement generic `datasets_base` and `shares_base` modules that can be used by any type of datasets and by any type of shareable object in a generic way. The goal of this PR is to split the `share_item_service`logic into the a generic service in `shares_base` and an specific service in s3_datasets_shares. - `ShareItemService` in shares_base only has shareItem logic without references to S3 or Glue. - `S3ShareItemService` in s3_datasets_shares has logic for share items that are tables and folders. The files' names are a bit messy but i don't want to pollute this PR with more changes. I'll do a review of the file names in part 10. ### Relates - #1283 - #1123 - #955 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
dlpzx
added a commit
that referenced
this issue
Jun 21, 2024
…art 9 (share db repositories) (#1351) ### Feature or Bugfix - Refactoring ### Detail As explained in the design for #1123 and #1283 we are trying to implement generic `datasets_base` and `shares_base` modules that can be used by any type of datasets and by any type of shareable object in a generic way. This PR includes: - Split the ShareobjectRepository from s3_datasets_shares into: - `ShareobjectRepository` (shares_base) - generic db operations on share objects - no references to S3, Glue - `ShareStatusRepository` (shares_base) - db operations related to the sharing state machine states - a way to split the db operations into smaller chunks - `S3ShareobjectRepository` (s3_datasets_share) - db operations on s3 share objects - used only in s3_datasets_shares. They might contain references to DatasetTables, S3Datasets... They are used in the clean-up activities and to count resources in environment. - Adapt `S3ShareobjectRepository` to S3 objects. For some queries it was needed to add filters on the type of share items retrieved, so that if in the future anyone adds a new share type the code still makes sense. To add some more meaning, some functions are renamed to clearly point out that they are s3 functions or what they do. - Make `ShareobjectRepository` completely generic. The following queries needed extra work: - ShareObjectRepository.get_share_item - renamed as `get_share_item_details` - `list_shareable_items` - split in 2 parts `list_shareable_items_of_type` + `paginated_list_shareable_items`: the first function is invoked recursively over the list of share processors, instead of querying the DatasetTable, DatasetStorageLocation and DatasetBucket we query the shareable_type. The challenge is to get all fields from the db Resource object that all of them are built upon. In particular the field `itemName` does not match the BucketName (in bucket) or the S3Prefix (in folders). For this reason I added a migration script to backfill the DatasetBucket.name as DatasetBucket.S3BucketName. and the DatasetStorageLocation.name with DatasetStorageLocation.S3Prefix. `paginated_list_shareable_items` joins the list of subqueries, filters and paginates. - In verify_dataset_share_objects instead of using list_shareable_items, I replaced it by `ShareObjectRepository.get_all_share_items_in_share` which does not need tables, storage, avoiding the whole S3 logic and avoiding unnecessary queries - Remove S3 references from shares_base.api.resolvers. Use DatasetBase and DatasetBaseRepository instead. Remove unused `ShareableObject`. - I had some problems with circular dependencies so I created the `ShareProcessorManager` in shares_base for the registration of processors. The SharingService uses the manager to get all processors. Missing items for Part10: - Lake Formation cross region table references in shares_base/services/share_item_service.py:add_shared_item - remove table references in shares_base/services/share_item_service.py:remove_shared_item - remove s3_prefix references shares_base/services/share_notification_service:notify_new_data_available_from_owners - RENAMING! Right now the names are a bit misleading ### Relates - #1283 - #1123 - #955 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
dlpzx
added a commit
that referenced
this issue
Jun 25, 2024
…art 10 (other s3 references in shares_base) (#1357) ### Feature or Bugfix - Refactoring ### Detail As explained in the design for #1123 and #1283 we are trying to implement generic `datasets_base` and `shares_base` modules that can be used by any type of datasets and by any type of shareable object in a generic way. This PR: - Remove the delete_resource_policy conditional for Tables in `backend/dataall/modules/shares_base/services/share_item_service.py` --> Permissions to the Table in data.all are granted once the share has succeeded, the conditional that checks for share_failed tables should not exist. - Remove unnecessary check in share_item_service: in add_share_item we check if it is a table whether it is a cross-region share. This check is completely unnecessary because when we create a share request object we are already checking if it is cross-region - Use `get_share_item_details` in add_share_item - we want to check if the table, folder, bucket exist so we need to query those tables. - Move s3_prefix notifications to subscription task - Fix error in query in `backend/dataall/modules/shares_base/db/share_state_machines_repositories.py` ### Relates - #1283 - #1123 - #955 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
noah-paige
added a commit
that referenced
this issue
Jun 25, 2024
commit 6968e67c Author: Noah Paige <[email protected]> Date: Fri May 17 2024 16:12:45 GMT-0400 (Eastern Daylight Time) Get to v2.5.0 commit 93ff7725 Author: Sofia Sazonova <[email protected]> Date: Mon May 13 2024 08:00:38 GMT-0400 (Eastern Daylight Time) Update version.json (#1264) Release info update commit e718d861 Author: Sofia Sazonova <[email protected]> Date: Mon May 13 2024 07:29:27 GMT-0400 (Eastern Daylight Time) fix permission query (#1263) ### Feature or Bugfix - Bugfix ### Detail - The filter -- array of permissions' NAMES, so in order to query policies correctly we need to add join - The filter 'share_type' and 'share_item_status' must be string - IMPORTANT: in block "finally" the param session was used, but session was defined only in "try" block. So, the lock failed to be released. ### Relates - <URL or Ticket> ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: Sofia Sazonova <[email protected]> commit 479b8f3f Author: mourya-33 <[email protected]> Date: Wed May 08 2024 10:29:36 GMT-0400 (Eastern Daylight Time) Add encryption and tag immutability to ECR repository (#1224) ### Feature or Bugfix - Bugfix ### Detail - Currently the ecr repository created do not have encryption and tag immutability enabled which is identified by checkov scans. This fix is to enable both. ### Relates [- <URL or Ticket>](https://github.com/data-dot-all/dataall/issues/1200) ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes N/A - Is the input sanitized? N/A - What precautions are you taking before deserializing the data you consume? N/A - Is injection prevented by parametrizing queries? N/A - Have you ensured no `eval` or similar functions are used? N/A - Does this PR introduce any functionality or component that requires authorization? N/A - How have you ensured it respects the existing AuthN/AuthZ mechanisms? N/A - Are you logging failed auth attempts? N/A - Are you using or adding any cryptographic features? N/A - Do you use a standard proven implementations? N/A - Are the used keys controlled by the customer? Where are they stored? No. This is with default encryption - Are you introducing any new policies/roles/users? N/A - Have you used the least-privilege principle? How? N/A By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 2f885773 Author: Sofia Sazonova <[email protected]> Date: Wed May 08 2024 09:22:40 GMT-0400 (Eastern Daylight Time) Multiple permission roots (#1259) ### Feature or Bugfix - Bugfix ### Detail - GET_DATASET_TABLE (FOLDER) permissions are granted to the group only if they are not granted already - these permissions are removed if group is not admin|steward and there are no other shares of this item. ### Relates - #1174 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: Sofia Sazonova <[email protected]> commit c4cc07ee Author: Petros Kalos <[email protected]> Date: Wed May 08 2024 08:54:02 GMT-0400 (Eastern Daylight Time) explicitly specify dataset_client s3 endpoint_url (#1260) * AWS requires that the endpoint_url should be explicitly specified for some regions * Remove misleading CORS error message, the upload step can fail for many reason ### Feature or Bugfix - Bugfix ### Detail Resolves #778 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 40defe8e Author: dlpzx <[email protected]> Date: Tue May 07 2024 11:52:17 GMT-0400 (Eastern Daylight Time) Generic dataset module and specific s3_datasets module - part 1 (Rename datasets as s3_datasets) (#1250) ### Feature or Bugfix - Refactoring ### Detail - Rename `datasets` module to `s3_datasets` module This PR is the first step to extract a generic datasets_base module that implements the undifferentiated concepts of Dataset in data.all. s3_datasets will use this base module to implement the specific implementation for S3 datatasets. ### Relates - #1123 - #955 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 74a303cb Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue May 07 2024 02:26:09 GMT-0400 (Eastern Daylight Time) Bump werkzeug from 3.0.1 to 3.0.3 in /tests (#1253) Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.1 to 3.0.3. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/pallets/werkzeug/releases">werkzeug's releases</a>.</em></p> <blockquote> <h2>3.0.3</h2> <p>This is the Werkzeug 3.0.3 security release, which fixes security issues and bugs but does not otherwise change behavior and should not result in breaking changes.</p> <p>PyPI: <a href="https://pypi.org/project/Werkzeug/3.0.3/">https://pypi.org/project/Werkzeug/3.0.3/</a> Changes: <a href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3</a> Milestone: <a href="https://github.com/pallets/werkzeug/milestone/35?closed=1">https://github.com/pallets/werkzeug/milestone/35?closed=1</a></p> <ul> <li>Only allow <code>localhost</code>, <code>.localhost</code>, <code>127.0.0.1</code>, or the specified hostname when running the dev server, to make debugger requests. Additional hosts can be added by using the debugger middleware directly. The debugger UI makes requests using the full URL rather than only the path. GHSA-2g68-c3qc-8985</li> <li>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. <a href="https://redirect.github.com/pallets/werkzeug/issues/2823">#2823</a></li> <li>Better TLS cert format with <code>adhoc</code> dev certs. <a href="https://redirect.github.com/pallets/werkzeug/issues/2891">#2891</a></li> <li>Inform Python < 3.12 how to handle <code>itms-services</code> URIs correctly, rather than using an overly-broad workaround in Werkzeug that caused some redirect URIs to be passed on without encoding. <a href="https://redirect.github.com/pallets/werkzeug/issues/2828">#2828</a></li> <li>Type annotation for <code>Rule.endpoint</code> and other uses of <code>endpoint</code> is <code>Any</code>. <a href="https://redirect.github.com/pallets/werkzeug/issues/2836">#2836</a></li> </ul> <h2>3.0.2</h2> <p>This is a fix release for the 3.0.x feature branch.</p> <ul> <li>Changes: <a href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2</a></li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/pallets/werkzeug/blob/main/CHANGES.rst">werkzeug's changelog</a>.</em></p> <blockquote> <h2>Version 3.0.3</h2> <p>Released 2024-05-05</p> <ul> <li> <p>Only allow <code>localhost</code>, <code>.localhost</code>, <code>127.0.0.1</code>, or the specified hostname when running the dev server, to make debugger requests. Additional hosts can be added by using the debugger middleware directly. The debugger UI makes requests using the full URL rather than only the path. :ghsa:<code>2g68-c3qc-8985</code></p> </li> <li> <p>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. :pr:<code>2823</code></p> </li> <li> <p>Better TLS cert format with <code>adhoc</code> dev certs. :pr:<code>2891</code></p> </li> <li> <p>Inform Python < 3.12 how to handle <code>itms-services</code> URIs correctly, rather than using an overly-broad workaround in Werkzeug that caused some redirect URIs to be passed on without encoding. :issue:<code>2828</code></p> </li> <li> <p>Type annotation for <code>Rule.endpoint</code> and other uses of <code>endpoint</code> is <code>Any</code>. :issue:<code>2836</code></p> </li> <li> <p>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. :pr:<code>2823</code></p> </li> </ul> <h2>Version 3.0.2</h2> <p>Released 2024-04-01</p> <ul> <li>Ensure setting <code>merge_slashes</code> to <code>False</code> results in <code>NotFound</code> for repeated-slash requests against single slash routes. :issue:<code>2834</code></li> <li>Fix handling of <code>TypeError</code> in <code>TypeConversionDict.get()</code> to match <code>ValueError</code>. :issue:<code>2843</code></li> <li>Fix <code>response_wrapper</code> type check in test client. :issue:<code>2831</code></li> <li>Make the return type of <code>MultiPartParser.parse</code> more precise. :issue:<code>2840</code></li> <li>Raise an error if converter arguments cannot be parsed. :issue:<code>2822</code></li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/pallets/werkzeug/commit/f9995e967979eb694d6b31536cc65314fd7e9c8c"><code>f9995e9</code></a> release version 3.0.3</li> <li><a href="https://github.com/pallets/werkzeug/commit/3386395b24c7371db11a5b8eaac0c91da5362692"><code>3386395</code></a> Merge pull request from GHSA-2g68-c3qc-8985</li> <li><a href="https://github.com/pallets/werkzeug/commit/890b6b62634fa61224222aee31081c61b054ff01"><code>890b6b6</code></a> only require trusted host for evalex</li> <li><a href="https://github.com/pallets/werkzeug/commit/71b69dfb7df3d912e66bab87fbb1f21f83504967"><code>71b69df</code></a> restrict debugger trusted hosts</li> <li><a href="https://github.com/pallets/werkzeug/commit/d2d3869525a4ffb2c41dfb2c0e39d94dab2d870c"><code>d2d3869</code></a> endpoint type is Any (<a href="https://redirect.github.com/pallets/werkzeug/issues/2895">#2895</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/7080b55acd48b68afdda65ee6c7f99e9afafb0ba"><code>7080b55</code></a> endpoint type is Any</li> <li><a href="https://github.com/pallets/werkzeug/commit/7555eff296fbdf12f2e576b6bbb0b506df8417ed"><code>7555eff</code></a> remove iri_to_uri redirect workaround (<a href="https://redirect.github.com/pallets/werkzeug/issues/2894">#2894</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/97fb2f722297ae4e12e36dab024e0acf8477b3c8"><code>97fb2f7</code></a> remove _invalid_iri_to_uri workaround</li> <li><a href="https://github.com/pallets/werkzeug/commit/249527ff981e7aa22cd714825c5637cc92df7761"><code>249527f</code></a> make cn field a valid single hostname, and use wildcard in SANs field. (<a href="https://redirect.github.com/pallets/werkzeug/issues/2892">#2892</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/793be472c9d145eb9be7d4200672d1806289d84a"><code>793be47</code></a> update adhoc tls dev cert format</li> <li>Additional commits viewable in <a href="https://github.com/pallets/werkzeug/compare/3.0.1...3.0.3">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=werkzeug&package-manager=pip&previous-version=3.0.1&new-version=3.0.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/data-dot-all/dataall/network/alerts). </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> commit 2f33320c Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue May 07 2024 02:25:03 GMT-0400 (Eastern Daylight Time) Bump werkzeug from 3.0.1 to 3.0.3 in /backend/dataall/base/cdkproxy (#1252) Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.1 to 3.0.3. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/pallets/werkzeug/releases">werkzeug's releases</a>.</em></p> <blockquote> <h2>3.0.3</h2> <p>This is the Werkzeug 3.0.3 security release, which fixes security issues and bugs but does not otherwise change behavior and should not result in breaking changes.</p> <p>PyPI: <a href="https://pypi.org/project/Werkzeug/3.0.3/">https://pypi.org/project/Werkzeug/3.0.3/</a> Changes: <a href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3</a> Milestone: <a href="https://github.com/pallets/werkzeug/milestone/35?closed=1">https://github.com/pallets/werkzeug/milestone/35?closed=1</a></p> <ul> <li>Only allow <code>localhost</code>, <code>.localhost</code>, <code>127.0.0.1</code>, or the specified hostname when running the dev server, to make debugger requests. Additional hosts can be added by using the debugger middleware directly. The debugger UI makes requests using the full URL rather than only the path. GHSA-2g68-c3qc-8985</li> <li>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. <a href="https://redirect.github.com/pallets/werkzeug/issues/2823">#2823</a></li> <li>Better TLS cert format with <code>adhoc</code> dev certs. <a href="https://redirect.github.com/pallets/werkzeug/issues/2891">#2891</a></li> <li>Inform Python < 3.12 how to handle <code>itms-services</code> URIs correctly, rather than using an overly-broad workaround in Werkzeug that caused some redirect URIs to be passed on without encoding. <a href="https://redirect.github.com/pallets/werkzeug/issues/2828">#2828</a></li> <li>Type annotation for <code>Rule.endpoint</code> and other uses of <code>endpoint</code> is <code>Any</code>. <a href="https://redirect.github.com/pallets/werkzeug/issues/2836">#2836</a></li> </ul> <h2>3.0.2</h2> <p>This is a fix release for the 3.0.x feature branch.</p> <ul> <li>Changes: <a href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2</a></li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/pallets/werkzeug/blob/main/CHANGES.rst">werkzeug's changelog</a>.</em></p> <blockquote> <h2>Version 3.0.3</h2> <p>Released 2024-05-05</p> <ul> <li> <p>Only allow <code>localhost</code>, <code>.localhost</code>, <code>127.0.0.1</code>, or the specified hostname when running the dev server, to make debugger requests. Additional hosts can be added by using the debugger middleware directly. The debugger UI makes requests using the full URL rather than only the path. :ghsa:<code>2g68-c3qc-8985</code></p> </li> <li> <p>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. :pr:<code>2823</code></p> </li> <li> <p>Better TLS cert format with <code>adhoc</code> dev certs. :pr:<code>2891</code></p> </li> <li> <p>Inform Python < 3.12 how to handle <code>itms-services</code> URIs correctly, rather than using an overly-broad workaround in Werkzeug that caused some redirect URIs to be passed on without encoding. :issue:<code>2828</code></p> </li> <li> <p>Type annotation for <code>Rule.endpoint</code> and other uses of <code>endpoint</code> is <code>Any</code>. :issue:<code>2836</code></p> </li> <li> <p>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. :pr:<code>2823</code></p> </li> </ul> <h2>Version 3.0.2</h2> <p>Released 2024-04-01</p> <ul> <li>Ensure setting <code>merge_slashes</code> to <code>False</code> results in <code>NotFound</code> for repeated-slash requests against single slash routes. :issue:<code>2834</code></li> <li>Fix handling of <code>TypeError</code> in <code>TypeConversionDict.get()</code> to match <code>ValueError</code>. :issue:<code>2843</code></li> <li>Fix <code>response_wrapper</code> type check in test client. :issue:<code>2831</code></li> <li>Make the return type of <code>MultiPartParser.parse</code> more precise. :issue:<code>2840</code></li> <li>Raise an error if converter arguments cannot be parsed. :issue:<code>2822</code></li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/pallets/werkzeug/commit/f9995e967979eb694d6b31536cc65314fd7e9c8c"><code>f9995e9</code></a> release version 3.0.3</li> <li><a href="https://github.com/pallets/werkzeug/commit/3386395b24c7371db11a5b8eaac0c91da5362692"><code>3386395</code></a> Merge pull request from GHSA-2g68-c3qc-8985</li> <li><a href="https://github.com/pallets/werkzeug/commit/890b6b62634fa61224222aee31081c61b054ff01"><code>890b6b6</code></a> only require trusted host for evalex</li> <li><a href="https://github.com/pallets/werkzeug/commit/71b69dfb7df3d912e66bab87fbb1f21f83504967"><code>71b69df</code></a> restrict debugger trusted hosts</li> <li><a href="https://github.com/pallets/werkzeug/commit/d2d3869525a4ffb2c41dfb2c0e39d94dab2d870c"><code>d2d3869</code></a> endpoint type is Any (<a href="https://redirect.github.com/pallets/werkzeug/issues/2895">#2895</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/7080b55acd48b68afdda65ee6c7f99e9afafb0ba"><code>7080b55</code></a> endpoint type is Any</li> <li><a href="https://github.com/pallets/werkzeug/commit/7555eff296fbdf12f2e576b6bbb0b506df8417ed"><code>7555eff</code></a> remove iri_to_uri redirect workaround (<a href="https://redirect.github.com/pallets/werkzeug/issues/2894">#2894</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/97fb2f722297ae4e12e36dab024e0acf8477b3c8"><code>97fb2f7</code></a> remove _invalid_iri_to_uri workaround</li> <li><a href="https://github.com/pallets/werkzeug/commit/249527ff981e7aa22cd714825c5637cc92df7761"><code>249527f</code></a> make cn field a valid single hostname, and use wildcard in SANs field. (<a href="https://redirect.github.com/pallets/werkzeug/issues/2892">#2892</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/793be472c9d145eb9be7d4200672d1806289d84a"><code>793be47</code></a> update adhoc tls dev cert format</li> <li>Additional commits viewable in <a href="https://github.com/pallets/werkzeug/compare/3.0.1...3.0.3">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=werkzeug&package-manager=pip&previous-version=3.0.1&new-version=3.0.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/data-dot-all/dataall/network/alerts). </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> commit 0b49633f Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue May 07 2024 02:24:34 GMT-0400 (Eastern Daylight Time) Bump werkzeug from 3.0.1 to 3.0.3 in /tests_new/integration_tests (#1254) Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.1 to 3.0.3. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/pallets/werkzeug/releases">werkzeug's releases</a>.</em></p> <blockquote> <h2>3.0.3</h2> <p>This is the Werkzeug 3.0.3 security release, which fixes security issues and bugs but does not otherwise change behavior and should not result in breaking changes.</p> <p>PyPI: <a href="https://pypi.org/project/Werkzeug/3.0.3/">https://pypi.org/project/Werkzeug/3.0.3/</a> Changes: <a href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3</a> Milestone: <a href="https://github.com/pallets/werkzeug/milestone/35?closed=1">https://github.com/pallets/werkzeug/milestone/35?closed=1</a></p> <ul> <li>Only allow <code>localhost</code>, <code>.localhost</code>, <code>127.0.0.1</code>, or the specified hostname when running the dev server, to make debugger requests. Additional hosts can be added by using the debugger middleware directly. The debugger UI makes requests using the full URL rather than only the path. GHSA-2g68-c3qc-8985</li> <li>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. <a href="https://redirect.github.com/pallets/werkzeug/issues/2823">#2823</a></li> <li>Better TLS cert format with <code>adhoc</code> dev certs. <a href="https://redirect.github.com/pallets/werkzeug/issues/2891">#2891</a></li> <li>Inform Python < 3.12 how to handle <code>itms-services</code> URIs correctly, rather than using an overly-broad workaround in Werkzeug that caused some redirect URIs to be passed on without encoding. <a href="https://redirect.github.com/pallets/werkzeug/issues/2828">#2828</a></li> <li>Type annotation for <code>Rule.endpoint</code> and other uses of <code>endpoint</code> is <code>Any</code>. <a href="https://redirect.github.com/pallets/werkzeug/issues/2836">#2836</a></li> </ul> <h2>3.0.2</h2> <p>This is a fix release for the 3.0.x feature branch.</p> <ul> <li>Changes: <a href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2</a></li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/pallets/werkzeug/blob/main/CHANGES.rst">werkzeug's changelog</a>.</em></p> <blockquote> <h2>Version 3.0.3</h2> <p>Released 2024-05-05</p> <ul> <li> <p>Only allow <code>localhost</code>, <code>.localhost</code>, <code>127.0.0.1</code>, or the specified hostname when running the dev server, to make debugger requests. Additional hosts can be added by using the debugger middleware directly. The debugger UI makes requests using the full URL rather than only the path. :ghsa:<code>2g68-c3qc-8985</code></p> </li> <li> <p>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. :pr:<code>2823</code></p> </li> <li> <p>Better TLS cert format with <code>adhoc</code> dev certs. :pr:<code>2891</code></p> </li> <li> <p>Inform Python < 3.12 how to handle <code>itms-services</code> URIs correctly, rather than using an overly-broad workaround in Werkzeug that caused some redirect URIs to be passed on without encoding. :issue:<code>2828</code></p> </li> <li> <p>Type annotation for <code>Rule.endpoint</code> and other uses of <code>endpoint</code> is <code>Any</code>. :issue:<code>2836</code></p> </li> <li> <p>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. :pr:<code>2823</code></p> </li> </ul> <h2>Version 3.0.2</h2> <p>Released 2024-04-01</p> <ul> <li>Ensure setting <code>merge_slashes</code> to <code>False</code> results in <code>NotFound</code> for repeated-slash requests against single slash routes. :issue:<code>2834</code></li> <li>Fix handling of <code>TypeError</code> in <code>TypeConversionDict.get()</code> to match <code>ValueError</code>. :issue:<code>2843</code></li> <li>Fix <code>response_wrapper</code> type check in test client. :issue:<code>2831</code></li> <li>Make the return type of <code>MultiPartParser.parse</code> more precise. :issue:<code>2840</code></li> <li>Raise an error if converter arguments cannot be parsed. :issue:<code>2822</code></li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/pallets/werkzeug/commit/f9995e967979eb694d6b31536cc65314fd7e9c8c"><code>f9995e9</code></a> release version 3.0.3</li> <li><a href="https://github.com/pallets/werkzeug/commit/3386395b24c7371db11a5b8eaac0c91da5362692"><code>3386395</code></a> Merge pull request from GHSA-2g68-c3qc-8985</li> <li><a href="https://github.com/pallets/werkzeug/commit/890b6b62634fa61224222aee31081c61b054ff01"><code>890b6b6</code></a> only require trusted host for evalex</li> <li><a href="https://github.com/pallets/werkzeug/commit/71b69dfb7df3d912e66bab87fbb1f21f83504967"><code>71b69df</code></a> restrict debugger trusted hosts</li> <li><a href="https://github.com/pallets/werkzeug/commit/d2d3869525a4ffb2c41dfb2c0e39d94dab2d870c"><code>d2d3869</code></a> endpoint type is Any (<a href="https://redirect.github.com/pallets/werkzeug/issues/2895">#2895</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/7080b55acd48b68afdda65ee6c7f99e9afafb0ba"><code>7080b55</code></a> endpoint type is Any</li> <li><a href="https://github.com/pallets/werkzeug/commit/7555eff296fbdf12f2e576b6bbb0b506df8417ed"><code>7555eff</code></a> remove iri_to_uri redirect workaround (<a href="https://redirect.github.com/pallets/werkzeug/issues/2894">#2894</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/97fb2f722297ae4e12e36dab024e0acf8477b3c8"><code>97fb2f7</code></a> remove _invalid_iri_to_uri workaround</li> <li><a href="https://github.com/pallets/werkzeug/commit/249527ff981e7aa22cd714825c5637cc92df7761"><code>249527f</code></a> make cn field a valid single hostname, and use wildcard in SANs field. (<a href="https://redirect.github.com/pallets/werkzeug/issues/2892">#2892</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/793be472c9d145eb9be7d4200672d1806289d84a"><code>793be47</code></a> update adhoc tls dev cert format</li> <li>Additional commits viewable in <a href="https://github.com/pallets/werkzeug/compare/3.0.1...3.0.3">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=werkzeug&package-manager=pip&previous-version=3.0.1&new-version=3.0.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/data-dot-all/dataall/network/alerts). </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> commit 08862420 Author: mourya-33 <[email protected]> Date: Tue May 07 2024 02:15:15 GMT-0400 (Eastern Daylight Time) Updated lambda_api.py to add encryption for lambda env vars for custo… (#1255) Feature or Bugfix Bugfix Detail The environment variables for the lambda functions are not encrypted in cdk which are identified by checkov scans. This fix is to enable kms encryption for the lambda environment variables. Relates Security Please answer the questions below briefly where applicable, or write N/A. Based on [OWASP 10](https://owasp.org/Top10/en/). Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? N/A Is the input sanitized? N/A What precautions are you taking before deserializing the data you consume? N/A Is injection prevented by parametrizing queries? N/A Have you ensured no eval or similar functions are used? N/A Does this PR introduce any functionality or component that requires authorization? N/A How have you ensured it respects the existing AuthN/AuthZ mechanisms? N/A Are you logging failed auth attempts? N/A Are you using or adding any cryptographic features? N/A Do you use a standard proven implementations? N/A Are the used keys controlled by the customer? Where are they stored? the KMS keys are generated by cdk and are used to encrypt the environment variables for all lambda functions in the lambda-api stack Are you introducing any new policies/roles/users? - N/A Have you used the least-privilege principle? How? N/A By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit ed7cc3eb Author: Noah Paige <[email protected]> Date: Mon May 06 2024 09:32:30 GMT-0400 (Eastern Daylight Time) Add order_by for paginated queries (#1249) ### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - This PR aims to solve the following - (1) for particular queries (identified as ones that perform `.outerjoin()` operations and have results paginated with `paginate()` function - sometimes the returned query results is *less than* the limit set by the pageSize of the paginate function even when the total count is greater than the pageSize - Ex 1: 11 envs total, `query_user_environments()` returning 9 envs on 1st page + 2 on 2nd page - Ex 2: 10 envs total, `query_user_environments()` returning 9 envs on 1st page + no 2nd page - Believe this is to be happening due to the way SQLAlchemy is "uniquing" the records resulted from an outerjoin and then returning that result back to the frontend - Adding a `.distinct()` check on the query ensures each distinct record is returned (tested successfully) - (2) Currently we often times do not implement an `.order_by()` condition for the query used in `paginate()` and do not have a stable way of preserving order of the items returned from a query (i.e. when navigating through pages of response) - A generally good practice seems to include an `order_by()` on a column or set of columns - For each query used in `paginate()` this PR adds an `order_by()` condition (full list in comments below) Can read a bit more context from related issue linked below ### Relates - https://github.com/data-dot-all/dataall/issues/1241 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 98e67fa8 Author: Sofia Sazonova <[email protected]> Date: Fri May 03 2024 12:21:57 GMT-0400 (Eastern Daylight Time) fix: DATASET_READ_TABLE read permissions (#1237) ### Feature or Bugfix - Bugfix ### Detail - backfill DATASET_READ_TABLE permissions - delete this permissions, when dataset tables are revoked or deteled - ### Relates - #1173 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: Sofia Sazonova <[email protected]> commit 18e2f509 Author: Noah Paige <[email protected]> Date: Fri May 03 2024 10:14:52 GMT-0400 (Eastern Daylight Time) Fix local test groups listing for listGroups query (#1239) ### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - Locally when trying to invite a team to Env or Org we call listGroups and the returned `LOCAL_TEST_GROUPS` is not returning the proper data type expected ### Relates N/A ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit a0be03c4 Author: dlpzx <[email protected]> Date: Fri May 03 2024 10:12:34 GMT-0400 (Eastern Daylight Time) Refactor: uncouple datasets and dataset_sharing modules - part 2-5 FINAL DELETE DATASETS_BASE (#1242) ### Feature or Bugfix - Refactoring ### Detail After all the previous PRs are merged, there should be no circular dependencies between `datasets` and `datasets_sharing`. We can now proceed to: - move `datasets_base` models, repositories, permissions and enums to `datasets` - adjust the `__init__` files to establish the `datasets_sharing` depends on `datasets` - adjust the Module interfaces to ensure that all necessary dataset models... are imported in the interface for sharing Next steps: - share_notifications paramter to dataset_sharing in config.json ### Relates #955 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit b68b40c1 Author: Sofia Sazonova <[email protected]> Date: Fri May 03 2024 10:12:11 GMT-0400 (Eastern Daylight Time) bugfix: EnvironmentGroup can remove other groups (#1234) ### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - Now, if the group can't update other group, it also can not remove them. - ### Relates - #1212 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: Sofia Sazonova <[email protected]> commit 264539b5 Author: Noah Paige <[email protected]> Date: Fri May 03 2024 05:23:11 GMT-0400 (Eastern Daylight Time) Fix Alembic Migration: has table checks (#1240) ### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - Fix `has_table()` check to ensure dropping the tables if the exists as part of alembic migration upgrade - Fix `DatasetLock nullable=True` ### Relates - https://github.com/data-dot-all/dataall/issues/1165 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? No - Is the input sanitized? N/A - What precautions are you taking before deserializing the data you consume? N/A - Is injection prevented by parametrizing queries? N/A - Have you ensured no `eval` or similar functions are used? N/A - Does this PR introduce any functionality or component that requires authorization? No - How have you ensured it respects the existing AuthN/AuthZ mechanisms? N/A - Are you logging failed auth attempts? N/A - Are you using or adding any cryptographic features? No - Do you use a standard proven implementations? N/A - Are the used keys controlled by the customer? Where are they stored? N/A - Are you introducing any new policies/roles/users? No - Have you used the least-privilege principle? How? N/A By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 42a5f6bd Author: dlpzx <[email protected]> Date: Fri May 03 2024 02:24:09 GMT-0400 (Eastern Daylight Time) Refactor: uncouple datasets and dataset_sharing modules - part 2-4 (#1214) ### Feature or Bugfix - Refactoring ⚠️ MERGE AFTER https://github.com/data-dot-all/dataall/pull/1213 ### Detail This is needed as explained in full PR [AFTER 2.4] Refactor: uncouple datasets and dataset_sharing modules #1179 - [X] Use interface to resolve dataset roles related to datasets shared and implement logic in the dataset_sharing module - [X] Extend and clean-up stewards share permissions through interface ### Relates - #1179 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 6d3f2d45 Author: Sofia Sazonova <[email protected]> Date: Thu May 02 2024 10:55:00 GMT-0400 (Eastern Daylight Time) [After 2.4]Core Refactoring part5 (#1194) ### Feature or Bugfix - Refactoring ### Detail - focus on core/environments - move logic from resolvers to services - create s3_client in base/aws --> TO BE REFACTORED. Needs to be merged with dataset_sharind/aws/s3_client ### Relates - #741 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: Sofia Sazonova <[email protected]> commit 2ea24cbb Author: dlpzx <[email protected]> Date: Thu May 02 2024 08:22:12 GMT-0400 (Eastern Daylight Time) Refactor: uncouple datasets and dataset_sharing modules - part 2-3 (#1213) ### Feature or Bugfix - Refactoring ⚠️ MERGE AFTER https://github.com/data-dot-all/dataall/pull/1187 ### Detail This is needed as explained in full PR [AFTER 2.4] Refactor: uncouple datasets and dataset_sharing modules #1179 - [X] Creates an interface to execute checks and clean-ups of data sharing objects when dataset objects are deleted (initially it was going to be an db interface, but I think it is better in the service) - [X] Move listDatasetShares query to dataset_sharing module in https://github.com/data-dot-all/dataall/pull/1185 ### Relates - #1179 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 750a5ec8 Author: Anushka Singh <[email protected]> Date: Wed May 01 2024 12:28:18 GMT-0400 (Eastern Daylight Time) Feature:1221 - Make visibility of auto-approval toggle configurable based on confidentiality (#1223) ### Feature or Bugfix - Feature ### Detail - Users should be able to disable visibility of auto-approval toggle with code. For example, at our company, we require that shares always go through approval process if their confidentiality classification is Secret. We dont even want to give the option to users to be able to set autoApproval enabled to ensure they dont do so by mistake and end up over sharing. Video demo: https://github.com/data-dot-all/dataall/issues/1221#issuecomment-2077412044 ### Relates - https://github.com/data-dot-all/dataall/issues/1221 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 82044689 Author: dlpzx <[email protected]> Date: Wed May 01 2024 12:26:42 GMT-0400 (Eastern Daylight Time) Refactor: uncouple datasets and dataset_sharing modules - part 2-2 (#1187) ### Feature or Bugfix - Refactoring ⚠️ MERGE AFTER https://github.com/data-dot-all/dataall/pull/1185 ### Detail This is needed as explained in full PR [AFTER 2.4] Refactor: uncouple datasets and dataset_sharing modules #1179 - Split the getDatasetAssumeRole API into 2 APIs, one for dataset owners role (in datasets module) and another one for share requester roles (in datasets_sharing module) ### Relates - #1179 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 5173419f Author: Noah Paige <[email protected]> Date: Wed May 01 2024 12:24:42 GMT-0400 (Eastern Daylight Time) Fix so listValidEnvironments called only once (#1238) ### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - When request access to a share on data.all the query to `listValidEnvironments` used to be called twice which (depending on how long for query results to return) could cause the environment initially selected to disappear ### Relates - Continuation of https://github.com/data-dot-all/dataall/issues/916 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 7656ea86 Author: dlpzx <[email protected]> Date: Tue Apr 30 2024 07:13:01 GMT-0400 (Eastern Daylight Time) Add integration tests on a real API client and integrate the tests in CICD (#1219) ### Feature or Bugfix - Feature ### Detail Add integration tests that use a real Client to execute different validation actions. - Define the Client and the way API calls are posted to API Gateway in the conftest - Define the Cognito users and the different fixtures needed for all tests - Write tests for the Organization core module as example - Add feature flag in `cdk.json` called `with_approval_tests` that can be defined at the deployment environment level. If set to True, a CodeBuild stage running the tests is created. ### Relates - https://github.com/data-dot-all/dataall/issues/1220 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit b963fe81 Author: Sofia Sazonova <[email protected]> Date: Mon Apr 29 2024 09:26:36 GMT-0400 (Eastern Daylight Time) Notification link routes to a share request page (#1227) ### Feature or Bugfix <!-- please choose --> - Feature ### Detail - in notification object field `target_uri = 'shareUri|DataSetUri'` - this value is parsed and used to redirect user to a relevant Share Request page ### Relates - #1115 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. Co-authored-by: Sofia Sazonova <[email protected]> commit 6386fe14 Author: dlpzx <[email protected]> Date: Mon Apr 29 2024 07:32:00 GMT-0400 (Eastern Daylight Time) Refactor: uncouple datasets and dataset_sharing modules - part 2 (#1185) ### Feature or Bugfix - Refactoring ### Detail Remove and move logic from dataset to datasets_sharing module. This is needed as explained in full PR [AFTER 2.4] Refactor: uncouple datasets and dataset_sharing modules #1179 - [X] Moves the verify dataset shares mutation to the datasets_sharing module - [X] Move dataset_subscription task to dataset_sharing - [X] Move listDatasetShares query to dataset_sharing module - [X] Remove unused `shares` field from the Dataset graphql type as it was not used in the frontend: listDatasets, listOwnedDatasets, listDatasetsOwnedByEnvGroup, listDatasetsCreatedInEnvironment and getDataset - [x] Move getSharedDatasetTables to data_sharing module and fix reference to DatasetService I am aware that some of the queries and mutations that this PR moves look a bit odd in the dataset_sharing module, but this will be solved once data sharing is divided into dataset_sharing_base and s3_dataset_sharing. ### Relates #1179 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires aut…
noah-paige
added a commit
that referenced
this issue
Jun 25, 2024
commit a06c8cba Author: Noah Paige <[email protected]> Date: Fri May 17 2024 16:37:05 GMT-0400 (Eastern Daylight Time) Merge share logs PR commit aee98cf7 Author: Noah Paige <[email protected]> Date: Fri May 17 2024 16:34:24 GMT-0400 (Eastern Daylight Time) Merge share logs PR commit 5ca55303 Author: Sofia Sazonova <[email protected]> Date: Wed May 15 2024 11:57:41 GMT-0400 (Eastern Daylight Time) remove unused imports commit 8f8bf3dd Author: Sofia Sazonova <[email protected]> Date: Wed May 15 2024 11:56:39 GMT-0400 (Eastern Daylight Time) restrict access to the share logs commit 9137da9b Author: Sofia Sazonova <[email protected]> Date: Wed May 15 2024 11:28:32 GMT-0400 (Eastern Daylight Time) share Logs button is available only for dataset Admins and stewards commit fcb16bd9 Author: Sofia Sazonova <[email protected]> Date: Wed May 15 2024 10:46:25 GMT-0400 (Eastern Daylight Time) getShareLogs query commit 0503a3bb Author: Sofia Sazonova <[email protected]> Date: Wed May 15 2024 10:21:25 GMT-0400 (Eastern Daylight Time) Logs modal in Share View commit bab2f3e6 Author: Sofia Sazonova <[email protected]> Date: Mon May 13 2024 09:09:18 GMT-0400 (Eastern Daylight Time) Add confirmation pop-ups for deletion of team roles and groups (#1231) ### Feature or Bugfix - Feature ### Detail Pop ups added for: - deletion team from environment - deletion of the consumption role - deletion of group from Organization ### Relates - #942 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. Co-authored-by: Sofia Sazonova <[email protected]> commit 93ff7725 Author: Sofia Sazonova <[email protected]> Date: Mon May 13 2024 08:00:38 GMT-0400 (Eastern Daylight Time) Update version.json (#1264) Release info update commit e718d861 Author: Sofia Sazonova <[email protected]> Date: Mon May 13 2024 07:29:27 GMT-0400 (Eastern Daylight Time) fix permission query (#1263) ### Feature or Bugfix - Bugfix ### Detail - The filter -- array of permissions' NAMES, so in order to query policies correctly we need to add join - The filter 'share_type' and 'share_item_status' must be string - IMPORTANT: in block "finally" the param session was used, but session was defined only in "try" block. So, the lock failed to be released. ### Relates - <URL or Ticket> ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: Sofia Sazonova <[email protected]> commit 479b8f3f Author: mourya-33 <[email protected]> Date: Wed May 08 2024 10:29:36 GMT-0400 (Eastern Daylight Time) Add encryption and tag immutability to ECR repository (#1224) ### Feature or Bugfix - Bugfix ### Detail - Currently the ecr repository created do not have encryption and tag immutability enabled which is identified by checkov scans. This fix is to enable both. ### Relates [- <URL or Ticket>](https://github.com/data-dot-all/dataall/issues/1200) ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes N/A - Is the input sanitized? N/A - What precautions are you taking before deserializing the data you consume? N/A - Is injection prevented by parametrizing queries? N/A - Have you ensured no `eval` or similar functions are used? N/A - Does this PR introduce any functionality or component that requires authorization? N/A - How have you ensured it respects the existing AuthN/AuthZ mechanisms? N/A - Are you logging failed auth attempts? N/A - Are you using or adding any cryptographic features? N/A - Do you use a standard proven implementations? N/A - Are the used keys controlled by the customer? Where are they stored? No. This is with default encryption - Are you introducing any new policies/roles/users? N/A - Have you used the least-privilege principle? How? N/A By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 2f885773 Author: Sofia Sazonova <[email protected]> Date: Wed May 08 2024 09:22:40 GMT-0400 (Eastern Daylight Time) Multiple permission roots (#1259) ### Feature or Bugfix - Bugfix ### Detail - GET_DATASET_TABLE (FOLDER) permissions are granted to the group only if they are not granted already - these permissions are removed if group is not admin|steward and there are no other shares of this item. ### Relates - #1174 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: Sofia Sazonova <[email protected]> commit c4cc07ee Author: Petros Kalos <[email protected]> Date: Wed May 08 2024 08:54:02 GMT-0400 (Eastern Daylight Time) explicitly specify dataset_client s3 endpoint_url (#1260) * AWS requires that the endpoint_url should be explicitly specified for some regions * Remove misleading CORS error message, the upload step can fail for many reason ### Feature or Bugfix - Bugfix ### Detail Resolves #778 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 40defe8e Author: dlpzx <[email protected]> Date: Tue May 07 2024 11:52:17 GMT-0400 (Eastern Daylight Time) Generic dataset module and specific s3_datasets module - part 1 (Rename datasets as s3_datasets) (#1250) ### Feature or Bugfix - Refactoring ### Detail - Rename `datasets` module to `s3_datasets` module This PR is the first step to extract a generic datasets_base module that implements the undifferentiated concepts of Dataset in data.all. s3_datasets will use this base module to implement the specific implementation for S3 datatasets. ### Relates - #1123 - #955 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 74a303cb Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue May 07 2024 02:26:09 GMT-0400 (Eastern Daylight Time) Bump werkzeug from 3.0.1 to 3.0.3 in /tests (#1253) Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.1 to 3.0.3. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/pallets/werkzeug/releases">werkzeug's releases</a>.</em></p> <blockquote> <h2>3.0.3</h2> <p>This is the Werkzeug 3.0.3 security release, which fixes security issues and bugs but does not otherwise change behavior and should not result in breaking changes.</p> <p>PyPI: <a href="https://pypi.org/project/Werkzeug/3.0.3/">https://pypi.org/project/Werkzeug/3.0.3/</a> Changes: <a href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3</a> Milestone: <a href="https://github.com/pallets/werkzeug/milestone/35?closed=1">https://github.com/pallets/werkzeug/milestone/35?closed=1</a></p> <ul> <li>Only allow <code>localhost</code>, <code>.localhost</code>, <code>127.0.0.1</code>, or the specified hostname when running the dev server, to make debugger requests. Additional hosts can be added by using the debugger middleware directly. The debugger UI makes requests using the full URL rather than only the path. GHSA-2g68-c3qc-8985</li> <li>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. <a href="https://redirect.github.com/pallets/werkzeug/issues/2823">#2823</a></li> <li>Better TLS cert format with <code>adhoc</code> dev certs. <a href="https://redirect.github.com/pallets/werkzeug/issues/2891">#2891</a></li> <li>Inform Python < 3.12 how to handle <code>itms-services</code> URIs correctly, rather than using an overly-broad workaround in Werkzeug that caused some redirect URIs to be passed on without encoding. <a href="https://redirect.github.com/pallets/werkzeug/issues/2828">#2828</a></li> <li>Type annotation for <code>Rule.endpoint</code> and other uses of <code>endpoint</code> is <code>Any</code>. <a href="https://redirect.github.com/pallets/werkzeug/issues/2836">#2836</a></li> </ul> <h2>3.0.2</h2> <p>This is a fix release for the 3.0.x feature branch.</p> <ul> <li>Changes: <a href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2</a></li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/pallets/werkzeug/blob/main/CHANGES.rst">werkzeug's changelog</a>.</em></p> <blockquote> <h2>Version 3.0.3</h2> <p>Released 2024-05-05</p> <ul> <li> <p>Only allow <code>localhost</code>, <code>.localhost</code>, <code>127.0.0.1</code>, or the specified hostname when running the dev server, to make debugger requests. Additional hosts can be added by using the debugger middleware directly. The debugger UI makes requests using the full URL rather than only the path. :ghsa:<code>2g68-c3qc-8985</code></p> </li> <li> <p>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. :pr:<code>2823</code></p> </li> <li> <p>Better TLS cert format with <code>adhoc</code> dev certs. :pr:<code>2891</code></p> </li> <li> <p>Inform Python < 3.12 how to handle <code>itms-services</code> URIs correctly, rather than using an overly-broad workaround in Werkzeug that caused some redirect URIs to be passed on without encoding. :issue:<code>2828</code></p> </li> <li> <p>Type annotation for <code>Rule.endpoint</code> and other uses of <code>endpoint</code> is <code>Any</code>. :issue:<code>2836</code></p> </li> <li> <p>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. :pr:<code>2823</code></p> </li> </ul> <h2>Version 3.0.2</h2> <p>Released 2024-04-01</p> <ul> <li>Ensure setting <code>merge_slashes</code> to <code>False</code> results in <code>NotFound</code> for repeated-slash requests against single slash routes. :issue:<code>2834</code></li> <li>Fix handling of <code>TypeError</code> in <code>TypeConversionDict.get()</code> to match <code>ValueError</code>. :issue:<code>2843</code></li> <li>Fix <code>response_wrapper</code> type check in test client. :issue:<code>2831</code></li> <li>Make the return type of <code>MultiPartParser.parse</code> more precise. :issue:<code>2840</code></li> <li>Raise an error if converter arguments cannot be parsed. :issue:<code>2822</code></li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/pallets/werkzeug/commit/f9995e967979eb694d6b31536cc65314fd7e9c8c"><code>f9995e9</code></a> release version 3.0.3</li> <li><a href="https://github.com/pallets/werkzeug/commit/3386395b24c7371db11a5b8eaac0c91da5362692"><code>3386395</code></a> Merge pull request from GHSA-2g68-c3qc-8985</li> <li><a href="https://github.com/pallets/werkzeug/commit/890b6b62634fa61224222aee31081c61b054ff01"><code>890b6b6</code></a> only require trusted host for evalex</li> <li><a href="https://github.com/pallets/werkzeug/commit/71b69dfb7df3d912e66bab87fbb1f21f83504967"><code>71b69df</code></a> restrict debugger trusted hosts</li> <li><a href="https://github.com/pallets/werkzeug/commit/d2d3869525a4ffb2c41dfb2c0e39d94dab2d870c"><code>d2d3869</code></a> endpoint type is Any (<a href="https://redirect.github.com/pallets/werkzeug/issues/2895">#2895</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/7080b55acd48b68afdda65ee6c7f99e9afafb0ba"><code>7080b55</code></a> endpoint type is Any</li> <li><a href="https://github.com/pallets/werkzeug/commit/7555eff296fbdf12f2e576b6bbb0b506df8417ed"><code>7555eff</code></a> remove iri_to_uri redirect workaround (<a href="https://redirect.github.com/pallets/werkzeug/issues/2894">#2894</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/97fb2f722297ae4e12e36dab024e0acf8477b3c8"><code>97fb2f7</code></a> remove _invalid_iri_to_uri workaround</li> <li><a href="https://github.com/pallets/werkzeug/commit/249527ff981e7aa22cd714825c5637cc92df7761"><code>249527f</code></a> make cn field a valid single hostname, and use wildcard in SANs field. (<a href="https://redirect.github.com/pallets/werkzeug/issues/2892">#2892</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/793be472c9d145eb9be7d4200672d1806289d84a"><code>793be47</code></a> update adhoc tls dev cert format</li> <li>Additional commits viewable in <a href="https://github.com/pallets/werkzeug/compare/3.0.1...3.0.3">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=werkzeug&package-manager=pip&previous-version=3.0.1&new-version=3.0.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/data-dot-all/dataall/network/alerts). </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> commit 2f33320c Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue May 07 2024 02:25:03 GMT-0400 (Eastern Daylight Time) Bump werkzeug from 3.0.1 to 3.0.3 in /backend/dataall/base/cdkproxy (#1252) Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.1 to 3.0.3. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/pallets/werkzeug/releases">werkzeug's releases</a>.</em></p> <blockquote> <h2>3.0.3</h2> <p>This is the Werkzeug 3.0.3 security release, which fixes security issues and bugs but does not otherwise change behavior and should not result in breaking changes.</p> <p>PyPI: <a href="https://pypi.org/project/Werkzeug/3.0.3/">https://pypi.org/project/Werkzeug/3.0.3/</a> Changes: <a href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3</a> Milestone: <a href="https://github.com/pallets/werkzeug/milestone/35?closed=1">https://github.com/pallets/werkzeug/milestone/35?closed=1</a></p> <ul> <li>Only allow <code>localhost</code>, <code>.localhost</code>, <code>127.0.0.1</code>, or the specified hostname when running the dev server, to make debugger requests. Additional hosts can be added by using the debugger middleware directly. The debugger UI makes requests using the full URL rather than only the path. GHSA-2g68-c3qc-8985</li> <li>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. <a href="https://redirect.github.com/pallets/werkzeug/issues/2823">#2823</a></li> <li>Better TLS cert format with <code>adhoc</code> dev certs. <a href="https://redirect.github.com/pallets/werkzeug/issues/2891">#2891</a></li> <li>Inform Python < 3.12 how to handle <code>itms-services</code> URIs correctly, rather than using an overly-broad workaround in Werkzeug that caused some redirect URIs to be passed on without encoding. <a href="https://redirect.github.com/pallets/werkzeug/issues/2828">#2828</a></li> <li>Type annotation for <code>Rule.endpoint</code> and other uses of <code>endpoint</code> is <code>Any</code>. <a href="https://redirect.github.com/pallets/werkzeug/issues/2836">#2836</a></li> </ul> <h2>3.0.2</h2> <p>This is a fix release for the 3.0.x feature branch.</p> <ul> <li>Changes: <a href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2</a></li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/pallets/werkzeug/blob/main/CHANGES.rst">werkzeug's changelog</a>.</em></p> <blockquote> <h2>Version 3.0.3</h2> <p>Released 2024-05-05</p> <ul> <li> <p>Only allow <code>localhost</code>, <code>.localhost</code>, <code>127.0.0.1</code>, or the specified hostname when running the dev server, to make debugger requests. Additional hosts can be added by using the debugger middleware directly. The debugger UI makes requests using the full URL rather than only the path. :ghsa:<code>2g68-c3qc-8985</code></p> </li> <li> <p>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. :pr:<code>2823</code></p> </li> <li> <p>Better TLS cert format with <code>adhoc</code> dev certs. :pr:<code>2891</code></p> </li> <li> <p>Inform Python < 3.12 how to handle <code>itms-services</code> URIs correctly, rather than using an overly-broad workaround in Werkzeug that caused some redirect URIs to be passed on without encoding. :issue:<code>2828</code></p> </li> <li> <p>Type annotation for <code>Rule.endpoint</code> and other uses of <code>endpoint</code> is <code>Any</code>. :issue:<code>2836</code></p> </li> <li> <p>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. :pr:<code>2823</code></p> </li> </ul> <h2>Version 3.0.2</h2> <p>Released 2024-04-01</p> <ul> <li>Ensure setting <code>merge_slashes</code> to <code>False</code> results in <code>NotFound</code> for repeated-slash requests against single slash routes. :issue:<code>2834</code></li> <li>Fix handling of <code>TypeError</code> in <code>TypeConversionDict.get()</code> to match <code>ValueError</code>. :issue:<code>2843</code></li> <li>Fix <code>response_wrapper</code> type check in test client. :issue:<code>2831</code></li> <li>Make the return type of <code>MultiPartParser.parse</code> more precise. :issue:<code>2840</code></li> <li>Raise an error if converter arguments cannot be parsed. :issue:<code>2822</code></li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/pallets/werkzeug/commit/f9995e967979eb694d6b31536cc65314fd7e9c8c"><code>f9995e9</code></a> release version 3.0.3</li> <li><a href="https://github.com/pallets/werkzeug/commit/3386395b24c7371db11a5b8eaac0c91da5362692"><code>3386395</code></a> Merge pull request from GHSA-2g68-c3qc-8985</li> <li><a href="https://github.com/pallets/werkzeug/commit/890b6b62634fa61224222aee31081c61b054ff01"><code>890b6b6</code></a> only require trusted host for evalex</li> <li><a href="https://github.com/pallets/werkzeug/commit/71b69dfb7df3d912e66bab87fbb1f21f83504967"><code>71b69df</code></a> restrict debugger trusted hosts</li> <li><a href="https://github.com/pallets/werkzeug/commit/d2d3869525a4ffb2c41dfb2c0e39d94dab2d870c"><code>d2d3869</code></a> endpoint type is Any (<a href="https://redirect.github.com/pallets/werkzeug/issues/2895">#2895</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/7080b55acd48b68afdda65ee6c7f99e9afafb0ba"><code>7080b55</code></a> endpoint type is Any</li> <li><a href="https://github.com/pallets/werkzeug/commit/7555eff296fbdf12f2e576b6bbb0b506df8417ed"><code>7555eff</code></a> remove iri_to_uri redirect workaround (<a href="https://redirect.github.com/pallets/werkzeug/issues/2894">#2894</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/97fb2f722297ae4e12e36dab024e0acf8477b3c8"><code>97fb2f7</code></a> remove _invalid_iri_to_uri workaround</li> <li><a href="https://github.com/pallets/werkzeug/commit/249527ff981e7aa22cd714825c5637cc92df7761"><code>249527f</code></a> make cn field a valid single hostname, and use wildcard in SANs field. (<a href="https://redirect.github.com/pallets/werkzeug/issues/2892">#2892</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/793be472c9d145eb9be7d4200672d1806289d84a"><code>793be47</code></a> update adhoc tls dev cert format</li> <li>Additional commits viewable in <a href="https://github.com/pallets/werkzeug/compare/3.0.1...3.0.3">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=werkzeug&package-manager=pip&previous-version=3.0.1&new-version=3.0.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/data-dot-all/dataall/network/alerts). </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> commit 0b49633f Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue May 07 2024 02:24:34 GMT-0400 (Eastern Daylight Time) Bump werkzeug from 3.0.1 to 3.0.3 in /tests_new/integration_tests (#1254) Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.1 to 3.0.3. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/pallets/werkzeug/releases">werkzeug's releases</a>.</em></p> <blockquote> <h2>3.0.3</h2> <p>This is the Werkzeug 3.0.3 security release, which fixes security issues and bugs but does not otherwise change behavior and should not result in breaking changes.</p> <p>PyPI: <a href="https://pypi.org/project/Werkzeug/3.0.3/">https://pypi.org/project/Werkzeug/3.0.3/</a> Changes: <a href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3</a> Milestone: <a href="https://github.com/pallets/werkzeug/milestone/35?closed=1">https://github.com/pallets/werkzeug/milestone/35?closed=1</a></p> <ul> <li>Only allow <code>localhost</code>, <code>.localhost</code>, <code>127.0.0.1</code>, or the specified hostname when running the dev server, to make debugger requests. Additional hosts can be added by using the debugger middleware directly. The debugger UI makes requests using the full URL rather than only the path. GHSA-2g68-c3qc-8985</li> <li>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. <a href="https://redirect.github.com/pallets/werkzeug/issues/2823">#2823</a></li> <li>Better TLS cert format with <code>adhoc</code> dev certs. <a href="https://redirect.github.com/pallets/werkzeug/issues/2891">#2891</a></li> <li>Inform Python < 3.12 how to handle <code>itms-services</code> URIs correctly, rather than using an overly-broad workaround in Werkzeug that caused some redirect URIs to be passed on without encoding. <a href="https://redirect.github.com/pallets/werkzeug/issues/2828">#2828</a></li> <li>Type annotation for <code>Rule.endpoint</code> and other uses of <code>endpoint</code> is <code>Any</code>. <a href="https://redirect.github.com/pallets/werkzeug/issues/2836">#2836</a></li> </ul> <h2>3.0.2</h2> <p>This is a fix release for the 3.0.x feature branch.</p> <ul> <li>Changes: <a href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2</a></li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/pallets/werkzeug/blob/main/CHANGES.rst">werkzeug's changelog</a>.</em></p> <blockquote> <h2>Version 3.0.3</h2> <p>Released 2024-05-05</p> <ul> <li> <p>Only allow <code>localhost</code>, <code>.localhost</code>, <code>127.0.0.1</code>, or the specified hostname when running the dev server, to make debugger requests. Additional hosts can be added by using the debugger middleware directly. The debugger UI makes requests using the full URL rather than only the path. :ghsa:<code>2g68-c3qc-8985</code></p> </li> <li> <p>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. :pr:<code>2823</code></p> </li> <li> <p>Better TLS cert format with <code>adhoc</code> dev certs. :pr:<code>2891</code></p> </li> <li> <p>Inform Python < 3.12 how to handle <code>itms-services</code> URIs correctly, rather than using an overly-broad workaround in Werkzeug that caused some redirect URIs to be passed on without encoding. :issue:<code>2828</code></p> </li> <li> <p>Type annotation for <code>Rule.endpoint</code> and other uses of <code>endpoint</code> is <code>Any</code>. :issue:<code>2836</code></p> </li> <li> <p>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. :pr:<code>2823</code></p> </li> </ul> <h2>Version 3.0.2</h2> <p>Released 2024-04-01</p> <ul> <li>Ensure setting <code>merge_slashes</code> to <code>False</code> results in <code>NotFound</code> for repeated-slash requests against single slash routes. :issue:<code>2834</code></li> <li>Fix handling of <code>TypeError</code> in <code>TypeConversionDict.get()</code> to match <code>ValueError</code>. :issue:<code>2843</code></li> <li>Fix <code>response_wrapper</code> type check in test client. :issue:<code>2831</code></li> <li>Make the return type of <code>MultiPartParser.parse</code> more precise. :issue:<code>2840</code></li> <li>Raise an error if converter arguments cannot be parsed. :issue:<code>2822</code></li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/pallets/werkzeug/commit/f9995e967979eb694d6b31536cc65314fd7e9c8c"><code>f9995e9</code></a> release version 3.0.3</li> <li><a href="https://github.com/pallets/werkzeug/commit/3386395b24c7371db11a5b8eaac0c91da5362692"><code>3386395</code></a> Merge pull request from GHSA-2g68-c3qc-8985</li> <li><a href="https://github.com/pallets/werkzeug/commit/890b6b62634fa61224222aee31081c61b054ff01"><code>890b6b6</code></a> only require trusted host for evalex</li> <li><a href="https://github.com/pallets/werkzeug/commit/71b69dfb7df3d912e66bab87fbb1f21f83504967"><code>71b69df</code></a> restrict debugger trusted hosts</li> <li><a href="https://github.com/pallets/werkzeug/commit/d2d3869525a4ffb2c41dfb2c0e39d94dab2d870c"><code>d2d3869</code></a> endpoint type is Any (<a href="https://redirect.github.com/pallets/werkzeug/issues/2895">#2895</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/7080b55acd48b68afdda65ee6c7f99e9afafb0ba"><code>7080b55</code></a> endpoint type is Any</li> <li><a href="https://github.com/pallets/werkzeug/commit/7555eff296fbdf12f2e576b6bbb0b506df8417ed"><code>7555eff</code></a> remove iri_to_uri redirect workaround (<a href="https://redirect.github.com/pallets/werkzeug/issues/2894">#2894</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/97fb2f722297ae4e12e36dab024e0acf8477b3c8"><code>97fb2f7</code></a> remove _invalid_iri_to_uri workaround</li> <li><a href="https://github.com/pallets/werkzeug/commit/249527ff981e7aa22cd714825c5637cc92df7761"><code>249527f</code></a> make cn field a valid single hostname, and use wildcard in SANs field. (<a href="https://redirect.github.com/pallets/werkzeug/issues/2892">#2892</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/793be472c9d145eb9be7d4200672d1806289d84a"><code>793be47</code></a> update adhoc tls dev cert format</li> <li>Additional commits viewable in <a href="https://github.com/pallets/werkzeug/compare/3.0.1...3.0.3">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=werkzeug&package-manager=pip&previous-version=3.0.1&new-version=3.0.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/data-dot-all/dataall/network/alerts). </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> commit 08862420 Author: mourya-33 <[email protected]> Date: Tue May 07 2024 02:15:15 GMT-0400 (Eastern Daylight Time) Updated lambda_api.py to add encryption for lambda env vars for custo… (#1255) Feature or Bugfix Bugfix Detail The environment variables for the lambda functions are not encrypted in cdk which are identified by checkov scans. This fix is to enable kms encryption for the lambda environment variables. Relates Security Please answer the questions below briefly where applicable, or write N/A. Based on [OWASP 10](https://owasp.org/Top10/en/). Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? N/A Is the input sanitized? N/A What precautions are you taking before deserializing the data you consume? N/A Is injection prevented by parametrizing queries? N/A Have you ensured no eval or similar functions are used? N/A Does this PR introduce any functionality or component that requires authorization? N/A How have you ensured it respects the existing AuthN/AuthZ mechanisms? N/A Are you logging failed auth attempts? N/A Are you using or adding any cryptographic features? N/A Do you use a standard proven implementations? N/A Are the used keys controlled by the customer? Where are they stored? the KMS keys are generated by cdk and are used to encrypt the environment variables for all lambda functions in the lambda-api stack Are you introducing any new policies/roles/users? - N/A Have you used the least-privilege principle? How? N/A By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit ed7cc3eb Author: Noah Paige <[email protected]> Date: Mon May 06 2024 09:32:30 GMT-0400 (Eastern Daylight Time) Add order_by for paginated queries (#1249) ### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - This PR aims to solve the following - (1) for particular queries (identified as ones that perform `.outerjoin()` operations and have results paginated with `paginate()` function - sometimes the returned query results is *less than* the limit set by the pageSize of the paginate function even when the total count is greater than the pageSize - Ex 1: 11 envs total, `query_user_environments()` returning 9 envs on 1st page + 2 on 2nd page - Ex 2: 10 envs total, `query_user_environments()` returning 9 envs on 1st page + no 2nd page - Believe this is to be happening due to the way SQLAlchemy is "uniquing" the records resulted from an outerjoin and then returning that result back to the frontend - Adding a `.distinct()` check on the query ensures each distinct record is returned (tested successfully) - (2) Currently we often times do not implement an `.order_by()` condition for the query used in `paginate()` and do not have a stable way of preserving order of the items returned from a query (i.e. when navigating through pages of response) - A generally good practice seems to include an `order_by()` on a column or set of columns - For each query used in `paginate()` this PR adds an `order_by()` condition (full list in comments below) Can read a bit more context from related issue linked below ### Relates - https://github.com/data-dot-all/dataall/issues/1241 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 98e67fa8 Author: Sofia Sazonova <[email protected]> Date: Fri May 03 2024 12:21:57 GMT-0400 (Eastern Daylight Time) fix: DATASET_READ_TABLE read permissions (#1237) ### Feature or Bugfix - Bugfix ### Detail - backfill DATASET_READ_TABLE permissions - delete this permissions, when dataset tables are revoked or deteled - ### Relates - #1173 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: Sofia Sazonova <[email protected]> commit 18e2f509 Author: Noah Paige <[email protected]> Date: Fri May 03 2024 10:14:52 GMT-0400 (Eastern Daylight Time) Fix local test groups listing for listGroups query (#1239) ### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - Locally when trying to invite a team to Env or Org we call listGroups and the returned `LOCAL_TEST_GROUPS` is not returning the proper data type expected ### Relates N/A ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit a0be03c4 Author: dlpzx <[email protected]> Date: Fri May 03 2024 10:12:34 GMT-0400 (Eastern Daylight Time) Refactor: uncouple datasets and dataset_sharing modules - part 2-5 FINAL DELETE DATASETS_BASE (#1242) ### Feature or Bugfix - Refactoring ### Detail After all the previous PRs are merged, there should be no circular dependencies between `datasets` and `datasets_sharing`. We can now proceed to: - move `datasets_base` models, repositories, permissions and enums to `datasets` - adjust the `__init__` files to establish the `datasets_sharing` depends on `datasets` - adjust the Module interfaces to ensure that all necessary dataset models... are imported in the interface for sharing Next steps: - share_notifications paramter to dataset_sharing in config.json ### Relates #955 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit b68b40c1 Author: Sofia Sazonova <[email protected]> Date: Fri May 03 2024 10:12:11 GMT-0400 (Eastern Daylight Time) bugfix: EnvironmentGroup can remove other groups (#1234) ### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - Now, if the group can't update other group, it also can not remove them. - ### Relates - #1212 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: Sofia Sazonova <[email protected]> commit 264539b5 Author: Noah Paige <[email protected]> Date: Fri May 03 2024 05:23:11 GMT-0400 (Eastern Daylight Time) Fix Alembic Migration: has table checks (#1240) ### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - Fix `has_table()` check to ensure dropping the tables if the exists as part of alembic migration upgrade - Fix `DatasetLock nullable=True` ### Relates - https://github.com/data-dot-all/dataall/issues/1165 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? No - Is the input sanitized? N/A - What precautions are you taking before deserializing the data you consume? N/A - Is injection prevented by parametrizing queries? N/A - Have you ensured no `eval` or similar functions are used? N/A - Does this PR introduce any functionality or component that requires authorization? No - How have you ensured it respects the existing AuthN/AuthZ mechanisms? N/A - Are you logging failed auth attempts? N/A - Are you using or adding any cryptographic features? No - Do you use a standard proven implementations? N/A - Are the used keys controlled by the customer? Where are they stored? N/A - Are you introducing any new policies/roles/users? No - Have you used the least-privilege principle? How? N/A By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 42a5f6bd Author: dlpzx <[email protected]> Date: Fri May 03 2024 02:24:09 GMT-0400 (Eastern Daylight Time) Refactor: uncouple datasets and dataset_sharing modules - part 2-4 (#1214) ### Feature or Bugfix - Refactoring ⚠️ MERGE AFTER https://github.com/data-dot-all/dataall/pull/1213 ### Detail This is needed as explained in full PR [AFTER 2.4] Refactor: uncouple datasets and dataset_sharing modules #1179 - [X] Use interface to resolve dataset roles related to datasets shared and implement logic in the dataset_sharing module - [X] Extend and clean-up stewards share permissions through interface ### Relates - #1179 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 6d3f2d45 Author: Sofia Sazonova <[email protected]> Date: Thu May 02 2024 10:55:00 GMT-0400 (Eastern Daylight Time) [After 2.4]Core Refactoring part5 (#1194) ### Feature or Bugfix - Refactoring ### Detail - focus on core/environments - move logic from resolvers to services - create s3_client in base/aws --> TO BE REFACTORED. Needs to be merged with dataset_sharind/aws/s3_client ### Relates - #741 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: Sofia Sazonova <[email protected]> commit 2ea24cbb Author: dlpzx <[email protected]> Date: Thu May 02 2024 08:22:12 GMT-0400 (Eastern Daylight Time) Refactor: uncouple datasets and dataset_sharing modules - part 2-3 (#1213) ### Feature or Bugfix - Refactoring ⚠️ MERGE AFTER https://github.com/data-dot-all/dataall/pull/1187 ### Detail This is needed as explained in full PR [AFTER 2.4] Refactor: uncouple datasets and dataset_sharing modules #1179 - [X] Creates an interface to execute checks and clean-ups of data sharing objects when dataset objects are deleted (initially it was going to be an db interface, but I think it is better in the service) - [X] Move listDatasetShares query to dataset_sharing module in https://github.com/data-dot-all/dataall/pull/1185 ### Relates - #1179 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 750a5ec8 Author: Anushka Singh <[email protected]> Date: Wed May 01 2024 12:28:18 GMT-0400 (Eastern Daylight Time) Feature:1221 - Make visibility of auto-approval toggle configurable based on confidentiality (#1223) ### Feature or Bugfix - Feature ### Detail - Users should be able to disable visibility of auto-approval toggle with code. For example, at our company, we require that shares always go through approval process if their confidentiality classification is Secret. We dont even want to give the option to users to be able to set autoApproval enabled to ensure they dont do so by mistake and end up over sharing. Video demo: https://github.com/data-dot-all/dataall/issues/1221#issuecomment-2077412044 ### Relates - https://github.com/data-dot-all/dataall/issues/1221 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 82044689 Author: dlpzx <[email protected]> Date: Wed May 01 2024 12:26:42 GMT-0400 (Eastern Daylight Time) Refactor: uncouple datasets and dataset_sharing modules - part 2-2 (#1187) ### Feature or Bugfix - Refactoring ⚠️ MERGE AFTER https://github.com/data-dot-all/dataall/pull/1185 ### Detail This is needed as explained in full PR [AFTER 2.4] Refactor: uncouple datasets and dataset_sharing modules #1179 - Split the getDatasetAssumeRole API into 2 APIs, one for dataset owners role (in datasets module) and another one for share requester roles (in datasets_sharing module) ### Relates - #1179 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 5173419f Author: Noah Paige <[email protected]> Date: Wed May 01 2024 12:24:42 GMT-0400 (Eastern Daylight Time) Fix so listValidEnvironments called only once (#1238) ### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - When request access to a share on data.all the query to `listValidEnvironments` used to be called twice which (depending on how long for query results to return) could cause the environment initially selected to disappear ### Relates - Continuation of https://github.com/data-dot-all/dataall/issues/916 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 7656ea86 Author: dlpzx <[email protected]> Date: Tue Apr 30 2024 07:13:01 GMT-0400 (Eastern Daylight Time) Add integration tests on a real API client and integrate the tests in CICD (#1219) ### Feature or Bugfix - Feature ### Detail Add integration tests that use a real Client to execute different validation actions. - Define the Client and the way API calls are posted to API Gateway in the conftest - Define the Cognito users and the different fixtures needed for all tests - Write tests for the Organization core module as example - Add feature flag in `cdk.json` called `with_approval_tests` that can be defined at the deployment environment level. If set to True, a CodeBuild stage running the tests is created. ### Relates - https://github.com/data-dot-all/dataall/issues/1220 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit b963fe81 Author: Sofia Sazonova <[email protected]> Date: Mon Apr 29 2024 09:26:36 GMT-0400 (Eastern Daylight Time) Notification link routes to a share request page (#1227) ### Feature or Bugfix <!-- please choose --> - Feature ### Detail - in notification object field `target_uri = 'shareUri|DataSetUri'` - this value is parsed and used to redirect user to a relevant Share Request page ### Relates - #1115 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the da…
noah-paige
added a commit
that referenced
this issue
Jun 25, 2024
commit d8497c55 Author: Noah Paige <[email protected]> Date: Mon May 20 2024 15:19:35 GMT-0400 (Eastern Daylight Time) fix commit 199ab505 Author: Admin/noahpaig-Isengard <Admin/noahpaig-Isengard> Date: Mon May 20 2024 15:16:14 GMT-0400 (Eastern Daylight Time) Conflicts resolved in the console. commit ad415575 Author: Noah Paige <[email protected]> Date: Mon May 20 2024 15:13:06 GMT-0400 (Eastern Daylight Time) Merge branch 'chatbot-test' into noah-main-2 commit 2893efc7 Author: Noah Paige <[email protected]> Date: Mon May 20 2024 15:10:20 GMT-0400 (Eastern Daylight Time) Merge branch 'chatbot-test' into noah-main-2 commit caad12e1 Author: Noah Paige <[email protected]> Date: Mon May 20 2024 15:12:47 GMT-0400 (Eastern Daylight Time) ruff commit a73b7110 Author: Noah Paige <[email protected]> Date: Mon May 20 2024 15:09:19 GMT-0400 (Eastern Daylight Time) Remove hardcoding commit 7b32a8f7 Author: Noah Paige <[email protected]> Date: Mon May 20 2024 15:07:29 GMT-0400 (Eastern Daylight Time) Chatbot POC commit e25e5815 Author: Noah Paige <[email protected]> Date: Mon May 20 2024 15:08:32 GMT-0400 (Eastern Daylight Time) Merge branch '1215-share-logs' into noah-main-2 commit a06c8cba Author: Noah Paige <[email protected]> Date: Fri May 17 2024 16:37:05 GMT-0400 (Eastern Daylight Time) Merge share logs PR commit a5626670 Author: Sofia Sazonova <[email protected]> Date: Fri May 17 2024 17:19:25 GMT-0400 (Eastern Daylight Time) make ruff happy commit aee98cf7 Author: Noah Paige <[email protected]> Date: Fri May 17 2024 16:34:24 GMT-0400 (Eastern Daylight Time) Merge share logs PR commit c3ee2f7c Author: Sofia Sazonova <[email protected]> Date: Fri May 17 2024 17:11:00 GMT-0400 (Eastern Daylight Time) PR comments commit 5ca55303 Author: Sofia Sazonova <[email protected]> Date: Wed May 15 2024 11:57:41 GMT-0400 (Eastern Daylight Time) remove unused imports commit 8f8bf3dd Author: Sofia Sazonova <[email protected]> Date: Wed May 15 2024 11:56:39 GMT-0400 (Eastern Daylight Time) restrict access to the share logs commit 9137da9b Author: Sofia Sazonova <[email protected]> Date: Wed May 15 2024 11:28:32 GMT-0400 (Eastern Daylight Time) share Logs button is available only for dataset Admins and stewards commit fcb16bd9 Author: Sofia Sazonova <[email protected]> Date: Wed May 15 2024 10:46:25 GMT-0400 (Eastern Daylight Time) getShareLogs query commit 0503a3bb Author: Sofia Sazonova <[email protected]> Date: Wed May 15 2024 10:21:25 GMT-0400 (Eastern Daylight Time) Logs modal in Share View commit bab2f3e6 Author: Sofia Sazonova <[email protected]> Date: Mon May 13 2024 09:09:18 GMT-0400 (Eastern Daylight Time) Add confirmation pop-ups for deletion of team roles and groups (#1231) ### Feature or Bugfix - Feature ### Detail Pop ups added for: - deletion team from environment - deletion of the consumption role - deletion of group from Organization ### Relates - #942 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. Co-authored-by: Sofia Sazonova <[email protected]> commit 93ff7725 Author: Sofia Sazonova <[email protected]> Date: Mon May 13 2024 08:00:38 GMT-0400 (Eastern Daylight Time) Update version.json (#1264) Release info update commit e718d861 Author: Sofia Sazonova <[email protected]> Date: Mon May 13 2024 07:29:27 GMT-0400 (Eastern Daylight Time) fix permission query (#1263) ### Feature or Bugfix - Bugfix ### Detail - The filter -- array of permissions' NAMES, so in order to query policies correctly we need to add join - The filter 'share_type' and 'share_item_status' must be string - IMPORTANT: in block "finally" the param session was used, but session was defined only in "try" block. So, the lock failed to be released. ### Relates - <URL or Ticket> ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: Sofia Sazonova <[email protected]> commit 479b8f3f Author: mourya-33 <[email protected]> Date: Wed May 08 2024 10:29:36 GMT-0400 (Eastern Daylight Time) Add encryption and tag immutability to ECR repository (#1224) ### Feature or Bugfix - Bugfix ### Detail - Currently the ecr repository created do not have encryption and tag immutability enabled which is identified by checkov scans. This fix is to enable both. ### Relates [- <URL or Ticket>](https://github.com/data-dot-all/dataall/issues/1200) ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes N/A - Is the input sanitized? N/A - What precautions are you taking before deserializing the data you consume? N/A - Is injection prevented by parametrizing queries? N/A - Have you ensured no `eval` or similar functions are used? N/A - Does this PR introduce any functionality or component that requires authorization? N/A - How have you ensured it respects the existing AuthN/AuthZ mechanisms? N/A - Are you logging failed auth attempts? N/A - Are you using or adding any cryptographic features? N/A - Do you use a standard proven implementations? N/A - Are the used keys controlled by the customer? Where are they stored? No. This is with default encryption - Are you introducing any new policies/roles/users? N/A - Have you used the least-privilege principle? How? N/A By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 2f885773 Author: Sofia Sazonova <[email protected]> Date: Wed May 08 2024 09:22:40 GMT-0400 (Eastern Daylight Time) Multiple permission roots (#1259) ### Feature or Bugfix - Bugfix ### Detail - GET_DATASET_TABLE (FOLDER) permissions are granted to the group only if they are not granted already - these permissions are removed if group is not admin|steward and there are no other shares of this item. ### Relates - #1174 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: Sofia Sazonova <[email protected]> commit c4cc07ee Author: Petros Kalos <[email protected]> Date: Wed May 08 2024 08:54:02 GMT-0400 (Eastern Daylight Time) explicitly specify dataset_client s3 endpoint_url (#1260) * AWS requires that the endpoint_url should be explicitly specified for some regions * Remove misleading CORS error message, the upload step can fail for many reason ### Feature or Bugfix - Bugfix ### Detail Resolves #778 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 40defe8e Author: dlpzx <[email protected]> Date: Tue May 07 2024 11:52:17 GMT-0400 (Eastern Daylight Time) Generic dataset module and specific s3_datasets module - part 1 (Rename datasets as s3_datasets) (#1250) ### Feature or Bugfix - Refactoring ### Detail - Rename `datasets` module to `s3_datasets` module This PR is the first step to extract a generic datasets_base module that implements the undifferentiated concepts of Dataset in data.all. s3_datasets will use this base module to implement the specific implementation for S3 datatasets. ### Relates - #1123 - #955 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 74a303cb Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue May 07 2024 02:26:09 GMT-0400 (Eastern Daylight Time) Bump werkzeug from 3.0.1 to 3.0.3 in /tests (#1253) Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.1 to 3.0.3. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/pallets/werkzeug/releases">werkzeug's releases</a>.</em></p> <blockquote> <h2>3.0.3</h2> <p>This is the Werkzeug 3.0.3 security release, which fixes security issues and bugs but does not otherwise change behavior and should not result in breaking changes.</p> <p>PyPI: <a href="https://pypi.org/project/Werkzeug/3.0.3/">https://pypi.org/project/Werkzeug/3.0.3/</a> Changes: <a href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3</a> Milestone: <a href="https://github.com/pallets/werkzeug/milestone/35?closed=1">https://github.com/pallets/werkzeug/milestone/35?closed=1</a></p> <ul> <li>Only allow <code>localhost</code>, <code>.localhost</code>, <code>127.0.0.1</code>, or the specified hostname when running the dev server, to make debugger requests. Additional hosts can be added by using the debugger middleware directly. The debugger UI makes requests using the full URL rather than only the path. GHSA-2g68-c3qc-8985</li> <li>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. <a href="https://redirect.github.com/pallets/werkzeug/issues/2823">#2823</a></li> <li>Better TLS cert format with <code>adhoc</code> dev certs. <a href="https://redirect.github.com/pallets/werkzeug/issues/2891">#2891</a></li> <li>Inform Python < 3.12 how to handle <code>itms-services</code> URIs correctly, rather than using an overly-broad workaround in Werkzeug that caused some redirect URIs to be passed on without encoding. <a href="https://redirect.github.com/pallets/werkzeug/issues/2828">#2828</a></li> <li>Type annotation for <code>Rule.endpoint</code> and other uses of <code>endpoint</code> is <code>Any</code>. <a href="https://redirect.github.com/pallets/werkzeug/issues/2836">#2836</a></li> </ul> <h2>3.0.2</h2> <p>This is a fix release for the 3.0.x feature branch.</p> <ul> <li>Changes: <a href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2</a></li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/pallets/werkzeug/blob/main/CHANGES.rst">werkzeug's changelog</a>.</em></p> <blockquote> <h2>Version 3.0.3</h2> <p>Released 2024-05-05</p> <ul> <li> <p>Only allow <code>localhost</code>, <code>.localhost</code>, <code>127.0.0.1</code>, or the specified hostname when running the dev server, to make debugger requests. Additional hosts can be added by using the debugger middleware directly. The debugger UI makes requests using the full URL rather than only the path. :ghsa:<code>2g68-c3qc-8985</code></p> </li> <li> <p>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. :pr:<code>2823</code></p> </li> <li> <p>Better TLS cert format with <code>adhoc</code> dev certs. :pr:<code>2891</code></p> </li> <li> <p>Inform Python < 3.12 how to handle <code>itms-services</code> URIs correctly, rather than using an overly-broad workaround in Werkzeug that caused some redirect URIs to be passed on without encoding. :issue:<code>2828</code></p> </li> <li> <p>Type annotation for <code>Rule.endpoint</code> and other uses of <code>endpoint</code> is <code>Any</code>. :issue:<code>2836</code></p> </li> <li> <p>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. :pr:<code>2823</code></p> </li> </ul> <h2>Version 3.0.2</h2> <p>Released 2024-04-01</p> <ul> <li>Ensure setting <code>merge_slashes</code> to <code>False</code> results in <code>NotFound</code> for repeated-slash requests against single slash routes. :issue:<code>2834</code></li> <li>Fix handling of <code>TypeError</code> in <code>TypeConversionDict.get()</code> to match <code>ValueError</code>. :issue:<code>2843</code></li> <li>Fix <code>response_wrapper</code> type check in test client. :issue:<code>2831</code></li> <li>Make the return type of <code>MultiPartParser.parse</code> more precise. :issue:<code>2840</code></li> <li>Raise an error if converter arguments cannot be parsed. :issue:<code>2822</code></li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/pallets/werkzeug/commit/f9995e967979eb694d6b31536cc65314fd7e9c8c"><code>f9995e9</code></a> release version 3.0.3</li> <li><a href="https://github.com/pallets/werkzeug/commit/3386395b24c7371db11a5b8eaac0c91da5362692"><code>3386395</code></a> Merge pull request from GHSA-2g68-c3qc-8985</li> <li><a href="https://github.com/pallets/werkzeug/commit/890b6b62634fa61224222aee31081c61b054ff01"><code>890b6b6</code></a> only require trusted host for evalex</li> <li><a href="https://github.com/pallets/werkzeug/commit/71b69dfb7df3d912e66bab87fbb1f21f83504967"><code>71b69df</code></a> restrict debugger trusted hosts</li> <li><a href="https://github.com/pallets/werkzeug/commit/d2d3869525a4ffb2c41dfb2c0e39d94dab2d870c"><code>d2d3869</code></a> endpoint type is Any (<a href="https://redirect.github.com/pallets/werkzeug/issues/2895">#2895</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/7080b55acd48b68afdda65ee6c7f99e9afafb0ba"><code>7080b55</code></a> endpoint type is Any</li> <li><a href="https://github.com/pallets/werkzeug/commit/7555eff296fbdf12f2e576b6bbb0b506df8417ed"><code>7555eff</code></a> remove iri_to_uri redirect workaround (<a href="https://redirect.github.com/pallets/werkzeug/issues/2894">#2894</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/97fb2f722297ae4e12e36dab024e0acf8477b3c8"><code>97fb2f7</code></a> remove _invalid_iri_to_uri workaround</li> <li><a href="https://github.com/pallets/werkzeug/commit/249527ff981e7aa22cd714825c5637cc92df7761"><code>249527f</code></a> make cn field a valid single hostname, and use wildcard in SANs field. (<a href="https://redirect.github.com/pallets/werkzeug/issues/2892">#2892</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/793be472c9d145eb9be7d4200672d1806289d84a"><code>793be47</code></a> update adhoc tls dev cert format</li> <li>Additional commits viewable in <a href="https://github.com/pallets/werkzeug/compare/3.0.1...3.0.3">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=werkzeug&package-manager=pip&previous-version=3.0.1&new-version=3.0.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/data-dot-all/dataall/network/alerts). </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> commit 2f33320c Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue May 07 2024 02:25:03 GMT-0400 (Eastern Daylight Time) Bump werkzeug from 3.0.1 to 3.0.3 in /backend/dataall/base/cdkproxy (#1252) Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.1 to 3.0.3. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/pallets/werkzeug/releases">werkzeug's releases</a>.</em></p> <blockquote> <h2>3.0.3</h2> <p>This is the Werkzeug 3.0.3 security release, which fixes security issues and bugs but does not otherwise change behavior and should not result in breaking changes.</p> <p>PyPI: <a href="https://pypi.org/project/Werkzeug/3.0.3/">https://pypi.org/project/Werkzeug/3.0.3/</a> Changes: <a href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3</a> Milestone: <a href="https://github.com/pallets/werkzeug/milestone/35?closed=1">https://github.com/pallets/werkzeug/milestone/35?closed=1</a></p> <ul> <li>Only allow <code>localhost</code>, <code>.localhost</code>, <code>127.0.0.1</code>, or the specified hostname when running the dev server, to make debugger requests. Additional hosts can be added by using the debugger middleware directly. The debugger UI makes requests using the full URL rather than only the path. GHSA-2g68-c3qc-8985</li> <li>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. <a href="https://redirect.github.com/pallets/werkzeug/issues/2823">#2823</a></li> <li>Better TLS cert format with <code>adhoc</code> dev certs. <a href="https://redirect.github.com/pallets/werkzeug/issues/2891">#2891</a></li> <li>Inform Python < 3.12 how to handle <code>itms-services</code> URIs correctly, rather than using an overly-broad workaround in Werkzeug that caused some redirect URIs to be passed on without encoding. <a href="https://redirect.github.com/pallets/werkzeug/issues/2828">#2828</a></li> <li>Type annotation for <code>Rule.endpoint</code> and other uses of <code>endpoint</code> is <code>Any</code>. <a href="https://redirect.github.com/pallets/werkzeug/issues/2836">#2836</a></li> </ul> <h2>3.0.2</h2> <p>This is a fix release for the 3.0.x feature branch.</p> <ul> <li>Changes: <a href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2</a></li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/pallets/werkzeug/blob/main/CHANGES.rst">werkzeug's changelog</a>.</em></p> <blockquote> <h2>Version 3.0.3</h2> <p>Released 2024-05-05</p> <ul> <li> <p>Only allow <code>localhost</code>, <code>.localhost</code>, <code>127.0.0.1</code>, or the specified hostname when running the dev server, to make debugger requests. Additional hosts can be added by using the debugger middleware directly. The debugger UI makes requests using the full URL rather than only the path. :ghsa:<code>2g68-c3qc-8985</code></p> </li> <li> <p>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. :pr:<code>2823</code></p> </li> <li> <p>Better TLS cert format with <code>adhoc</code> dev certs. :pr:<code>2891</code></p> </li> <li> <p>Inform Python < 3.12 how to handle <code>itms-services</code> URIs correctly, rather than using an overly-broad workaround in Werkzeug that caused some redirect URIs to be passed on without encoding. :issue:<code>2828</code></p> </li> <li> <p>Type annotation for <code>Rule.endpoint</code> and other uses of <code>endpoint</code> is <code>Any</code>. :issue:<code>2836</code></p> </li> <li> <p>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. :pr:<code>2823</code></p> </li> </ul> <h2>Version 3.0.2</h2> <p>Released 2024-04-01</p> <ul> <li>Ensure setting <code>merge_slashes</code> to <code>False</code> results in <code>NotFound</code> for repeated-slash requests against single slash routes. :issue:<code>2834</code></li> <li>Fix handling of <code>TypeError</code> in <code>TypeConversionDict.get()</code> to match <code>ValueError</code>. :issue:<code>2843</code></li> <li>Fix <code>response_wrapper</code> type check in test client. :issue:<code>2831</code></li> <li>Make the return type of <code>MultiPartParser.parse</code> more precise. :issue:<code>2840</code></li> <li>Raise an error if converter arguments cannot be parsed. :issue:<code>2822</code></li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/pallets/werkzeug/commit/f9995e967979eb694d6b31536cc65314fd7e9c8c"><code>f9995e9</code></a> release version 3.0.3</li> <li><a href="https://github.com/pallets/werkzeug/commit/3386395b24c7371db11a5b8eaac0c91da5362692"><code>3386395</code></a> Merge pull request from GHSA-2g68-c3qc-8985</li> <li><a href="https://github.com/pallets/werkzeug/commit/890b6b62634fa61224222aee31081c61b054ff01"><code>890b6b6</code></a> only require trusted host for evalex</li> <li><a href="https://github.com/pallets/werkzeug/commit/71b69dfb7df3d912e66bab87fbb1f21f83504967"><code>71b69df</code></a> restrict debugger trusted hosts</li> <li><a href="https://github.com/pallets/werkzeug/commit/d2d3869525a4ffb2c41dfb2c0e39d94dab2d870c"><code>d2d3869</code></a> endpoint type is Any (<a href="https://redirect.github.com/pallets/werkzeug/issues/2895">#2895</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/7080b55acd48b68afdda65ee6c7f99e9afafb0ba"><code>7080b55</code></a> endpoint type is Any</li> <li><a href="https://github.com/pallets/werkzeug/commit/7555eff296fbdf12f2e576b6bbb0b506df8417ed"><code>7555eff</code></a> remove iri_to_uri redirect workaround (<a href="https://redirect.github.com/pallets/werkzeug/issues/2894">#2894</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/97fb2f722297ae4e12e36dab024e0acf8477b3c8"><code>97fb2f7</code></a> remove _invalid_iri_to_uri workaround</li> <li><a href="https://github.com/pallets/werkzeug/commit/249527ff981e7aa22cd714825c5637cc92df7761"><code>249527f</code></a> make cn field a valid single hostname, and use wildcard in SANs field. (<a href="https://redirect.github.com/pallets/werkzeug/issues/2892">#2892</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/793be472c9d145eb9be7d4200672d1806289d84a"><code>793be47</code></a> update adhoc tls dev cert format</li> <li>Additional commits viewable in <a href="https://github.com/pallets/werkzeug/compare/3.0.1...3.0.3">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=werkzeug&package-manager=pip&previous-version=3.0.1&new-version=3.0.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/data-dot-all/dataall/network/alerts). </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> commit 0b49633f Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue May 07 2024 02:24:34 GMT-0400 (Eastern Daylight Time) Bump werkzeug from 3.0.1 to 3.0.3 in /tests_new/integration_tests (#1254) Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.1 to 3.0.3. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/pallets/werkzeug/releases">werkzeug's releases</a>.</em></p> <blockquote> <h2>3.0.3</h2> <p>This is the Werkzeug 3.0.3 security release, which fixes security issues and bugs but does not otherwise change behavior and should not result in breaking changes.</p> <p>PyPI: <a href="https://pypi.org/project/Werkzeug/3.0.3/">https://pypi.org/project/Werkzeug/3.0.3/</a> Changes: <a href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3</a> Milestone: <a href="https://github.com/pallets/werkzeug/milestone/35?closed=1">https://github.com/pallets/werkzeug/milestone/35?closed=1</a></p> <ul> <li>Only allow <code>localhost</code>, <code>.localhost</code>, <code>127.0.0.1</code>, or the specified hostname when running the dev server, to make debugger requests. Additional hosts can be added by using the debugger middleware directly. The debugger UI makes requests using the full URL rather than only the path. GHSA-2g68-c3qc-8985</li> <li>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. <a href="https://redirect.github.com/pallets/werkzeug/issues/2823">#2823</a></li> <li>Better TLS cert format with <code>adhoc</code> dev certs. <a href="https://redirect.github.com/pallets/werkzeug/issues/2891">#2891</a></li> <li>Inform Python < 3.12 how to handle <code>itms-services</code> URIs correctly, rather than using an overly-broad workaround in Werkzeug that caused some redirect URIs to be passed on without encoding. <a href="https://redirect.github.com/pallets/werkzeug/issues/2828">#2828</a></li> <li>Type annotation for <code>Rule.endpoint</code> and other uses of <code>endpoint</code> is <code>Any</code>. <a href="https://redirect.github.com/pallets/werkzeug/issues/2836">#2836</a></li> </ul> <h2>3.0.2</h2> <p>This is a fix release for the 3.0.x feature branch.</p> <ul> <li>Changes: <a href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2</a></li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/pallets/werkzeug/blob/main/CHANGES.rst">werkzeug's changelog</a>.</em></p> <blockquote> <h2>Version 3.0.3</h2> <p>Released 2024-05-05</p> <ul> <li> <p>Only allow <code>localhost</code>, <code>.localhost</code>, <code>127.0.0.1</code>, or the specified hostname when running the dev server, to make debugger requests. Additional hosts can be added by using the debugger middleware directly. The debugger UI makes requests using the full URL rather than only the path. :ghsa:<code>2g68-c3qc-8985</code></p> </li> <li> <p>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. :pr:<code>2823</code></p> </li> <li> <p>Better TLS cert format with <code>adhoc</code> dev certs. :pr:<code>2891</code></p> </li> <li> <p>Inform Python < 3.12 how to handle <code>itms-services</code> URIs correctly, rather than using an overly-broad workaround in Werkzeug that caused some redirect URIs to be passed on without encoding. :issue:<code>2828</code></p> </li> <li> <p>Type annotation for <code>Rule.endpoint</code> and other uses of <code>endpoint</code> is <code>Any</code>. :issue:<code>2836</code></p> </li> <li> <p>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. :pr:<code>2823</code></p> </li> </ul> <h2>Version 3.0.2</h2> <p>Released 2024-04-01</p> <ul> <li>Ensure setting <code>merge_slashes</code> to <code>False</code> results in <code>NotFound</code> for repeated-slash requests against single slash routes. :issue:<code>2834</code></li> <li>Fix handling of <code>TypeError</code> in <code>TypeConversionDict.get()</code> to match <code>ValueError</code>. :issue:<code>2843</code></li> <li>Fix <code>response_wrapper</code> type check in test client. :issue:<code>2831</code></li> <li>Make the return type of <code>MultiPartParser.parse</code> more precise. :issue:<code>2840</code></li> <li>Raise an error if converter arguments cannot be parsed. :issue:<code>2822</code></li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/pallets/werkzeug/commit/f9995e967979eb694d6b31536cc65314fd7e9c8c"><code>f9995e9</code></a> release version 3.0.3</li> <li><a href="https://github.com/pallets/werkzeug/commit/3386395b24c7371db11a5b8eaac0c91da5362692"><code>3386395</code></a> Merge pull request from GHSA-2g68-c3qc-8985</li> <li><a href="https://github.com/pallets/werkzeug/commit/890b6b62634fa61224222aee31081c61b054ff01"><code>890b6b6</code></a> only require trusted host for evalex</li> <li><a href="https://github.com/pallets/werkzeug/commit/71b69dfb7df3d912e66bab87fbb1f21f83504967"><code>71b69df</code></a> restrict debugger trusted hosts</li> <li><a href="https://github.com/pallets/werkzeug/commit/d2d3869525a4ffb2c41dfb2c0e39d94dab2d870c"><code>d2d3869</code></a> endpoint type is Any (<a href="https://redirect.github.com/pallets/werkzeug/issues/2895">#2895</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/7080b55acd48b68afdda65ee6c7f99e9afafb0ba"><code>7080b55</code></a> endpoint type is Any</li> <li><a href="https://github.com/pallets/werkzeug/commit/7555eff296fbdf12f2e576b6bbb0b506df8417ed"><code>7555eff</code></a> remove iri_to_uri redirect workaround (<a href="https://redirect.github.com/pallets/werkzeug/issues/2894">#2894</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/97fb2f722297ae4e12e36dab024e0acf8477b3c8"><code>97fb2f7</code></a> remove _invalid_iri_to_uri workaround</li> <li><a href="https://github.com/pallets/werkzeug/commit/249527ff981e7aa22cd714825c5637cc92df7761"><code>249527f</code></a> make cn field a valid single hostname, and use wildcard in SANs field. (<a href="https://redirect.github.com/pallets/werkzeug/issues/2892">#2892</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/793be472c9d145eb9be7d4200672d1806289d84a"><code>793be47</code></a> update adhoc tls dev cert format</li> <li>Additional commits viewable in <a href="https://github.com/pallets/werkzeug/compare/3.0.1...3.0.3">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=werkzeug&package-manager=pip&previous-version=3.0.1&new-version=3.0.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/data-dot-all/dataall/network/alerts). </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> commit 08862420 Author: mourya-33 <[email protected]> Date: Tue May 07 2024 02:15:15 GMT-0400 (Eastern Daylight Time) Updated lambda_api.py to add encryption for lambda env vars for custo… (#1255) Feature or Bugfix Bugfix Detail The environment variables for the lambda functions are not encrypted in cdk which are identified by checkov scans. This fix is to enable kms encryption for the lambda environment variables. Relates Security Please answer the questions below briefly where applicable, or write N/A. Based on [OWASP 10](https://owasp.org/Top10/en/). Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? N/A Is the input sanitized? N/A What precautions are you taking before deserializing the data you consume? N/A Is injection prevented by parametrizing queries? N/A Have you ensured no eval or similar functions are used? N/A Does this PR introduce any functionality or component that requires authorization? N/A How have you ensured it respects the existing AuthN/AuthZ mechanisms? N/A Are you logging failed auth attempts? N/A Are you using or adding any cryptographic features? N/A Do you use a standard proven implementations? N/A Are the used keys controlled by the customer? Where are they stored? the KMS keys are generated by cdk and are used to encrypt the environment variables for all lambda functions in the lambda-api stack Are you introducing any new policies/roles/users? - N/A Have you used the least-privilege principle? How? N/A By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit ed7cc3eb Author: Noah Paige <[email protected]> Date: Mon May 06 2024 09:32:30 GMT-0400 (Eastern Daylight Time) Add order_by for paginated queries (#1249) ### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - This PR aims to solve the following - (1) for particular queries (identified as ones that perform `.outerjoin()` operations and have results paginated with `paginate()` function - sometimes the returned query results is *less than* the limit set by the pageSize of the paginate function even when the total count is greater than the pageSize - Ex 1: 11 envs total, `query_user_environments()` returning 9 envs on 1st page + 2 on 2nd page - Ex 2: 10 envs total, `query_user_environments()` returning 9 envs on 1st page + no 2nd page - Believe this is to be happening due to the way SQLAlchemy is "uniquing" the records resulted from an outerjoin and then returning that result back to the frontend - Adding a `.distinct()` check on the query ensures each distinct record is returned (tested successfully) - (2) Currently we often times do not implement an `.order_by()` condition for the query used in `paginate()` and do not have a stable way of preserving order of the items returned from a query (i.e. when navigating through pages of response) - A generally good practice seems to include an `order_by()` on a column or set of columns - For each query used in `paginate()` this PR adds an `order_by()` condition (full list in comments below) Can read a bit more context from related issue linked below ### Relates - https://github.com/data-dot-all/dataall/issues/1241 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 98e67fa8 Author: Sofia Sazonova <[email protected]> Date: Fri May 03 2024 12:21:57 GMT-0400 (Eastern Daylight Time) fix: DATASET_READ_TABLE read permissions (#1237) ### Feature or Bugfix - Bugfix ### Detail - backfill DATASET_READ_TABLE permissions - delete this permissions, when dataset tables are revoked or deteled - ### Relates - #1173 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: Sofia Sazonova <[email protected]> commit 18e2f509 Author: Noah Paige <[email protected]> Date: Fri May 03 2024 10:14:52 GMT-0400 (Eastern Daylight Time) Fix local test groups listing for listGroups query (#1239) ### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - Locally when trying to invite a team to Env or Org we call listGroups and the returned `LOCAL_TEST_GROUPS` is not returning the proper data type expected ### Relates N/A ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit a0be03c4 Author: dlpzx <[email protected]> Date: Fri May 03 2024 10:12:34 GMT-0400 (Eastern Daylight Time) Refactor: uncouple datasets and dataset_sharing modules - part 2-5 FINAL DELETE DATASETS_BASE (#1242) ### Feature or Bugfix - Refactoring ### Detail After all the previous PRs are merged, there should be no circular dependencies between `datasets` and `datasets_sharing`. We can now proceed to: - move `datasets_base` models, repositories, permissions and enums to `datasets` - adjust the `__init__` files to establish the `datasets_sharing` depends on `datasets` - adjust the Module interfaces to ensure that all necessary dataset models... are imported in the interface for sharing Next steps: - share_notifications paramter to dataset_sharing in config.json ### Relates #955 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit b68b40c1 Author: Sofia Sazonova <[email protected]> Date: Fri May 03 2024 10:12:11 GMT-0400 (Eastern Daylight Time) bugfix: EnvironmentGroup can remove other groups (#1234) ### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - Now, if the group can't update other group, it also can not remove them. - ### Relates - #1212 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: Sofia Sazonova <[email protected]> commit 264539b5 Author: Noah Paige <[email protected]> Date: Fri May 03 2024 05:23:11 GMT-0400 (Eastern Daylight Time) Fix Alembic Migration: has table checks (#1240) ### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - Fix `has_table()` check to ensure dropping the tables if the exists as part of alembic migration upgrade - Fix `DatasetLock nullable=True` ### Relates - https://github.com/data-dot-all/dataall/issues/1165 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? No - Is the input sanitized? N/A - What precautions are you taking before deserializing the data you consume? N/A - Is injection prevented by parametrizing queries? N/A - Have you ensured no `eval` or similar functions are used? N/A - Does this PR introduce any functionality or component that requires authorization? No - How have you ensured it respects the existing AuthN/AuthZ mechanisms? N/A - Are you logging failed auth attempts? N/A - Are you using or adding any cryptographic features? No - Do you use a standard proven implementations? N/A - Are the used keys controlled by the customer? Where are they stored? N/A - Are you introducing any new policies/roles/users? No - Have you used the least-privilege principle? How? N/A By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 42a5f6bd Author: dlpzx <[email protected]> Date: Fri May 03 2024 02:24:09 GMT-0400 (Eastern Daylight Time) Refactor: uncouple datasets and dataset_sharing modules - part 2-4 (#1214) ### Feature or Bugfix - Refactoring ⚠️ MERGE AFTER https://github.com/data-dot-all/dataall/pull/1213 ### Detail This is needed as explained in full PR [AFTER 2.4] Refactor: uncouple datasets and dataset_sharing modules #1179 - [X] Use interface to resolve dataset roles related to datasets shared and implement logic in the dataset_sharing module - [X] Extend and clean-up stewards share permissions through interface ### Relates - #1179 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 6d3f2d45 Author: Sofia Sazonova <[email protected]> Date: Thu May 02 2024 10:55:00 GMT-0400 (Eastern Daylight Time) [After 2.4]Core Refactoring part5 (#1194) ### Feature or Bugfix - Refactoring ### Detail - focus on core/environments - move logic from resolvers to services - create s3_client in base/aws --> TO BE REFACTORED. Needs to be merged with dataset_sharind/aws/s3_client ### Relates - #741 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: Sofia Sazonova <[email protected]> commit 2ea24cbb Author: dlpzx <[email protected]> Date: Thu May 02 2024 08:22:12 GMT-0400 (Eastern Daylight Time) Refactor: uncouple datasets and dataset_sharing modules - part 2-3 (#1213) ### Feature or Bugfix - Refactoring ⚠️ MERGE AFTER https://github.com/data-dot-all/dataall/pull/1187 ### Detail This is needed as explained in full PR [AFTER 2.4] Refactor: uncouple datasets and dataset_sharing modules #1179 - [X] Creates an interface to execute checks and clean-ups of data sharing objects when dataset objects are deleted (initially it was going to be an db interface, but I think it is better in the service) - [X] Move listDatasetShares query to dataset_sharing module in https://github.com/data-dot-all/dataall/pull/1185 ### Relates - #1179 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 750a5ec8 Author: Anushka Singh <[email protected]> Date: Wed May 01 2024 12:28:18 GMT-0400 (Eastern Daylight Time) Feature:1221 - Make visibility of auto-approval toggle configurable based on confidentiality (#1223) ### Feature or Bugfix - Feature ### Detail - Users should be able to disable visibility of auto-approval toggle with code. For example, at our company, we require that shares always go through approval process if their confidentiality classification is Secret. We dont even want to give the option to users to be able to set autoApproval enabled to ensure they dont do so by mistake and end up over sharing. Video demo: https://github.com/data-dot-all/dataall/issues/1221#issuecomment-2077412044 ### Relates - https://github.com/data-dot-all/dataall/issues/1221 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 82044689 Author: dlpzx <[email protected]> Date: Wed May 01 2024 12:26:42 GMT-0400 (Eastern Daylight Time) Refactor: uncouple datasets and dataset_sharing modules - part 2-2 (#1187) ### Feature or Bugfix - Refactoring ⚠️ MERGE AFTER https://github.com/data-dot-all/dataall/pull/1185 ### Detail This is needed as explained in full PR [AFTER 2.4] Refactor: uncouple datasets and dataset_sharing modules #1179 - Split the getDatasetAssumeRole API into 2 APIs, one for dataset owners role (in datasets module) and another one for share requester roles (in datasets_sharing module) ### Relates - #1179 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 5173419f Author: Noah Paige <[email protected]> Date: Wed May 01 2024 12:24:42 GMT-0400 (Eastern Daylight Time) Fix so listValidEnvironments called only once (#1238) ### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - When request access to a share on data.all the query to `listValidEnvironments` used to be called twice which (depending on how long for query results to return) could cause the environment initially selected to disappear ### Relates - Continuation of https://github.com/data-dot-all/dataall/issues/916 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 7656ea86 Author: dlpzx <[email protected]> Date: Tue Apr 30 2024 07:13:01 GMT-0400 (Eastern Daylight Time) Add integration tests on a real API client and integrate the tests in CICD (#1219) ### Feature or Bugfix - Feature ### Detail Add integration tests that use a real Client to execute different validation actions. - Define the Client and the way API calls are posted to API Gateway in the conftest - Define the Cognito users and the different fixtures needed for all tests - Write tests for the Organization core module as example - Add feature flag in `cdk.json` called `with_approval_tests` that can be defined at the deployment environment level. If set to True, a CodeBuild stage running the tests is created. ### Relates - https://github.com/data-dot-all/dataall/issues/1220 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you c…
noah-paige
added a commit
that referenced
this issue
Jun 25, 2024
commit 1617953c Author: Noah Paige <[email protected]> Date: Mon May 20 2024 16:48:17 GMT-0400 (Eastern Daylight Time) Add open dependency matrix commit d8497c55 Author: Noah Paige <[email protected]> Date: Mon May 20 2024 15:19:35 GMT-0400 (Eastern Daylight Time) fix commit 199ab505 Author: Admin/noahpaig-Isengard <Admin/noahpaig-Isengard> Date: Mon May 20 2024 15:16:14 GMT-0400 (Eastern Daylight Time) Conflicts resolved in the console. commit ad415575 Author: Noah Paige <[email protected]> Date: Mon May 20 2024 15:13:06 GMT-0400 (Eastern Daylight Time) Merge branch 'chatbot-test' into noah-main-2 commit 2893efc7 Author: Noah Paige <[email protected]> Date: Mon May 20 2024 15:10:20 GMT-0400 (Eastern Daylight Time) Merge branch 'chatbot-test' into noah-main-2 commit caad12e1 Author: Noah Paige <[email protected]> Date: Mon May 20 2024 15:12:47 GMT-0400 (Eastern Daylight Time) ruff commit a73b7110 Author: Noah Paige <[email protected]> Date: Mon May 20 2024 15:09:19 GMT-0400 (Eastern Daylight Time) Remove hardcoding commit 7b32a8f7 Author: Noah Paige <[email protected]> Date: Mon May 20 2024 15:07:29 GMT-0400 (Eastern Daylight Time) Chatbot POC commit e25e5815 Author: Noah Paige <[email protected]> Date: Mon May 20 2024 15:08:32 GMT-0400 (Eastern Daylight Time) Merge branch '1215-share-logs' into noah-main-2 commit a06c8cba Author: Noah Paige <[email protected]> Date: Fri May 17 2024 16:37:05 GMT-0400 (Eastern Daylight Time) Merge share logs PR commit a5626670 Author: Sofia Sazonova <[email protected]> Date: Fri May 17 2024 17:19:25 GMT-0400 (Eastern Daylight Time) make ruff happy commit aee98cf7 Author: Noah Paige <[email protected]> Date: Fri May 17 2024 16:34:24 GMT-0400 (Eastern Daylight Time) Merge share logs PR commit c3ee2f7c Author: Sofia Sazonova <[email protected]> Date: Fri May 17 2024 17:11:00 GMT-0400 (Eastern Daylight Time) PR comments commit 5ca55303 Author: Sofia Sazonova <[email protected]> Date: Wed May 15 2024 11:57:41 GMT-0400 (Eastern Daylight Time) remove unused imports commit 8f8bf3dd Author: Sofia Sazonova <[email protected]> Date: Wed May 15 2024 11:56:39 GMT-0400 (Eastern Daylight Time) restrict access to the share logs commit 9137da9b Author: Sofia Sazonova <[email protected]> Date: Wed May 15 2024 11:28:32 GMT-0400 (Eastern Daylight Time) share Logs button is available only for dataset Admins and stewards commit fcb16bd9 Author: Sofia Sazonova <[email protected]> Date: Wed May 15 2024 10:46:25 GMT-0400 (Eastern Daylight Time) getShareLogs query commit 0503a3bb Author: Sofia Sazonova <[email protected]> Date: Wed May 15 2024 10:21:25 GMT-0400 (Eastern Daylight Time) Logs modal in Share View commit bab2f3e6 Author: Sofia Sazonova <[email protected]> Date: Mon May 13 2024 09:09:18 GMT-0400 (Eastern Daylight Time) Add confirmation pop-ups for deletion of team roles and groups (#1231) ### Feature or Bugfix - Feature ### Detail Pop ups added for: - deletion team from environment - deletion of the consumption role - deletion of group from Organization ### Relates - #942 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. Co-authored-by: Sofia Sazonova <[email protected]> commit 93ff7725 Author: Sofia Sazonova <[email protected]> Date: Mon May 13 2024 08:00:38 GMT-0400 (Eastern Daylight Time) Update version.json (#1264) Release info update commit e718d861 Author: Sofia Sazonova <[email protected]> Date: Mon May 13 2024 07:29:27 GMT-0400 (Eastern Daylight Time) fix permission query (#1263) ### Feature or Bugfix - Bugfix ### Detail - The filter -- array of permissions' NAMES, so in order to query policies correctly we need to add join - The filter 'share_type' and 'share_item_status' must be string - IMPORTANT: in block "finally" the param session was used, but session was defined only in "try" block. So, the lock failed to be released. ### Relates - <URL or Ticket> ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: Sofia Sazonova <[email protected]> commit 479b8f3f Author: mourya-33 <[email protected]> Date: Wed May 08 2024 10:29:36 GMT-0400 (Eastern Daylight Time) Add encryption and tag immutability to ECR repository (#1224) ### Feature or Bugfix - Bugfix ### Detail - Currently the ecr repository created do not have encryption and tag immutability enabled which is identified by checkov scans. This fix is to enable both. ### Relates [- <URL or Ticket>](https://github.com/data-dot-all/dataall/issues/1200) ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes N/A - Is the input sanitized? N/A - What precautions are you taking before deserializing the data you consume? N/A - Is injection prevented by parametrizing queries? N/A - Have you ensured no `eval` or similar functions are used? N/A - Does this PR introduce any functionality or component that requires authorization? N/A - How have you ensured it respects the existing AuthN/AuthZ mechanisms? N/A - Are you logging failed auth attempts? N/A - Are you using or adding any cryptographic features? N/A - Do you use a standard proven implementations? N/A - Are the used keys controlled by the customer? Where are they stored? No. This is with default encryption - Are you introducing any new policies/roles/users? N/A - Have you used the least-privilege principle? How? N/A By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 2f885773 Author: Sofia Sazonova <[email protected]> Date: Wed May 08 2024 09:22:40 GMT-0400 (Eastern Daylight Time) Multiple permission roots (#1259) ### Feature or Bugfix - Bugfix ### Detail - GET_DATASET_TABLE (FOLDER) permissions are granted to the group only if they are not granted already - these permissions are removed if group is not admin|steward and there are no other shares of this item. ### Relates - #1174 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: Sofia Sazonova <[email protected]> commit c4cc07ee Author: Petros Kalos <[email protected]> Date: Wed May 08 2024 08:54:02 GMT-0400 (Eastern Daylight Time) explicitly specify dataset_client s3 endpoint_url (#1260) * AWS requires that the endpoint_url should be explicitly specified for some regions * Remove misleading CORS error message, the upload step can fail for many reason ### Feature or Bugfix - Bugfix ### Detail Resolves #778 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 40defe8e Author: dlpzx <[email protected]> Date: Tue May 07 2024 11:52:17 GMT-0400 (Eastern Daylight Time) Generic dataset module and specific s3_datasets module - part 1 (Rename datasets as s3_datasets) (#1250) ### Feature or Bugfix - Refactoring ### Detail - Rename `datasets` module to `s3_datasets` module This PR is the first step to extract a generic datasets_base module that implements the undifferentiated concepts of Dataset in data.all. s3_datasets will use this base module to implement the specific implementation for S3 datatasets. ### Relates - #1123 - #955 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 74a303cb Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue May 07 2024 02:26:09 GMT-0400 (Eastern Daylight Time) Bump werkzeug from 3.0.1 to 3.0.3 in /tests (#1253) Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.1 to 3.0.3. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/pallets/werkzeug/releases">werkzeug's releases</a>.</em></p> <blockquote> <h2>3.0.3</h2> <p>This is the Werkzeug 3.0.3 security release, which fixes security issues and bugs but does not otherwise change behavior and should not result in breaking changes.</p> <p>PyPI: <a href="https://pypi.org/project/Werkzeug/3.0.3/">https://pypi.org/project/Werkzeug/3.0.3/</a> Changes: <a href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3</a> Milestone: <a href="https://github.com/pallets/werkzeug/milestone/35?closed=1">https://github.com/pallets/werkzeug/milestone/35?closed=1</a></p> <ul> <li>Only allow <code>localhost</code>, <code>.localhost</code>, <code>127.0.0.1</code>, or the specified hostname when running the dev server, to make debugger requests. Additional hosts can be added by using the debugger middleware directly. The debugger UI makes requests using the full URL rather than only the path. GHSA-2g68-c3qc-8985</li> <li>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. <a href="https://redirect.github.com/pallets/werkzeug/issues/2823">#2823</a></li> <li>Better TLS cert format with <code>adhoc</code> dev certs. <a href="https://redirect.github.com/pallets/werkzeug/issues/2891">#2891</a></li> <li>Inform Python < 3.12 how to handle <code>itms-services</code> URIs correctly, rather than using an overly-broad workaround in Werkzeug that caused some redirect URIs to be passed on without encoding. <a href="https://redirect.github.com/pallets/werkzeug/issues/2828">#2828</a></li> <li>Type annotation for <code>Rule.endpoint</code> and other uses of <code>endpoint</code> is <code>Any</code>. <a href="https://redirect.github.com/pallets/werkzeug/issues/2836">#2836</a></li> </ul> <h2>3.0.2</h2> <p>This is a fix release for the 3.0.x feature branch.</p> <ul> <li>Changes: <a href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2</a></li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/pallets/werkzeug/blob/main/CHANGES.rst">werkzeug's changelog</a>.</em></p> <blockquote> <h2>Version 3.0.3</h2> <p>Released 2024-05-05</p> <ul> <li> <p>Only allow <code>localhost</code>, <code>.localhost</code>, <code>127.0.0.1</code>, or the specified hostname when running the dev server, to make debugger requests. Additional hosts can be added by using the debugger middleware directly. The debugger UI makes requests using the full URL rather than only the path. :ghsa:<code>2g68-c3qc-8985</code></p> </li> <li> <p>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. :pr:<code>2823</code></p> </li> <li> <p>Better TLS cert format with <code>adhoc</code> dev certs. :pr:<code>2891</code></p> </li> <li> <p>Inform Python < 3.12 how to handle <code>itms-services</code> URIs correctly, rather than using an overly-broad workaround in Werkzeug that caused some redirect URIs to be passed on without encoding. :issue:<code>2828</code></p> </li> <li> <p>Type annotation for <code>Rule.endpoint</code> and other uses of <code>endpoint</code> is <code>Any</code>. :issue:<code>2836</code></p> </li> <li> <p>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. :pr:<code>2823</code></p> </li> </ul> <h2>Version 3.0.2</h2> <p>Released 2024-04-01</p> <ul> <li>Ensure setting <code>merge_slashes</code> to <code>False</code> results in <code>NotFound</code> for repeated-slash requests against single slash routes. :issue:<code>2834</code></li> <li>Fix handling of <code>TypeError</code> in <code>TypeConversionDict.get()</code> to match <code>ValueError</code>. :issue:<code>2843</code></li> <li>Fix <code>response_wrapper</code> type check in test client. :issue:<code>2831</code></li> <li>Make the return type of <code>MultiPartParser.parse</code> more precise. :issue:<code>2840</code></li> <li>Raise an error if converter arguments cannot be parsed. :issue:<code>2822</code></li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/pallets/werkzeug/commit/f9995e967979eb694d6b31536cc65314fd7e9c8c"><code>f9995e9</code></a> release version 3.0.3</li> <li><a href="https://github.com/pallets/werkzeug/commit/3386395b24c7371db11a5b8eaac0c91da5362692"><code>3386395</code></a> Merge pull request from GHSA-2g68-c3qc-8985</li> <li><a href="https://github.com/pallets/werkzeug/commit/890b6b62634fa61224222aee31081c61b054ff01"><code>890b6b6</code></a> only require trusted host for evalex</li> <li><a href="https://github.com/pallets/werkzeug/commit/71b69dfb7df3d912e66bab87fbb1f21f83504967"><code>71b69df</code></a> restrict debugger trusted hosts</li> <li><a href="https://github.com/pallets/werkzeug/commit/d2d3869525a4ffb2c41dfb2c0e39d94dab2d870c"><code>d2d3869</code></a> endpoint type is Any (<a href="https://redirect.github.com/pallets/werkzeug/issues/2895">#2895</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/7080b55acd48b68afdda65ee6c7f99e9afafb0ba"><code>7080b55</code></a> endpoint type is Any</li> <li><a href="https://github.com/pallets/werkzeug/commit/7555eff296fbdf12f2e576b6bbb0b506df8417ed"><code>7555eff</code></a> remove iri_to_uri redirect workaround (<a href="https://redirect.github.com/pallets/werkzeug/issues/2894">#2894</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/97fb2f722297ae4e12e36dab024e0acf8477b3c8"><code>97fb2f7</code></a> remove _invalid_iri_to_uri workaround</li> <li><a href="https://github.com/pallets/werkzeug/commit/249527ff981e7aa22cd714825c5637cc92df7761"><code>249527f</code></a> make cn field a valid single hostname, and use wildcard in SANs field. (<a href="https://redirect.github.com/pallets/werkzeug/issues/2892">#2892</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/793be472c9d145eb9be7d4200672d1806289d84a"><code>793be47</code></a> update adhoc tls dev cert format</li> <li>Additional commits viewable in <a href="https://github.com/pallets/werkzeug/compare/3.0.1...3.0.3">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=werkzeug&package-manager=pip&previous-version=3.0.1&new-version=3.0.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/data-dot-all/dataall/network/alerts). </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> commit 2f33320c Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue May 07 2024 02:25:03 GMT-0400 (Eastern Daylight Time) Bump werkzeug from 3.0.1 to 3.0.3 in /backend/dataall/base/cdkproxy (#1252) Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.1 to 3.0.3. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/pallets/werkzeug/releases">werkzeug's releases</a>.</em></p> <blockquote> <h2>3.0.3</h2> <p>This is the Werkzeug 3.0.3 security release, which fixes security issues and bugs but does not otherwise change behavior and should not result in breaking changes.</p> <p>PyPI: <a href="https://pypi.org/project/Werkzeug/3.0.3/">https://pypi.org/project/Werkzeug/3.0.3/</a> Changes: <a href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3</a> Milestone: <a href="https://github.com/pallets/werkzeug/milestone/35?closed=1">https://github.com/pallets/werkzeug/milestone/35?closed=1</a></p> <ul> <li>Only allow <code>localhost</code>, <code>.localhost</code>, <code>127.0.0.1</code>, or the specified hostname when running the dev server, to make debugger requests. Additional hosts can be added by using the debugger middleware directly. The debugger UI makes requests using the full URL rather than only the path. GHSA-2g68-c3qc-8985</li> <li>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. <a href="https://redirect.github.com/pallets/werkzeug/issues/2823">#2823</a></li> <li>Better TLS cert format with <code>adhoc</code> dev certs. <a href="https://redirect.github.com/pallets/werkzeug/issues/2891">#2891</a></li> <li>Inform Python < 3.12 how to handle <code>itms-services</code> URIs correctly, rather than using an overly-broad workaround in Werkzeug that caused some redirect URIs to be passed on without encoding. <a href="https://redirect.github.com/pallets/werkzeug/issues/2828">#2828</a></li> <li>Type annotation for <code>Rule.endpoint</code> and other uses of <code>endpoint</code> is <code>Any</code>. <a href="https://redirect.github.com/pallets/werkzeug/issues/2836">#2836</a></li> </ul> <h2>3.0.2</h2> <p>This is a fix release for the 3.0.x feature branch.</p> <ul> <li>Changes: <a href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2</a></li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/pallets/werkzeug/blob/main/CHANGES.rst">werkzeug's changelog</a>.</em></p> <blockquote> <h2>Version 3.0.3</h2> <p>Released 2024-05-05</p> <ul> <li> <p>Only allow <code>localhost</code>, <code>.localhost</code>, <code>127.0.0.1</code>, or the specified hostname when running the dev server, to make debugger requests. Additional hosts can be added by using the debugger middleware directly. The debugger UI makes requests using the full URL rather than only the path. :ghsa:<code>2g68-c3qc-8985</code></p> </li> <li> <p>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. :pr:<code>2823</code></p> </li> <li> <p>Better TLS cert format with <code>adhoc</code> dev certs. :pr:<code>2891</code></p> </li> <li> <p>Inform Python < 3.12 how to handle <code>itms-services</code> URIs correctly, rather than using an overly-broad workaround in Werkzeug that caused some redirect URIs to be passed on without encoding. :issue:<code>2828</code></p> </li> <li> <p>Type annotation for <code>Rule.endpoint</code> and other uses of <code>endpoint</code> is <code>Any</code>. :issue:<code>2836</code></p> </li> <li> <p>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. :pr:<code>2823</code></p> </li> </ul> <h2>Version 3.0.2</h2> <p>Released 2024-04-01</p> <ul> <li>Ensure setting <code>merge_slashes</code> to <code>False</code> results in <code>NotFound</code> for repeated-slash requests against single slash routes. :issue:<code>2834</code></li> <li>Fix handling of <code>TypeError</code> in <code>TypeConversionDict.get()</code> to match <code>ValueError</code>. :issue:<code>2843</code></li> <li>Fix <code>response_wrapper</code> type check in test client. :issue:<code>2831</code></li> <li>Make the return type of <code>MultiPartParser.parse</code> more precise. :issue:<code>2840</code></li> <li>Raise an error if converter arguments cannot be parsed. :issue:<code>2822</code></li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/pallets/werkzeug/commit/f9995e967979eb694d6b31536cc65314fd7e9c8c"><code>f9995e9</code></a> release version 3.0.3</li> <li><a href="https://github.com/pallets/werkzeug/commit/3386395b24c7371db11a5b8eaac0c91da5362692"><code>3386395</code></a> Merge pull request from GHSA-2g68-c3qc-8985</li> <li><a href="https://github.com/pallets/werkzeug/commit/890b6b62634fa61224222aee31081c61b054ff01"><code>890b6b6</code></a> only require trusted host for evalex</li> <li><a href="https://github.com/pallets/werkzeug/commit/71b69dfb7df3d912e66bab87fbb1f21f83504967"><code>71b69df</code></a> restrict debugger trusted hosts</li> <li><a href="https://github.com/pallets/werkzeug/commit/d2d3869525a4ffb2c41dfb2c0e39d94dab2d870c"><code>d2d3869</code></a> endpoint type is Any (<a href="https://redirect.github.com/pallets/werkzeug/issues/2895">#2895</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/7080b55acd48b68afdda65ee6c7f99e9afafb0ba"><code>7080b55</code></a> endpoint type is Any</li> <li><a href="https://github.com/pallets/werkzeug/commit/7555eff296fbdf12f2e576b6bbb0b506df8417ed"><code>7555eff</code></a> remove iri_to_uri redirect workaround (<a href="https://redirect.github.com/pallets/werkzeug/issues/2894">#2894</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/97fb2f722297ae4e12e36dab024e0acf8477b3c8"><code>97fb2f7</code></a> remove _invalid_iri_to_uri workaround</li> <li><a href="https://github.com/pallets/werkzeug/commit/249527ff981e7aa22cd714825c5637cc92df7761"><code>249527f</code></a> make cn field a valid single hostname, and use wildcard in SANs field. (<a href="https://redirect.github.com/pallets/werkzeug/issues/2892">#2892</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/793be472c9d145eb9be7d4200672d1806289d84a"><code>793be47</code></a> update adhoc tls dev cert format</li> <li>Additional commits viewable in <a href="https://github.com/pallets/werkzeug/compare/3.0.1...3.0.3">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=werkzeug&package-manager=pip&previous-version=3.0.1&new-version=3.0.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/data-dot-all/dataall/network/alerts). </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> commit 0b49633f Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue May 07 2024 02:24:34 GMT-0400 (Eastern Daylight Time) Bump werkzeug from 3.0.1 to 3.0.3 in /tests_new/integration_tests (#1254) Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.1 to 3.0.3. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/pallets/werkzeug/releases">werkzeug's releases</a>.</em></p> <blockquote> <h2>3.0.3</h2> <p>This is the Werkzeug 3.0.3 security release, which fixes security issues and bugs but does not otherwise change behavior and should not result in breaking changes.</p> <p>PyPI: <a href="https://pypi.org/project/Werkzeug/3.0.3/">https://pypi.org/project/Werkzeug/3.0.3/</a> Changes: <a href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3</a> Milestone: <a href="https://github.com/pallets/werkzeug/milestone/35?closed=1">https://github.com/pallets/werkzeug/milestone/35?closed=1</a></p> <ul> <li>Only allow <code>localhost</code>, <code>.localhost</code>, <code>127.0.0.1</code>, or the specified hostname when running the dev server, to make debugger requests. Additional hosts can be added by using the debugger middleware directly. The debugger UI makes requests using the full URL rather than only the path. GHSA-2g68-c3qc-8985</li> <li>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. <a href="https://redirect.github.com/pallets/werkzeug/issues/2823">#2823</a></li> <li>Better TLS cert format with <code>adhoc</code> dev certs. <a href="https://redirect.github.com/pallets/werkzeug/issues/2891">#2891</a></li> <li>Inform Python < 3.12 how to handle <code>itms-services</code> URIs correctly, rather than using an overly-broad workaround in Werkzeug that caused some redirect URIs to be passed on without encoding. <a href="https://redirect.github.com/pallets/werkzeug/issues/2828">#2828</a></li> <li>Type annotation for <code>Rule.endpoint</code> and other uses of <code>endpoint</code> is <code>Any</code>. <a href="https://redirect.github.com/pallets/werkzeug/issues/2836">#2836</a></li> </ul> <h2>3.0.2</h2> <p>This is a fix release for the 3.0.x feature branch.</p> <ul> <li>Changes: <a href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2</a></li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/pallets/werkzeug/blob/main/CHANGES.rst">werkzeug's changelog</a>.</em></p> <blockquote> <h2>Version 3.0.3</h2> <p>Released 2024-05-05</p> <ul> <li> <p>Only allow <code>localhost</code>, <code>.localhost</code>, <code>127.0.0.1</code>, or the specified hostname when running the dev server, to make debugger requests. Additional hosts can be added by using the debugger middleware directly. The debugger UI makes requests using the full URL rather than only the path. :ghsa:<code>2g68-c3qc-8985</code></p> </li> <li> <p>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. :pr:<code>2823</code></p> </li> <li> <p>Better TLS cert format with <code>adhoc</code> dev certs. :pr:<code>2891</code></p> </li> <li> <p>Inform Python < 3.12 how to handle <code>itms-services</code> URIs correctly, rather than using an overly-broad workaround in Werkzeug that caused some redirect URIs to be passed on without encoding. :issue:<code>2828</code></p> </li> <li> <p>Type annotation for <code>Rule.endpoint</code> and other uses of <code>endpoint</code> is <code>Any</code>. :issue:<code>2836</code></p> </li> <li> <p>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. :pr:<code>2823</code></p> </li> </ul> <h2>Version 3.0.2</h2> <p>Released 2024-04-01</p> <ul> <li>Ensure setting <code>merge_slashes</code> to <code>False</code> results in <code>NotFound</code> for repeated-slash requests against single slash routes. :issue:<code>2834</code></li> <li>Fix handling of <code>TypeError</code> in <code>TypeConversionDict.get()</code> to match <code>ValueError</code>. :issue:<code>2843</code></li> <li>Fix <code>response_wrapper</code> type check in test client. :issue:<code>2831</code></li> <li>Make the return type of <code>MultiPartParser.parse</code> more precise. :issue:<code>2840</code></li> <li>Raise an error if converter arguments cannot be parsed. :issue:<code>2822</code></li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/pallets/werkzeug/commit/f9995e967979eb694d6b31536cc65314fd7e9c8c"><code>f9995e9</code></a> release version 3.0.3</li> <li><a href="https://github.com/pallets/werkzeug/commit/3386395b24c7371db11a5b8eaac0c91da5362692"><code>3386395</code></a> Merge pull request from GHSA-2g68-c3qc-8985</li> <li><a href="https://github.com/pallets/werkzeug/commit/890b6b62634fa61224222aee31081c61b054ff01"><code>890b6b6</code></a> only require trusted host for evalex</li> <li><a href="https://github.com/pallets/werkzeug/commit/71b69dfb7df3d912e66bab87fbb1f21f83504967"><code>71b69df</code></a> restrict debugger trusted hosts</li> <li><a href="https://github.com/pallets/werkzeug/commit/d2d3869525a4ffb2c41dfb2c0e39d94dab2d870c"><code>d2d3869</code></a> endpoint type is Any (<a href="https://redirect.github.com/pallets/werkzeug/issues/2895">#2895</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/7080b55acd48b68afdda65ee6c7f99e9afafb0ba"><code>7080b55</code></a> endpoint type is Any</li> <li><a href="https://github.com/pallets/werkzeug/commit/7555eff296fbdf12f2e576b6bbb0b506df8417ed"><code>7555eff</code></a> remove iri_to_uri redirect workaround (<a href="https://redirect.github.com/pallets/werkzeug/issues/2894">#2894</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/97fb2f722297ae4e12e36dab024e0acf8477b3c8"><code>97fb2f7</code></a> remove _invalid_iri_to_uri workaround</li> <li><a href="https://github.com/pallets/werkzeug/commit/249527ff981e7aa22cd714825c5637cc92df7761"><code>249527f</code></a> make cn field a valid single hostname, and use wildcard in SANs field. (<a href="https://redirect.github.com/pallets/werkzeug/issues/2892">#2892</a>)</li> <li><a href="https://github.com/pallets/werkzeug/commit/793be472c9d145eb9be7d4200672d1806289d84a"><code>793be47</code></a> update adhoc tls dev cert format</li> <li>Additional commits viewable in <a href="https://github.com/pallets/werkzeug/compare/3.0.1...3.0.3">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=werkzeug&package-manager=pip&previous-version=3.0.1&new-version=3.0.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/data-dot-all/dataall/network/alerts). </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> commit 08862420 Author: mourya-33 <[email protected]> Date: Tue May 07 2024 02:15:15 GMT-0400 (Eastern Daylight Time) Updated lambda_api.py to add encryption for lambda env vars for custo… (#1255) Feature or Bugfix Bugfix Detail The environment variables for the lambda functions are not encrypted in cdk which are identified by checkov scans. This fix is to enable kms encryption for the lambda environment variables. Relates Security Please answer the questions below briefly where applicable, or write N/A. Based on [OWASP 10](https://owasp.org/Top10/en/). Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? N/A Is the input sanitized? N/A What precautions are you taking before deserializing the data you consume? N/A Is injection prevented by parametrizing queries? N/A Have you ensured no eval or similar functions are used? N/A Does this PR introduce any functionality or component that requires authorization? N/A How have you ensured it respects the existing AuthN/AuthZ mechanisms? N/A Are you logging failed auth attempts? N/A Are you using or adding any cryptographic features? N/A Do you use a standard proven implementations? N/A Are the used keys controlled by the customer? Where are they stored? the KMS keys are generated by cdk and are used to encrypt the environment variables for all lambda functions in the lambda-api stack Are you introducing any new policies/roles/users? - N/A Have you used the least-privilege principle? How? N/A By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit ed7cc3eb Author: Noah Paige <[email protected]> Date: Mon May 06 2024 09:32:30 GMT-0400 (Eastern Daylight Time) Add order_by for paginated queries (#1249) ### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - This PR aims to solve the following - (1) for particular queries (identified as ones that perform `.outerjoin()` operations and have results paginated with `paginate()` function - sometimes the returned query results is *less than* the limit set by the pageSize of the paginate function even when the total count is greater than the pageSize - Ex 1: 11 envs total, `query_user_environments()` returning 9 envs on 1st page + 2 on 2nd page - Ex 2: 10 envs total, `query_user_environments()` returning 9 envs on 1st page + no 2nd page - Believe this is to be happening due to the way SQLAlchemy is "uniquing" the records resulted from an outerjoin and then returning that result back to the frontend - Adding a `.distinct()` check on the query ensures each distinct record is returned (tested successfully) - (2) Currently we often times do not implement an `.order_by()` condition for the query used in `paginate()` and do not have a stable way of preserving order of the items returned from a query (i.e. when navigating through pages of response) - A generally good practice seems to include an `order_by()` on a column or set of columns - For each query used in `paginate()` this PR adds an `order_by()` condition (full list in comments below) Can read a bit more context from related issue linked below ### Relates - https://github.com/data-dot-all/dataall/issues/1241 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 98e67fa8 Author: Sofia Sazonova <[email protected]> Date: Fri May 03 2024 12:21:57 GMT-0400 (Eastern Daylight Time) fix: DATASET_READ_TABLE read permissions (#1237) ### Feature or Bugfix - Bugfix ### Detail - backfill DATASET_READ_TABLE permissions - delete this permissions, when dataset tables are revoked or deteled - ### Relates - #1173 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: Sofia Sazonova <[email protected]> commit 18e2f509 Author: Noah Paige <[email protected]> Date: Fri May 03 2024 10:14:52 GMT-0400 (Eastern Daylight Time) Fix local test groups listing for listGroups query (#1239) ### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - Locally when trying to invite a team to Env or Org we call listGroups and the returned `LOCAL_TEST_GROUPS` is not returning the proper data type expected ### Relates N/A ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit a0be03c4 Author: dlpzx <[email protected]> Date: Fri May 03 2024 10:12:34 GMT-0400 (Eastern Daylight Time) Refactor: uncouple datasets and dataset_sharing modules - part 2-5 FINAL DELETE DATASETS_BASE (#1242) ### Feature or Bugfix - Refactoring ### Detail After all the previous PRs are merged, there should be no circular dependencies between `datasets` and `datasets_sharing`. We can now proceed to: - move `datasets_base` models, repositories, permissions and enums to `datasets` - adjust the `__init__` files to establish the `datasets_sharing` depends on `datasets` - adjust the Module interfaces to ensure that all necessary dataset models... are imported in the interface for sharing Next steps: - share_notifications paramter to dataset_sharing in config.json ### Relates #955 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit b68b40c1 Author: Sofia Sazonova <[email protected]> Date: Fri May 03 2024 10:12:11 GMT-0400 (Eastern Daylight Time) bugfix: EnvironmentGroup can remove other groups (#1234) ### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - Now, if the group can't update other group, it also can not remove them. - ### Relates - #1212 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: Sofia Sazonova <[email protected]> commit 264539b5 Author: Noah Paige <[email protected]> Date: Fri May 03 2024 05:23:11 GMT-0400 (Eastern Daylight Time) Fix Alembic Migration: has table checks (#1240) ### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - Fix `has_table()` check to ensure dropping the tables if the exists as part of alembic migration upgrade - Fix `DatasetLock nullable=True` ### Relates - https://github.com/data-dot-all/dataall/issues/1165 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? No - Is the input sanitized? N/A - What precautions are you taking before deserializing the data you consume? N/A - Is injection prevented by parametrizing queries? N/A - Have you ensured no `eval` or similar functions are used? N/A - Does this PR introduce any functionality or component that requires authorization? No - How have you ensured it respects the existing AuthN/AuthZ mechanisms? N/A - Are you logging failed auth attempts? N/A - Are you using or adding any cryptographic features? No - Do you use a standard proven implementations? N/A - Are the used keys controlled by the customer? Where are they stored? N/A - Are you introducing any new policies/roles/users? No - Have you used the least-privilege principle? How? N/A By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 42a5f6bd Author: dlpzx <[email protected]> Date: Fri May 03 2024 02:24:09 GMT-0400 (Eastern Daylight Time) Refactor: uncouple datasets and dataset_sharing modules - part 2-4 (#1214) ### Feature or Bugfix - Refactoring ⚠️ MERGE AFTER https://github.com/data-dot-all/dataall/pull/1213 ### Detail This is needed as explained in full PR [AFTER 2.4] Refactor: uncouple datasets and dataset_sharing modules #1179 - [X] Use interface to resolve dataset roles related to datasets shared and implement logic in the dataset_sharing module - [X] Extend and clean-up stewards share permissions through interface ### Relates - #1179 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 6d3f2d45 Author: Sofia Sazonova <[email protected]> Date: Thu May 02 2024 10:55:00 GMT-0400 (Eastern Daylight Time) [After 2.4]Core Refactoring part5 (#1194) ### Feature or Bugfix - Refactoring ### Detail - focus on core/environments - move logic from resolvers to services - create s3_client in base/aws --> TO BE REFACTORED. Needs to be merged with dataset_sharind/aws/s3_client ### Relates - #741 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: Sofia Sazonova <[email protected]> commit 2ea24cbb Author: dlpzx <[email protected]> Date: Thu May 02 2024 08:22:12 GMT-0400 (Eastern Daylight Time) Refactor: uncouple datasets and dataset_sharing modules - part 2-3 (#1213) ### Feature or Bugfix - Refactoring ⚠️ MERGE AFTER https://github.com/data-dot-all/dataall/pull/1187 ### Detail This is needed as explained in full PR [AFTER 2.4] Refactor: uncouple datasets and dataset_sharing modules #1179 - [X] Creates an interface to execute checks and clean-ups of data sharing objects when dataset objects are deleted (initially it was going to be an db interface, but I think it is better in the service) - [X] Move listDatasetShares query to dataset_sharing module in https://github.com/data-dot-all/dataall/pull/1185 ### Relates - #1179 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 750a5ec8 Author: Anushka Singh <[email protected]> Date: Wed May 01 2024 12:28:18 GMT-0400 (Eastern Daylight Time) Feature:1221 - Make visibility of auto-approval toggle configurable based on confidentiality (#1223) ### Feature or Bugfix - Feature ### Detail - Users should be able to disable visibility of auto-approval toggle with code. For example, at our company, we require that shares always go through approval process if their confidentiality classification is Secret. We dont even want to give the option to users to be able to set autoApproval enabled to ensure they dont do so by mistake and end up over sharing. Video demo: https://github.com/data-dot-all/dataall/issues/1221#issuecomment-2077412044 ### Relates - https://github.com/data-dot-all/dataall/issues/1221 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 82044689 Author: dlpzx <[email protected]> Date: Wed May 01 2024 12:26:42 GMT-0400 (Eastern Daylight Time) Refactor: uncouple datasets and dataset_sharing modules - part 2-2 (#1187) ### Feature or Bugfix - Refactoring ⚠️ MERGE AFTER https://github.com/data-dot-all/dataall/pull/1185 ### Detail This is needed as explained in full PR [AFTER 2.4] Refactor: uncouple datasets and dataset_sharing modules #1179 - Split the getDatasetAssumeRole API into 2 APIs, one for dataset owners role (in datasets module) and another one for share requester roles (in datasets_sharing module) ### Relates - #1179 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 5173419f Author: Noah Paige <[email protected]> Date: Wed May 01 2024 12:24:42 GMT-0400 (Eastern Daylight Time) Fix so listValidEnvironments called only once (#1238) ### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - When request access to a share on data.all the query to `listValidEnvironments` used to be called twice which (depending on how long for query results to return) could cause the environment initially selected to disappear ### Relates - Continuation of https://github.com/data-dot-all/dataall/issues/916 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 7656ea86 Author: dlpzx <[email protected]> Date: Tue Apr 30 2024 07:13:01 GMT-0400 (Eastern Daylight Time) Add integration tests on a real API client and integrate the tests in CICD (#1219) ### Feature or Bugfix - Feature ### Detail Add integration tests that use a real Client to execute different validation actions. - Define the Client and the way API calls are posted to API Gateway in the conftest - Define the Cognito users and the different fixtures needed for all tests - Write tests for the Organization core module as example - Add feature flag in `cdk.json` called `with_approval_tests` that can be defined at the deployment environment level. If set to True, a CodeBuild stage running the tests is created. ### Relates - https://github.com/data-dot-all/dataall/issues/1220 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage …
dlpzx
added a commit
that referenced
this issue
Jun 27, 2024
…art 11 (renaming and cleaning up s3_shares) (#1359) ### Feature or Bugfix - Refactoring ### Detail As explained in the design for #1123 and #1283 we are trying to implement generic `datasets_base` and `shares_base` modules that can be used by any type of datasets and by any type of shareable object in a generic way. This is one of the last PRs focused on renaming files and cleaning-up the s3_datasets_shares module. The first step is a consolidation of the file and classes names in the services to clearly refer to s3_shares: - `services.managed_share_policy_service.SharePolicyService` ---> `services.s3_share_managed_policy_service.S3SharePolicyService` - `services.dataset_sharing_alarm_service.DatasetSharingAlarmService` --> `services.s3_share_alarm_service.S3ShareAlarmService` - `services.managed_share_policy_service.SharePolicyService` --> `services.s3_share_managed_policy_service.S3SharePolicyService` 👀 The main refactoring happens in what is used to be `services.dataset_sharing_service`. - The part that implements the `DatasetServiceInterface` has been moved to `services/s3_share_dataset_service.py` as the `S3ShareDatasetService` - The part used in the resolvers and by other methods has been renamed as `services.s3_share_service.py` and the methods for the folder/table permissions are also added to the S3ShareService (from share_item_service) Lastly, there is one method previously in share_item_service that has been moved to the GlueClient directly as `get_glue_database_from_catalog`. ### Relates - #1283 - #1123 - #955 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
noah-paige
added a commit
that referenced
this issue
Aug 30, 2024
commit 22a6f6ef Author: Noah Paige <[email protected]> Date: Mon Jul 08 2024 11:28:07 GMT-0400 (Eastern Daylight Time) Add integ tests commit 4fb7d653 Author: Noah Paige <[email protected]> Date: Mon Jul 08 2024 11:26:36 GMT-0400 (Eastern Daylight Time) Merge env test changes commit 4cf42e8 Author: Petros Kalos <[email protected]> Date: Fri Jul 05 2024 08:19:34 GMT-0400 (Eastern Daylight Time) improve docs commit 65f930a Author: Petros Kalos <[email protected]> Date: Fri Jul 05 2024 08:10:56 GMT-0400 (Eastern Daylight Time) fix failures commit 170b7ce Author: Petros Kalos <[email protected]> Date: Wed Jul 03 2024 10:52:20 GMT-0400 (Eastern Daylight Time) add group/consumption_role invite/remove tests commit ba77d69 Author: dlpzx <[email protected]> Date: Wed Jul 03 2024 06:51:47 GMT-0400 (Eastern Daylight Time) Rename alias for env_vars kms key in cognito lambdas FE and BE (#1385) ### Feature or Bugfix - Bugfix ### Detail For the case in which we deploy FE and BE in us-east-1 the new lambda env_key alias is the same one for TriggerFunctionCognitoUrlsConfig in FE and for TriggerFunctionCognitoConfig in BE, which results in a failure of the CICD in the FE stack because the alias already exists. This PR changes the name of both aliases to avoid this conflict. It also adds envname to avoid issues with other deployment environments/tooling account in the future ### Relates - <URL or Ticket> ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit e5923a9 Author: dlpzx <[email protected]> Date: Wed Jul 03 2024 04:27:11 GMT-0400 (Eastern Daylight Time) Fix lambda_env_key out of scope for vpc-facing cognito setup (#1384) ### Feature or Bugfix - Bugfix ### Detail The KMS key for the Lambda environment variables in the Cognito IdP stack was defined inside an if-clause for internet facing frontend. Outside of that if, for vpc-facing architecture the kms key does not exist and the CICD pipeline fails. This PRs move the creation of the KMS key outside of the if. ### Relates ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 3ccacfc Author: Noah Paige <[email protected]> Date: Mon Jul 01 2024 13:56:58 GMT-0400 (Eastern Daylight Time) Add delete docs not found when re indexing in catalog task (#1365) ### Feature or Bugfix <!-- please choose --> - Feature ### Detail - Add logic to Catalog Indexer Task to Delete Docs No Longer in RDS - TODO: Add Ability to Re-index Catalog Items via Dataall Admin UI ### Relates - #1078 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit e2817a1 Author: Noah Paige <[email protected]> Date: Mon Jul 01 2024 05:14:07 GMT-0400 (Eastern Daylight Time) Fix/glossary status (#1373) ### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - Add back `status` to Glossary GQL Object for GQL Operations (getGlossary, listGlossaries) - Fix `listOrganizationGroupPermissions` enforce non null on FE ### Relates ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit c3c58bd Author: Petros Kalos <[email protected]> Date: Fri Jun 28 2024 06:55:42 GMT-0400 (Eastern Daylight Time) add environment tests (#1371) ### Feature or Bugfix Feature ### Detail * add list_environment tests * add test for updating an environment (via update_stack) * generalise the polling functions for stacks ### Relates #1220 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit e913d48 Author: dlpzx <[email protected]> Date: Fri Jun 28 2024 04:15:49 GMT-0400 (Eastern Daylight Time) Add search (Autocomplete) in miscellaneous dropdowns (#1367) ### Feature or Bugfix - Feature ### Detail Autocomplete for environments and teams in the following frontend views as requested in #1012. In this case the views required custom dropdowns. ❗ I used `noOptionsText` whenever it was necessary instead of checking groupOptions lenght >0 - [x] DatasetEditForm.js -> ❗ I kept the stewards field as `freesolo` - what that means is that users CAN specify options that are not on the list. I would like the reviewer to confirm this is what we want. At the end stewardship is a delegation of permissions, it makes sense that delegation happens to other teams. Also changed DatasetCreateForm - [X] RequestDashboardAccessModal.js - already implemented, minor changes - [X] EnvironmentTeamInviteForm.js - already implemented, minor changes. -> Kept `freesolo` because invited teams might not be the user teams. Same reason why there is no check for groupOptions == 0, if there are no options there is still the free text option. - [X] EnvironmentRoleAddForm.js - [X] NetworkCreateModal.js ### Relates - #1012 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit ee71d7b Author: Tejas Rajopadhye <[email protected]> Date: Thu Jun 27 2024 14:08:27 GMT-0400 (Eastern Daylight Time) [Gh 1301] Enhancement Feature - Bulk share reapply on dataset (#1363) ### Feature or Bugfix - Feature ### Detail - Adds feature to reapply shares in bulk for a dataset. - Also contains bugfix for AWS worker lambda errors ### Relates - #1301 - #1364 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? N/A - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? N/A - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? N/A - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? N/A - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: trajopadhye <[email protected]> commit 27f1ad7 Author: Noah Paige <[email protected]> Date: Thu Jun 27 2024 13:18:32 GMT-0400 (Eastern Daylight Time) Convert Dataset Lock Mechanism to Generic Resource Lock (#1338) ### Feature or Bugfix <!-- please choose --> - Feature - Bugfix - Refactoring ### Detail - Convert Dataset Lock Mechanism to Generic Resource Lock - Extend locking to Share principals (i.e. EnvironmentGroup and Consumption Roles) - Making locking a generic component not tied to datasets ### Relates - #1093 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: dlpzx <[email protected]> commit e3b8658 Author: Petros Kalos <[email protected]> Date: Thu Jun 27 2024 12:50:59 GMT-0400 (Eastern Daylight Time) ignore ruff change in blame (#1372) ### Feature or Bugfix <!-- please choose --> - Feature - Bugfix - Refactoring ### Detail - <feature1 or bug1> - <feature2 or bug2> ### Relates - <URL or Ticket> ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 2e80de4 Author: dlpzx <[email protected]> Date: Thu Jun 27 2024 10:59:18 GMT-0400 (Eastern Daylight Time) Generic shares_base module and specific s3_datasets_shares module - part 11 (renaming and cleaning up s3_shares) (#1359) ### Feature or Bugfix - Refactoring ### Detail As explained in the design for #1123 and #1283 we are trying to implement generic `datasets_base` and `shares_base` modules that can be used by any type of datasets and by any type of shareable object in a generic way. This is one of the last PRs focused on renaming files and cleaning-up the s3_datasets_shares module. The first step is a consolidation of the file and classes names in the services to clearly refer to s3_shares: - `services.managed_share_policy_service.SharePolicyService` ---> `services.s3_share_managed_policy_service.S3SharePolicyService` - `services.dataset_sharing_alarm_service.DatasetSharingAlarmService` --> `services.s3_share_alarm_service.S3ShareAlarmService` - `services.managed_share_policy_service.SharePolicyService` --> `services.s3_share_managed_policy_service.S3SharePolicyService` 👀 The main refactoring happens in what is used to be `services.dataset_sharing_service`. - The part that implements the `DatasetServiceInterface` has been moved to `services/s3_share_dataset_service.py` as the `S3ShareDatasetService` - The part used in the resolvers and by other methods has been renamed as `services.s3_share_service.py` and the methods for the folder/table permissions are also added to the S3ShareService (from share_item_service) Lastly, there is one method previously in share_item_service that has been moved to the GlueClient directly as `get_glue_database_from_catalog`. ### Relates - #1283 - #1123 - #955 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 1c09015 Author: Noah Paige <[email protected]> Date: Thu Jun 27 2024 04:16:14 GMT-0400 (Eastern Daylight Time) fix listOrganizationGroupPermissions (#1369) ### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - Fix listOrganizationGroupPermissions ### Relates - <URL or Ticket> ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 976ec6b Author: dlpzx <[email protected]> Date: Thu Jun 27 2024 04:13:14 GMT-0400 (Eastern Daylight Time) Add search (Autocomplete) in create pipelines (#1368) ### Feature or Bugfix - Feature ### Detail Autocomplete for environments and teams in the following frontend views as requested in #1012. This PR implements it for createPipelines ### Relates - #1012 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 6c909a3 Author: Noah Paige <[email protected]> Date: Wed Jun 26 2024 11:18:04 GMT-0400 (Eastern Daylight Time) fix migration to not rely on OrganizationService or RequestContext (#1361) ### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - Ensure migration script does not need RequestContext - otherwise fails in migration trigger lambda as context info not set / available ### Relates - #1306 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 90835fb Author: Anushka Singh <[email protected]> Date: Wed Jun 26 2024 11:17:22 GMT-0400 (Eastern Daylight Time) Issue1248: Persistent Email Reminders (#1354) ### Feature or Bugfix - Feature ### Detail - When a share request is initiated and remains pending for an extended period, dataset producers will receive automated email reminders at predefined intervals. These reminders will prompt producers to either approve or extend the share request, thereby preventing delays in accessing datasets. Attaching screenshots for emails: <img width="1336" alt="Screenshot 2024-06-20 at 5 34 31 PM" src="https://github.com/data-dot-all/dataall/assets/26413731/d7be28c3-5c98-4146-92b1-295e136137a3"> <img width="1322" alt="Screenshot 2024-06-20 at 5 34 52 PM" src="https://github.com/data-dot-all/dataall/assets/26413731/047556e8-59ee-4ebf-b8a7-c0a6684e2a63"> - Email will be sent every Monday at 9am UTC. Schedule can be changed in cron expression in container.py ### Relates - #1248 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: Anushka Singh <[email protected]> Co-authored-by: trajopadhye <[email protected]> Co-authored-by: Mohit Arora <[email protected]> Co-authored-by: rbernota <[email protected]> Co-authored-by: Rick Bernotas <[email protected]> Co-authored-by: Raj Chopde <[email protected]> Co-authored-by: Noah Paige <[email protected]> Co-authored-by: dlpzx <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: jaidisido <[email protected]> Co-authored-by: dlpzx <[email protected]> Co-authored-by: mourya-33 <[email protected]> Co-authored-by: nikpodsh <[email protected]> Co-authored-by: MK <[email protected]> Co-authored-by: Manjula <[email protected]> Co-authored-by: Zilvinas Saltys <[email protected]> Co-authored-by: Zilvinas Saltys <[email protected]> Co-authored-by: Daniel Lorch <[email protected]> Co-authored-by: Tejas Rajopadhye <[email protected]> Co-authored-by: Zilvinas Saltys <[email protected]> Co-authored-by: Sofia Sazonova <[email protected]> Co-authored-by: Sofia Sazonova <[email protected]> commit e477bdf Author: Noah Paige <[email protected]> Date: Wed Jun 26 2024 10:39:09 GMT-0400 (Eastern Daylight Time) Enforce non null on GQL query string if non null defined (#1362) ### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - Add `String!` to ensure non null input argument on FE if defined as such on backend GQL operation for `listS3DatasetsSharedWithEnvGroup` ### Relates ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit d6b59b3 Author: Noah Paige <[email protected]> Date: Wed Jun 26 2024 08:48:52 GMT-0400 (Eastern Daylight Time) Fix Init Share Base (#1360) ### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - Need to register processors in init for s3 dataset shares API module ### Relates ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit bd3698c Author: Petros Kalos <[email protected]> Date: Wed Jun 26 2024 05:19:14 GMT-0400 (Eastern Daylight Time) split cognito urls setup and cognito user creation (#1366) ### Feature or Bugfix - Bugfix ### Details For more details about the issue read #1353 In this PR we are solving the problem by splitting the configuration of Cognito in 2. * First part (cognito_users_config.py) is setting up the required groups and users and runs after UserPool deployment * Second part (cognito_urls_config.py) is setting up Cognito's callback/logout urls and runs after the CloudFront deployment We chose to split the functionality because we need to have the users/groups setup for the integration tests which are run after the backend deployment. The other althernative is to keep the config functionality as one but make the integ tests run after CloudFront stage. ### Relates - Solves #1353 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
noah-paige
added a commit
that referenced
this issue
Aug 30, 2024
commit 4425e756 Author: Noah Paige <[email protected]> Date: Mon Jul 08 2024 11:57:31 GMT-0400 (Eastern Daylight Time) Fix commit 4cd2bf77 Author: Noah Paige <[email protected]> Date: Mon Jul 08 2024 11:56:38 GMT-0400 (Eastern Daylight Time) Fix commit 22a6f6ef Author: Noah Paige <[email protected]> Date: Mon Jul 08 2024 11:28:07 GMT-0400 (Eastern Daylight Time) Add integ tests commit 4fb7d653 Author: Noah Paige <[email protected]> Date: Mon Jul 08 2024 11:26:36 GMT-0400 (Eastern Daylight Time) Merge env test changes commit 4cf42e8 Author: Petros Kalos <[email protected]> Date: Fri Jul 05 2024 08:19:34 GMT-0400 (Eastern Daylight Time) improve docs commit 65f930a Author: Petros Kalos <[email protected]> Date: Fri Jul 05 2024 08:10:56 GMT-0400 (Eastern Daylight Time) fix failures commit 170b7ce Author: Petros Kalos <[email protected]> Date: Wed Jul 03 2024 10:52:20 GMT-0400 (Eastern Daylight Time) add group/consumption_role invite/remove tests commit ba77d69 Author: dlpzx <[email protected]> Date: Wed Jul 03 2024 06:51:47 GMT-0400 (Eastern Daylight Time) Rename alias for env_vars kms key in cognito lambdas FE and BE (#1385) ### Feature or Bugfix - Bugfix ### Detail For the case in which we deploy FE and BE in us-east-1 the new lambda env_key alias is the same one for TriggerFunctionCognitoUrlsConfig in FE and for TriggerFunctionCognitoConfig in BE, which results in a failure of the CICD in the FE stack because the alias already exists. This PR changes the name of both aliases to avoid this conflict. It also adds envname to avoid issues with other deployment environments/tooling account in the future ### Relates - <URL or Ticket> ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit e5923a9 Author: dlpzx <[email protected]> Date: Wed Jul 03 2024 04:27:11 GMT-0400 (Eastern Daylight Time) Fix lambda_env_key out of scope for vpc-facing cognito setup (#1384) ### Feature or Bugfix - Bugfix ### Detail The KMS key for the Lambda environment variables in the Cognito IdP stack was defined inside an if-clause for internet facing frontend. Outside of that if, for vpc-facing architecture the kms key does not exist and the CICD pipeline fails. This PRs move the creation of the KMS key outside of the if. ### Relates ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 3ccacfc Author: Noah Paige <[email protected]> Date: Mon Jul 01 2024 13:56:58 GMT-0400 (Eastern Daylight Time) Add delete docs not found when re indexing in catalog task (#1365) ### Feature or Bugfix <!-- please choose --> - Feature ### Detail - Add logic to Catalog Indexer Task to Delete Docs No Longer in RDS - TODO: Add Ability to Re-index Catalog Items via Dataall Admin UI ### Relates - #1078 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit e2817a1 Author: Noah Paige <[email protected]> Date: Mon Jul 01 2024 05:14:07 GMT-0400 (Eastern Daylight Time) Fix/glossary status (#1373) ### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - Add back `status` to Glossary GQL Object for GQL Operations (getGlossary, listGlossaries) - Fix `listOrganizationGroupPermissions` enforce non null on FE ### Relates ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit c3c58bd Author: Petros Kalos <[email protected]> Date: Fri Jun 28 2024 06:55:42 GMT-0400 (Eastern Daylight Time) add environment tests (#1371) ### Feature or Bugfix Feature ### Detail * add list_environment tests * add test for updating an environment (via update_stack) * generalise the polling functions for stacks ### Relates #1220 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit e913d48 Author: dlpzx <[email protected]> Date: Fri Jun 28 2024 04:15:49 GMT-0400 (Eastern Daylight Time) Add search (Autocomplete) in miscellaneous dropdowns (#1367) ### Feature or Bugfix - Feature ### Detail Autocomplete for environments and teams in the following frontend views as requested in #1012. In this case the views required custom dropdowns. ❗ I used `noOptionsText` whenever it was necessary instead of checking groupOptions lenght >0 - [x] DatasetEditForm.js -> ❗ I kept the stewards field as `freesolo` - what that means is that users CAN specify options that are not on the list. I would like the reviewer to confirm this is what we want. At the end stewardship is a delegation of permissions, it makes sense that delegation happens to other teams. Also changed DatasetCreateForm - [X] RequestDashboardAccessModal.js - already implemented, minor changes - [X] EnvironmentTeamInviteForm.js - already implemented, minor changes. -> Kept `freesolo` because invited teams might not be the user teams. Same reason why there is no check for groupOptions == 0, if there are no options there is still the free text option. - [X] EnvironmentRoleAddForm.js - [X] NetworkCreateModal.js ### Relates - #1012 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit ee71d7b Author: Tejas Rajopadhye <[email protected]> Date: Thu Jun 27 2024 14:08:27 GMT-0400 (Eastern Daylight Time) [Gh 1301] Enhancement Feature - Bulk share reapply on dataset (#1363) ### Feature or Bugfix - Feature ### Detail - Adds feature to reapply shares in bulk for a dataset. - Also contains bugfix for AWS worker lambda errors ### Relates - #1301 - #1364 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? N/A - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? N/A - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? N/A - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? N/A - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: trajopadhye <[email protected]> commit 27f1ad7 Author: Noah Paige <[email protected]> Date: Thu Jun 27 2024 13:18:32 GMT-0400 (Eastern Daylight Time) Convert Dataset Lock Mechanism to Generic Resource Lock (#1338) ### Feature or Bugfix <!-- please choose --> - Feature - Bugfix - Refactoring ### Detail - Convert Dataset Lock Mechanism to Generic Resource Lock - Extend locking to Share principals (i.e. EnvironmentGroup and Consumption Roles) - Making locking a generic component not tied to datasets ### Relates - #1093 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: dlpzx <[email protected]> commit e3b8658 Author: Petros Kalos <[email protected]> Date: Thu Jun 27 2024 12:50:59 GMT-0400 (Eastern Daylight Time) ignore ruff change in blame (#1372) ### Feature or Bugfix <!-- please choose --> - Feature - Bugfix - Refactoring ### Detail - <feature1 or bug1> - <feature2 or bug2> ### Relates - <URL or Ticket> ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 2e80de4 Author: dlpzx <[email protected]> Date: Thu Jun 27 2024 10:59:18 GMT-0400 (Eastern Daylight Time) Generic shares_base module and specific s3_datasets_shares module - part 11 (renaming and cleaning up s3_shares) (#1359) ### Feature or Bugfix - Refactoring ### Detail As explained in the design for #1123 and #1283 we are trying to implement generic `datasets_base` and `shares_base` modules that can be used by any type of datasets and by any type of shareable object in a generic way. This is one of the last PRs focused on renaming files and cleaning-up the s3_datasets_shares module. The first step is a consolidation of the file and classes names in the services to clearly refer to s3_shares: - `services.managed_share_policy_service.SharePolicyService` ---> `services.s3_share_managed_policy_service.S3SharePolicyService` - `services.dataset_sharing_alarm_service.DatasetSharingAlarmService` --> `services.s3_share_alarm_service.S3ShareAlarmService` - `services.managed_share_policy_service.SharePolicyService` --> `services.s3_share_managed_policy_service.S3SharePolicyService` 👀 The main refactoring happens in what is used to be `services.dataset_sharing_service`. - The part that implements the `DatasetServiceInterface` has been moved to `services/s3_share_dataset_service.py` as the `S3ShareDatasetService` - The part used in the resolvers and by other methods has been renamed as `services.s3_share_service.py` and the methods for the folder/table permissions are also added to the S3ShareService (from share_item_service) Lastly, there is one method previously in share_item_service that has been moved to the GlueClient directly as `get_glue_database_from_catalog`. ### Relates - #1283 - #1123 - #955 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 1c09015 Author: Noah Paige <[email protected]> Date: Thu Jun 27 2024 04:16:14 GMT-0400 (Eastern Daylight Time) fix listOrganizationGroupPermissions (#1369) ### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - Fix listOrganizationGroupPermissions ### Relates - <URL or Ticket> ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 976ec6b Author: dlpzx <[email protected]> Date: Thu Jun 27 2024 04:13:14 GMT-0400 (Eastern Daylight Time) Add search (Autocomplete) in create pipelines (#1368) ### Feature or Bugfix - Feature ### Detail Autocomplete for environments and teams in the following frontend views as requested in #1012. This PR implements it for createPipelines ### Relates - #1012 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 6c909a3 Author: Noah Paige <[email protected]> Date: Wed Jun 26 2024 11:18:04 GMT-0400 (Eastern Daylight Time) fix migration to not rely on OrganizationService or RequestContext (#1361) ### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - Ensure migration script does not need RequestContext - otherwise fails in migration trigger lambda as context info not set / available ### Relates - #1306 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 90835fb Author: Anushka Singh <[email protected]> Date: Wed Jun 26 2024 11:17:22 GMT-0400 (Eastern Daylight Time) Issue1248: Persistent Email Reminders (#1354) ### Feature or Bugfix - Feature ### Detail - When a share request is initiated and remains pending for an extended period, dataset producers will receive automated email reminders at predefined intervals. These reminders will prompt producers to either approve or extend the share request, thereby preventing delays in accessing datasets. Attaching screenshots for emails: <img width="1336" alt="Screenshot 2024-06-20 at 5 34 31 PM" src="https://github.com/data-dot-all/dataall/assets/26413731/d7be28c3-5c98-4146-92b1-295e136137a3"> <img width="1322" alt="Screenshot 2024-06-20 at 5 34 52 PM" src="https://github.com/data-dot-all/dataall/assets/26413731/047556e8-59ee-4ebf-b8a7-c0a6684e2a63"> - Email will be sent every Monday at 9am UTC. Schedule can be changed in cron expression in container.py ### Relates - #1248 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: Anushka Singh <[email protected]> Co-authored-by: trajopadhye <[email protected]> Co-authored-by: Mohit Arora <[email protected]> Co-authored-by: rbernota <[email protected]> Co-authored-by: Rick Bernotas <[email protected]> Co-authored-by: Raj Chopde <[email protected]> Co-authored-by: Noah Paige <[email protected]> Co-authored-by: dlpzx <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: jaidisido <[email protected]> Co-authored-by: dlpzx <[email protected]> Co-authored-by: mourya-33 <[email protected]> Co-authored-by: nikpodsh <[email protected]> Co-authored-by: MK <[email protected]> Co-authored-by: Manjula <[email protected]> Co-authored-by: Zilvinas Saltys <[email protected]> Co-authored-by: Zilvinas Saltys <[email protected]> Co-authored-by: Daniel Lorch <[email protected]> Co-authored-by: Tejas Rajopadhye <[email protected]> Co-authored-by: Zilvinas Saltys <[email protected]> Co-authored-by: Sofia Sazonova <[email protected]> Co-authored-by: Sofia Sazonova <[email protected]> commit e477bdf Author: Noah Paige <[email protected]> Date: Wed Jun 26 2024 10:39:09 GMT-0400 (Eastern Daylight Time) Enforce non null on GQL query string if non null defined (#1362) ### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - Add `String!` to ensure non null input argument on FE if defined as such on backend GQL operation for `listS3DatasetsSharedWithEnvGroup` ### Relates ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit d6b59b3 Author: Noah Paige <[email protected]> Date: Wed Jun 26 2024 08:48:52 GMT-0400 (Eastern Daylight Time) Fix Init Share Base (#1360) ### Feature or Bugfix <!-- please choose --> - Bugfix ### Detail - Need to register processors in init for s3 dataset shares API module ### Relates ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit bd3698c Author: Petros Kalos <[email protected]> Date: Wed Jun 26 2024 05:19:14 GMT-0400 (Eastern Daylight Time) split cognito urls setup and cognito user creation (#1366) ### Feature or Bugfix - Bugfix ### Details For more details about the issue read #1353 In this PR we are solving the problem by splitting the configuration of Cognito in 2. * First part (cognito_users_config.py) is setting up the required groups and users and runs after UserPool deployment * Second part (cognito_urls_config.py) is setting up Cognito's callback/logout urls and runs after the CloudFront deployment We chose to split the functionality because we need to have the users/groups setup for the integration tests which are run after the backend deployment. The other althernative is to keep the config functionality as one but make the integ tests run after CloudFront stage. ### Relates - Solves #1353 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Is your feature request related to a problem? Please describe.
It is difficult to add new dataset types with the current implementation of the
datasets
module.Describe the solution you'd like
If we want to add more types of dataset (e.g. Redshift dataset) a first step will be the creation of a generic class that defines a data.all Dataset abstraction that can be used by each of the particular dataset implementations.
Describe alternatives you've considered
This are rough design considerations that might change during implementation:
Backend
1. Avoid circular dependencies and remove the current
datasets_base
- [COMPLETED ✅ ]There is some work to do to avoid circular dependencies between the initial
datasets
anddataset_sharing
. There are some API calls that use each others' graphql models. From a code logic perspective,datasets
should not depend ondataset_sharing
code, so as a first step we will remove this dependency by defining the corresponding code related to dataset_sharing in dataset_sharing always.2. Define basic code layout
datasets_base
--> includesdb
base models for Dataset + list APIs that are generic- DEPENDS ON: nothing - we need to remove dependencies with sharing. Anything related to sharing should be moved to the sharing module.
dataset_sharing_base
--> includesdb
base models for ShareObject, base permissions + list methods- DEPENDS ON:
datasets_base
s3_datasets
--> includes alldb
models of S3 datasets + implementation S3 APIs- DEPENDS ON:
datasets_base
anddataset_sharing_base
(maybe)s3_dataset_sharing
---> business logic for S3 sharing- DEPENDS ON:
s3_dataset
,dataset_sharing_base
redshift_datasets
andredshift_dataset_sharing
3.
datasets_base
moduleI thought of splitting the module into 2 modules:
datasets_list
module anddatasets_base
module. Althoughdatasets_base
would not be a module, it is some base code that will be re-used by other modules. Because of this and because the dataset_list uses dataset_base models I decided to leave it in the same module, but we could change this behavior, it is not a one way door.Base code for Datasets
Base service--> DISCARDED: The reason why we wanted to implement this interface was to ensure that the dataset methods were decorated with the CREATE_DATASET... permissions. But as explained below, it does not make much sense to aply the same permissions to the different types of datasets, except for the list permissions which would not be par tof this base class.dataset_base_service
- Defines abstract classDatasetBaseService
to be implemented in each dataset typedatasets/db/dataset_models
- DatasetBase - same as before but WITHOUT all S3-related logic. We add the
datasetType
column. Each dataset module will have to implement their own tables based on this table. For this we can use sqlalchemy *joined table inheritance- DatasetLock - it is a generic model that can be used by any type of dataset.
DatasetRepository
indatasets/db/dataset_repositories
DatasetLock transactions. Each dataset module will have to implement their own repositories.*Joined table inheritance: In joined table inheritance, each class along a hierarchy of classes is represented by a distinct table. Querying for a particular subclass in the hierarchy will render as a SQL JOIN along all tables in its inheritance path. If the queried class is the base class, the base table is queried instead, with options to include other tables at the same time or to allow attributes specific to sub-tables to load later.
This means that if we query the table
datasets
, base class, we get a list of all the generic attributes of all types of datasets. If we query the specific subclasss3_datasets
it retrieves the generic details+the S3 details of the S3 datasets.List Datasets
datasets/api
dataset_list_service
- business logic and permissions checkingdataset_base_permissions
DATASET PERMISSIONS FOR ENVIRONMENT - List Datasets in environment needed for the serviceDatasetListRepository
indatasets/db/dataset_repositories
independent from their dataset type. ~~It is a EnvironmentResource type to count_resources before deleting an environment. ~~ DISCARDED: it makes more sense to count the specific resources before deletion.4.
s3_datasets
moduleVery similar to the previous
datasets_module
, it contains the code to manage S3-GLUE datasets in data.all.-
S3Dataset
model that usesDataset
as base using inheritance in sqlalchemy.- Other models that are specific to S3/GLUE-Datase: tables, folders.....
Frontend
for s3_datasets and datasets_base
Config.json
for s3_datasets and datasets_base
Divide the parameters between those that are generic dataset parameters (defined in
dataset
module) and those that ares3_datasets
specific parameters.Related to #955
🥇 Thanks to @dosiennik for his suggestion of using inheritance in sqlalchemy: https://docs.sqlalchemy.org/en/20/orm/inheritance.html this will allow us to define a Dataset model and child models
The text was updated successfully, but these errors were encountered: