Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create generic dataset_base and s3_dataset modules from current datasets #1123

Closed
dlpzx opened this issue Mar 25, 2024 · 2 comments
Closed

Comments

@dlpzx
Copy link
Contributor

dlpzx commented Mar 25, 2024

Is your feature request related to a problem? Please describe.
It is difficult to add new dataset types with the current implementation of the datasets module.

Describe the solution you'd like
If we want to add more types of dataset (e.g. Redshift dataset) a first step will be the creation of a generic class that defines a data.all Dataset abstraction that can be used by each of the particular dataset implementations.

Describe alternatives you've considered
This are rough design considerations that might change during implementation:


Backend

1. Avoid circular dependencies and remove the current datasets_base - [COMPLETED ✅ ]

There is some work to do to avoid circular dependencies between the initial datasets and dataset_sharing. There are some API calls that use each others' graphql models. From a code logic perspective, datasets should not depend on dataset_sharing code, so as a first step we will remove this dependency by defining the corresponding code related to dataset_sharing in dataset_sharing always.

2. Define basic code layout

image

  • datasets_base --> includes db base models for Dataset + list APIs that are generic
    - DEPENDS ON: nothing - we need to remove dependencies with sharing. Anything related to sharing should be moved to the sharing module.
  • dataset_sharing_base --> includes db base models for ShareObject, base permissions + list methods
    - DEPENDS ON: datasets_base
  • s3_datasets --> includes all db models of S3 datasets + implementation S3 APIs
    - DEPENDS ON: datasets_base and dataset_sharing_base (maybe)
  • s3_dataset_sharing ---> business logic for S3 sharing
    - DEPENDS ON: s3_dataset, dataset_sharing_base
  • in the future redshift_datasets and redshift_dataset_sharing

3. datasets_base module

  1. contains the API calls that list Datasets that are generic (without details to S3/Glue)
  2. contains the base code for other modules to implement a dataset

I thought of splitting the module into 2 modules: datasets_list module and datasets_base module. Although datasets_base would not be a module, it is some base code that will be re-used by other modules. Because of this and because the dataset_list uses dataset_base models I decided to leave it in the same module, but we could change this behavior, it is not a one way door.

Base code for Datasets

  • NO API CALLS
  • Base service dataset_base_service - Defines abstract class DatasetBaseService to be implemented in each dataset type --> DISCARDED: The reason why we wanted to implement this interface was to ensure that the dataset methods were decorated with the CREATE_DATASET... permissions. But as explained below, it does not make much sense to aply the same permissions to the different types of datasets, except for the list permissions which would not be par tof this base class.
  • Database models - defined in datasets/db/dataset_models
    - DatasetBase - same as before but WITHOUT all S3-related logic. We add the datasetType column. Each dataset module will have to implement their own tables based on this table. For this we can use sqlalchemy *joined table inheritance
    - DatasetLock - it is a generic model that can be used by any type of dataset.
  • Database operations - in DatasetRepository in datasets/db/dataset_repositories DatasetLock transactions. Each dataset module will have to implement their own repositories.
  • datasets_enums - re-used by all Datasets

*Joined table inheritance: In joined table inheritance, each class along a hierarchy of classes is represented by a distinct table. Querying for a particular subclass in the hierarchy will render as a SQL JOIN along all tables in its inheritance path. If the queried class is the base class, the base table is queried instead, with options to include other tables at the same time or to allow attributes specific to sub-tables to load later.

This means that if we query the table datasets, base class, we get a list of all the generic attributes of all types of datasets. If we query the specific subclass s3_datasets it retrieves the generic details+the S3 details of the S3 datasets.

List Datasets

  • API definitions in datasets/api
  • Service dataset_list_service - business logic and permissions checking
  • Permissions dataset_base_permissions DATASET PERMISSIONS FOR ENVIRONMENT - List Datasets in environment needed for the service
  • Database models - explained above
  • Database operations to query all Datasets DatasetListRepository in datasets/db/dataset_repositories independent from their dataset type. ~~It is a EnvironmentResource type to count_resources before deleting an environment. ~~ DISCARDED: it makes more sense to count the specific resources before deletion.
  • datasets_enums - explained above

4. s3_datasets module

Very similar to the previous datasets_module, it contains the code to manage S3-GLUE datasets in data.all.

  • API calls - WITHOUT generic listDataset calls
  • AWS clients - includes all previous code as it is S3/GLUE-Dataset specific
  • CDK stacks and permissions
  • Database models/repositories:
    - S3Dataset model that uses Dataset as base using inheritance in sqlalchemy.
    - Other models that are specific to S3/GLUE-Datase: tables, folders.....
  • handlers - includes all previous code as it is S3/GLUE-Dataset specific
  • indexers
  • services
  • tasks - includes all previous code as it is S3/GLUE-Dataset specific

Frontend

for s3_datasets and datasets_base

  • The listDatasets view is generic as it is the API listDatasets and analogous apis
  • All the detail getDataset views are specific of each Dataset type

Config.json

for s3_datasets and datasets_base

Divide the parameters between those that are generic dataset parameters (defined in dataset module) and those that are s3_datasets specific parameters.

        "datasets": {
            "active": true,
            "features": {
                "share_notifications": {
                    "email": {
                        "active": false,
                        "parameters": {
                            "group_notifications": true
                        }
                    }
                },
                "confidentiality_dropdown" : true,
                "topics_dropdown" : true
            }
        },
        "s3_datasets": {
            "active": true,
            "features": {
                "file_uploads": true,
                "file_actions": true,
                "aws_actions": true,
                "preview_data": true,
                "glue_crawler": true
            }
        },

Related to #955

🥇 Thanks to @dosiennik for his suggestion of using inheritance in sqlalchemy: https://docs.sqlalchemy.org/en/20/orm/inheritance.html this will allow us to define a Dataset model and child models

@dlpzx
Copy link
Contributor Author

dlpzx commented May 7, 2024

Step-by-step implementation plan

Pre-requisites

  • decouple datasets and datasets_sharing and remove the initial datasets_base module

1. Split datasets

Separate datasets into datasets_base and s3_datasets
We need to implement it in multiple PRs, otherwise it will be impossible to follow.
*BE=Backend, FE=Frontend

  • (BE + FE) Rename datasets to s3_datasets (PART1 PR)
  • (BE ONLY) Create datasets_base module and update the dependencies (s3_datasets depends on datasets_base) (PART2)
  • (BE ONLY) Move generic dataset enums to datasets_base (PART2)
  • (BE ONLY) Create a base DatasetBase model in datasets_base and an S3Dataset model in s3_dataset that inherits it. (use enum DatasetType) Ensure deletes deletes items in both tables and create migration scripts. (PART3)
  • (BE ONLY) Move generic db models and generic services to datasets_base = DatasetLock(PART4)
  • (BE+FE) Create list APIs in datasets_base, adapt the frontend and move the generic DatasetInterface to datasets_base (PART5) and additional PR
  • (FE ONLY) Create new Datasets_Base FE module and add generic views and services (PART6)
  • (BE ONLY) Review dataset permissions and if it makes sense split permissions into dataset_base permissions and S3_dataset permissions (PART7) ---> ⚠️ DISCARDED. LIST_ENVIRONMENT_DATASETS is the only permission that should be in datasets_base, the rest are specific for each type of dataset. For example one group might have permissions to CREATE_S3_DATASETS but not to CREATE_REDSHIFT_DATASETS. They are protecting different API calls, therefore they should be different permissions. The only change that would be nice is renaming the current CREATE_DATASETS permissions to CREATE_S3_DATASETS permissions. It is a cosmetic change that would be nice, but it involves some effort and carefully leading with the alembic migration. For the moment I have not implemented this change.
  • (BE ONLY) Review CatalogIndexer and if it makes sense add it to datasets_base (PART8)---> ⚠️ DISCARDED The DatasetCatalogIndexer of S3 datasets upserts tables and folders that are particular to s3 datasets. Despite involving some duplication of code it is good that each dataset module has control over the catalog indexing.

2. Split config.json

Separate config.json parameters into dataset_base parameters and s3_datasets parameters

  • BE+FE - split the config.json params and the references in BE and FE (PART7)

3. Split dataset sharing

Separate datasets_sharing into sharing_base and s3_datasets_sharing
in separate issue: #1283

dlpzx added a commit that referenced this issue May 7, 2024
…me datasets as s3_datasets) (#1250)

### Feature or Bugfix
- Refactoring

### Detail
- Rename `datasets` module to `s3_datasets` module

This PR is the first step to extract a generic datasets_base module that
implements the undifferentiated concepts of Dataset in data.all.
s3_datasets will use this base module to implement the specific
implementation for S3 datatasets.

### Relates
- #1123 
- #955 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
dlpzx added a commit that referenced this issue May 15, 2024
…te datasets_base and move enums) (#1257)

### Feature or Bugfix
⚠️ This PR should be merged after #1250. 
- Refactoring

### Detail
As explained in the design for #1123 we are trying to implement a
generic `datasets_base` module that can be used by any type of datasets
in a generic way.

This PR:
- Creates the skeleton of the `datasets_base` module consisting of 3
packages: `db`, `api`, `services`. And adds the `__init__` file.
- Adds the dependency of `s3_datasets` to `datasets_base` in the
`__init__` file of the `s3_datasets` module
- Moves datasets_enums to datasets_base

### Relates
- #1123 
- #955 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
dlpzx added a commit that referenced this issue May 17, 2024
…te DatasetBase db model and S3Dataset model) (#1258)

### Feature or Bugfix
⚠️ This PR should be merged after #1257.
- Feature
- Refactoring

### Detail
As explained in the design for #1123 we are trying to implement a
generic `datasets_base` module that can be used by any type of datasets
in a generic way.

**This PR does**:
- Adds a generic `DatasetBase` model in datasets_base.db that is used in
s3_datasets.db to build the `S3Dataset` model using joined table
inheritance in
[sqlalchemy](https://docs.sqlalchemy.org/en/20/orm/inheritance.html)
- Rename all usages of Dataset to S3Dataset (in the future some will be
returned to DatasetBase, but for the moment we will keep them as
S3Dataset)
- Add migration script that backfills `datasets` table and renames
`s3_datasets` ---> ⚠️ In the process of migrating we are doing some
"scary" operations on the dataset table, if for any reason the migration
encounters any issue it could result in catastrophic loss of information
--> for this reason this
[PR](#1267) implements RDS
snapshots on migrations.

**This PR does not**:
- Feed registration stays as:
`FeedRegistry.register(FeedDefinition('Dataset', S3Dataset))` using
`Dataset` with the `S3Dataset` resource type. It is out of the scope of
this PR to migrate the Feed definition.
- Exactly the same for the GlossaryRegistry registration. We keep
`object_type='Dataset'` to avoid backwards compatibility issues.
- It does not change the resourceType for permissions. We keep using a
generic `Dataset` as target for S3 permissions. If we are to split
permissions into DatasetBase permissions and S3Dataset permissions we
would do it on a different PR

#### Remarks
Inserting new items of S3Dataset does not require any changes. SQL
Alchemy joined inheritance automatically inserts data in the parent
table and then another one to the child table as explained in this
stackoverflow
[link](https://stackoverflow.com/questions/39926937/sqlalchemy-how-to-insert-a-joined-table-inherited-class-instance-when-the-pare)
(I was not able to find it in the official docs)


### Relates
- #1123 
- #955 
- #1267

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
dlpzx added a commit that referenced this issue May 21, 2024
…te DatasetBaseRepository and move DatasetLock) (#1276)

### Feature or Bugfix
⚠️ merge after #1258 
- Refactoring

### Detail
As explained in the design for #1123 we are trying to implement a
generic `datasets_base` module that can be used by any type of datasets
in a generic way.

In this small PR:
- we move the generic DatasetLock model to datasets_base
- move the DatasetLock db operations to databasets_base
DatasetBaseRepository
- move activity to DatasetBaseRepository

### Relates
- #1123 
- #955 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
dlpzx added a commit that referenced this issue May 21, 2024
…e DatasetServiceInterface to datasets_base, add property, create first list API for datasets_base) (#1281)

### Feature or Bugfix
- Feature
- Refactoring

### Detail
As explained in the design for #1123 we are trying to implement a
generic `datasets_base` module that can be used by any type of datasets
in a generic way.

In this PR we:
- Move DatasetServiceInterface to datasets_base. This interface is used
by datasets_sharing to "inject" logic in s3_datasets
- add property dataset_type to the DatasetServiceInterface interface to
distinguish which type of dataset this interface applies to.
- create first list API for datasets_base. 👀 This is the most important
part. When having multiple types of datasets users will still list all
datasets together in several places in the UI (e.g. in listDatasets in
DatasetList view, in listDatasetsEnvironment in Environment view) This
API calls are not specific to s3_datasets, but generic to any type of
dataset. Thus, they should be part of datasets_base. This PR introduces
the datasets_list_service, datasetListRepository and includes only one
example of API that moves to dataset_base. In next PRs we will move the
rest of APIs

### Relates
- #1123 
- #955 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
dlpzx added a commit that referenced this issue May 21, 2024
…ve list queries to dataset_base or rename them) (#1282)

### Feature or Bugfix
- Refactoring

### Detail
As explained in the design for #1123 we are trying to implement a
generic `datasets_base` module that can be used by any type of datasets
in a generic way.

In this PR we:
- Restructure listDatasetsOwnedByEnvGroup as
listS3DatasetsOwnedByEnvGroup and move it into Worksheets in FE: the
reason why it is moved to Worksheets is that it is the only place where
it is used in the FE. One could argue that in the BE
listS3DatasetsOwnedByEnvGroup is part of the S3_Dataset module. The way
I see it, FE and BE are independent and their modularization strategies
fit the type of programming, what makes sense in FE might not make it in
BE. In BE queries belong to the module whose services/models they are
performing actions on, in this case s3_datasets. In FE queries belong to
the module where they are used and if a query is used by more than one
module then it can be placed in the generic `services` directory. What
is important is that we define the dependencies. In this case it is
important to make Worksheets dependent of S3_Datasets (as we do in the
index in `frontend/src/modules/Worksheets/index.js` and in
`backend/dataall/modules/worksheets/__init__.py`
- Move listDatasetsCreatedInEnvironment to datasets_base

### Relates
- #1123 
- #955 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
dlpzx added a commit that referenced this issue May 22, 2024
…art 1 (renaming, enums and permissions) (#1284)

### Feature or Bugfix
- Feature
- Refactoring

### Detail
As explained in the design for #1123 and #1283 we are trying to
implement generic `datasets_base` and `shares_base` modules that can be
used by any type of datasets and by any type of shareable object in a
generic way.

In this PR:
- Rename `dataset_sharing` as `s3_dataset_shares`
- Create `shares_base` and introduce dependency (`s3_dataset_shares`
depends on `shares_base`)
- Move generic enums to shares_base
- Move generic permissions to shares_base


### Relates
- #1283 
- #1123 
- #955 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
dlpzx added a commit that referenced this issue Jun 3, 2024
…t config.json) (#1297)

### Feature or Bugfix
- Refactoring
⚠️ ⚠️ When releasing we need to give a heads up to customers as this
change might overwrite their current config.json configurations. It will
probably result is some conflicting changes to resolve.

### Detail
As explained in the design for #1123 we are trying to implement a
generic `datasets_base` module that can be used by any type of datasets
in a generic way.

This PR is the last one of the series, it moves the generic config.json
parameters for datasets into a new field `datasets_base` in the
config.json. Specific `s3_datasets` parameters stay in the s3 module
configuration.

### Relates
- #1123 
- #955 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
@dlpzx
Copy link
Contributor Author

dlpzx commented Jun 3, 2024

Completed!

@dlpzx dlpzx closed this as completed Jun 3, 2024
dlpzx added a commit that referenced this issue Jun 4, 2024
…art 3 (share processor and manager interfaces) (#1298)

### Feature or Bugfix
- Feature
- Refactoring

### Detail
As explained in the design for #1123 and #1283 we are trying to
implement generic `datasets_base` and `shares_base` modules that can be
used by any type of datasets and by any type of shareable object in a
generic way.

In this PR:
- Move ECS handler for sharing tasks to shares_base
- Move sharing tasks to shares_base
- Move DataSharingService to shares_base and rename it as
SharingService. Make SharingService generic and remove any reference to
specific items. In this process:
- attach and delete table and folder permissions are moved into the
specific share processor. delete permissions are processed item by item
and moved into the shareItemService
- Some methods are copied from ShareObjectRepository in
s3_datasets_shares to shares_base. They have not been removed from
s3_datasets_shares, instead there is a TODO marking those methods that
have been copied. The migration and clean-up of shareObjectRepository
will be done in a following PR
- Need to introduce `DatasetBaseRepository.get_dataset_by_uri(session,
share.datasetUri)` to avoid future circular dependencies: shares_base
depends only on datasets_base.
- Clean-up and consolidate methods: remove updates of the share items
outside of state machine transactions. Only re-used methods and dataset
lock handling in share_manager --> TODO: dataset_lock manager should be
its own service outside of sharing, but this is out of the scope of this
PR.
- Add updates of share_item statuses in except clauses for more robust
sharing
- Introduce ShareProcessor and ShareManager interfaces and use them in
the SharingService: instead of the processor inheriting the manager
class, the processor uses the ShareProcessor interface and constructs a
manager when needed.
- Introduce new load ImportMode `SHARES_TASK` and register
ShareProcessors in s3_datasets_share
(`backend/dataall/modules/s3_datasets_shares/__init__.py`)


![image](https://github.com/data-dot-all/dataall/assets/71252798/af4eafc3-c990-4532-ba25-62ef950791aa)

See full detail of SharingService design in #1283

### Next steps/Open questions
For failures I think we should rollback whatever actions where
performed. For example, if we are sharing a table and it failed in one
step, it should revert all the steps executed before. @petrkalos
@SofiaSazonova @noah-paige what do you think?



### Relates
- #1283 
- #1123 
- #955 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
dlpzx added a commit that referenced this issue Jun 6, 2024
…art 4 (remove s3 info from shareItem db models) (#1311)

### Feature or Bugfix
- Refactoring

### Detail
As explained in the design for #1123 and #1283 we are trying to
implement generic `datasets_base` and `shares_base` modules that can be
used by any type of datasets and by any type of shareable object in a
generic way.

In this PR:
- Remove the fields `GlueDatabaseName`, `GlueTableName` and
`S3AccessPointName` from the ShareObjectItem db model. The goal is to
have a generic ShareObjectItem and access the specific item information
through its itemUri and itemType.
- Migration script to drop those columns. 
- Disclaimer, there is still work to do on the gql types; but that is
out of the scope for this PR

### Relates
- #1283 
- #1123 
- #955 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
dlpzx added a commit that referenced this issue Jun 6, 2024
…art 5 (move exceptions and notifications to shares_base) (#1312)

### Feature or Bugfix
- Refactoring

### Detail
As explained in the design for #1123 and #1283 we are trying to
implement generic `datasets_base` and `shares_base` modules that can be
used by any type of datasets and by any type of shareable object in a
generic way.

In this PR:
- Move share_exceptions to shares_base
- Move share_notification_service to shares_base


### Relates
- #1283 
- #1123 
- #955 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
dlpzx added a commit that referenced this issue Jun 11, 2024
### Feature or Bugfix
- Bugfix

### Detail
With the refactoring of the datasets and the datasets_sharing tests were
not being executed. In this PR the corresponding tests packages are
renamed and the tests inside are fixed with the latest changes in main.

There are still things to improve, for example we can simplify the
conftests of the s3_datasets and the s3_datasets_shars modules; but I am
going to leave it for a next PR

### Relates
- #1123 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
dlpzx added a commit that referenced this issue Jun 17, 2024
…art 6 (Split APIs and graphql types) (#1320)

### Feature or Bugfix
- Feature
- Refactoring

### Detail
As explained in the design for #1123 and #1283 we are trying to
implement generic `datasets_base` and `shares_base` modules that can be
used by any type of datasets and by any type of shareable object in a
generic way.

- First, this PR splits the query used in Worksheet and Environments to
list glue databases and list datasets. They are pretty different
queries, the one used in Worksheets is only relevant for S3 datasets,
while the one in Environment is focused on the share items in general:
- Introduce new API `listS3DatasetsSharedWithEnvGroup` to list shared
glue databases in Worksheets view. It is part of the s3_datasets_shares
module. This new API replaces the usage of `searchEnvironmentDataItems`
in Worksheets frontend.
- Remove Glue-parameters from `searchEnvironmentDataItems` API, this API
belongs to shares_base. It is only used in the Environment view >
Datasets tab, so I moved the API in frontend to modules/Environment.
- remove unused parameters (`tables`, `locations`) from statistics in
`api/types/ShareObjectStatistics`. Now the statistics are only generic.

- Introduce new API `getS3ConsumptionData` in s3_datasets_shares. This
new API call gets the details of gluedatabase/table, s3accesspoint that
were previously part of ShareObject. This way the graphql ShareObject
does not contain specific S3 info.

- The rest of the APIs have been split in `shares_base` and
`s3_datasets_shares`. In general, all the share lifecycle (create, add
items, approve...) is part of shares_base. listDatasetShareObjects,
verifyDatasetShareObjects used in the S3-Dataset UI are part of
s3_datasets_shares

- TODO: Review tests and create new tests for get_consumption_data.
Currently tests for shares and datasets are placed in the same folder. I
will open a separate PR to order the tests a bit before this

### Relates
- #1283 
- #1123 
- #955 
- #1277 ---> This PR needs
to be merged and then I will introduce some changes in the ShareView.

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
dlpzx added a commit that referenced this issue Jun 19, 2024
…art 7 (share_object_service) (#1340)

### Feature or Bugfix
- Refactoring

### Detail
As explained in the design for #1123 and #1283 we are trying to
implement generic `datasets_base` and `shares_base` modules that can be
used by any type of datasets and by any type of shareable object in a
generic way.

The goal of this PR is to move the `share_object_service` to
`shares_base` and refactor any dependency to S3 in the service.

- Move file and fix imports of ShareObjectService
- Use DatasetsBase and DatasetsBaseRepository instead of the S3
equivalents
- ⚠️ Avoid Dashboard check logic in
`ShareObjectService.submit_share_object` see below
- ⚠️ Avoid SharePolicyService logic in
`ShareObjectService.create_share_object` see below
- Create ShareLogsService for logs
- Remove unused methods
- I also copied share_item_service to shares_base (it will be used in
next PR)


#### Avoid Dashboard check logic in
`ShareObjectService.submit_share_object`
Currently, whenever a share request is submitted, we check if the
REQUESTER environment has dashboards enabled and if there are shared
tables we verify that the Quicksight subscription is active.

Alternative: perform this check in the share processor of tables. It
solves the issue, but it gives a poorer user experience as it is
difficult to figure out for the requester why the share failed. This can
be solved holistically as requested in
#1168.

Decision: move the logic to the processor and make the table share fail.

#### Avoid SharePolicyService logic in
`ShareObjectService.create_share_object`

When a share request is first created, we perform a series of operations
to ensure that an IAM policy for the share requester principal IAM role
is created.

Alternative 1: move this logic inside the share processor. Not sure if
it is possible. It would be the ideal solution, but the
SharePolicyService throws errors in the create share object API if the
policy is not attached.

Alternative 2: implement interface to define share_policies (similar to
the dataset-actions that use share logic in
`backend/dataall/modules/s3_datasets_shares/services/dataset_sharing_service.py`).

Decision: we want to preserve the user experience of having the IAM
policy created before the share request is processed. Plus, it is not an
uncommon pattern that could get extended by other dataset types, for
example redshift sharing might need additional share policies. For this
reason this PR implements alternative 2


### Relates
- #1283 
- #1123 
- #955 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
dlpzx added a commit that referenced this issue Jun 19, 2024
…art 8 (sharei_item_service) (#1350)

### Feature or Bugfix
- Refactoring

### Detail
As explained in the design for #1123 and #1283 we are trying to
implement generic `datasets_base` and `shares_base` modules that can be
used by any type of datasets and by any type of shareable object in a
generic way.

The goal of this PR is to split the `share_item_service`logic into the a
generic service in `shares_base` and an specific service in
s3_datasets_shares.
- `ShareItemService` in shares_base only has shareItem logic without
references to S3 or Glue.
- `S3ShareItemService` in s3_datasets_shares has logic for share items
that are tables and folders.

The files' names are a bit messy but i don't want to pollute this PR
with more changes. I'll do a review of the file names in part 10.

### Relates
- #1283 
- #1123 
- #955 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
dlpzx added a commit that referenced this issue Jun 21, 2024
…art 9 (share db repositories) (#1351)

### Feature or Bugfix
- Refactoring

### Detail
As explained in the design for #1123 and #1283 we are trying to
implement generic `datasets_base` and `shares_base` modules that can be
used by any type of datasets and by any type of shareable object in a
generic way.

This PR includes:
- Split the ShareobjectRepository from s3_datasets_shares into:
- `ShareobjectRepository` (shares_base) - generic db operations on share
objects - no references to S3, Glue
- `ShareStatusRepository` (shares_base) - db operations related to the
sharing state machine states - a way to split the db operations into
smaller chunks
- `S3ShareobjectRepository` (s3_datasets_share) - db operations on s3
share objects - used only in s3_datasets_shares. They might contain
references to DatasetTables, S3Datasets... They are used in the clean-up
activities and to count resources in environment.

- Adapt `S3ShareobjectRepository` to S3 objects. For some queries it was
needed to add filters on the type of share items retrieved, so that if
in the future anyone adds a new share type the code still makes sense.
To add some more meaning, some functions are renamed to clearly point
out that they are s3 functions or what they do.

- Make `ShareobjectRepository` completely generic. The following queries
needed extra work:
- ShareObjectRepository.get_share_item - renamed as
`get_share_item_details`
- `list_shareable_items` - split in 2 parts
`list_shareable_items_of_type` + `paginated_list_shareable_items`: the
first function is invoked recursively over the list of share processors,
instead of querying the DatasetTable, DatasetStorageLocation and
DatasetBucket we query the shareable_type. The challenge is to get all
fields from the db Resource object that all of them are built upon. In
particular the field `itemName` does not match the BucketName (in
bucket) or the S3Prefix (in folders). For this reason I added a
migration script to backfill the DatasetBucket.name as
DatasetBucket.S3BucketName. and the DatasetStorageLocation.name with
DatasetStorageLocation.S3Prefix. `paginated_list_shareable_items` joins
the list of subqueries, filters and paginates.
- In verify_dataset_share_objects instead of using list_shareable_items,
I replaced it by `ShareObjectRepository.get_all_share_items_in_share`
which does not need tables, storage, avoiding the whole S3 logic and
avoiding unnecessary queries

- Remove S3 references from shares_base.api.resolvers. Use DatasetBase
and DatasetBaseRepository instead. Remove unused `ShareableObject`.

- I had some problems with circular dependencies so I created the
`ShareProcessorManager` in shares_base for the registration of
processors. The SharingService uses the manager to get all processors.

Missing items for Part10:
- Lake Formation cross region table references in
shares_base/services/share_item_service.py:add_shared_item
- remove table references in
shares_base/services/share_item_service.py:remove_shared_item
- remove s3_prefix references
shares_base/services/share_notification_service:notify_new_data_available_from_owners
- RENAMING! Right now the names are a bit misleading

### Relates
- #1283 
- #1123 
- #955 


### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
dlpzx added a commit that referenced this issue Jun 25, 2024
…art 10 (other s3 references in shares_base) (#1357)

### Feature or Bugfix
- Refactoring

### Detail
As explained in the design for #1123 and #1283 we are trying to
implement generic `datasets_base` and `shares_base` modules that can be
used by any type of datasets and by any type of shareable object in a
generic way.

This PR:
- Remove the delete_resource_policy conditional for Tables in
`backend/dataall/modules/shares_base/services/share_item_service.py` -->
Permissions to the Table in data.all are granted once the share has
succeeded, the conditional that checks for share_failed tables should
not exist.
- Remove unnecessary check in share_item_service: in add_share_item we
check if it is a table whether it is a cross-region share. This check is
completely unnecessary because when we create a share request object we
are already checking if it is cross-region
- Use `get_share_item_details` in add_share_item - we want to check if
the table, folder, bucket exist so we need to query those tables.
- Move s3_prefix notifications to subscription task
- Fix error in query in
`backend/dataall/modules/shares_base/db/share_state_machines_repositories.py`


### Relates
- #1283 
- #1123 
- #955 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
noah-paige added a commit that referenced this issue Jun 25, 2024
commit 6968e67c 
Author: Noah Paige <[email protected]> 
Date: Fri May 17 2024 16:12:45 GMT-0400 (Eastern Daylight Time) 

    Get to v2.5.0


commit 93ff7725 
Author: Sofia Sazonova <[email protected]> 
Date: Mon May 13 2024 08:00:38 GMT-0400 (Eastern Daylight Time) 

    Update version.json (#1264)

Release info update

commit e718d861 
Author: Sofia Sazonova <[email protected]> 
Date: Mon May 13 2024 07:29:27 GMT-0400 (Eastern Daylight Time) 

    fix permission query (#1263)

### Feature or Bugfix
- Bugfix


### Detail
- The filter -- array of permissions' NAMES, so in order to query
policies correctly we need to add join
- The filter 'share_type' and 'share_item_status' must be string
- IMPORTANT: in block "finally" the param session was used, but session
was defined only in "try" block. So, the lock failed to be released.

### Relates
- <URL or Ticket>

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Sofia Sazonova <[email protected]>

commit 479b8f3f 
Author: mourya-33 <[email protected]> 
Date: Wed May 08 2024 10:29:36 GMT-0400 (Eastern Daylight Time) 

    Add encryption and tag immutability to ECR repository (#1224)

### Feature or Bugfix
- Bugfix

### Detail
- Currently the ecr repository created do not have encryption and tag
immutability enabled which is identified by checkov scans. This fix is
to enable both.

### Relates
[- <URL or Ticket>](https://github.com/data-dot-all/dataall/issues/1200)

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
N/A
  - Is the input sanitized? N/A
- What precautions are you taking before deserializing the data you
consume? N/A
  - Is injection prevented by parametrizing queries? N/A
  - Have you ensured no `eval` or similar functions are used? N/A
- Does this PR introduce any functionality or component that requires
authorization? N/A
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
N/A
  - Are you logging failed auth attempts? N/A
- Are you using or adding any cryptographic features? N/A
  - Do you use a standard proven implementations? N/A
- Are the used keys controlled by the customer? Where are they stored?
No. This is with default encryption
- Are you introducing any new policies/roles/users? N/A
  - Have you used the least-privilege principle? How? N/A


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 2f885773 
Author: Sofia Sazonova <[email protected]> 
Date: Wed May 08 2024 09:22:40 GMT-0400 (Eastern Daylight Time) 

    Multiple permission roots (#1259)

### Feature or Bugfix
- Bugfix


### Detail
- GET_DATASET_TABLE (FOLDER) permissions are granted to the group only
if they are not granted already
- these permissions are removed if group is not admin|steward and there
are no other shares of this item.

### Relates
- #1174

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Sofia Sazonova <[email protected]>

commit c4cc07ee 
Author: Petros Kalos <[email protected]> 
Date: Wed May 08 2024 08:54:02 GMT-0400 (Eastern Daylight Time) 

    explicitly specify dataset_client s3 endpoint_url (#1260)

* AWS requires that the endpoint_url should be explicitly specified for
some regions
* Remove misleading CORS error message, the upload step can fail for
many reason

### Feature or Bugfix
- Bugfix

### Detail
Resolves #778 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 40defe8e 
Author: dlpzx <[email protected]> 
Date: Tue May 07 2024 11:52:17 GMT-0400 (Eastern Daylight Time) 

    Generic dataset module and specific s3_datasets module - part 1 (Rename datasets as s3_datasets) (#1250)

### Feature or Bugfix
- Refactoring

### Detail
- Rename `datasets` module to `s3_datasets` module

This PR is the first step to extract a generic datasets_base module that
implements the undifferentiated concepts of Dataset in data.all.
s3_datasets will use this base module to implement the specific
implementation for S3 datatasets.

### Relates
- #1123 
- #955 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 74a303cb 
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> 
Date: Tue May 07 2024 02:26:09 GMT-0400 (Eastern Daylight Time) 

    Bump werkzeug from 3.0.1 to 3.0.3 in /tests (#1253)

Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.1 to
3.0.3.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/werkzeug/releases">werkzeug's
releases</a>.</em></p>
<blockquote>
<h2>3.0.3</h2>
<p>This is the Werkzeug 3.0.3 security release, which fixes security
issues and bugs but does not otherwise change behavior and should not
result in breaking changes.</p>
<p>PyPI: <a
href="https://pypi.org/project/Werkzeug/3.0.3/">https://pypi.org/project/Werkzeug/3.0.3/</a>
Changes: <a
href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3</a>
Milestone: <a
href="https://github.com/pallets/werkzeug/milestone/35?closed=1">https://github.com/pallets/werkzeug/milestone/35?closed=1</a></p>
<ul>
<li>Only allow <code>localhost</code>, <code>.localhost</code>,
<code>127.0.0.1</code>, or the specified hostname when running the dev
server, to make debugger requests. Additional hosts can be added by
using the debugger middleware directly. The debugger UI makes requests
using the full URL rather than only the path. GHSA-2g68-c3qc-8985</li>
<li>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2823">#2823</a></li>
<li>Better TLS cert format with <code>adhoc</code> dev certs. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2891">#2891</a></li>
<li>Inform Python &lt; 3.12 how to handle <code>itms-services</code>
URIs correctly, rather than using an overly-broad workaround in Werkzeug
that caused some redirect URIs to be passed on without encoding. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2828">#2828</a></li>
<li>Type annotation for <code>Rule.endpoint</code> and other uses of
<code>endpoint</code> is <code>Any</code>. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2836">#2836</a></li>
</ul>
<h2>3.0.2</h2>
<p>This is a fix release for the 3.0.x feature branch.</p>
<ul>
<li>Changes: <a
href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2</a></li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/werkzeug/blob/main/CHANGES.rst">werkzeug's
changelog</a>.</em></p>
<blockquote>
<h2>Version 3.0.3</h2>
<p>Released 2024-05-05</p>
<ul>
<li>
<p>Only allow <code>localhost</code>, <code>.localhost</code>,
<code>127.0.0.1</code>, or the specified
hostname when running the dev server, to make debugger requests.
Additional
hosts can be added by using the debugger middleware directly. The
debugger
UI makes requests using the full URL rather than only the path.
:ghsa:<code>2g68-c3qc-8985</code></p>
</li>
<li>
<p>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. :pr:<code>2823</code></p>
</li>
<li>
<p>Better TLS cert format with <code>adhoc</code> dev certs.
:pr:<code>2891</code></p>
</li>
<li>
<p>Inform Python &lt; 3.12 how to handle <code>itms-services</code> URIs
correctly, rather
than using an overly-broad workaround in Werkzeug that caused some
redirect
URIs to be passed on without encoding. :issue:<code>2828</code></p>
</li>
<li>
<p>Type annotation for <code>Rule.endpoint</code> and other uses of
<code>endpoint</code> is
<code>Any</code>. :issue:<code>2836</code></p>
</li>
<li>
<p>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. :pr:<code>2823</code></p>
</li>
</ul>
<h2>Version 3.0.2</h2>
<p>Released 2024-04-01</p>
<ul>
<li>Ensure setting <code>merge_slashes</code> to <code>False</code>
results in <code>NotFound</code> for
repeated-slash requests against single slash routes.
:issue:<code>2834</code></li>
<li>Fix handling of <code>TypeError</code> in
<code>TypeConversionDict.get()</code> to match
<code>ValueError</code>. :issue:<code>2843</code></li>
<li>Fix <code>response_wrapper</code> type check in test client.
:issue:<code>2831</code></li>
<li>Make the return type of <code>MultiPartParser.parse</code> more
precise.
:issue:<code>2840</code></li>
<li>Raise an error if converter arguments cannot be parsed.
:issue:<code>2822</code></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/pallets/werkzeug/commit/f9995e967979eb694d6b31536cc65314fd7e9c8c"><code>f9995e9</code></a>
release version 3.0.3</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/3386395b24c7371db11a5b8eaac0c91da5362692"><code>3386395</code></a>
Merge pull request from GHSA-2g68-c3qc-8985</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/890b6b62634fa61224222aee31081c61b054ff01"><code>890b6b6</code></a>
only require trusted host for evalex</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/71b69dfb7df3d912e66bab87fbb1f21f83504967"><code>71b69df</code></a>
restrict debugger trusted hosts</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/d2d3869525a4ffb2c41dfb2c0e39d94dab2d870c"><code>d2d3869</code></a>
endpoint type is Any (<a
href="https://redirect.github.com/pallets/werkzeug/issues/2895">#2895</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/7080b55acd48b68afdda65ee6c7f99e9afafb0ba"><code>7080b55</code></a>
endpoint type is Any</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/7555eff296fbdf12f2e576b6bbb0b506df8417ed"><code>7555eff</code></a>
remove iri_to_uri redirect workaround (<a
href="https://redirect.github.com/pallets/werkzeug/issues/2894">#2894</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/97fb2f722297ae4e12e36dab024e0acf8477b3c8"><code>97fb2f7</code></a>
remove _invalid_iri_to_uri workaround</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/249527ff981e7aa22cd714825c5637cc92df7761"><code>249527f</code></a>
make cn field a valid single hostname, and use wildcard in SANs field.
(<a
href="https://redirect.github.com/pallets/werkzeug/issues/2892">#2892</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/793be472c9d145eb9be7d4200672d1806289d84a"><code>793be47</code></a>
update adhoc tls dev cert format</li>
<li>Additional commits viewable in <a
href="https://github.com/pallets/werkzeug/compare/3.0.1...3.0.3">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=werkzeug&package-manager=pip&previous-version=3.0.1&new-version=3.0.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/data-dot-all/dataall/network/alerts).

</details>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit 2f33320c 
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> 
Date: Tue May 07 2024 02:25:03 GMT-0400 (Eastern Daylight Time) 

    Bump werkzeug from 3.0.1 to 3.0.3 in /backend/dataall/base/cdkproxy (#1252)

Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.1 to
3.0.3.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/werkzeug/releases">werkzeug's
releases</a>.</em></p>
<blockquote>
<h2>3.0.3</h2>
<p>This is the Werkzeug 3.0.3 security release, which fixes security
issues and bugs but does not otherwise change behavior and should not
result in breaking changes.</p>
<p>PyPI: <a
href="https://pypi.org/project/Werkzeug/3.0.3/">https://pypi.org/project/Werkzeug/3.0.3/</a>
Changes: <a
href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3</a>
Milestone: <a
href="https://github.com/pallets/werkzeug/milestone/35?closed=1">https://github.com/pallets/werkzeug/milestone/35?closed=1</a></p>
<ul>
<li>Only allow <code>localhost</code>, <code>.localhost</code>,
<code>127.0.0.1</code>, or the specified hostname when running the dev
server, to make debugger requests. Additional hosts can be added by
using the debugger middleware directly. The debugger UI makes requests
using the full URL rather than only the path. GHSA-2g68-c3qc-8985</li>
<li>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2823">#2823</a></li>
<li>Better TLS cert format with <code>adhoc</code> dev certs. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2891">#2891</a></li>
<li>Inform Python &lt; 3.12 how to handle <code>itms-services</code>
URIs correctly, rather than using an overly-broad workaround in Werkzeug
that caused some redirect URIs to be passed on without encoding. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2828">#2828</a></li>
<li>Type annotation for <code>Rule.endpoint</code> and other uses of
<code>endpoint</code> is <code>Any</code>. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2836">#2836</a></li>
</ul>
<h2>3.0.2</h2>
<p>This is a fix release for the 3.0.x feature branch.</p>
<ul>
<li>Changes: <a
href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2</a></li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/werkzeug/blob/main/CHANGES.rst">werkzeug's
changelog</a>.</em></p>
<blockquote>
<h2>Version 3.0.3</h2>
<p>Released 2024-05-05</p>
<ul>
<li>
<p>Only allow <code>localhost</code>, <code>.localhost</code>,
<code>127.0.0.1</code>, or the specified
hostname when running the dev server, to make debugger requests.
Additional
hosts can be added by using the debugger middleware directly. The
debugger
UI makes requests using the full URL rather than only the path.
:ghsa:<code>2g68-c3qc-8985</code></p>
</li>
<li>
<p>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. :pr:<code>2823</code></p>
</li>
<li>
<p>Better TLS cert format with <code>adhoc</code> dev certs.
:pr:<code>2891</code></p>
</li>
<li>
<p>Inform Python &lt; 3.12 how to handle <code>itms-services</code> URIs
correctly, rather
than using an overly-broad workaround in Werkzeug that caused some
redirect
URIs to be passed on without encoding. :issue:<code>2828</code></p>
</li>
<li>
<p>Type annotation for <code>Rule.endpoint</code> and other uses of
<code>endpoint</code> is
<code>Any</code>. :issue:<code>2836</code></p>
</li>
<li>
<p>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. :pr:<code>2823</code></p>
</li>
</ul>
<h2>Version 3.0.2</h2>
<p>Released 2024-04-01</p>
<ul>
<li>Ensure setting <code>merge_slashes</code> to <code>False</code>
results in <code>NotFound</code> for
repeated-slash requests against single slash routes.
:issue:<code>2834</code></li>
<li>Fix handling of <code>TypeError</code> in
<code>TypeConversionDict.get()</code> to match
<code>ValueError</code>. :issue:<code>2843</code></li>
<li>Fix <code>response_wrapper</code> type check in test client.
:issue:<code>2831</code></li>
<li>Make the return type of <code>MultiPartParser.parse</code> more
precise.
:issue:<code>2840</code></li>
<li>Raise an error if converter arguments cannot be parsed.
:issue:<code>2822</code></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/pallets/werkzeug/commit/f9995e967979eb694d6b31536cc65314fd7e9c8c"><code>f9995e9</code></a>
release version 3.0.3</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/3386395b24c7371db11a5b8eaac0c91da5362692"><code>3386395</code></a>
Merge pull request from GHSA-2g68-c3qc-8985</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/890b6b62634fa61224222aee31081c61b054ff01"><code>890b6b6</code></a>
only require trusted host for evalex</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/71b69dfb7df3d912e66bab87fbb1f21f83504967"><code>71b69df</code></a>
restrict debugger trusted hosts</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/d2d3869525a4ffb2c41dfb2c0e39d94dab2d870c"><code>d2d3869</code></a>
endpoint type is Any (<a
href="https://redirect.github.com/pallets/werkzeug/issues/2895">#2895</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/7080b55acd48b68afdda65ee6c7f99e9afafb0ba"><code>7080b55</code></a>
endpoint type is Any</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/7555eff296fbdf12f2e576b6bbb0b506df8417ed"><code>7555eff</code></a>
remove iri_to_uri redirect workaround (<a
href="https://redirect.github.com/pallets/werkzeug/issues/2894">#2894</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/97fb2f722297ae4e12e36dab024e0acf8477b3c8"><code>97fb2f7</code></a>
remove _invalid_iri_to_uri workaround</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/249527ff981e7aa22cd714825c5637cc92df7761"><code>249527f</code></a>
make cn field a valid single hostname, and use wildcard in SANs field.
(<a
href="https://redirect.github.com/pallets/werkzeug/issues/2892">#2892</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/793be472c9d145eb9be7d4200672d1806289d84a"><code>793be47</code></a>
update adhoc tls dev cert format</li>
<li>Additional commits viewable in <a
href="https://github.com/pallets/werkzeug/compare/3.0.1...3.0.3">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=werkzeug&package-manager=pip&previous-version=3.0.1&new-version=3.0.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/data-dot-all/dataall/network/alerts).

</details>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit 0b49633f 
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> 
Date: Tue May 07 2024 02:24:34 GMT-0400 (Eastern Daylight Time) 

    Bump werkzeug from 3.0.1 to 3.0.3 in /tests_new/integration_tests (#1254)

Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.1 to
3.0.3.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/werkzeug/releases">werkzeug's
releases</a>.</em></p>
<blockquote>
<h2>3.0.3</h2>
<p>This is the Werkzeug 3.0.3 security release, which fixes security
issues and bugs but does not otherwise change behavior and should not
result in breaking changes.</p>
<p>PyPI: <a
href="https://pypi.org/project/Werkzeug/3.0.3/">https://pypi.org/project/Werkzeug/3.0.3/</a>
Changes: <a
href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3</a>
Milestone: <a
href="https://github.com/pallets/werkzeug/milestone/35?closed=1">https://github.com/pallets/werkzeug/milestone/35?closed=1</a></p>
<ul>
<li>Only allow <code>localhost</code>, <code>.localhost</code>,
<code>127.0.0.1</code>, or the specified hostname when running the dev
server, to make debugger requests. Additional hosts can be added by
using the debugger middleware directly. The debugger UI makes requests
using the full URL rather than only the path. GHSA-2g68-c3qc-8985</li>
<li>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2823">#2823</a></li>
<li>Better TLS cert format with <code>adhoc</code> dev certs. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2891">#2891</a></li>
<li>Inform Python &lt; 3.12 how to handle <code>itms-services</code>
URIs correctly, rather than using an overly-broad workaround in Werkzeug
that caused some redirect URIs to be passed on without encoding. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2828">#2828</a></li>
<li>Type annotation for <code>Rule.endpoint</code> and other uses of
<code>endpoint</code> is <code>Any</code>. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2836">#2836</a></li>
</ul>
<h2>3.0.2</h2>
<p>This is a fix release for the 3.0.x feature branch.</p>
<ul>
<li>Changes: <a
href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2</a></li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/werkzeug/blob/main/CHANGES.rst">werkzeug's
changelog</a>.</em></p>
<blockquote>
<h2>Version 3.0.3</h2>
<p>Released 2024-05-05</p>
<ul>
<li>
<p>Only allow <code>localhost</code>, <code>.localhost</code>,
<code>127.0.0.1</code>, or the specified
hostname when running the dev server, to make debugger requests.
Additional
hosts can be added by using the debugger middleware directly. The
debugger
UI makes requests using the full URL rather than only the path.
:ghsa:<code>2g68-c3qc-8985</code></p>
</li>
<li>
<p>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. :pr:<code>2823</code></p>
</li>
<li>
<p>Better TLS cert format with <code>adhoc</code> dev certs.
:pr:<code>2891</code></p>
</li>
<li>
<p>Inform Python &lt; 3.12 how to handle <code>itms-services</code> URIs
correctly, rather
than using an overly-broad workaround in Werkzeug that caused some
redirect
URIs to be passed on without encoding. :issue:<code>2828</code></p>
</li>
<li>
<p>Type annotation for <code>Rule.endpoint</code> and other uses of
<code>endpoint</code> is
<code>Any</code>. :issue:<code>2836</code></p>
</li>
<li>
<p>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. :pr:<code>2823</code></p>
</li>
</ul>
<h2>Version 3.0.2</h2>
<p>Released 2024-04-01</p>
<ul>
<li>Ensure setting <code>merge_slashes</code> to <code>False</code>
results in <code>NotFound</code> for
repeated-slash requests against single slash routes.
:issue:<code>2834</code></li>
<li>Fix handling of <code>TypeError</code> in
<code>TypeConversionDict.get()</code> to match
<code>ValueError</code>. :issue:<code>2843</code></li>
<li>Fix <code>response_wrapper</code> type check in test client.
:issue:<code>2831</code></li>
<li>Make the return type of <code>MultiPartParser.parse</code> more
precise.
:issue:<code>2840</code></li>
<li>Raise an error if converter arguments cannot be parsed.
:issue:<code>2822</code></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/pallets/werkzeug/commit/f9995e967979eb694d6b31536cc65314fd7e9c8c"><code>f9995e9</code></a>
release version 3.0.3</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/3386395b24c7371db11a5b8eaac0c91da5362692"><code>3386395</code></a>
Merge pull request from GHSA-2g68-c3qc-8985</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/890b6b62634fa61224222aee31081c61b054ff01"><code>890b6b6</code></a>
only require trusted host for evalex</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/71b69dfb7df3d912e66bab87fbb1f21f83504967"><code>71b69df</code></a>
restrict debugger trusted hosts</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/d2d3869525a4ffb2c41dfb2c0e39d94dab2d870c"><code>d2d3869</code></a>
endpoint type is Any (<a
href="https://redirect.github.com/pallets/werkzeug/issues/2895">#2895</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/7080b55acd48b68afdda65ee6c7f99e9afafb0ba"><code>7080b55</code></a>
endpoint type is Any</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/7555eff296fbdf12f2e576b6bbb0b506df8417ed"><code>7555eff</code></a>
remove iri_to_uri redirect workaround (<a
href="https://redirect.github.com/pallets/werkzeug/issues/2894">#2894</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/97fb2f722297ae4e12e36dab024e0acf8477b3c8"><code>97fb2f7</code></a>
remove _invalid_iri_to_uri workaround</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/249527ff981e7aa22cd714825c5637cc92df7761"><code>249527f</code></a>
make cn field a valid single hostname, and use wildcard in SANs field.
(<a
href="https://redirect.github.com/pallets/werkzeug/issues/2892">#2892</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/793be472c9d145eb9be7d4200672d1806289d84a"><code>793be47</code></a>
update adhoc tls dev cert format</li>
<li>Additional commits viewable in <a
href="https://github.com/pallets/werkzeug/compare/3.0.1...3.0.3">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=werkzeug&package-manager=pip&previous-version=3.0.1&new-version=3.0.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/data-dot-all/dataall/network/alerts).

</details>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit 08862420 
Author: mourya-33 <[email protected]> 
Date: Tue May 07 2024 02:15:15 GMT-0400 (Eastern Daylight Time) 

    Updated lambda_api.py to add encryption for lambda env vars for custo… (#1255)

Feature or Bugfix

    Bugfix

Detail

The environment variables for the lambda functions are not encrypted in
cdk which are identified by checkov scans. This fix is to enable kms
encryption for the lambda environment variables.

Relates


Security

Please answer the questions below briefly where applicable, or write
N/A. Based on
[OWASP 10](https://owasp.org/Top10/en/).

Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)? N/A
        Is the input sanitized? N/A
What precautions are you taking before deserializing the data you
consume? N/A
        Is injection prevented by parametrizing queries? N/A
        Have you ensured no eval or similar functions are used? N/A
Does this PR introduce any functionality or component that requires
authorization? N/A
How have you ensured it respects the existing AuthN/AuthZ mechanisms?
N/A
        Are you logging failed auth attempts? N/A
    Are you using or adding any cryptographic features? N/A
        Do you use a standard proven implementations? N/A
Are the used keys controlled by the customer? Where are they stored? the
KMS keys are generated by cdk and are used to encrypt the environment
variables for all lambda functions in the lambda-api stack
    Are you introducing any new policies/roles/users? - N/A
        Have you used the least-privilege principle? How? N/A

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit ed7cc3eb 
Author: Noah Paige <[email protected]> 
Date: Mon May 06 2024 09:32:30 GMT-0400 (Eastern Daylight Time) 

    Add order_by for paginated queries  (#1249)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- This PR aims to solve the following

- (1) for particular queries (identified as ones that perform
`.outerjoin()` operations and have results paginated with `paginate()`
function - sometimes the returned query results is *less than* the limit
set by the pageSize of the paginate function even when the total count
is greater than the pageSize
- Ex 1: 11 envs total, `query_user_environments()` returning 9 envs on
1st page + 2 on 2nd page
- Ex 2: 10 envs total, `query_user_environments()` returning 9 envs on
1st page + no 2nd page

- Believe this is to be happening due to the way SQLAlchemy is
"uniquing" the records resulted from an outerjoin and then returning
that result back to the frontend

- Adding a `.distinct()` check on the query ensures each distinct record
is returned (tested successfully)

- (2) Currently we often times do not implement an `.order_by()`
condition for the query used in `paginate()` and do not have a stable
way of preserving order of the items returned from a query (i.e. when
navigating through pages of response)
- A generally good practice seems to include an `order_by()` on a column
or set of columns
- For each query used in `paginate()` this PR adds an `order_by()`
condition (full list in comments below)

Can read a bit more context from related issue linked below

### Relates
- https://github.com/data-dot-all/dataall/issues/1241

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 98e67fa8 
Author: Sofia Sazonova <[email protected]> 
Date: Fri May 03 2024 12:21:57 GMT-0400 (Eastern Daylight Time) 

    fix: DATASET_READ_TABLE read permissions (#1237)

### Feature or Bugfix
- Bugfix


### Detail
- backfill DATASET_READ_TABLE permissions
- delete this permissions, when dataset tables are revoked or deteled 
- 
### Relates
- #1173

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Sofia Sazonova <[email protected]>

commit 18e2f509 
Author: Noah Paige <[email protected]> 
Date: Fri May 03 2024 10:14:52 GMT-0400 (Eastern Daylight Time) 

    Fix local test groups listing for listGroups query (#1239)

### Feature or Bugfix
<!-- please choose -->
- Bugfix


### Detail
- Locally when trying to invite a team to Env or Org we call listGroups
and the returned `LOCAL_TEST_GROUPS` is not returning the proper data
type expected


### Relates
N/A

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit a0be03c4 
Author: dlpzx <[email protected]> 
Date: Fri May 03 2024 10:12:34 GMT-0400 (Eastern Daylight Time) 

    Refactor: uncouple datasets and dataset_sharing modules - part 2-5 FINAL DELETE DATASETS_BASE (#1242)

### Feature or Bugfix
- Refactoring

### Detail
After all the previous PRs are merged, there should be no circular
dependencies between `datasets` and `datasets_sharing`. We can now
proceed to:
- move `datasets_base` models, repositories, permissions and enums to
`datasets`
- adjust the `__init__` files to establish the `datasets_sharing`
depends on `datasets`
- adjust the Module interfaces to ensure that all necessary dataset
models... are imported in the interface for sharing


Next steps:
- share_notifications paramter to dataset_sharing in config.json

### Relates
#955 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit b68b40c1 
Author: Sofia Sazonova <[email protected]> 
Date: Fri May 03 2024 10:12:11 GMT-0400 (Eastern Daylight Time) 

    bugfix: EnvironmentGroup can remove other groups (#1234)

### Feature or Bugfix
<!-- please choose -->
- Bugfix


### Detail
- Now, if the group can't update other group, it also can not remove
them.
- 
### Relates
- #1212 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Sofia Sazonova <[email protected]>

commit 264539b5 
Author: Noah Paige <[email protected]> 
Date: Fri May 03 2024 05:23:11 GMT-0400 (Eastern Daylight Time) 

    Fix Alembic Migration: has table checks (#1240)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- Fix `has_table()` check to ensure dropping the tables if the exists as
part of alembic migration upgrade
- Fix `DatasetLock nullable=True`

### Relates
- https://github.com/data-dot-all/dataall/issues/1165

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)? No
  - Is the input sanitized? N/A
- What precautions are you taking before deserializing the data you
consume? N/A
  - Is injection prevented by parametrizing queries? N/A
  - Have you ensured no `eval` or similar functions are used? N/A
- Does this PR introduce any functionality or component that requires
authorization? No
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
N/A
  - Are you logging failed auth attempts? N/A
- Are you using or adding any cryptographic features? No
  - Do you use a standard proven implementations? N/A
- Are the used keys controlled by the customer? Where are they stored?
N/A
- Are you introducing any new policies/roles/users? No
  - Have you used the least-privilege principle? How? N/A


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 42a5f6bd 
Author: dlpzx <[email protected]> 
Date: Fri May 03 2024 02:24:09 GMT-0400 (Eastern Daylight Time) 

    Refactor: uncouple datasets and dataset_sharing modules - part 2-4 (#1214)

### Feature or Bugfix
- Refactoring
⚠️ MERGE AFTER https://github.com/data-dot-all/dataall/pull/1213

### Detail
This is needed as explained in full PR [AFTER 2.4] Refactor: uncouple
datasets and dataset_sharing modules #1179
- [X] Use interface to resolve dataset roles related to datasets shared
and implement logic in the dataset_sharing module
- [X] Extend and clean-up stewards share permissions through interface

### Relates
- #1179 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 6d3f2d45 
Author: Sofia Sazonova <[email protected]> 
Date: Thu May 02 2024 10:55:00 GMT-0400 (Eastern Daylight Time) 

    [After 2.4]Core Refactoring part5 (#1194)

### Feature or Bugfix
- Refactoring

### Detail
- focus on core/environments
- move logic from resolvers to services
- create s3_client in base/aws --> TO BE REFACTORED. Needs to be merged
with dataset_sharind/aws/s3_client

### Relates
- #741 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Sofia Sazonova <[email protected]>

commit 2ea24cbb 
Author: dlpzx <[email protected]> 
Date: Thu May 02 2024 08:22:12 GMT-0400 (Eastern Daylight Time) 

    Refactor: uncouple datasets and dataset_sharing modules - part 2-3 (#1213)

### Feature or Bugfix
- Refactoring
⚠️ MERGE AFTER https://github.com/data-dot-all/dataall/pull/1187

### Detail
This is needed as explained in full PR [AFTER 2.4] Refactor: uncouple
datasets and dataset_sharing modules #1179

- [X] Creates an interface to execute checks and clean-ups of data
sharing objects when dataset objects are deleted (initially it was going
to be an db interface, but I think it is better in the service)
- [X] Move listDatasetShares query to dataset_sharing module in
https://github.com/data-dot-all/dataall/pull/1185

### Relates
-  #1179

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 750a5ec8 
Author: Anushka Singh <[email protected]> 
Date: Wed May 01 2024 12:28:18 GMT-0400 (Eastern Daylight Time) 

    Feature:1221 - Make visibility of auto-approval toggle configurable based on confidentiality (#1223)

### Feature or Bugfix

- Feature


### Detail
- Users should be able to disable visibility of auto-approval toggle
with code. For example, at our company, we require that shares always go
through approval process if their confidentiality classification is
Secret. We dont even want to give the option to users to be able to set
autoApproval enabled to ensure they dont do so by mistake and end up
over sharing.

Video demo:
https://github.com/data-dot-all/dataall/issues/1221#issuecomment-2077412044

### Relates
- https://github.com/data-dot-all/dataall/issues/1221

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 82044689 
Author: dlpzx <[email protected]> 
Date: Wed May 01 2024 12:26:42 GMT-0400 (Eastern Daylight Time) 

    Refactor: uncouple datasets and dataset_sharing modules - part 2-2 (#1187)

### Feature or Bugfix
- Refactoring
⚠️ MERGE AFTER https://github.com/data-dot-all/dataall/pull/1185

### Detail
This is needed as explained in full PR [AFTER 2.4] Refactor: uncouple
datasets and dataset_sharing modules #1179
- Split the getDatasetAssumeRole API into 2 APIs, one for dataset owners
role (in datasets module) and another one for share requester roles (in
datasets_sharing module)

### Relates
-  #1179

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 5173419f 
Author: Noah Paige <[email protected]> 
Date: Wed May 01 2024 12:24:42 GMT-0400 (Eastern Daylight Time) 

    Fix so listValidEnvironments called only once (#1238)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- When request access to a share on data.all the query to
`listValidEnvironments` used to be called twice which (depending on how
long for query results to return) could cause the environment initially
selected to disappear


### Relates
- Continuation of https://github.com/data-dot-all/dataall/issues/916

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 7656ea86 
Author: dlpzx <[email protected]> 
Date: Tue Apr 30 2024 07:13:01 GMT-0400 (Eastern Daylight Time) 

    Add integration tests on a real API client and integrate the tests in CICD (#1219)

### Feature or Bugfix
- Feature

### Detail
Add integration tests that use a real Client to execute different
validation actions.

- Define the Client and the way API calls are posted to API Gateway in
the conftest
- Define the Cognito users and the different fixtures needed for all
tests
- Write tests for the Organization core module as example
- Add feature flag in `cdk.json` called `with_approval_tests` that can
be defined at the deployment environment level. If set to True, a
CodeBuild stage running the tests is created.

### Relates
- https://github.com/data-dot-all/dataall/issues/1220

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit b963fe81 
Author: Sofia Sazonova <[email protected]> 
Date: Mon Apr 29 2024 09:26:36 GMT-0400 (Eastern Daylight Time) 

    Notification link routes to a share request page (#1227)

### Feature or Bugfix
<!-- please choose -->
- Feature

### Detail
- in notification object field `target_uri = 'shareUri|DataSetUri'`
- this value is parsed and used to redirect user to a relevant Share
Request page

### Relates
- #1115 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

Co-authored-by: Sofia Sazonova <[email protected]>

commit 6386fe14 
Author: dlpzx <[email protected]> 
Date: Mon Apr 29 2024 07:32:00 GMT-0400 (Eastern Daylight Time) 

    Refactor: uncouple datasets and dataset_sharing modules - part 2 (#1185)

### Feature or Bugfix
- Refactoring

### Detail

Remove and move logic from dataset to datasets_sharing module. This is
needed as explained in full PR [AFTER 2.4] Refactor: uncouple datasets
and dataset_sharing modules #1179
- [X] Moves the verify dataset shares mutation to the datasets_sharing
module
- [X] Move dataset_subscription task to dataset_sharing
- [X] Move listDatasetShares query to dataset_sharing module
- [X] Remove unused `shares` field from the Dataset graphql type as it
was not used in the frontend: listDatasets, listOwnedDatasets,
listDatasetsOwnedByEnvGroup, listDatasetsCreatedInEnvironment and
getDataset
- [x] Move getSharedDatasetTables to data_sharing module and fix
reference to DatasetService

I am aware that some of the queries and mutations that this PR moves
look a bit odd in the dataset_sharing module, but this will be solved
once data sharing is divided into dataset_sharing_base and
s3_dataset_sharing.


### Relates
#1179

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
aut…
noah-paige added a commit that referenced this issue Jun 25, 2024
commit a06c8cba 
Author: Noah Paige <[email protected]> 
Date: Fri May 17 2024 16:37:05 GMT-0400 (Eastern Daylight Time) 

    Merge share logs PR


commit aee98cf7 
Author: Noah Paige <[email protected]> 
Date: Fri May 17 2024 16:34:24 GMT-0400 (Eastern Daylight Time) 

    Merge share logs PR


commit 5ca55303 
Author: Sofia Sazonova <[email protected]> 
Date: Wed May 15 2024 11:57:41 GMT-0400 (Eastern Daylight Time) 

    remove unused imports


commit 8f8bf3dd 
Author: Sofia Sazonova <[email protected]> 
Date: Wed May 15 2024 11:56:39 GMT-0400 (Eastern Daylight Time) 

    restrict access to the share logs


commit 9137da9b 
Author: Sofia Sazonova <[email protected]> 
Date: Wed May 15 2024 11:28:32 GMT-0400 (Eastern Daylight Time) 

    share Logs button is available only for dataset Admins and stewards


commit fcb16bd9 
Author: Sofia Sazonova <[email protected]> 
Date: Wed May 15 2024 10:46:25 GMT-0400 (Eastern Daylight Time) 

    getShareLogs query


commit 0503a3bb 
Author: Sofia Sazonova <[email protected]> 
Date: Wed May 15 2024 10:21:25 GMT-0400 (Eastern Daylight Time) 

    Logs modal in Share View


commit bab2f3e6 
Author: Sofia Sazonova <[email protected]> 
Date: Mon May 13 2024 09:09:18 GMT-0400 (Eastern Daylight Time) 

    Add confirmation pop-ups for deletion of team roles and groups (#1231)

### Feature or Bugfix

- Feature



### Detail
Pop ups added for:
- deletion team from environment
- deletion of the consumption role
- deletion of group from Organization

### Relates
- #942 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

Co-authored-by: Sofia Sazonova <[email protected]>

commit 93ff7725 
Author: Sofia Sazonova <[email protected]> 
Date: Mon May 13 2024 08:00:38 GMT-0400 (Eastern Daylight Time) 

    Update version.json (#1264)

Release info update

commit e718d861 
Author: Sofia Sazonova <[email protected]> 
Date: Mon May 13 2024 07:29:27 GMT-0400 (Eastern Daylight Time) 

    fix permission query (#1263)

### Feature or Bugfix
- Bugfix


### Detail
- The filter -- array of permissions' NAMES, so in order to query
policies correctly we need to add join
- The filter 'share_type' and 'share_item_status' must be string
- IMPORTANT: in block "finally" the param session was used, but session
was defined only in "try" block. So, the lock failed to be released.

### Relates
- <URL or Ticket>

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Sofia Sazonova <[email protected]>

commit 479b8f3f 
Author: mourya-33 <[email protected]> 
Date: Wed May 08 2024 10:29:36 GMT-0400 (Eastern Daylight Time) 

    Add encryption and tag immutability to ECR repository (#1224)

### Feature or Bugfix
- Bugfix

### Detail
- Currently the ecr repository created do not have encryption and tag
immutability enabled which is identified by checkov scans. This fix is
to enable both.

### Relates
[- <URL or Ticket>](https://github.com/data-dot-all/dataall/issues/1200)

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
N/A
  - Is the input sanitized? N/A
- What precautions are you taking before deserializing the data you
consume? N/A
  - Is injection prevented by parametrizing queries? N/A
  - Have you ensured no `eval` or similar functions are used? N/A
- Does this PR introduce any functionality or component that requires
authorization? N/A
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
N/A
  - Are you logging failed auth attempts? N/A
- Are you using or adding any cryptographic features? N/A
  - Do you use a standard proven implementations? N/A
- Are the used keys controlled by the customer? Where are they stored?
No. This is with default encryption
- Are you introducing any new policies/roles/users? N/A
  - Have you used the least-privilege principle? How? N/A


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 2f885773 
Author: Sofia Sazonova <[email protected]> 
Date: Wed May 08 2024 09:22:40 GMT-0400 (Eastern Daylight Time) 

    Multiple permission roots (#1259)

### Feature or Bugfix
- Bugfix


### Detail
- GET_DATASET_TABLE (FOLDER) permissions are granted to the group only
if they are not granted already
- these permissions are removed if group is not admin|steward and there
are no other shares of this item.

### Relates
- #1174

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Sofia Sazonova <[email protected]>

commit c4cc07ee 
Author: Petros Kalos <[email protected]> 
Date: Wed May 08 2024 08:54:02 GMT-0400 (Eastern Daylight Time) 

    explicitly specify dataset_client s3 endpoint_url (#1260)

* AWS requires that the endpoint_url should be explicitly specified for
some regions
* Remove misleading CORS error message, the upload step can fail for
many reason

### Feature or Bugfix
- Bugfix

### Detail
Resolves #778 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 40defe8e 
Author: dlpzx <[email protected]> 
Date: Tue May 07 2024 11:52:17 GMT-0400 (Eastern Daylight Time) 

    Generic dataset module and specific s3_datasets module - part 1 (Rename datasets as s3_datasets) (#1250)

### Feature or Bugfix
- Refactoring

### Detail
- Rename `datasets` module to `s3_datasets` module

This PR is the first step to extract a generic datasets_base module that
implements the undifferentiated concepts of Dataset in data.all.
s3_datasets will use this base module to implement the specific
implementation for S3 datatasets.

### Relates
- #1123 
- #955 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 74a303cb 
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> 
Date: Tue May 07 2024 02:26:09 GMT-0400 (Eastern Daylight Time) 

    Bump werkzeug from 3.0.1 to 3.0.3 in /tests (#1253)

Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.1 to
3.0.3.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/werkzeug/releases">werkzeug's
releases</a>.</em></p>
<blockquote>
<h2>3.0.3</h2>
<p>This is the Werkzeug 3.0.3 security release, which fixes security
issues and bugs but does not otherwise change behavior and should not
result in breaking changes.</p>
<p>PyPI: <a
href="https://pypi.org/project/Werkzeug/3.0.3/">https://pypi.org/project/Werkzeug/3.0.3/</a>
Changes: <a
href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3</a>
Milestone: <a
href="https://github.com/pallets/werkzeug/milestone/35?closed=1">https://github.com/pallets/werkzeug/milestone/35?closed=1</a></p>
<ul>
<li>Only allow <code>localhost</code>, <code>.localhost</code>,
<code>127.0.0.1</code>, or the specified hostname when running the dev
server, to make debugger requests. Additional hosts can be added by
using the debugger middleware directly. The debugger UI makes requests
using the full URL rather than only the path. GHSA-2g68-c3qc-8985</li>
<li>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2823">#2823</a></li>
<li>Better TLS cert format with <code>adhoc</code> dev certs. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2891">#2891</a></li>
<li>Inform Python &lt; 3.12 how to handle <code>itms-services</code>
URIs correctly, rather than using an overly-broad workaround in Werkzeug
that caused some redirect URIs to be passed on without encoding. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2828">#2828</a></li>
<li>Type annotation for <code>Rule.endpoint</code> and other uses of
<code>endpoint</code> is <code>Any</code>. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2836">#2836</a></li>
</ul>
<h2>3.0.2</h2>
<p>This is a fix release for the 3.0.x feature branch.</p>
<ul>
<li>Changes: <a
href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2</a></li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/werkzeug/blob/main/CHANGES.rst">werkzeug's
changelog</a>.</em></p>
<blockquote>
<h2>Version 3.0.3</h2>
<p>Released 2024-05-05</p>
<ul>
<li>
<p>Only allow <code>localhost</code>, <code>.localhost</code>,
<code>127.0.0.1</code>, or the specified
hostname when running the dev server, to make debugger requests.
Additional
hosts can be added by using the debugger middleware directly. The
debugger
UI makes requests using the full URL rather than only the path.
:ghsa:<code>2g68-c3qc-8985</code></p>
</li>
<li>
<p>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. :pr:<code>2823</code></p>
</li>
<li>
<p>Better TLS cert format with <code>adhoc</code> dev certs.
:pr:<code>2891</code></p>
</li>
<li>
<p>Inform Python &lt; 3.12 how to handle <code>itms-services</code> URIs
correctly, rather
than using an overly-broad workaround in Werkzeug that caused some
redirect
URIs to be passed on without encoding. :issue:<code>2828</code></p>
</li>
<li>
<p>Type annotation for <code>Rule.endpoint</code> and other uses of
<code>endpoint</code> is
<code>Any</code>. :issue:<code>2836</code></p>
</li>
<li>
<p>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. :pr:<code>2823</code></p>
</li>
</ul>
<h2>Version 3.0.2</h2>
<p>Released 2024-04-01</p>
<ul>
<li>Ensure setting <code>merge_slashes</code> to <code>False</code>
results in <code>NotFound</code> for
repeated-slash requests against single slash routes.
:issue:<code>2834</code></li>
<li>Fix handling of <code>TypeError</code> in
<code>TypeConversionDict.get()</code> to match
<code>ValueError</code>. :issue:<code>2843</code></li>
<li>Fix <code>response_wrapper</code> type check in test client.
:issue:<code>2831</code></li>
<li>Make the return type of <code>MultiPartParser.parse</code> more
precise.
:issue:<code>2840</code></li>
<li>Raise an error if converter arguments cannot be parsed.
:issue:<code>2822</code></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/pallets/werkzeug/commit/f9995e967979eb694d6b31536cc65314fd7e9c8c"><code>f9995e9</code></a>
release version 3.0.3</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/3386395b24c7371db11a5b8eaac0c91da5362692"><code>3386395</code></a>
Merge pull request from GHSA-2g68-c3qc-8985</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/890b6b62634fa61224222aee31081c61b054ff01"><code>890b6b6</code></a>
only require trusted host for evalex</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/71b69dfb7df3d912e66bab87fbb1f21f83504967"><code>71b69df</code></a>
restrict debugger trusted hosts</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/d2d3869525a4ffb2c41dfb2c0e39d94dab2d870c"><code>d2d3869</code></a>
endpoint type is Any (<a
href="https://redirect.github.com/pallets/werkzeug/issues/2895">#2895</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/7080b55acd48b68afdda65ee6c7f99e9afafb0ba"><code>7080b55</code></a>
endpoint type is Any</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/7555eff296fbdf12f2e576b6bbb0b506df8417ed"><code>7555eff</code></a>
remove iri_to_uri redirect workaround (<a
href="https://redirect.github.com/pallets/werkzeug/issues/2894">#2894</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/97fb2f722297ae4e12e36dab024e0acf8477b3c8"><code>97fb2f7</code></a>
remove _invalid_iri_to_uri workaround</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/249527ff981e7aa22cd714825c5637cc92df7761"><code>249527f</code></a>
make cn field a valid single hostname, and use wildcard in SANs field.
(<a
href="https://redirect.github.com/pallets/werkzeug/issues/2892">#2892</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/793be472c9d145eb9be7d4200672d1806289d84a"><code>793be47</code></a>
update adhoc tls dev cert format</li>
<li>Additional commits viewable in <a
href="https://github.com/pallets/werkzeug/compare/3.0.1...3.0.3">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=werkzeug&package-manager=pip&previous-version=3.0.1&new-version=3.0.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/data-dot-all/dataall/network/alerts).

</details>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit 2f33320c 
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> 
Date: Tue May 07 2024 02:25:03 GMT-0400 (Eastern Daylight Time) 

    Bump werkzeug from 3.0.1 to 3.0.3 in /backend/dataall/base/cdkproxy (#1252)

Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.1 to
3.0.3.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/werkzeug/releases">werkzeug's
releases</a>.</em></p>
<blockquote>
<h2>3.0.3</h2>
<p>This is the Werkzeug 3.0.3 security release, which fixes security
issues and bugs but does not otherwise change behavior and should not
result in breaking changes.</p>
<p>PyPI: <a
href="https://pypi.org/project/Werkzeug/3.0.3/">https://pypi.org/project/Werkzeug/3.0.3/</a>
Changes: <a
href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3</a>
Milestone: <a
href="https://github.com/pallets/werkzeug/milestone/35?closed=1">https://github.com/pallets/werkzeug/milestone/35?closed=1</a></p>
<ul>
<li>Only allow <code>localhost</code>, <code>.localhost</code>,
<code>127.0.0.1</code>, or the specified hostname when running the dev
server, to make debugger requests. Additional hosts can be added by
using the debugger middleware directly. The debugger UI makes requests
using the full URL rather than only the path. GHSA-2g68-c3qc-8985</li>
<li>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2823">#2823</a></li>
<li>Better TLS cert format with <code>adhoc</code> dev certs. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2891">#2891</a></li>
<li>Inform Python &lt; 3.12 how to handle <code>itms-services</code>
URIs correctly, rather than using an overly-broad workaround in Werkzeug
that caused some redirect URIs to be passed on without encoding. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2828">#2828</a></li>
<li>Type annotation for <code>Rule.endpoint</code> and other uses of
<code>endpoint</code> is <code>Any</code>. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2836">#2836</a></li>
</ul>
<h2>3.0.2</h2>
<p>This is a fix release for the 3.0.x feature branch.</p>
<ul>
<li>Changes: <a
href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2</a></li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/werkzeug/blob/main/CHANGES.rst">werkzeug's
changelog</a>.</em></p>
<blockquote>
<h2>Version 3.0.3</h2>
<p>Released 2024-05-05</p>
<ul>
<li>
<p>Only allow <code>localhost</code>, <code>.localhost</code>,
<code>127.0.0.1</code>, or the specified
hostname when running the dev server, to make debugger requests.
Additional
hosts can be added by using the debugger middleware directly. The
debugger
UI makes requests using the full URL rather than only the path.
:ghsa:<code>2g68-c3qc-8985</code></p>
</li>
<li>
<p>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. :pr:<code>2823</code></p>
</li>
<li>
<p>Better TLS cert format with <code>adhoc</code> dev certs.
:pr:<code>2891</code></p>
</li>
<li>
<p>Inform Python &lt; 3.12 how to handle <code>itms-services</code> URIs
correctly, rather
than using an overly-broad workaround in Werkzeug that caused some
redirect
URIs to be passed on without encoding. :issue:<code>2828</code></p>
</li>
<li>
<p>Type annotation for <code>Rule.endpoint</code> and other uses of
<code>endpoint</code> is
<code>Any</code>. :issue:<code>2836</code></p>
</li>
<li>
<p>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. :pr:<code>2823</code></p>
</li>
</ul>
<h2>Version 3.0.2</h2>
<p>Released 2024-04-01</p>
<ul>
<li>Ensure setting <code>merge_slashes</code> to <code>False</code>
results in <code>NotFound</code> for
repeated-slash requests against single slash routes.
:issue:<code>2834</code></li>
<li>Fix handling of <code>TypeError</code> in
<code>TypeConversionDict.get()</code> to match
<code>ValueError</code>. :issue:<code>2843</code></li>
<li>Fix <code>response_wrapper</code> type check in test client.
:issue:<code>2831</code></li>
<li>Make the return type of <code>MultiPartParser.parse</code> more
precise.
:issue:<code>2840</code></li>
<li>Raise an error if converter arguments cannot be parsed.
:issue:<code>2822</code></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/pallets/werkzeug/commit/f9995e967979eb694d6b31536cc65314fd7e9c8c"><code>f9995e9</code></a>
release version 3.0.3</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/3386395b24c7371db11a5b8eaac0c91da5362692"><code>3386395</code></a>
Merge pull request from GHSA-2g68-c3qc-8985</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/890b6b62634fa61224222aee31081c61b054ff01"><code>890b6b6</code></a>
only require trusted host for evalex</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/71b69dfb7df3d912e66bab87fbb1f21f83504967"><code>71b69df</code></a>
restrict debugger trusted hosts</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/d2d3869525a4ffb2c41dfb2c0e39d94dab2d870c"><code>d2d3869</code></a>
endpoint type is Any (<a
href="https://redirect.github.com/pallets/werkzeug/issues/2895">#2895</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/7080b55acd48b68afdda65ee6c7f99e9afafb0ba"><code>7080b55</code></a>
endpoint type is Any</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/7555eff296fbdf12f2e576b6bbb0b506df8417ed"><code>7555eff</code></a>
remove iri_to_uri redirect workaround (<a
href="https://redirect.github.com/pallets/werkzeug/issues/2894">#2894</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/97fb2f722297ae4e12e36dab024e0acf8477b3c8"><code>97fb2f7</code></a>
remove _invalid_iri_to_uri workaround</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/249527ff981e7aa22cd714825c5637cc92df7761"><code>249527f</code></a>
make cn field a valid single hostname, and use wildcard in SANs field.
(<a
href="https://redirect.github.com/pallets/werkzeug/issues/2892">#2892</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/793be472c9d145eb9be7d4200672d1806289d84a"><code>793be47</code></a>
update adhoc tls dev cert format</li>
<li>Additional commits viewable in <a
href="https://github.com/pallets/werkzeug/compare/3.0.1...3.0.3">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=werkzeug&package-manager=pip&previous-version=3.0.1&new-version=3.0.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/data-dot-all/dataall/network/alerts).

</details>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit 0b49633f 
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> 
Date: Tue May 07 2024 02:24:34 GMT-0400 (Eastern Daylight Time) 

    Bump werkzeug from 3.0.1 to 3.0.3 in /tests_new/integration_tests (#1254)

Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.1 to
3.0.3.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/werkzeug/releases">werkzeug's
releases</a>.</em></p>
<blockquote>
<h2>3.0.3</h2>
<p>This is the Werkzeug 3.0.3 security release, which fixes security
issues and bugs but does not otherwise change behavior and should not
result in breaking changes.</p>
<p>PyPI: <a
href="https://pypi.org/project/Werkzeug/3.0.3/">https://pypi.org/project/Werkzeug/3.0.3/</a>
Changes: <a
href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3</a>
Milestone: <a
href="https://github.com/pallets/werkzeug/milestone/35?closed=1">https://github.com/pallets/werkzeug/milestone/35?closed=1</a></p>
<ul>
<li>Only allow <code>localhost</code>, <code>.localhost</code>,
<code>127.0.0.1</code>, or the specified hostname when running the dev
server, to make debugger requests. Additional hosts can be added by
using the debugger middleware directly. The debugger UI makes requests
using the full URL rather than only the path. GHSA-2g68-c3qc-8985</li>
<li>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2823">#2823</a></li>
<li>Better TLS cert format with <code>adhoc</code> dev certs. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2891">#2891</a></li>
<li>Inform Python &lt; 3.12 how to handle <code>itms-services</code>
URIs correctly, rather than using an overly-broad workaround in Werkzeug
that caused some redirect URIs to be passed on without encoding. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2828">#2828</a></li>
<li>Type annotation for <code>Rule.endpoint</code> and other uses of
<code>endpoint</code> is <code>Any</code>. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2836">#2836</a></li>
</ul>
<h2>3.0.2</h2>
<p>This is a fix release for the 3.0.x feature branch.</p>
<ul>
<li>Changes: <a
href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2</a></li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/werkzeug/blob/main/CHANGES.rst">werkzeug's
changelog</a>.</em></p>
<blockquote>
<h2>Version 3.0.3</h2>
<p>Released 2024-05-05</p>
<ul>
<li>
<p>Only allow <code>localhost</code>, <code>.localhost</code>,
<code>127.0.0.1</code>, or the specified
hostname when running the dev server, to make debugger requests.
Additional
hosts can be added by using the debugger middleware directly. The
debugger
UI makes requests using the full URL rather than only the path.
:ghsa:<code>2g68-c3qc-8985</code></p>
</li>
<li>
<p>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. :pr:<code>2823</code></p>
</li>
<li>
<p>Better TLS cert format with <code>adhoc</code> dev certs.
:pr:<code>2891</code></p>
</li>
<li>
<p>Inform Python &lt; 3.12 how to handle <code>itms-services</code> URIs
correctly, rather
than using an overly-broad workaround in Werkzeug that caused some
redirect
URIs to be passed on without encoding. :issue:<code>2828</code></p>
</li>
<li>
<p>Type annotation for <code>Rule.endpoint</code> and other uses of
<code>endpoint</code> is
<code>Any</code>. :issue:<code>2836</code></p>
</li>
<li>
<p>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. :pr:<code>2823</code></p>
</li>
</ul>
<h2>Version 3.0.2</h2>
<p>Released 2024-04-01</p>
<ul>
<li>Ensure setting <code>merge_slashes</code> to <code>False</code>
results in <code>NotFound</code> for
repeated-slash requests against single slash routes.
:issue:<code>2834</code></li>
<li>Fix handling of <code>TypeError</code> in
<code>TypeConversionDict.get()</code> to match
<code>ValueError</code>. :issue:<code>2843</code></li>
<li>Fix <code>response_wrapper</code> type check in test client.
:issue:<code>2831</code></li>
<li>Make the return type of <code>MultiPartParser.parse</code> more
precise.
:issue:<code>2840</code></li>
<li>Raise an error if converter arguments cannot be parsed.
:issue:<code>2822</code></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/pallets/werkzeug/commit/f9995e967979eb694d6b31536cc65314fd7e9c8c"><code>f9995e9</code></a>
release version 3.0.3</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/3386395b24c7371db11a5b8eaac0c91da5362692"><code>3386395</code></a>
Merge pull request from GHSA-2g68-c3qc-8985</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/890b6b62634fa61224222aee31081c61b054ff01"><code>890b6b6</code></a>
only require trusted host for evalex</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/71b69dfb7df3d912e66bab87fbb1f21f83504967"><code>71b69df</code></a>
restrict debugger trusted hosts</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/d2d3869525a4ffb2c41dfb2c0e39d94dab2d870c"><code>d2d3869</code></a>
endpoint type is Any (<a
href="https://redirect.github.com/pallets/werkzeug/issues/2895">#2895</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/7080b55acd48b68afdda65ee6c7f99e9afafb0ba"><code>7080b55</code></a>
endpoint type is Any</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/7555eff296fbdf12f2e576b6bbb0b506df8417ed"><code>7555eff</code></a>
remove iri_to_uri redirect workaround (<a
href="https://redirect.github.com/pallets/werkzeug/issues/2894">#2894</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/97fb2f722297ae4e12e36dab024e0acf8477b3c8"><code>97fb2f7</code></a>
remove _invalid_iri_to_uri workaround</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/249527ff981e7aa22cd714825c5637cc92df7761"><code>249527f</code></a>
make cn field a valid single hostname, and use wildcard in SANs field.
(<a
href="https://redirect.github.com/pallets/werkzeug/issues/2892">#2892</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/793be472c9d145eb9be7d4200672d1806289d84a"><code>793be47</code></a>
update adhoc tls dev cert format</li>
<li>Additional commits viewable in <a
href="https://github.com/pallets/werkzeug/compare/3.0.1...3.0.3">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=werkzeug&package-manager=pip&previous-version=3.0.1&new-version=3.0.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/data-dot-all/dataall/network/alerts).

</details>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit 08862420 
Author: mourya-33 <[email protected]> 
Date: Tue May 07 2024 02:15:15 GMT-0400 (Eastern Daylight Time) 

    Updated lambda_api.py to add encryption for lambda env vars for custo… (#1255)

Feature or Bugfix

    Bugfix

Detail

The environment variables for the lambda functions are not encrypted in
cdk which are identified by checkov scans. This fix is to enable kms
encryption for the lambda environment variables.

Relates


Security

Please answer the questions below briefly where applicable, or write
N/A. Based on
[OWASP 10](https://owasp.org/Top10/en/).

Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)? N/A
        Is the input sanitized? N/A
What precautions are you taking before deserializing the data you
consume? N/A
        Is injection prevented by parametrizing queries? N/A
        Have you ensured no eval or similar functions are used? N/A
Does this PR introduce any functionality or component that requires
authorization? N/A
How have you ensured it respects the existing AuthN/AuthZ mechanisms?
N/A
        Are you logging failed auth attempts? N/A
    Are you using or adding any cryptographic features? N/A
        Do you use a standard proven implementations? N/A
Are the used keys controlled by the customer? Where are they stored? the
KMS keys are generated by cdk and are used to encrypt the environment
variables for all lambda functions in the lambda-api stack
    Are you introducing any new policies/roles/users? - N/A
        Have you used the least-privilege principle? How? N/A

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit ed7cc3eb 
Author: Noah Paige <[email protected]> 
Date: Mon May 06 2024 09:32:30 GMT-0400 (Eastern Daylight Time) 

    Add order_by for paginated queries  (#1249)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- This PR aims to solve the following

- (1) for particular queries (identified as ones that perform
`.outerjoin()` operations and have results paginated with `paginate()`
function - sometimes the returned query results is *less than* the limit
set by the pageSize of the paginate function even when the total count
is greater than the pageSize
- Ex 1: 11 envs total, `query_user_environments()` returning 9 envs on
1st page + 2 on 2nd page
- Ex 2: 10 envs total, `query_user_environments()` returning 9 envs on
1st page + no 2nd page

- Believe this is to be happening due to the way SQLAlchemy is
"uniquing" the records resulted from an outerjoin and then returning
that result back to the frontend

- Adding a `.distinct()` check on the query ensures each distinct record
is returned (tested successfully)

- (2) Currently we often times do not implement an `.order_by()`
condition for the query used in `paginate()` and do not have a stable
way of preserving order of the items returned from a query (i.e. when
navigating through pages of response)
- A generally good practice seems to include an `order_by()` on a column
or set of columns
- For each query used in `paginate()` this PR adds an `order_by()`
condition (full list in comments below)

Can read a bit more context from related issue linked below

### Relates
- https://github.com/data-dot-all/dataall/issues/1241

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 98e67fa8 
Author: Sofia Sazonova <[email protected]> 
Date: Fri May 03 2024 12:21:57 GMT-0400 (Eastern Daylight Time) 

    fix: DATASET_READ_TABLE read permissions (#1237)

### Feature or Bugfix
- Bugfix


### Detail
- backfill DATASET_READ_TABLE permissions
- delete this permissions, when dataset tables are revoked or deteled 
- 
### Relates
- #1173

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Sofia Sazonova <[email protected]>

commit 18e2f509 
Author: Noah Paige <[email protected]> 
Date: Fri May 03 2024 10:14:52 GMT-0400 (Eastern Daylight Time) 

    Fix local test groups listing for listGroups query (#1239)

### Feature or Bugfix
<!-- please choose -->
- Bugfix


### Detail
- Locally when trying to invite a team to Env or Org we call listGroups
and the returned `LOCAL_TEST_GROUPS` is not returning the proper data
type expected


### Relates
N/A

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit a0be03c4 
Author: dlpzx <[email protected]> 
Date: Fri May 03 2024 10:12:34 GMT-0400 (Eastern Daylight Time) 

    Refactor: uncouple datasets and dataset_sharing modules - part 2-5 FINAL DELETE DATASETS_BASE (#1242)

### Feature or Bugfix
- Refactoring

### Detail
After all the previous PRs are merged, there should be no circular
dependencies between `datasets` and `datasets_sharing`. We can now
proceed to:
- move `datasets_base` models, repositories, permissions and enums to
`datasets`
- adjust the `__init__` files to establish the `datasets_sharing`
depends on `datasets`
- adjust the Module interfaces to ensure that all necessary dataset
models... are imported in the interface for sharing


Next steps:
- share_notifications paramter to dataset_sharing in config.json

### Relates
#955 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit b68b40c1 
Author: Sofia Sazonova <[email protected]> 
Date: Fri May 03 2024 10:12:11 GMT-0400 (Eastern Daylight Time) 

    bugfix: EnvironmentGroup can remove other groups (#1234)

### Feature or Bugfix
<!-- please choose -->
- Bugfix


### Detail
- Now, if the group can't update other group, it also can not remove
them.
- 
### Relates
- #1212 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Sofia Sazonova <[email protected]>

commit 264539b5 
Author: Noah Paige <[email protected]> 
Date: Fri May 03 2024 05:23:11 GMT-0400 (Eastern Daylight Time) 

    Fix Alembic Migration: has table checks (#1240)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- Fix `has_table()` check to ensure dropping the tables if the exists as
part of alembic migration upgrade
- Fix `DatasetLock nullable=True`

### Relates
- https://github.com/data-dot-all/dataall/issues/1165

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)? No
  - Is the input sanitized? N/A
- What precautions are you taking before deserializing the data you
consume? N/A
  - Is injection prevented by parametrizing queries? N/A
  - Have you ensured no `eval` or similar functions are used? N/A
- Does this PR introduce any functionality or component that requires
authorization? No
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
N/A
  - Are you logging failed auth attempts? N/A
- Are you using or adding any cryptographic features? No
  - Do you use a standard proven implementations? N/A
- Are the used keys controlled by the customer? Where are they stored?
N/A
- Are you introducing any new policies/roles/users? No
  - Have you used the least-privilege principle? How? N/A


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 42a5f6bd 
Author: dlpzx <[email protected]> 
Date: Fri May 03 2024 02:24:09 GMT-0400 (Eastern Daylight Time) 

    Refactor: uncouple datasets and dataset_sharing modules - part 2-4 (#1214)

### Feature or Bugfix
- Refactoring
⚠️ MERGE AFTER https://github.com/data-dot-all/dataall/pull/1213

### Detail
This is needed as explained in full PR [AFTER 2.4] Refactor: uncouple
datasets and dataset_sharing modules #1179
- [X] Use interface to resolve dataset roles related to datasets shared
and implement logic in the dataset_sharing module
- [X] Extend and clean-up stewards share permissions through interface

### Relates
- #1179 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 6d3f2d45 
Author: Sofia Sazonova <[email protected]> 
Date: Thu May 02 2024 10:55:00 GMT-0400 (Eastern Daylight Time) 

    [After 2.4]Core Refactoring part5 (#1194)

### Feature or Bugfix
- Refactoring

### Detail
- focus on core/environments
- move logic from resolvers to services
- create s3_client in base/aws --> TO BE REFACTORED. Needs to be merged
with dataset_sharind/aws/s3_client

### Relates
- #741 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Sofia Sazonova <[email protected]>

commit 2ea24cbb 
Author: dlpzx <[email protected]> 
Date: Thu May 02 2024 08:22:12 GMT-0400 (Eastern Daylight Time) 

    Refactor: uncouple datasets and dataset_sharing modules - part 2-3 (#1213)

### Feature or Bugfix
- Refactoring
⚠️ MERGE AFTER https://github.com/data-dot-all/dataall/pull/1187

### Detail
This is needed as explained in full PR [AFTER 2.4] Refactor: uncouple
datasets and dataset_sharing modules #1179

- [X] Creates an interface to execute checks and clean-ups of data
sharing objects when dataset objects are deleted (initially it was going
to be an db interface, but I think it is better in the service)
- [X] Move listDatasetShares query to dataset_sharing module in
https://github.com/data-dot-all/dataall/pull/1185

### Relates
-  #1179

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 750a5ec8 
Author: Anushka Singh <[email protected]> 
Date: Wed May 01 2024 12:28:18 GMT-0400 (Eastern Daylight Time) 

    Feature:1221 - Make visibility of auto-approval toggle configurable based on confidentiality (#1223)

### Feature or Bugfix

- Feature


### Detail
- Users should be able to disable visibility of auto-approval toggle
with code. For example, at our company, we require that shares always go
through approval process if their confidentiality classification is
Secret. We dont even want to give the option to users to be able to set
autoApproval enabled to ensure they dont do so by mistake and end up
over sharing.

Video demo:
https://github.com/data-dot-all/dataall/issues/1221#issuecomment-2077412044

### Relates
- https://github.com/data-dot-all/dataall/issues/1221

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 82044689 
Author: dlpzx <[email protected]> 
Date: Wed May 01 2024 12:26:42 GMT-0400 (Eastern Daylight Time) 

    Refactor: uncouple datasets and dataset_sharing modules - part 2-2 (#1187)

### Feature or Bugfix
- Refactoring
⚠️ MERGE AFTER https://github.com/data-dot-all/dataall/pull/1185

### Detail
This is needed as explained in full PR [AFTER 2.4] Refactor: uncouple
datasets and dataset_sharing modules #1179
- Split the getDatasetAssumeRole API into 2 APIs, one for dataset owners
role (in datasets module) and another one for share requester roles (in
datasets_sharing module)

### Relates
-  #1179

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 5173419f 
Author: Noah Paige <[email protected]> 
Date: Wed May 01 2024 12:24:42 GMT-0400 (Eastern Daylight Time) 

    Fix so listValidEnvironments called only once (#1238)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- When request access to a share on data.all the query to
`listValidEnvironments` used to be called twice which (depending on how
long for query results to return) could cause the environment initially
selected to disappear


### Relates
- Continuation of https://github.com/data-dot-all/dataall/issues/916

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 7656ea86 
Author: dlpzx <[email protected]> 
Date: Tue Apr 30 2024 07:13:01 GMT-0400 (Eastern Daylight Time) 

    Add integration tests on a real API client and integrate the tests in CICD (#1219)

### Feature or Bugfix
- Feature

### Detail
Add integration tests that use a real Client to execute different
validation actions.

- Define the Client and the way API calls are posted to API Gateway in
the conftest
- Define the Cognito users and the different fixtures needed for all
tests
- Write tests for the Organization core module as example
- Add feature flag in `cdk.json` called `with_approval_tests` that can
be defined at the deployment environment level. If set to True, a
CodeBuild stage running the tests is created.

### Relates
- https://github.com/data-dot-all/dataall/issues/1220

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit b963fe81 
Author: Sofia Sazonova <[email protected]> 
Date: Mon Apr 29 2024 09:26:36 GMT-0400 (Eastern Daylight Time) 

    Notification link routes to a share request page (#1227)

### Feature or Bugfix
<!-- please choose -->
- Feature

### Detail
- in notification object field `target_uri = 'shareUri|DataSetUri'`
- this value is parsed and used to redirect user to a relevant Share
Request page

### Relates
- #1115 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the da…
noah-paige added a commit that referenced this issue Jun 25, 2024
commit d8497c55 
Author: Noah Paige <[email protected]> 
Date: Mon May 20 2024 15:19:35 GMT-0400 (Eastern Daylight Time) 

    fix


commit 199ab505 
Author: Admin/noahpaig-Isengard <Admin/noahpaig-Isengard> 
Date: Mon May 20 2024 15:16:14 GMT-0400 (Eastern Daylight Time) 

    Conflicts resolved in the console.

commit ad415575 
Author: Noah Paige <[email protected]> 
Date: Mon May 20 2024 15:13:06 GMT-0400 (Eastern Daylight Time) 

    Merge branch 'chatbot-test' into noah-main-2


commit 2893efc7 
Author: Noah Paige <[email protected]> 
Date: Mon May 20 2024 15:10:20 GMT-0400 (Eastern Daylight Time) 

    Merge branch 'chatbot-test' into noah-main-2


commit caad12e1 
Author: Noah Paige <[email protected]> 
Date: Mon May 20 2024 15:12:47 GMT-0400 (Eastern Daylight Time) 

    ruff


commit a73b7110 
Author: Noah Paige <[email protected]> 
Date: Mon May 20 2024 15:09:19 GMT-0400 (Eastern Daylight Time) 

    Remove hardcoding


commit 7b32a8f7 
Author: Noah Paige <[email protected]> 
Date: Mon May 20 2024 15:07:29 GMT-0400 (Eastern Daylight Time) 

    Chatbot POC


commit e25e5815 
Author: Noah Paige <[email protected]> 
Date: Mon May 20 2024 15:08:32 GMT-0400 (Eastern Daylight Time) 

    Merge branch '1215-share-logs' into noah-main-2


commit a06c8cba 
Author: Noah Paige <[email protected]> 
Date: Fri May 17 2024 16:37:05 GMT-0400 (Eastern Daylight Time) 

    Merge share logs PR


commit a5626670 
Author: Sofia Sazonova <[email protected]> 
Date: Fri May 17 2024 17:19:25 GMT-0400 (Eastern Daylight Time) 

    make ruff happy


commit aee98cf7 
Author: Noah Paige <[email protected]> 
Date: Fri May 17 2024 16:34:24 GMT-0400 (Eastern Daylight Time) 

    Merge share logs PR


commit c3ee2f7c 
Author: Sofia Sazonova <[email protected]> 
Date: Fri May 17 2024 17:11:00 GMT-0400 (Eastern Daylight Time) 

    PR comments


commit 5ca55303 
Author: Sofia Sazonova <[email protected]> 
Date: Wed May 15 2024 11:57:41 GMT-0400 (Eastern Daylight Time) 

    remove unused imports


commit 8f8bf3dd 
Author: Sofia Sazonova <[email protected]> 
Date: Wed May 15 2024 11:56:39 GMT-0400 (Eastern Daylight Time) 

    restrict access to the share logs


commit 9137da9b 
Author: Sofia Sazonova <[email protected]> 
Date: Wed May 15 2024 11:28:32 GMT-0400 (Eastern Daylight Time) 

    share Logs button is available only for dataset Admins and stewards


commit fcb16bd9 
Author: Sofia Sazonova <[email protected]> 
Date: Wed May 15 2024 10:46:25 GMT-0400 (Eastern Daylight Time) 

    getShareLogs query


commit 0503a3bb 
Author: Sofia Sazonova <[email protected]> 
Date: Wed May 15 2024 10:21:25 GMT-0400 (Eastern Daylight Time) 

    Logs modal in Share View


commit bab2f3e6 
Author: Sofia Sazonova <[email protected]> 
Date: Mon May 13 2024 09:09:18 GMT-0400 (Eastern Daylight Time) 

    Add confirmation pop-ups for deletion of team roles and groups (#1231)

### Feature or Bugfix

- Feature



### Detail
Pop ups added for:
- deletion team from environment
- deletion of the consumption role
- deletion of group from Organization

### Relates
- #942 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

Co-authored-by: Sofia Sazonova <[email protected]>

commit 93ff7725 
Author: Sofia Sazonova <[email protected]> 
Date: Mon May 13 2024 08:00:38 GMT-0400 (Eastern Daylight Time) 

    Update version.json (#1264)

Release info update

commit e718d861 
Author: Sofia Sazonova <[email protected]> 
Date: Mon May 13 2024 07:29:27 GMT-0400 (Eastern Daylight Time) 

    fix permission query (#1263)

### Feature or Bugfix
- Bugfix


### Detail
- The filter -- array of permissions' NAMES, so in order to query
policies correctly we need to add join
- The filter 'share_type' and 'share_item_status' must be string
- IMPORTANT: in block "finally" the param session was used, but session
was defined only in "try" block. So, the lock failed to be released.

### Relates
- <URL or Ticket>

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Sofia Sazonova <[email protected]>

commit 479b8f3f 
Author: mourya-33 <[email protected]> 
Date: Wed May 08 2024 10:29:36 GMT-0400 (Eastern Daylight Time) 

    Add encryption and tag immutability to ECR repository (#1224)

### Feature or Bugfix
- Bugfix

### Detail
- Currently the ecr repository created do not have encryption and tag
immutability enabled which is identified by checkov scans. This fix is
to enable both.

### Relates
[- <URL or Ticket>](https://github.com/data-dot-all/dataall/issues/1200)

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
N/A
  - Is the input sanitized? N/A
- What precautions are you taking before deserializing the data you
consume? N/A
  - Is injection prevented by parametrizing queries? N/A
  - Have you ensured no `eval` or similar functions are used? N/A
- Does this PR introduce any functionality or component that requires
authorization? N/A
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
N/A
  - Are you logging failed auth attempts? N/A
- Are you using or adding any cryptographic features? N/A
  - Do you use a standard proven implementations? N/A
- Are the used keys controlled by the customer? Where are they stored?
No. This is with default encryption
- Are you introducing any new policies/roles/users? N/A
  - Have you used the least-privilege principle? How? N/A


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 2f885773 
Author: Sofia Sazonova <[email protected]> 
Date: Wed May 08 2024 09:22:40 GMT-0400 (Eastern Daylight Time) 

    Multiple permission roots (#1259)

### Feature or Bugfix
- Bugfix


### Detail
- GET_DATASET_TABLE (FOLDER) permissions are granted to the group only
if they are not granted already
- these permissions are removed if group is not admin|steward and there
are no other shares of this item.

### Relates
- #1174

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Sofia Sazonova <[email protected]>

commit c4cc07ee 
Author: Petros Kalos <[email protected]> 
Date: Wed May 08 2024 08:54:02 GMT-0400 (Eastern Daylight Time) 

    explicitly specify dataset_client s3 endpoint_url (#1260)

* AWS requires that the endpoint_url should be explicitly specified for
some regions
* Remove misleading CORS error message, the upload step can fail for
many reason

### Feature or Bugfix
- Bugfix

### Detail
Resolves #778 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 40defe8e 
Author: dlpzx <[email protected]> 
Date: Tue May 07 2024 11:52:17 GMT-0400 (Eastern Daylight Time) 

    Generic dataset module and specific s3_datasets module - part 1 (Rename datasets as s3_datasets) (#1250)

### Feature or Bugfix
- Refactoring

### Detail
- Rename `datasets` module to `s3_datasets` module

This PR is the first step to extract a generic datasets_base module that
implements the undifferentiated concepts of Dataset in data.all.
s3_datasets will use this base module to implement the specific
implementation for S3 datatasets.

### Relates
- #1123 
- #955 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 74a303cb 
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> 
Date: Tue May 07 2024 02:26:09 GMT-0400 (Eastern Daylight Time) 

    Bump werkzeug from 3.0.1 to 3.0.3 in /tests (#1253)

Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.1 to
3.0.3.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/werkzeug/releases">werkzeug's
releases</a>.</em></p>
<blockquote>
<h2>3.0.3</h2>
<p>This is the Werkzeug 3.0.3 security release, which fixes security
issues and bugs but does not otherwise change behavior and should not
result in breaking changes.</p>
<p>PyPI: <a
href="https://pypi.org/project/Werkzeug/3.0.3/">https://pypi.org/project/Werkzeug/3.0.3/</a>
Changes: <a
href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3</a>
Milestone: <a
href="https://github.com/pallets/werkzeug/milestone/35?closed=1">https://github.com/pallets/werkzeug/milestone/35?closed=1</a></p>
<ul>
<li>Only allow <code>localhost</code>, <code>.localhost</code>,
<code>127.0.0.1</code>, or the specified hostname when running the dev
server, to make debugger requests. Additional hosts can be added by
using the debugger middleware directly. The debugger UI makes requests
using the full URL rather than only the path. GHSA-2g68-c3qc-8985</li>
<li>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2823">#2823</a></li>
<li>Better TLS cert format with <code>adhoc</code> dev certs. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2891">#2891</a></li>
<li>Inform Python &lt; 3.12 how to handle <code>itms-services</code>
URIs correctly, rather than using an overly-broad workaround in Werkzeug
that caused some redirect URIs to be passed on without encoding. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2828">#2828</a></li>
<li>Type annotation for <code>Rule.endpoint</code> and other uses of
<code>endpoint</code> is <code>Any</code>. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2836">#2836</a></li>
</ul>
<h2>3.0.2</h2>
<p>This is a fix release for the 3.0.x feature branch.</p>
<ul>
<li>Changes: <a
href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2</a></li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/werkzeug/blob/main/CHANGES.rst">werkzeug's
changelog</a>.</em></p>
<blockquote>
<h2>Version 3.0.3</h2>
<p>Released 2024-05-05</p>
<ul>
<li>
<p>Only allow <code>localhost</code>, <code>.localhost</code>,
<code>127.0.0.1</code>, or the specified
hostname when running the dev server, to make debugger requests.
Additional
hosts can be added by using the debugger middleware directly. The
debugger
UI makes requests using the full URL rather than only the path.
:ghsa:<code>2g68-c3qc-8985</code></p>
</li>
<li>
<p>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. :pr:<code>2823</code></p>
</li>
<li>
<p>Better TLS cert format with <code>adhoc</code> dev certs.
:pr:<code>2891</code></p>
</li>
<li>
<p>Inform Python &lt; 3.12 how to handle <code>itms-services</code> URIs
correctly, rather
than using an overly-broad workaround in Werkzeug that caused some
redirect
URIs to be passed on without encoding. :issue:<code>2828</code></p>
</li>
<li>
<p>Type annotation for <code>Rule.endpoint</code> and other uses of
<code>endpoint</code> is
<code>Any</code>. :issue:<code>2836</code></p>
</li>
<li>
<p>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. :pr:<code>2823</code></p>
</li>
</ul>
<h2>Version 3.0.2</h2>
<p>Released 2024-04-01</p>
<ul>
<li>Ensure setting <code>merge_slashes</code> to <code>False</code>
results in <code>NotFound</code> for
repeated-slash requests against single slash routes.
:issue:<code>2834</code></li>
<li>Fix handling of <code>TypeError</code> in
<code>TypeConversionDict.get()</code> to match
<code>ValueError</code>. :issue:<code>2843</code></li>
<li>Fix <code>response_wrapper</code> type check in test client.
:issue:<code>2831</code></li>
<li>Make the return type of <code>MultiPartParser.parse</code> more
precise.
:issue:<code>2840</code></li>
<li>Raise an error if converter arguments cannot be parsed.
:issue:<code>2822</code></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/pallets/werkzeug/commit/f9995e967979eb694d6b31536cc65314fd7e9c8c"><code>f9995e9</code></a>
release version 3.0.3</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/3386395b24c7371db11a5b8eaac0c91da5362692"><code>3386395</code></a>
Merge pull request from GHSA-2g68-c3qc-8985</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/890b6b62634fa61224222aee31081c61b054ff01"><code>890b6b6</code></a>
only require trusted host for evalex</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/71b69dfb7df3d912e66bab87fbb1f21f83504967"><code>71b69df</code></a>
restrict debugger trusted hosts</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/d2d3869525a4ffb2c41dfb2c0e39d94dab2d870c"><code>d2d3869</code></a>
endpoint type is Any (<a
href="https://redirect.github.com/pallets/werkzeug/issues/2895">#2895</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/7080b55acd48b68afdda65ee6c7f99e9afafb0ba"><code>7080b55</code></a>
endpoint type is Any</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/7555eff296fbdf12f2e576b6bbb0b506df8417ed"><code>7555eff</code></a>
remove iri_to_uri redirect workaround (<a
href="https://redirect.github.com/pallets/werkzeug/issues/2894">#2894</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/97fb2f722297ae4e12e36dab024e0acf8477b3c8"><code>97fb2f7</code></a>
remove _invalid_iri_to_uri workaround</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/249527ff981e7aa22cd714825c5637cc92df7761"><code>249527f</code></a>
make cn field a valid single hostname, and use wildcard in SANs field.
(<a
href="https://redirect.github.com/pallets/werkzeug/issues/2892">#2892</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/793be472c9d145eb9be7d4200672d1806289d84a"><code>793be47</code></a>
update adhoc tls dev cert format</li>
<li>Additional commits viewable in <a
href="https://github.com/pallets/werkzeug/compare/3.0.1...3.0.3">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=werkzeug&package-manager=pip&previous-version=3.0.1&new-version=3.0.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/data-dot-all/dataall/network/alerts).

</details>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit 2f33320c 
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> 
Date: Tue May 07 2024 02:25:03 GMT-0400 (Eastern Daylight Time) 

    Bump werkzeug from 3.0.1 to 3.0.3 in /backend/dataall/base/cdkproxy (#1252)

Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.1 to
3.0.3.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/werkzeug/releases">werkzeug's
releases</a>.</em></p>
<blockquote>
<h2>3.0.3</h2>
<p>This is the Werkzeug 3.0.3 security release, which fixes security
issues and bugs but does not otherwise change behavior and should not
result in breaking changes.</p>
<p>PyPI: <a
href="https://pypi.org/project/Werkzeug/3.0.3/">https://pypi.org/project/Werkzeug/3.0.3/</a>
Changes: <a
href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3</a>
Milestone: <a
href="https://github.com/pallets/werkzeug/milestone/35?closed=1">https://github.com/pallets/werkzeug/milestone/35?closed=1</a></p>
<ul>
<li>Only allow <code>localhost</code>, <code>.localhost</code>,
<code>127.0.0.1</code>, or the specified hostname when running the dev
server, to make debugger requests. Additional hosts can be added by
using the debugger middleware directly. The debugger UI makes requests
using the full URL rather than only the path. GHSA-2g68-c3qc-8985</li>
<li>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2823">#2823</a></li>
<li>Better TLS cert format with <code>adhoc</code> dev certs. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2891">#2891</a></li>
<li>Inform Python &lt; 3.12 how to handle <code>itms-services</code>
URIs correctly, rather than using an overly-broad workaround in Werkzeug
that caused some redirect URIs to be passed on without encoding. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2828">#2828</a></li>
<li>Type annotation for <code>Rule.endpoint</code> and other uses of
<code>endpoint</code> is <code>Any</code>. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2836">#2836</a></li>
</ul>
<h2>3.0.2</h2>
<p>This is a fix release for the 3.0.x feature branch.</p>
<ul>
<li>Changes: <a
href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2</a></li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/werkzeug/blob/main/CHANGES.rst">werkzeug's
changelog</a>.</em></p>
<blockquote>
<h2>Version 3.0.3</h2>
<p>Released 2024-05-05</p>
<ul>
<li>
<p>Only allow <code>localhost</code>, <code>.localhost</code>,
<code>127.0.0.1</code>, or the specified
hostname when running the dev server, to make debugger requests.
Additional
hosts can be added by using the debugger middleware directly. The
debugger
UI makes requests using the full URL rather than only the path.
:ghsa:<code>2g68-c3qc-8985</code></p>
</li>
<li>
<p>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. :pr:<code>2823</code></p>
</li>
<li>
<p>Better TLS cert format with <code>adhoc</code> dev certs.
:pr:<code>2891</code></p>
</li>
<li>
<p>Inform Python &lt; 3.12 how to handle <code>itms-services</code> URIs
correctly, rather
than using an overly-broad workaround in Werkzeug that caused some
redirect
URIs to be passed on without encoding. :issue:<code>2828</code></p>
</li>
<li>
<p>Type annotation for <code>Rule.endpoint</code> and other uses of
<code>endpoint</code> is
<code>Any</code>. :issue:<code>2836</code></p>
</li>
<li>
<p>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. :pr:<code>2823</code></p>
</li>
</ul>
<h2>Version 3.0.2</h2>
<p>Released 2024-04-01</p>
<ul>
<li>Ensure setting <code>merge_slashes</code> to <code>False</code>
results in <code>NotFound</code> for
repeated-slash requests against single slash routes.
:issue:<code>2834</code></li>
<li>Fix handling of <code>TypeError</code> in
<code>TypeConversionDict.get()</code> to match
<code>ValueError</code>. :issue:<code>2843</code></li>
<li>Fix <code>response_wrapper</code> type check in test client.
:issue:<code>2831</code></li>
<li>Make the return type of <code>MultiPartParser.parse</code> more
precise.
:issue:<code>2840</code></li>
<li>Raise an error if converter arguments cannot be parsed.
:issue:<code>2822</code></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/pallets/werkzeug/commit/f9995e967979eb694d6b31536cc65314fd7e9c8c"><code>f9995e9</code></a>
release version 3.0.3</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/3386395b24c7371db11a5b8eaac0c91da5362692"><code>3386395</code></a>
Merge pull request from GHSA-2g68-c3qc-8985</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/890b6b62634fa61224222aee31081c61b054ff01"><code>890b6b6</code></a>
only require trusted host for evalex</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/71b69dfb7df3d912e66bab87fbb1f21f83504967"><code>71b69df</code></a>
restrict debugger trusted hosts</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/d2d3869525a4ffb2c41dfb2c0e39d94dab2d870c"><code>d2d3869</code></a>
endpoint type is Any (<a
href="https://redirect.github.com/pallets/werkzeug/issues/2895">#2895</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/7080b55acd48b68afdda65ee6c7f99e9afafb0ba"><code>7080b55</code></a>
endpoint type is Any</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/7555eff296fbdf12f2e576b6bbb0b506df8417ed"><code>7555eff</code></a>
remove iri_to_uri redirect workaround (<a
href="https://redirect.github.com/pallets/werkzeug/issues/2894">#2894</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/97fb2f722297ae4e12e36dab024e0acf8477b3c8"><code>97fb2f7</code></a>
remove _invalid_iri_to_uri workaround</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/249527ff981e7aa22cd714825c5637cc92df7761"><code>249527f</code></a>
make cn field a valid single hostname, and use wildcard in SANs field.
(<a
href="https://redirect.github.com/pallets/werkzeug/issues/2892">#2892</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/793be472c9d145eb9be7d4200672d1806289d84a"><code>793be47</code></a>
update adhoc tls dev cert format</li>
<li>Additional commits viewable in <a
href="https://github.com/pallets/werkzeug/compare/3.0.1...3.0.3">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=werkzeug&package-manager=pip&previous-version=3.0.1&new-version=3.0.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/data-dot-all/dataall/network/alerts).

</details>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit 0b49633f 
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> 
Date: Tue May 07 2024 02:24:34 GMT-0400 (Eastern Daylight Time) 

    Bump werkzeug from 3.0.1 to 3.0.3 in /tests_new/integration_tests (#1254)

Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.1 to
3.0.3.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/werkzeug/releases">werkzeug's
releases</a>.</em></p>
<blockquote>
<h2>3.0.3</h2>
<p>This is the Werkzeug 3.0.3 security release, which fixes security
issues and bugs but does not otherwise change behavior and should not
result in breaking changes.</p>
<p>PyPI: <a
href="https://pypi.org/project/Werkzeug/3.0.3/">https://pypi.org/project/Werkzeug/3.0.3/</a>
Changes: <a
href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3</a>
Milestone: <a
href="https://github.com/pallets/werkzeug/milestone/35?closed=1">https://github.com/pallets/werkzeug/milestone/35?closed=1</a></p>
<ul>
<li>Only allow <code>localhost</code>, <code>.localhost</code>,
<code>127.0.0.1</code>, or the specified hostname when running the dev
server, to make debugger requests. Additional hosts can be added by
using the debugger middleware directly. The debugger UI makes requests
using the full URL rather than only the path. GHSA-2g68-c3qc-8985</li>
<li>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2823">#2823</a></li>
<li>Better TLS cert format with <code>adhoc</code> dev certs. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2891">#2891</a></li>
<li>Inform Python &lt; 3.12 how to handle <code>itms-services</code>
URIs correctly, rather than using an overly-broad workaround in Werkzeug
that caused some redirect URIs to be passed on without encoding. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2828">#2828</a></li>
<li>Type annotation for <code>Rule.endpoint</code> and other uses of
<code>endpoint</code> is <code>Any</code>. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2836">#2836</a></li>
</ul>
<h2>3.0.2</h2>
<p>This is a fix release for the 3.0.x feature branch.</p>
<ul>
<li>Changes: <a
href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2</a></li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/werkzeug/blob/main/CHANGES.rst">werkzeug's
changelog</a>.</em></p>
<blockquote>
<h2>Version 3.0.3</h2>
<p>Released 2024-05-05</p>
<ul>
<li>
<p>Only allow <code>localhost</code>, <code>.localhost</code>,
<code>127.0.0.1</code>, or the specified
hostname when running the dev server, to make debugger requests.
Additional
hosts can be added by using the debugger middleware directly. The
debugger
UI makes requests using the full URL rather than only the path.
:ghsa:<code>2g68-c3qc-8985</code></p>
</li>
<li>
<p>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. :pr:<code>2823</code></p>
</li>
<li>
<p>Better TLS cert format with <code>adhoc</code> dev certs.
:pr:<code>2891</code></p>
</li>
<li>
<p>Inform Python &lt; 3.12 how to handle <code>itms-services</code> URIs
correctly, rather
than using an overly-broad workaround in Werkzeug that caused some
redirect
URIs to be passed on without encoding. :issue:<code>2828</code></p>
</li>
<li>
<p>Type annotation for <code>Rule.endpoint</code> and other uses of
<code>endpoint</code> is
<code>Any</code>. :issue:<code>2836</code></p>
</li>
<li>
<p>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. :pr:<code>2823</code></p>
</li>
</ul>
<h2>Version 3.0.2</h2>
<p>Released 2024-04-01</p>
<ul>
<li>Ensure setting <code>merge_slashes</code> to <code>False</code>
results in <code>NotFound</code> for
repeated-slash requests against single slash routes.
:issue:<code>2834</code></li>
<li>Fix handling of <code>TypeError</code> in
<code>TypeConversionDict.get()</code> to match
<code>ValueError</code>. :issue:<code>2843</code></li>
<li>Fix <code>response_wrapper</code> type check in test client.
:issue:<code>2831</code></li>
<li>Make the return type of <code>MultiPartParser.parse</code> more
precise.
:issue:<code>2840</code></li>
<li>Raise an error if converter arguments cannot be parsed.
:issue:<code>2822</code></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/pallets/werkzeug/commit/f9995e967979eb694d6b31536cc65314fd7e9c8c"><code>f9995e9</code></a>
release version 3.0.3</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/3386395b24c7371db11a5b8eaac0c91da5362692"><code>3386395</code></a>
Merge pull request from GHSA-2g68-c3qc-8985</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/890b6b62634fa61224222aee31081c61b054ff01"><code>890b6b6</code></a>
only require trusted host for evalex</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/71b69dfb7df3d912e66bab87fbb1f21f83504967"><code>71b69df</code></a>
restrict debugger trusted hosts</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/d2d3869525a4ffb2c41dfb2c0e39d94dab2d870c"><code>d2d3869</code></a>
endpoint type is Any (<a
href="https://redirect.github.com/pallets/werkzeug/issues/2895">#2895</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/7080b55acd48b68afdda65ee6c7f99e9afafb0ba"><code>7080b55</code></a>
endpoint type is Any</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/7555eff296fbdf12f2e576b6bbb0b506df8417ed"><code>7555eff</code></a>
remove iri_to_uri redirect workaround (<a
href="https://redirect.github.com/pallets/werkzeug/issues/2894">#2894</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/97fb2f722297ae4e12e36dab024e0acf8477b3c8"><code>97fb2f7</code></a>
remove _invalid_iri_to_uri workaround</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/249527ff981e7aa22cd714825c5637cc92df7761"><code>249527f</code></a>
make cn field a valid single hostname, and use wildcard in SANs field.
(<a
href="https://redirect.github.com/pallets/werkzeug/issues/2892">#2892</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/793be472c9d145eb9be7d4200672d1806289d84a"><code>793be47</code></a>
update adhoc tls dev cert format</li>
<li>Additional commits viewable in <a
href="https://github.com/pallets/werkzeug/compare/3.0.1...3.0.3">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=werkzeug&package-manager=pip&previous-version=3.0.1&new-version=3.0.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/data-dot-all/dataall/network/alerts).

</details>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit 08862420 
Author: mourya-33 <[email protected]> 
Date: Tue May 07 2024 02:15:15 GMT-0400 (Eastern Daylight Time) 

    Updated lambda_api.py to add encryption for lambda env vars for custo… (#1255)

Feature or Bugfix

    Bugfix

Detail

The environment variables for the lambda functions are not encrypted in
cdk which are identified by checkov scans. This fix is to enable kms
encryption for the lambda environment variables.

Relates


Security

Please answer the questions below briefly where applicable, or write
N/A. Based on
[OWASP 10](https://owasp.org/Top10/en/).

Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)? N/A
        Is the input sanitized? N/A
What precautions are you taking before deserializing the data you
consume? N/A
        Is injection prevented by parametrizing queries? N/A
        Have you ensured no eval or similar functions are used? N/A
Does this PR introduce any functionality or component that requires
authorization? N/A
How have you ensured it respects the existing AuthN/AuthZ mechanisms?
N/A
        Are you logging failed auth attempts? N/A
    Are you using or adding any cryptographic features? N/A
        Do you use a standard proven implementations? N/A
Are the used keys controlled by the customer? Where are they stored? the
KMS keys are generated by cdk and are used to encrypt the environment
variables for all lambda functions in the lambda-api stack
    Are you introducing any new policies/roles/users? - N/A
        Have you used the least-privilege principle? How? N/A

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit ed7cc3eb 
Author: Noah Paige <[email protected]> 
Date: Mon May 06 2024 09:32:30 GMT-0400 (Eastern Daylight Time) 

    Add order_by for paginated queries  (#1249)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- This PR aims to solve the following

- (1) for particular queries (identified as ones that perform
`.outerjoin()` operations and have results paginated with `paginate()`
function - sometimes the returned query results is *less than* the limit
set by the pageSize of the paginate function even when the total count
is greater than the pageSize
- Ex 1: 11 envs total, `query_user_environments()` returning 9 envs on
1st page + 2 on 2nd page
- Ex 2: 10 envs total, `query_user_environments()` returning 9 envs on
1st page + no 2nd page

- Believe this is to be happening due to the way SQLAlchemy is
"uniquing" the records resulted from an outerjoin and then returning
that result back to the frontend

- Adding a `.distinct()` check on the query ensures each distinct record
is returned (tested successfully)

- (2) Currently we often times do not implement an `.order_by()`
condition for the query used in `paginate()` and do not have a stable
way of preserving order of the items returned from a query (i.e. when
navigating through pages of response)
- A generally good practice seems to include an `order_by()` on a column
or set of columns
- For each query used in `paginate()` this PR adds an `order_by()`
condition (full list in comments below)

Can read a bit more context from related issue linked below

### Relates
- https://github.com/data-dot-all/dataall/issues/1241

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 98e67fa8 
Author: Sofia Sazonova <[email protected]> 
Date: Fri May 03 2024 12:21:57 GMT-0400 (Eastern Daylight Time) 

    fix: DATASET_READ_TABLE read permissions (#1237)

### Feature or Bugfix
- Bugfix


### Detail
- backfill DATASET_READ_TABLE permissions
- delete this permissions, when dataset tables are revoked or deteled 
- 
### Relates
- #1173

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Sofia Sazonova <[email protected]>

commit 18e2f509 
Author: Noah Paige <[email protected]> 
Date: Fri May 03 2024 10:14:52 GMT-0400 (Eastern Daylight Time) 

    Fix local test groups listing for listGroups query (#1239)

### Feature or Bugfix
<!-- please choose -->
- Bugfix


### Detail
- Locally when trying to invite a team to Env or Org we call listGroups
and the returned `LOCAL_TEST_GROUPS` is not returning the proper data
type expected


### Relates
N/A

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit a0be03c4 
Author: dlpzx <[email protected]> 
Date: Fri May 03 2024 10:12:34 GMT-0400 (Eastern Daylight Time) 

    Refactor: uncouple datasets and dataset_sharing modules - part 2-5 FINAL DELETE DATASETS_BASE (#1242)

### Feature or Bugfix
- Refactoring

### Detail
After all the previous PRs are merged, there should be no circular
dependencies between `datasets` and `datasets_sharing`. We can now
proceed to:
- move `datasets_base` models, repositories, permissions and enums to
`datasets`
- adjust the `__init__` files to establish the `datasets_sharing`
depends on `datasets`
- adjust the Module interfaces to ensure that all necessary dataset
models... are imported in the interface for sharing


Next steps:
- share_notifications paramter to dataset_sharing in config.json

### Relates
#955 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit b68b40c1 
Author: Sofia Sazonova <[email protected]> 
Date: Fri May 03 2024 10:12:11 GMT-0400 (Eastern Daylight Time) 

    bugfix: EnvironmentGroup can remove other groups (#1234)

### Feature or Bugfix
<!-- please choose -->
- Bugfix


### Detail
- Now, if the group can't update other group, it also can not remove
them.
- 
### Relates
- #1212 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Sofia Sazonova <[email protected]>

commit 264539b5 
Author: Noah Paige <[email protected]> 
Date: Fri May 03 2024 05:23:11 GMT-0400 (Eastern Daylight Time) 

    Fix Alembic Migration: has table checks (#1240)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- Fix `has_table()` check to ensure dropping the tables if the exists as
part of alembic migration upgrade
- Fix `DatasetLock nullable=True`

### Relates
- https://github.com/data-dot-all/dataall/issues/1165

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)? No
  - Is the input sanitized? N/A
- What precautions are you taking before deserializing the data you
consume? N/A
  - Is injection prevented by parametrizing queries? N/A
  - Have you ensured no `eval` or similar functions are used? N/A
- Does this PR introduce any functionality or component that requires
authorization? No
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
N/A
  - Are you logging failed auth attempts? N/A
- Are you using or adding any cryptographic features? No
  - Do you use a standard proven implementations? N/A
- Are the used keys controlled by the customer? Where are they stored?
N/A
- Are you introducing any new policies/roles/users? No
  - Have you used the least-privilege principle? How? N/A


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 42a5f6bd 
Author: dlpzx <[email protected]> 
Date: Fri May 03 2024 02:24:09 GMT-0400 (Eastern Daylight Time) 

    Refactor: uncouple datasets and dataset_sharing modules - part 2-4 (#1214)

### Feature or Bugfix
- Refactoring
⚠️ MERGE AFTER https://github.com/data-dot-all/dataall/pull/1213

### Detail
This is needed as explained in full PR [AFTER 2.4] Refactor: uncouple
datasets and dataset_sharing modules #1179
- [X] Use interface to resolve dataset roles related to datasets shared
and implement logic in the dataset_sharing module
- [X] Extend and clean-up stewards share permissions through interface

### Relates
- #1179 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 6d3f2d45 
Author: Sofia Sazonova <[email protected]> 
Date: Thu May 02 2024 10:55:00 GMT-0400 (Eastern Daylight Time) 

    [After 2.4]Core Refactoring part5 (#1194)

### Feature or Bugfix
- Refactoring

### Detail
- focus on core/environments
- move logic from resolvers to services
- create s3_client in base/aws --> TO BE REFACTORED. Needs to be merged
with dataset_sharind/aws/s3_client

### Relates
- #741 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Sofia Sazonova <[email protected]>

commit 2ea24cbb 
Author: dlpzx <[email protected]> 
Date: Thu May 02 2024 08:22:12 GMT-0400 (Eastern Daylight Time) 

    Refactor: uncouple datasets and dataset_sharing modules - part 2-3 (#1213)

### Feature or Bugfix
- Refactoring
⚠️ MERGE AFTER https://github.com/data-dot-all/dataall/pull/1187

### Detail
This is needed as explained in full PR [AFTER 2.4] Refactor: uncouple
datasets and dataset_sharing modules #1179

- [X] Creates an interface to execute checks and clean-ups of data
sharing objects when dataset objects are deleted (initially it was going
to be an db interface, but I think it is better in the service)
- [X] Move listDatasetShares query to dataset_sharing module in
https://github.com/data-dot-all/dataall/pull/1185

### Relates
-  #1179

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 750a5ec8 
Author: Anushka Singh <[email protected]> 
Date: Wed May 01 2024 12:28:18 GMT-0400 (Eastern Daylight Time) 

    Feature:1221 - Make visibility of auto-approval toggle configurable based on confidentiality (#1223)

### Feature or Bugfix

- Feature


### Detail
- Users should be able to disable visibility of auto-approval toggle
with code. For example, at our company, we require that shares always go
through approval process if their confidentiality classification is
Secret. We dont even want to give the option to users to be able to set
autoApproval enabled to ensure they dont do so by mistake and end up
over sharing.

Video demo:
https://github.com/data-dot-all/dataall/issues/1221#issuecomment-2077412044

### Relates
- https://github.com/data-dot-all/dataall/issues/1221

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 82044689 
Author: dlpzx <[email protected]> 
Date: Wed May 01 2024 12:26:42 GMT-0400 (Eastern Daylight Time) 

    Refactor: uncouple datasets and dataset_sharing modules - part 2-2 (#1187)

### Feature or Bugfix
- Refactoring
⚠️ MERGE AFTER https://github.com/data-dot-all/dataall/pull/1185

### Detail
This is needed as explained in full PR [AFTER 2.4] Refactor: uncouple
datasets and dataset_sharing modules #1179
- Split the getDatasetAssumeRole API into 2 APIs, one for dataset owners
role (in datasets module) and another one for share requester roles (in
datasets_sharing module)

### Relates
-  #1179

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 5173419f 
Author: Noah Paige <[email protected]> 
Date: Wed May 01 2024 12:24:42 GMT-0400 (Eastern Daylight Time) 

    Fix so listValidEnvironments called only once (#1238)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- When request access to a share on data.all the query to
`listValidEnvironments` used to be called twice which (depending on how
long for query results to return) could cause the environment initially
selected to disappear


### Relates
- Continuation of https://github.com/data-dot-all/dataall/issues/916

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 7656ea86 
Author: dlpzx <[email protected]> 
Date: Tue Apr 30 2024 07:13:01 GMT-0400 (Eastern Daylight Time) 

    Add integration tests on a real API client and integrate the tests in CICD (#1219)

### Feature or Bugfix
- Feature

### Detail
Add integration tests that use a real Client to execute different
validation actions.

- Define the Client and the way API calls are posted to API Gateway in
the conftest
- Define the Cognito users and the different fixtures needed for all
tests
- Write tests for the Organization core module as example
- Add feature flag in `cdk.json` called `with_approval_tests` that can
be defined at the deployment environment level. If set to True, a
CodeBuild stage running the tests is created.

### Relates
- https://github.com/data-dot-all/dataall/issues/1220

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
c…
noah-paige added a commit that referenced this issue Jun 25, 2024
commit 1617953c 
Author: Noah Paige <[email protected]> 
Date: Mon May 20 2024 16:48:17 GMT-0400 (Eastern Daylight Time) 

    Add open dependency matrix


commit d8497c55 
Author: Noah Paige <[email protected]> 
Date: Mon May 20 2024 15:19:35 GMT-0400 (Eastern Daylight Time) 

    fix


commit 199ab505 
Author: Admin/noahpaig-Isengard <Admin/noahpaig-Isengard> 
Date: Mon May 20 2024 15:16:14 GMT-0400 (Eastern Daylight Time) 

    Conflicts resolved in the console.

commit ad415575 
Author: Noah Paige <[email protected]> 
Date: Mon May 20 2024 15:13:06 GMT-0400 (Eastern Daylight Time) 

    Merge branch 'chatbot-test' into noah-main-2


commit 2893efc7 
Author: Noah Paige <[email protected]> 
Date: Mon May 20 2024 15:10:20 GMT-0400 (Eastern Daylight Time) 

    Merge branch 'chatbot-test' into noah-main-2


commit caad12e1 
Author: Noah Paige <[email protected]> 
Date: Mon May 20 2024 15:12:47 GMT-0400 (Eastern Daylight Time) 

    ruff


commit a73b7110 
Author: Noah Paige <[email protected]> 
Date: Mon May 20 2024 15:09:19 GMT-0400 (Eastern Daylight Time) 

    Remove hardcoding


commit 7b32a8f7 
Author: Noah Paige <[email protected]> 
Date: Mon May 20 2024 15:07:29 GMT-0400 (Eastern Daylight Time) 

    Chatbot POC


commit e25e5815 
Author: Noah Paige <[email protected]> 
Date: Mon May 20 2024 15:08:32 GMT-0400 (Eastern Daylight Time) 

    Merge branch '1215-share-logs' into noah-main-2


commit a06c8cba 
Author: Noah Paige <[email protected]> 
Date: Fri May 17 2024 16:37:05 GMT-0400 (Eastern Daylight Time) 

    Merge share logs PR


commit a5626670 
Author: Sofia Sazonova <[email protected]> 
Date: Fri May 17 2024 17:19:25 GMT-0400 (Eastern Daylight Time) 

    make ruff happy


commit aee98cf7 
Author: Noah Paige <[email protected]> 
Date: Fri May 17 2024 16:34:24 GMT-0400 (Eastern Daylight Time) 

    Merge share logs PR


commit c3ee2f7c 
Author: Sofia Sazonova <[email protected]> 
Date: Fri May 17 2024 17:11:00 GMT-0400 (Eastern Daylight Time) 

    PR comments


commit 5ca55303 
Author: Sofia Sazonova <[email protected]> 
Date: Wed May 15 2024 11:57:41 GMT-0400 (Eastern Daylight Time) 

    remove unused imports


commit 8f8bf3dd 
Author: Sofia Sazonova <[email protected]> 
Date: Wed May 15 2024 11:56:39 GMT-0400 (Eastern Daylight Time) 

    restrict access to the share logs


commit 9137da9b 
Author: Sofia Sazonova <[email protected]> 
Date: Wed May 15 2024 11:28:32 GMT-0400 (Eastern Daylight Time) 

    share Logs button is available only for dataset Admins and stewards


commit fcb16bd9 
Author: Sofia Sazonova <[email protected]> 
Date: Wed May 15 2024 10:46:25 GMT-0400 (Eastern Daylight Time) 

    getShareLogs query


commit 0503a3bb 
Author: Sofia Sazonova <[email protected]> 
Date: Wed May 15 2024 10:21:25 GMT-0400 (Eastern Daylight Time) 

    Logs modal in Share View


commit bab2f3e6 
Author: Sofia Sazonova <[email protected]> 
Date: Mon May 13 2024 09:09:18 GMT-0400 (Eastern Daylight Time) 

    Add confirmation pop-ups for deletion of team roles and groups (#1231)

### Feature or Bugfix

- Feature



### Detail
Pop ups added for:
- deletion team from environment
- deletion of the consumption role
- deletion of group from Organization

### Relates
- #942 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

Co-authored-by: Sofia Sazonova <[email protected]>

commit 93ff7725 
Author: Sofia Sazonova <[email protected]> 
Date: Mon May 13 2024 08:00:38 GMT-0400 (Eastern Daylight Time) 

    Update version.json (#1264)

Release info update

commit e718d861 
Author: Sofia Sazonova <[email protected]> 
Date: Mon May 13 2024 07:29:27 GMT-0400 (Eastern Daylight Time) 

    fix permission query (#1263)

### Feature or Bugfix
- Bugfix


### Detail
- The filter -- array of permissions' NAMES, so in order to query
policies correctly we need to add join
- The filter 'share_type' and 'share_item_status' must be string
- IMPORTANT: in block "finally" the param session was used, but session
was defined only in "try" block. So, the lock failed to be released.

### Relates
- <URL or Ticket>

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Sofia Sazonova <[email protected]>

commit 479b8f3f 
Author: mourya-33 <[email protected]> 
Date: Wed May 08 2024 10:29:36 GMT-0400 (Eastern Daylight Time) 

    Add encryption and tag immutability to ECR repository (#1224)

### Feature or Bugfix
- Bugfix

### Detail
- Currently the ecr repository created do not have encryption and tag
immutability enabled which is identified by checkov scans. This fix is
to enable both.

### Relates
[- <URL or Ticket>](https://github.com/data-dot-all/dataall/issues/1200)

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
N/A
  - Is the input sanitized? N/A
- What precautions are you taking before deserializing the data you
consume? N/A
  - Is injection prevented by parametrizing queries? N/A
  - Have you ensured no `eval` or similar functions are used? N/A
- Does this PR introduce any functionality or component that requires
authorization? N/A
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
N/A
  - Are you logging failed auth attempts? N/A
- Are you using or adding any cryptographic features? N/A
  - Do you use a standard proven implementations? N/A
- Are the used keys controlled by the customer? Where are they stored?
No. This is with default encryption
- Are you introducing any new policies/roles/users? N/A
  - Have you used the least-privilege principle? How? N/A


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 2f885773 
Author: Sofia Sazonova <[email protected]> 
Date: Wed May 08 2024 09:22:40 GMT-0400 (Eastern Daylight Time) 

    Multiple permission roots (#1259)

### Feature or Bugfix
- Bugfix


### Detail
- GET_DATASET_TABLE (FOLDER) permissions are granted to the group only
if they are not granted already
- these permissions are removed if group is not admin|steward and there
are no other shares of this item.

### Relates
- #1174

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Sofia Sazonova <[email protected]>

commit c4cc07ee 
Author: Petros Kalos <[email protected]> 
Date: Wed May 08 2024 08:54:02 GMT-0400 (Eastern Daylight Time) 

    explicitly specify dataset_client s3 endpoint_url (#1260)

* AWS requires that the endpoint_url should be explicitly specified for
some regions
* Remove misleading CORS error message, the upload step can fail for
many reason

### Feature or Bugfix
- Bugfix

### Detail
Resolves #778 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 40defe8e 
Author: dlpzx <[email protected]> 
Date: Tue May 07 2024 11:52:17 GMT-0400 (Eastern Daylight Time) 

    Generic dataset module and specific s3_datasets module - part 1 (Rename datasets as s3_datasets) (#1250)

### Feature or Bugfix
- Refactoring

### Detail
- Rename `datasets` module to `s3_datasets` module

This PR is the first step to extract a generic datasets_base module that
implements the undifferentiated concepts of Dataset in data.all.
s3_datasets will use this base module to implement the specific
implementation for S3 datatasets.

### Relates
- #1123 
- #955 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 74a303cb 
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> 
Date: Tue May 07 2024 02:26:09 GMT-0400 (Eastern Daylight Time) 

    Bump werkzeug from 3.0.1 to 3.0.3 in /tests (#1253)

Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.1 to
3.0.3.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/werkzeug/releases">werkzeug's
releases</a>.</em></p>
<blockquote>
<h2>3.0.3</h2>
<p>This is the Werkzeug 3.0.3 security release, which fixes security
issues and bugs but does not otherwise change behavior and should not
result in breaking changes.</p>
<p>PyPI: <a
href="https://pypi.org/project/Werkzeug/3.0.3/">https://pypi.org/project/Werkzeug/3.0.3/</a>
Changes: <a
href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3</a>
Milestone: <a
href="https://github.com/pallets/werkzeug/milestone/35?closed=1">https://github.com/pallets/werkzeug/milestone/35?closed=1</a></p>
<ul>
<li>Only allow <code>localhost</code>, <code>.localhost</code>,
<code>127.0.0.1</code>, or the specified hostname when running the dev
server, to make debugger requests. Additional hosts can be added by
using the debugger middleware directly. The debugger UI makes requests
using the full URL rather than only the path. GHSA-2g68-c3qc-8985</li>
<li>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2823">#2823</a></li>
<li>Better TLS cert format with <code>adhoc</code> dev certs. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2891">#2891</a></li>
<li>Inform Python &lt; 3.12 how to handle <code>itms-services</code>
URIs correctly, rather than using an overly-broad workaround in Werkzeug
that caused some redirect URIs to be passed on without encoding. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2828">#2828</a></li>
<li>Type annotation for <code>Rule.endpoint</code> and other uses of
<code>endpoint</code> is <code>Any</code>. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2836">#2836</a></li>
</ul>
<h2>3.0.2</h2>
<p>This is a fix release for the 3.0.x feature branch.</p>
<ul>
<li>Changes: <a
href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2</a></li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/werkzeug/blob/main/CHANGES.rst">werkzeug's
changelog</a>.</em></p>
<blockquote>
<h2>Version 3.0.3</h2>
<p>Released 2024-05-05</p>
<ul>
<li>
<p>Only allow <code>localhost</code>, <code>.localhost</code>,
<code>127.0.0.1</code>, or the specified
hostname when running the dev server, to make debugger requests.
Additional
hosts can be added by using the debugger middleware directly. The
debugger
UI makes requests using the full URL rather than only the path.
:ghsa:<code>2g68-c3qc-8985</code></p>
</li>
<li>
<p>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. :pr:<code>2823</code></p>
</li>
<li>
<p>Better TLS cert format with <code>adhoc</code> dev certs.
:pr:<code>2891</code></p>
</li>
<li>
<p>Inform Python &lt; 3.12 how to handle <code>itms-services</code> URIs
correctly, rather
than using an overly-broad workaround in Werkzeug that caused some
redirect
URIs to be passed on without encoding. :issue:<code>2828</code></p>
</li>
<li>
<p>Type annotation for <code>Rule.endpoint</code> and other uses of
<code>endpoint</code> is
<code>Any</code>. :issue:<code>2836</code></p>
</li>
<li>
<p>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. :pr:<code>2823</code></p>
</li>
</ul>
<h2>Version 3.0.2</h2>
<p>Released 2024-04-01</p>
<ul>
<li>Ensure setting <code>merge_slashes</code> to <code>False</code>
results in <code>NotFound</code> for
repeated-slash requests against single slash routes.
:issue:<code>2834</code></li>
<li>Fix handling of <code>TypeError</code> in
<code>TypeConversionDict.get()</code> to match
<code>ValueError</code>. :issue:<code>2843</code></li>
<li>Fix <code>response_wrapper</code> type check in test client.
:issue:<code>2831</code></li>
<li>Make the return type of <code>MultiPartParser.parse</code> more
precise.
:issue:<code>2840</code></li>
<li>Raise an error if converter arguments cannot be parsed.
:issue:<code>2822</code></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/pallets/werkzeug/commit/f9995e967979eb694d6b31536cc65314fd7e9c8c"><code>f9995e9</code></a>
release version 3.0.3</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/3386395b24c7371db11a5b8eaac0c91da5362692"><code>3386395</code></a>
Merge pull request from GHSA-2g68-c3qc-8985</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/890b6b62634fa61224222aee31081c61b054ff01"><code>890b6b6</code></a>
only require trusted host for evalex</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/71b69dfb7df3d912e66bab87fbb1f21f83504967"><code>71b69df</code></a>
restrict debugger trusted hosts</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/d2d3869525a4ffb2c41dfb2c0e39d94dab2d870c"><code>d2d3869</code></a>
endpoint type is Any (<a
href="https://redirect.github.com/pallets/werkzeug/issues/2895">#2895</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/7080b55acd48b68afdda65ee6c7f99e9afafb0ba"><code>7080b55</code></a>
endpoint type is Any</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/7555eff296fbdf12f2e576b6bbb0b506df8417ed"><code>7555eff</code></a>
remove iri_to_uri redirect workaround (<a
href="https://redirect.github.com/pallets/werkzeug/issues/2894">#2894</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/97fb2f722297ae4e12e36dab024e0acf8477b3c8"><code>97fb2f7</code></a>
remove _invalid_iri_to_uri workaround</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/249527ff981e7aa22cd714825c5637cc92df7761"><code>249527f</code></a>
make cn field a valid single hostname, and use wildcard in SANs field.
(<a
href="https://redirect.github.com/pallets/werkzeug/issues/2892">#2892</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/793be472c9d145eb9be7d4200672d1806289d84a"><code>793be47</code></a>
update adhoc tls dev cert format</li>
<li>Additional commits viewable in <a
href="https://github.com/pallets/werkzeug/compare/3.0.1...3.0.3">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=werkzeug&package-manager=pip&previous-version=3.0.1&new-version=3.0.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/data-dot-all/dataall/network/alerts).

</details>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit 2f33320c 
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> 
Date: Tue May 07 2024 02:25:03 GMT-0400 (Eastern Daylight Time) 

    Bump werkzeug from 3.0.1 to 3.0.3 in /backend/dataall/base/cdkproxy (#1252)

Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.1 to
3.0.3.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/werkzeug/releases">werkzeug's
releases</a>.</em></p>
<blockquote>
<h2>3.0.3</h2>
<p>This is the Werkzeug 3.0.3 security release, which fixes security
issues and bugs but does not otherwise change behavior and should not
result in breaking changes.</p>
<p>PyPI: <a
href="https://pypi.org/project/Werkzeug/3.0.3/">https://pypi.org/project/Werkzeug/3.0.3/</a>
Changes: <a
href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3</a>
Milestone: <a
href="https://github.com/pallets/werkzeug/milestone/35?closed=1">https://github.com/pallets/werkzeug/milestone/35?closed=1</a></p>
<ul>
<li>Only allow <code>localhost</code>, <code>.localhost</code>,
<code>127.0.0.1</code>, or the specified hostname when running the dev
server, to make debugger requests. Additional hosts can be added by
using the debugger middleware directly. The debugger UI makes requests
using the full URL rather than only the path. GHSA-2g68-c3qc-8985</li>
<li>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2823">#2823</a></li>
<li>Better TLS cert format with <code>adhoc</code> dev certs. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2891">#2891</a></li>
<li>Inform Python &lt; 3.12 how to handle <code>itms-services</code>
URIs correctly, rather than using an overly-broad workaround in Werkzeug
that caused some redirect URIs to be passed on without encoding. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2828">#2828</a></li>
<li>Type annotation for <code>Rule.endpoint</code> and other uses of
<code>endpoint</code> is <code>Any</code>. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2836">#2836</a></li>
</ul>
<h2>3.0.2</h2>
<p>This is a fix release for the 3.0.x feature branch.</p>
<ul>
<li>Changes: <a
href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2</a></li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/werkzeug/blob/main/CHANGES.rst">werkzeug's
changelog</a>.</em></p>
<blockquote>
<h2>Version 3.0.3</h2>
<p>Released 2024-05-05</p>
<ul>
<li>
<p>Only allow <code>localhost</code>, <code>.localhost</code>,
<code>127.0.0.1</code>, or the specified
hostname when running the dev server, to make debugger requests.
Additional
hosts can be added by using the debugger middleware directly. The
debugger
UI makes requests using the full URL rather than only the path.
:ghsa:<code>2g68-c3qc-8985</code></p>
</li>
<li>
<p>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. :pr:<code>2823</code></p>
</li>
<li>
<p>Better TLS cert format with <code>adhoc</code> dev certs.
:pr:<code>2891</code></p>
</li>
<li>
<p>Inform Python &lt; 3.12 how to handle <code>itms-services</code> URIs
correctly, rather
than using an overly-broad workaround in Werkzeug that caused some
redirect
URIs to be passed on without encoding. :issue:<code>2828</code></p>
</li>
<li>
<p>Type annotation for <code>Rule.endpoint</code> and other uses of
<code>endpoint</code> is
<code>Any</code>. :issue:<code>2836</code></p>
</li>
<li>
<p>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. :pr:<code>2823</code></p>
</li>
</ul>
<h2>Version 3.0.2</h2>
<p>Released 2024-04-01</p>
<ul>
<li>Ensure setting <code>merge_slashes</code> to <code>False</code>
results in <code>NotFound</code> for
repeated-slash requests against single slash routes.
:issue:<code>2834</code></li>
<li>Fix handling of <code>TypeError</code> in
<code>TypeConversionDict.get()</code> to match
<code>ValueError</code>. :issue:<code>2843</code></li>
<li>Fix <code>response_wrapper</code> type check in test client.
:issue:<code>2831</code></li>
<li>Make the return type of <code>MultiPartParser.parse</code> more
precise.
:issue:<code>2840</code></li>
<li>Raise an error if converter arguments cannot be parsed.
:issue:<code>2822</code></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/pallets/werkzeug/commit/f9995e967979eb694d6b31536cc65314fd7e9c8c"><code>f9995e9</code></a>
release version 3.0.3</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/3386395b24c7371db11a5b8eaac0c91da5362692"><code>3386395</code></a>
Merge pull request from GHSA-2g68-c3qc-8985</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/890b6b62634fa61224222aee31081c61b054ff01"><code>890b6b6</code></a>
only require trusted host for evalex</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/71b69dfb7df3d912e66bab87fbb1f21f83504967"><code>71b69df</code></a>
restrict debugger trusted hosts</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/d2d3869525a4ffb2c41dfb2c0e39d94dab2d870c"><code>d2d3869</code></a>
endpoint type is Any (<a
href="https://redirect.github.com/pallets/werkzeug/issues/2895">#2895</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/7080b55acd48b68afdda65ee6c7f99e9afafb0ba"><code>7080b55</code></a>
endpoint type is Any</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/7555eff296fbdf12f2e576b6bbb0b506df8417ed"><code>7555eff</code></a>
remove iri_to_uri redirect workaround (<a
href="https://redirect.github.com/pallets/werkzeug/issues/2894">#2894</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/97fb2f722297ae4e12e36dab024e0acf8477b3c8"><code>97fb2f7</code></a>
remove _invalid_iri_to_uri workaround</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/249527ff981e7aa22cd714825c5637cc92df7761"><code>249527f</code></a>
make cn field a valid single hostname, and use wildcard in SANs field.
(<a
href="https://redirect.github.com/pallets/werkzeug/issues/2892">#2892</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/793be472c9d145eb9be7d4200672d1806289d84a"><code>793be47</code></a>
update adhoc tls dev cert format</li>
<li>Additional commits viewable in <a
href="https://github.com/pallets/werkzeug/compare/3.0.1...3.0.3">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=werkzeug&package-manager=pip&previous-version=3.0.1&new-version=3.0.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/data-dot-all/dataall/network/alerts).

</details>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit 0b49633f 
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> 
Date: Tue May 07 2024 02:24:34 GMT-0400 (Eastern Daylight Time) 

    Bump werkzeug from 3.0.1 to 3.0.3 in /tests_new/integration_tests (#1254)

Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.1 to
3.0.3.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/werkzeug/releases">werkzeug's
releases</a>.</em></p>
<blockquote>
<h2>3.0.3</h2>
<p>This is the Werkzeug 3.0.3 security release, which fixes security
issues and bugs but does not otherwise change behavior and should not
result in breaking changes.</p>
<p>PyPI: <a
href="https://pypi.org/project/Werkzeug/3.0.3/">https://pypi.org/project/Werkzeug/3.0.3/</a>
Changes: <a
href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3</a>
Milestone: <a
href="https://github.com/pallets/werkzeug/milestone/35?closed=1">https://github.com/pallets/werkzeug/milestone/35?closed=1</a></p>
<ul>
<li>Only allow <code>localhost</code>, <code>.localhost</code>,
<code>127.0.0.1</code>, or the specified hostname when running the dev
server, to make debugger requests. Additional hosts can be added by
using the debugger middleware directly. The debugger UI makes requests
using the full URL rather than only the path. GHSA-2g68-c3qc-8985</li>
<li>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2823">#2823</a></li>
<li>Better TLS cert format with <code>adhoc</code> dev certs. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2891">#2891</a></li>
<li>Inform Python &lt; 3.12 how to handle <code>itms-services</code>
URIs correctly, rather than using an overly-broad workaround in Werkzeug
that caused some redirect URIs to be passed on without encoding. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2828">#2828</a></li>
<li>Type annotation for <code>Rule.endpoint</code> and other uses of
<code>endpoint</code> is <code>Any</code>. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2836">#2836</a></li>
</ul>
<h2>3.0.2</h2>
<p>This is a fix release for the 3.0.x feature branch.</p>
<ul>
<li>Changes: <a
href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-2</a></li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/werkzeug/blob/main/CHANGES.rst">werkzeug's
changelog</a>.</em></p>
<blockquote>
<h2>Version 3.0.3</h2>
<p>Released 2024-05-05</p>
<ul>
<li>
<p>Only allow <code>localhost</code>, <code>.localhost</code>,
<code>127.0.0.1</code>, or the specified
hostname when running the dev server, to make debugger requests.
Additional
hosts can be added by using the debugger middleware directly. The
debugger
UI makes requests using the full URL rather than only the path.
:ghsa:<code>2g68-c3qc-8985</code></p>
</li>
<li>
<p>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. :pr:<code>2823</code></p>
</li>
<li>
<p>Better TLS cert format with <code>adhoc</code> dev certs.
:pr:<code>2891</code></p>
</li>
<li>
<p>Inform Python &lt; 3.12 how to handle <code>itms-services</code> URIs
correctly, rather
than using an overly-broad workaround in Werkzeug that caused some
redirect
URIs to be passed on without encoding. :issue:<code>2828</code></p>
</li>
<li>
<p>Type annotation for <code>Rule.endpoint</code> and other uses of
<code>endpoint</code> is
<code>Any</code>. :issue:<code>2836</code></p>
</li>
<li>
<p>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. :pr:<code>2823</code></p>
</li>
</ul>
<h2>Version 3.0.2</h2>
<p>Released 2024-04-01</p>
<ul>
<li>Ensure setting <code>merge_slashes</code> to <code>False</code>
results in <code>NotFound</code> for
repeated-slash requests against single slash routes.
:issue:<code>2834</code></li>
<li>Fix handling of <code>TypeError</code> in
<code>TypeConversionDict.get()</code> to match
<code>ValueError</code>. :issue:<code>2843</code></li>
<li>Fix <code>response_wrapper</code> type check in test client.
:issue:<code>2831</code></li>
<li>Make the return type of <code>MultiPartParser.parse</code> more
precise.
:issue:<code>2840</code></li>
<li>Raise an error if converter arguments cannot be parsed.
:issue:<code>2822</code></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/pallets/werkzeug/commit/f9995e967979eb694d6b31536cc65314fd7e9c8c"><code>f9995e9</code></a>
release version 3.0.3</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/3386395b24c7371db11a5b8eaac0c91da5362692"><code>3386395</code></a>
Merge pull request from GHSA-2g68-c3qc-8985</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/890b6b62634fa61224222aee31081c61b054ff01"><code>890b6b6</code></a>
only require trusted host for evalex</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/71b69dfb7df3d912e66bab87fbb1f21f83504967"><code>71b69df</code></a>
restrict debugger trusted hosts</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/d2d3869525a4ffb2c41dfb2c0e39d94dab2d870c"><code>d2d3869</code></a>
endpoint type is Any (<a
href="https://redirect.github.com/pallets/werkzeug/issues/2895">#2895</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/7080b55acd48b68afdda65ee6c7f99e9afafb0ba"><code>7080b55</code></a>
endpoint type is Any</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/7555eff296fbdf12f2e576b6bbb0b506df8417ed"><code>7555eff</code></a>
remove iri_to_uri redirect workaround (<a
href="https://redirect.github.com/pallets/werkzeug/issues/2894">#2894</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/97fb2f722297ae4e12e36dab024e0acf8477b3c8"><code>97fb2f7</code></a>
remove _invalid_iri_to_uri workaround</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/249527ff981e7aa22cd714825c5637cc92df7761"><code>249527f</code></a>
make cn field a valid single hostname, and use wildcard in SANs field.
(<a
href="https://redirect.github.com/pallets/werkzeug/issues/2892">#2892</a>)</li>
<li><a
href="https://github.com/pallets/werkzeug/commit/793be472c9d145eb9be7d4200672d1806289d84a"><code>793be47</code></a>
update adhoc tls dev cert format</li>
<li>Additional commits viewable in <a
href="https://github.com/pallets/werkzeug/compare/3.0.1...3.0.3">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=werkzeug&package-manager=pip&previous-version=3.0.1&new-version=3.0.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/data-dot-all/dataall/network/alerts).

</details>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit 08862420 
Author: mourya-33 <[email protected]> 
Date: Tue May 07 2024 02:15:15 GMT-0400 (Eastern Daylight Time) 

    Updated lambda_api.py to add encryption for lambda env vars for custo… (#1255)

Feature or Bugfix

    Bugfix

Detail

The environment variables for the lambda functions are not encrypted in
cdk which are identified by checkov scans. This fix is to enable kms
encryption for the lambda environment variables.

Relates


Security

Please answer the questions below briefly where applicable, or write
N/A. Based on
[OWASP 10](https://owasp.org/Top10/en/).

Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)? N/A
        Is the input sanitized? N/A
What precautions are you taking before deserializing the data you
consume? N/A
        Is injection prevented by parametrizing queries? N/A
        Have you ensured no eval or similar functions are used? N/A
Does this PR introduce any functionality or component that requires
authorization? N/A
How have you ensured it respects the existing AuthN/AuthZ mechanisms?
N/A
        Are you logging failed auth attempts? N/A
    Are you using or adding any cryptographic features? N/A
        Do you use a standard proven implementations? N/A
Are the used keys controlled by the customer? Where are they stored? the
KMS keys are generated by cdk and are used to encrypt the environment
variables for all lambda functions in the lambda-api stack
    Are you introducing any new policies/roles/users? - N/A
        Have you used the least-privilege principle? How? N/A

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit ed7cc3eb 
Author: Noah Paige <[email protected]> 
Date: Mon May 06 2024 09:32:30 GMT-0400 (Eastern Daylight Time) 

    Add order_by for paginated queries  (#1249)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- This PR aims to solve the following

- (1) for particular queries (identified as ones that perform
`.outerjoin()` operations and have results paginated with `paginate()`
function - sometimes the returned query results is *less than* the limit
set by the pageSize of the paginate function even when the total count
is greater than the pageSize
- Ex 1: 11 envs total, `query_user_environments()` returning 9 envs on
1st page + 2 on 2nd page
- Ex 2: 10 envs total, `query_user_environments()` returning 9 envs on
1st page + no 2nd page

- Believe this is to be happening due to the way SQLAlchemy is
"uniquing" the records resulted from an outerjoin and then returning
that result back to the frontend

- Adding a `.distinct()` check on the query ensures each distinct record
is returned (tested successfully)

- (2) Currently we often times do not implement an `.order_by()`
condition for the query used in `paginate()` and do not have a stable
way of preserving order of the items returned from a query (i.e. when
navigating through pages of response)
- A generally good practice seems to include an `order_by()` on a column
or set of columns
- For each query used in `paginate()` this PR adds an `order_by()`
condition (full list in comments below)

Can read a bit more context from related issue linked below

### Relates
- https://github.com/data-dot-all/dataall/issues/1241

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 98e67fa8 
Author: Sofia Sazonova <[email protected]> 
Date: Fri May 03 2024 12:21:57 GMT-0400 (Eastern Daylight Time) 

    fix: DATASET_READ_TABLE read permissions (#1237)

### Feature or Bugfix
- Bugfix


### Detail
- backfill DATASET_READ_TABLE permissions
- delete this permissions, when dataset tables are revoked or deteled 
- 
### Relates
- #1173

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Sofia Sazonova <[email protected]>

commit 18e2f509 
Author: Noah Paige <[email protected]> 
Date: Fri May 03 2024 10:14:52 GMT-0400 (Eastern Daylight Time) 

    Fix local test groups listing for listGroups query (#1239)

### Feature or Bugfix
<!-- please choose -->
- Bugfix


### Detail
- Locally when trying to invite a team to Env or Org we call listGroups
and the returned `LOCAL_TEST_GROUPS` is not returning the proper data
type expected


### Relates
N/A

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit a0be03c4 
Author: dlpzx <[email protected]> 
Date: Fri May 03 2024 10:12:34 GMT-0400 (Eastern Daylight Time) 

    Refactor: uncouple datasets and dataset_sharing modules - part 2-5 FINAL DELETE DATASETS_BASE (#1242)

### Feature or Bugfix
- Refactoring

### Detail
After all the previous PRs are merged, there should be no circular
dependencies between `datasets` and `datasets_sharing`. We can now
proceed to:
- move `datasets_base` models, repositories, permissions and enums to
`datasets`
- adjust the `__init__` files to establish the `datasets_sharing`
depends on `datasets`
- adjust the Module interfaces to ensure that all necessary dataset
models... are imported in the interface for sharing


Next steps:
- share_notifications paramter to dataset_sharing in config.json

### Relates
#955 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit b68b40c1 
Author: Sofia Sazonova <[email protected]> 
Date: Fri May 03 2024 10:12:11 GMT-0400 (Eastern Daylight Time) 

    bugfix: EnvironmentGroup can remove other groups (#1234)

### Feature or Bugfix
<!-- please choose -->
- Bugfix


### Detail
- Now, if the group can't update other group, it also can not remove
them.
- 
### Relates
- #1212 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Sofia Sazonova <[email protected]>

commit 264539b5 
Author: Noah Paige <[email protected]> 
Date: Fri May 03 2024 05:23:11 GMT-0400 (Eastern Daylight Time) 

    Fix Alembic Migration: has table checks (#1240)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- Fix `has_table()` check to ensure dropping the tables if the exists as
part of alembic migration upgrade
- Fix `DatasetLock nullable=True`

### Relates
- https://github.com/data-dot-all/dataall/issues/1165

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)? No
  - Is the input sanitized? N/A
- What precautions are you taking before deserializing the data you
consume? N/A
  - Is injection prevented by parametrizing queries? N/A
  - Have you ensured no `eval` or similar functions are used? N/A
- Does this PR introduce any functionality or component that requires
authorization? No
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
N/A
  - Are you logging failed auth attempts? N/A
- Are you using or adding any cryptographic features? No
  - Do you use a standard proven implementations? N/A
- Are the used keys controlled by the customer? Where are they stored?
N/A
- Are you introducing any new policies/roles/users? No
  - Have you used the least-privilege principle? How? N/A


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 42a5f6bd 
Author: dlpzx <[email protected]> 
Date: Fri May 03 2024 02:24:09 GMT-0400 (Eastern Daylight Time) 

    Refactor: uncouple datasets and dataset_sharing modules - part 2-4 (#1214)

### Feature or Bugfix
- Refactoring
⚠️ MERGE AFTER https://github.com/data-dot-all/dataall/pull/1213

### Detail
This is needed as explained in full PR [AFTER 2.4] Refactor: uncouple
datasets and dataset_sharing modules #1179
- [X] Use interface to resolve dataset roles related to datasets shared
and implement logic in the dataset_sharing module
- [X] Extend and clean-up stewards share permissions through interface

### Relates
- #1179 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 6d3f2d45 
Author: Sofia Sazonova <[email protected]> 
Date: Thu May 02 2024 10:55:00 GMT-0400 (Eastern Daylight Time) 

    [After 2.4]Core Refactoring part5 (#1194)

### Feature or Bugfix
- Refactoring

### Detail
- focus on core/environments
- move logic from resolvers to services
- create s3_client in base/aws --> TO BE REFACTORED. Needs to be merged
with dataset_sharind/aws/s3_client

### Relates
- #741 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Sofia Sazonova <[email protected]>

commit 2ea24cbb 
Author: dlpzx <[email protected]> 
Date: Thu May 02 2024 08:22:12 GMT-0400 (Eastern Daylight Time) 

    Refactor: uncouple datasets and dataset_sharing modules - part 2-3 (#1213)

### Feature or Bugfix
- Refactoring
⚠️ MERGE AFTER https://github.com/data-dot-all/dataall/pull/1187

### Detail
This is needed as explained in full PR [AFTER 2.4] Refactor: uncouple
datasets and dataset_sharing modules #1179

- [X] Creates an interface to execute checks and clean-ups of data
sharing objects when dataset objects are deleted (initially it was going
to be an db interface, but I think it is better in the service)
- [X] Move listDatasetShares query to dataset_sharing module in
https://github.com/data-dot-all/dataall/pull/1185

### Relates
-  #1179

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 750a5ec8 
Author: Anushka Singh <[email protected]> 
Date: Wed May 01 2024 12:28:18 GMT-0400 (Eastern Daylight Time) 

    Feature:1221 - Make visibility of auto-approval toggle configurable based on confidentiality (#1223)

### Feature or Bugfix

- Feature


### Detail
- Users should be able to disable visibility of auto-approval toggle
with code. For example, at our company, we require that shares always go
through approval process if their confidentiality classification is
Secret. We dont even want to give the option to users to be able to set
autoApproval enabled to ensure they dont do so by mistake and end up
over sharing.

Video demo:
https://github.com/data-dot-all/dataall/issues/1221#issuecomment-2077412044

### Relates
- https://github.com/data-dot-all/dataall/issues/1221

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 82044689 
Author: dlpzx <[email protected]> 
Date: Wed May 01 2024 12:26:42 GMT-0400 (Eastern Daylight Time) 

    Refactor: uncouple datasets and dataset_sharing modules - part 2-2 (#1187)

### Feature or Bugfix
- Refactoring
⚠️ MERGE AFTER https://github.com/data-dot-all/dataall/pull/1185

### Detail
This is needed as explained in full PR [AFTER 2.4] Refactor: uncouple
datasets and dataset_sharing modules #1179
- Split the getDatasetAssumeRole API into 2 APIs, one for dataset owners
role (in datasets module) and another one for share requester roles (in
datasets_sharing module)

### Relates
-  #1179

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 5173419f 
Author: Noah Paige <[email protected]> 
Date: Wed May 01 2024 12:24:42 GMT-0400 (Eastern Daylight Time) 

    Fix so listValidEnvironments called only once (#1238)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- When request access to a share on data.all the query to
`listValidEnvironments` used to be called twice which (depending on how
long for query results to return) could cause the environment initially
selected to disappear


### Relates
- Continuation of https://github.com/data-dot-all/dataall/issues/916

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 7656ea86 
Author: dlpzx <[email protected]> 
Date: Tue Apr 30 2024 07:13:01 GMT-0400 (Eastern Daylight Time) 

    Add integration tests on a real API client and integrate the tests in CICD (#1219)

### Feature or Bugfix
- Feature

### Detail
Add integration tests that use a real Client to execute different
validation actions.

- Define the Client and the way API calls are posted to API Gateway in
the conftest
- Define the Cognito users and the different fixtures needed for all
tests
- Write tests for the Organization core module as example
- Add feature flag in `cdk.json` called `with_approval_tests` that can
be defined at the deployment environment level. If set to True, a
CodeBuild stage running the tests is created.

### Relates
- https://github.com/data-dot-all/dataall/issues/1220

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage …
dlpzx added a commit that referenced this issue Jun 27, 2024
…art 11 (renaming and cleaning up s3_shares) (#1359)

### Feature or Bugfix
- Refactoring

### Detail
As explained in the design for #1123 and #1283 we are trying to
implement generic `datasets_base` and `shares_base` modules that can be
used by any type of datasets and by any type of shareable object in a
generic way.

This is one of the last PRs focused on renaming files and cleaning-up
the s3_datasets_shares module. The first step is a consolidation of the
file and classes names in the services to clearly refer to s3_shares:
- `services.managed_share_policy_service.SharePolicyService` --->
`services.s3_share_managed_policy_service.S3SharePolicyService`
- `services.dataset_sharing_alarm_service.DatasetSharingAlarmService`
--> `services.s3_share_alarm_service.S3ShareAlarmService`
- `services.managed_share_policy_service.SharePolicyService` -->
`services.s3_share_managed_policy_service.S3SharePolicyService`

👀 The main refactoring happens in what is used to be
`services.dataset_sharing_service`.
- The part that implements the `DatasetServiceInterface` has been moved
to `services/s3_share_dataset_service.py` as the `S3ShareDatasetService`
- The part used in the resolvers and by other methods has been renamed
as `services.s3_share_service.py` and the methods for the folder/table
permissions are also added to the S3ShareService (from
share_item_service)

Lastly, there is one method previously in share_item_service that has
been moved to the GlueClient directly as
`get_glue_database_from_catalog`.


### Relates
- #1283 
- #1123 
- #955 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
noah-paige added a commit that referenced this issue Aug 30, 2024
commit 22a6f6ef 
Author: Noah Paige <[email protected]> 
Date: Mon Jul 08 2024 11:28:07 GMT-0400 (Eastern Daylight Time) 

    Add integ tests


commit 4fb7d653 
Author: Noah Paige <[email protected]> 
Date: Mon Jul 08 2024 11:26:36 GMT-0400 (Eastern Daylight Time) 

    Merge env test changes


commit 4cf42e8 
Author: Petros Kalos <[email protected]> 
Date: Fri Jul 05 2024 08:19:34 GMT-0400 (Eastern Daylight Time) 

    improve docs


commit 65f930a 
Author: Petros Kalos <[email protected]> 
Date: Fri Jul 05 2024 08:10:56 GMT-0400 (Eastern Daylight Time) 

    fix failures


commit 170b7ce 
Author: Petros Kalos <[email protected]> 
Date: Wed Jul 03 2024 10:52:20 GMT-0400 (Eastern Daylight Time) 

    add group/consumption_role invite/remove tests


commit ba77d69 
Author: dlpzx <[email protected]> 
Date: Wed Jul 03 2024 06:51:47 GMT-0400 (Eastern Daylight Time) 

    Rename alias for env_vars kms key in cognito lambdas FE and BE (#1385)

### Feature or Bugfix
- Bugfix

### Detail
For the case in which we deploy FE and BE in us-east-1 the new lambda
env_key alias is the same one for TriggerFunctionCognitoUrlsConfig in FE
and for TriggerFunctionCognitoConfig in BE, which results in a failure
of the CICD in the FE stack because the alias already exists.

This PR changes the name of both aliases to avoid this conflict. It also
adds envname to avoid issues with other deployment environments/tooling
account in the future

### Relates
- <URL or Ticket>

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit e5923a9 
Author: dlpzx <[email protected]> 
Date: Wed Jul 03 2024 04:27:11 GMT-0400 (Eastern Daylight Time) 

    Fix lambda_env_key out of scope for vpc-facing cognito setup (#1384)

### Feature or Bugfix
- Bugfix

### Detail
The KMS key for the Lambda environment variables in the Cognito IdP
stack was defined inside an if-clause for internet facing frontend.
Outside of that if, for vpc-facing architecture the kms key does not
exist and the CICD pipeline fails. This PRs move the creation of the KMS
key outside of the if.

### Relates

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 3ccacfc 
Author: Noah Paige <[email protected]> 
Date: Mon Jul 01 2024 13:56:58 GMT-0400 (Eastern Daylight Time) 

    Add delete docs not found when re indexing in catalog task (#1365)

### Feature or Bugfix
<!-- please choose -->
- Feature

### Detail
- Add logic to Catalog Indexer Task to Delete Docs No Longer in RDS
- TODO: Add Ability to Re-index Catalog Items via Dataall Admin UI

### Relates
- #1078

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit e2817a1 
Author: Noah Paige <[email protected]> 
Date: Mon Jul 01 2024 05:14:07 GMT-0400 (Eastern Daylight Time) 

    Fix/glossary status (#1373)

### Feature or Bugfix
<!-- please choose -->
- Bugfix


### Detail
- Add back `status` to Glossary GQL Object for GQL Operations
(getGlossary, listGlossaries)
- Fix  `listOrganizationGroupPermissions` enforce non null on FE


### Relates


### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit c3c58bd 
Author: Petros Kalos <[email protected]> 
Date: Fri Jun 28 2024 06:55:42 GMT-0400 (Eastern Daylight Time) 

    add environment tests (#1371)

### Feature or Bugfix
Feature

### Detail
* add list_environment tests
* add test for updating an environment (via update_stack)
* generalise the polling functions for stacks

### Relates
#1220 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit e913d48 
Author: dlpzx <[email protected]> 
Date: Fri Jun 28 2024 04:15:49 GMT-0400 (Eastern Daylight Time) 

    Add search (Autocomplete) in miscellaneous dropdowns (#1367)

### Feature or Bugfix
- Feature

### Detail
Autocomplete for environments and teams in the following frontend views
as requested in #1012. In this case the views required custom dropdowns.

❗ I used `noOptionsText` whenever it was necessary instead of checking
groupOptions lenght >0
- [x] DatasetEditForm.js -> ❗ I kept the stewards field as `freesolo` -
what that means is that users CAN specify options that are not on the
list. I would like the reviewer to confirm this is what we want. At the
end stewardship is a delegation of permissions, it makes sense that
delegation happens to other teams. Also changed DatasetCreateForm
- [X] RequestDashboardAccessModal.js - already implemented, minor
changes
- [X] EnvironmentTeamInviteForm.js - already implemented, minor changes.
-> Kept `freesolo` because invited teams might not be the user teams.
Same reason why there is no check for groupOptions == 0, if there are no
options there is still the free text option.
- [X] EnvironmentRoleAddForm.js
- [X] NetworkCreateModal.js 

### Relates
- #1012 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit ee71d7b 
Author: Tejas Rajopadhye <[email protected]> 
Date: Thu Jun 27 2024 14:08:27 GMT-0400 (Eastern Daylight Time) 

    [Gh 1301] Enhancement Feature - Bulk share reapply on dataset  (#1363)

### Feature or Bugfix
- Feature


### Detail

- Adds feature to reapply shares in bulk for a dataset. 
- Also contains bugfix for AWS worker lambda errors 

### Relates
- #1301
- #1364

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)? N/A
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization? N/A
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features? N/A
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users? N/A
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: trajopadhye <[email protected]>

commit 27f1ad7 
Author: Noah Paige <[email protected]> 
Date: Thu Jun 27 2024 13:18:32 GMT-0400 (Eastern Daylight Time) 

    Convert Dataset Lock Mechanism to Generic Resource Lock (#1338)

### Feature or Bugfix
<!-- please choose -->
- Feature
- Bugfix
- Refactoring

### Detail
- Convert Dataset Lock Mechanism to Generic Resource Lock
- Extend locking to Share principals (i.e. EnvironmentGroup and
Consumption Roles)

- Making locking a generic component not tied to datasets


### Relates
- #1093 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: dlpzx <[email protected]>

commit e3b8658 
Author: Petros Kalos <[email protected]> 
Date: Thu Jun 27 2024 12:50:59 GMT-0400 (Eastern Daylight Time) 

    ignore ruff change in blame (#1372)

### Feature or Bugfix
<!-- please choose -->
- Feature
- Bugfix
- Refactoring

### Detail
- <feature1 or bug1>
- <feature2 or bug2>

### Relates
- <URL or Ticket>

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 2e80de4 
Author: dlpzx <[email protected]> 
Date: Thu Jun 27 2024 10:59:18 GMT-0400 (Eastern Daylight Time) 

    Generic shares_base module and specific s3_datasets_shares module - part 11 (renaming and cleaning up s3_shares) (#1359)

### Feature or Bugfix
- Refactoring

### Detail
As explained in the design for #1123 and #1283 we are trying to
implement generic `datasets_base` and `shares_base` modules that can be
used by any type of datasets and by any type of shareable object in a
generic way.

This is one of the last PRs focused on renaming files and cleaning-up
the s3_datasets_shares module. The first step is a consolidation of the
file and classes names in the services to clearly refer to s3_shares:
- `services.managed_share_policy_service.SharePolicyService` --->
`services.s3_share_managed_policy_service.S3SharePolicyService`
- `services.dataset_sharing_alarm_service.DatasetSharingAlarmService`
--> `services.s3_share_alarm_service.S3ShareAlarmService`
- `services.managed_share_policy_service.SharePolicyService` -->
`services.s3_share_managed_policy_service.S3SharePolicyService`

👀 The main refactoring happens in what is used to be
`services.dataset_sharing_service`.
- The part that implements the `DatasetServiceInterface` has been moved
to `services/s3_share_dataset_service.py` as the `S3ShareDatasetService`
- The part used in the resolvers and by other methods has been renamed
as `services.s3_share_service.py` and the methods for the folder/table
permissions are also added to the S3ShareService (from
share_item_service)

Lastly, there is one method previously in share_item_service that has
been moved to the GlueClient directly as
`get_glue_database_from_catalog`.


### Relates
- #1283 
- #1123 
- #955 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 1c09015 
Author: Noah Paige <[email protected]> 
Date: Thu Jun 27 2024 04:16:14 GMT-0400 (Eastern Daylight Time) 

    fix listOrganizationGroupPermissions (#1369)

### Feature or Bugfix
<!-- please choose -->
- Bugfix


### Detail
- Fix listOrganizationGroupPermissions


### Relates
- <URL or Ticket>

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 976ec6b 
Author: dlpzx <[email protected]> 
Date: Thu Jun 27 2024 04:13:14 GMT-0400 (Eastern Daylight Time) 

    Add search (Autocomplete) in create pipelines (#1368)

### Feature or Bugfix
- Feature

### Detail
Autocomplete for environments and teams in the following frontend views
as requested in #1012.
This PR implements it for createPipelines

### Relates
- #1012 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 6c909a3 
Author: Noah Paige <[email protected]> 
Date: Wed Jun 26 2024 11:18:04 GMT-0400 (Eastern Daylight Time) 

    fix migration to not rely on OrganizationService or RequestContext (#1361)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- Ensure migration script does not need RequestContext - otherwise fails
in migration trigger lambda as context info not set / available


### Relates
- #1306

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 90835fb 
Author: Anushka Singh <[email protected]> 
Date: Wed Jun 26 2024 11:17:22 GMT-0400 (Eastern Daylight Time) 

    Issue1248: Persistent Email Reminders (#1354)

### Feature or Bugfix
- Feature


### Detail
- When a share request is initiated and remains pending for an extended
period, dataset producers will receive automated email reminders at
predefined intervals. These reminders will prompt producers to either
approve or extend the share request, thereby preventing delays in
accessing datasets.

Attaching screenshots for emails:

<img width="1336" alt="Screenshot 2024-06-20 at 5 34 31 PM"
src="https://github.com/data-dot-all/dataall/assets/26413731/d7be28c3-5c98-4146-92b1-295e136137a3">

<img width="1322" alt="Screenshot 2024-06-20 at 5 34 52 PM"
src="https://github.com/data-dot-all/dataall/assets/26413731/047556e8-59ee-4ebf-b8a7-c0a6684e2a63">


- Email will be sent every Monday at 9am UTC. Schedule can be changed in
cron expression in container.py

### Relates
- #1248

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Anushka Singh <[email protected]>
Co-authored-by: trajopadhye <[email protected]>
Co-authored-by: Mohit Arora <[email protected]>
Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>
Co-authored-by: Raj Chopde <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Sofia Sazonova <[email protected]>
Co-authored-by: Sofia Sazonova <[email protected]>

commit e477bdf 
Author: Noah Paige <[email protected]> 
Date: Wed Jun 26 2024 10:39:09 GMT-0400 (Eastern Daylight Time) 

    Enforce non null on GQL query string if non null defined (#1362)

### Feature or Bugfix
<!-- please choose -->
- Bugfix


### Detail
- Add `String!` to ensure non null input argument on FE if defined as
such on backend GQL operation for `listS3DatasetsSharedWithEnvGroup`


### Relates

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit d6b59b3 
Author: Noah Paige <[email protected]> 
Date: Wed Jun 26 2024 08:48:52 GMT-0400 (Eastern Daylight Time) 

    Fix Init Share Base (#1360)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- Need to register processors in init for s3 dataset shares API module


### Relates

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit bd3698c 
Author: Petros Kalos <[email protected]> 
Date: Wed Jun 26 2024 05:19:14 GMT-0400 (Eastern Daylight Time) 

    split cognito urls setup and cognito user creation (#1366)

### Feature or Bugfix
- Bugfix
### Details
For more details about the issue read #1353 
In this PR we are solving the problem by splitting the configuration of
Cognito in 2.
* First part (cognito_users_config.py) is setting up the required groups
and users and runs after UserPool deployment
* Second part (cognito_urls_config.py) is setting up Cognito's
callback/logout urls and runs after the CloudFront deployment

We chose to split the functionality because we need to have the
users/groups setup for the integration tests which are run after the
backend deployment.

The other althernative is to keep the config functionality as one but
make the integ tests run after CloudFront stage.

### Relates
- Solves #1353 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
noah-paige added a commit that referenced this issue Aug 30, 2024
commit 4425e756 
Author: Noah Paige <[email protected]> 
Date: Mon Jul 08 2024 11:57:31 GMT-0400 (Eastern Daylight Time) 

    Fix


commit 4cd2bf77 
Author: Noah Paige <[email protected]> 
Date: Mon Jul 08 2024 11:56:38 GMT-0400 (Eastern Daylight Time) 

    Fix


commit 22a6f6ef 
Author: Noah Paige <[email protected]> 
Date: Mon Jul 08 2024 11:28:07 GMT-0400 (Eastern Daylight Time) 

    Add integ tests


commit 4fb7d653 
Author: Noah Paige <[email protected]> 
Date: Mon Jul 08 2024 11:26:36 GMT-0400 (Eastern Daylight Time) 

    Merge env test changes


commit 4cf42e8 
Author: Petros Kalos <[email protected]> 
Date: Fri Jul 05 2024 08:19:34 GMT-0400 (Eastern Daylight Time) 

    improve docs


commit 65f930a 
Author: Petros Kalos <[email protected]> 
Date: Fri Jul 05 2024 08:10:56 GMT-0400 (Eastern Daylight Time) 

    fix failures


commit 170b7ce 
Author: Petros Kalos <[email protected]> 
Date: Wed Jul 03 2024 10:52:20 GMT-0400 (Eastern Daylight Time) 

    add group/consumption_role invite/remove tests


commit ba77d69 
Author: dlpzx <[email protected]> 
Date: Wed Jul 03 2024 06:51:47 GMT-0400 (Eastern Daylight Time) 

    Rename alias for env_vars kms key in cognito lambdas FE and BE (#1385)

### Feature or Bugfix
- Bugfix

### Detail
For the case in which we deploy FE and BE in us-east-1 the new lambda
env_key alias is the same one for TriggerFunctionCognitoUrlsConfig in FE
and for TriggerFunctionCognitoConfig in BE, which results in a failure
of the CICD in the FE stack because the alias already exists.

This PR changes the name of both aliases to avoid this conflict. It also
adds envname to avoid issues with other deployment environments/tooling
account in the future

### Relates
- <URL or Ticket>

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit e5923a9 
Author: dlpzx <[email protected]> 
Date: Wed Jul 03 2024 04:27:11 GMT-0400 (Eastern Daylight Time) 

    Fix lambda_env_key out of scope for vpc-facing cognito setup (#1384)

### Feature or Bugfix
- Bugfix

### Detail
The KMS key for the Lambda environment variables in the Cognito IdP
stack was defined inside an if-clause for internet facing frontend.
Outside of that if, for vpc-facing architecture the kms key does not
exist and the CICD pipeline fails. This PRs move the creation of the KMS
key outside of the if.

### Relates

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 3ccacfc 
Author: Noah Paige <[email protected]> 
Date: Mon Jul 01 2024 13:56:58 GMT-0400 (Eastern Daylight Time) 

    Add delete docs not found when re indexing in catalog task (#1365)

### Feature or Bugfix
<!-- please choose -->
- Feature

### Detail
- Add logic to Catalog Indexer Task to Delete Docs No Longer in RDS
- TODO: Add Ability to Re-index Catalog Items via Dataall Admin UI

### Relates
- #1078

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit e2817a1 
Author: Noah Paige <[email protected]> 
Date: Mon Jul 01 2024 05:14:07 GMT-0400 (Eastern Daylight Time) 

    Fix/glossary status (#1373)

### Feature or Bugfix
<!-- please choose -->
- Bugfix


### Detail
- Add back `status` to Glossary GQL Object for GQL Operations
(getGlossary, listGlossaries)
- Fix  `listOrganizationGroupPermissions` enforce non null on FE


### Relates


### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit c3c58bd 
Author: Petros Kalos <[email protected]> 
Date: Fri Jun 28 2024 06:55:42 GMT-0400 (Eastern Daylight Time) 

    add environment tests (#1371)

### Feature or Bugfix
Feature

### Detail
* add list_environment tests
* add test for updating an environment (via update_stack)
* generalise the polling functions for stacks

### Relates
#1220 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit e913d48 
Author: dlpzx <[email protected]> 
Date: Fri Jun 28 2024 04:15:49 GMT-0400 (Eastern Daylight Time) 

    Add search (Autocomplete) in miscellaneous dropdowns (#1367)

### Feature or Bugfix
- Feature

### Detail
Autocomplete for environments and teams in the following frontend views
as requested in #1012. In this case the views required custom dropdowns.

❗ I used `noOptionsText` whenever it was necessary instead of checking
groupOptions lenght >0
- [x] DatasetEditForm.js -> ❗ I kept the stewards field as `freesolo` -
what that means is that users CAN specify options that are not on the
list. I would like the reviewer to confirm this is what we want. At the
end stewardship is a delegation of permissions, it makes sense that
delegation happens to other teams. Also changed DatasetCreateForm
- [X] RequestDashboardAccessModal.js - already implemented, minor
changes
- [X] EnvironmentTeamInviteForm.js - already implemented, minor changes.
-> Kept `freesolo` because invited teams might not be the user teams.
Same reason why there is no check for groupOptions == 0, if there are no
options there is still the free text option.
- [X] EnvironmentRoleAddForm.js
- [X] NetworkCreateModal.js 

### Relates
- #1012 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit ee71d7b 
Author: Tejas Rajopadhye <[email protected]> 
Date: Thu Jun 27 2024 14:08:27 GMT-0400 (Eastern Daylight Time) 

    [Gh 1301] Enhancement Feature - Bulk share reapply on dataset  (#1363)

### Feature or Bugfix
- Feature


### Detail

- Adds feature to reapply shares in bulk for a dataset. 
- Also contains bugfix for AWS worker lambda errors 

### Relates
- #1301
- #1364

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)? N/A
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization? N/A
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features? N/A
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users? N/A
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: trajopadhye <[email protected]>

commit 27f1ad7 
Author: Noah Paige <[email protected]> 
Date: Thu Jun 27 2024 13:18:32 GMT-0400 (Eastern Daylight Time) 

    Convert Dataset Lock Mechanism to Generic Resource Lock (#1338)

### Feature or Bugfix
<!-- please choose -->
- Feature
- Bugfix
- Refactoring

### Detail
- Convert Dataset Lock Mechanism to Generic Resource Lock
- Extend locking to Share principals (i.e. EnvironmentGroup and
Consumption Roles)

- Making locking a generic component not tied to datasets


### Relates
- #1093 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: dlpzx <[email protected]>

commit e3b8658 
Author: Petros Kalos <[email protected]> 
Date: Thu Jun 27 2024 12:50:59 GMT-0400 (Eastern Daylight Time) 

    ignore ruff change in blame (#1372)

### Feature or Bugfix
<!-- please choose -->
- Feature
- Bugfix
- Refactoring

### Detail
- <feature1 or bug1>
- <feature2 or bug2>

### Relates
- <URL or Ticket>

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 2e80de4 
Author: dlpzx <[email protected]> 
Date: Thu Jun 27 2024 10:59:18 GMT-0400 (Eastern Daylight Time) 

    Generic shares_base module and specific s3_datasets_shares module - part 11 (renaming and cleaning up s3_shares) (#1359)

### Feature or Bugfix
- Refactoring

### Detail
As explained in the design for #1123 and #1283 we are trying to
implement generic `datasets_base` and `shares_base` modules that can be
used by any type of datasets and by any type of shareable object in a
generic way.

This is one of the last PRs focused on renaming files and cleaning-up
the s3_datasets_shares module. The first step is a consolidation of the
file and classes names in the services to clearly refer to s3_shares:
- `services.managed_share_policy_service.SharePolicyService` --->
`services.s3_share_managed_policy_service.S3SharePolicyService`
- `services.dataset_sharing_alarm_service.DatasetSharingAlarmService`
--> `services.s3_share_alarm_service.S3ShareAlarmService`
- `services.managed_share_policy_service.SharePolicyService` -->
`services.s3_share_managed_policy_service.S3SharePolicyService`

👀 The main refactoring happens in what is used to be
`services.dataset_sharing_service`.
- The part that implements the `DatasetServiceInterface` has been moved
to `services/s3_share_dataset_service.py` as the `S3ShareDatasetService`
- The part used in the resolvers and by other methods has been renamed
as `services.s3_share_service.py` and the methods for the folder/table
permissions are also added to the S3ShareService (from
share_item_service)

Lastly, there is one method previously in share_item_service that has
been moved to the GlueClient directly as
`get_glue_database_from_catalog`.


### Relates
- #1283 
- #1123 
- #955 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 1c09015 
Author: Noah Paige <[email protected]> 
Date: Thu Jun 27 2024 04:16:14 GMT-0400 (Eastern Daylight Time) 

    fix listOrganizationGroupPermissions (#1369)

### Feature or Bugfix
<!-- please choose -->
- Bugfix


### Detail
- Fix listOrganizationGroupPermissions


### Relates
- <URL or Ticket>

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 976ec6b 
Author: dlpzx <[email protected]> 
Date: Thu Jun 27 2024 04:13:14 GMT-0400 (Eastern Daylight Time) 

    Add search (Autocomplete) in create pipelines (#1368)

### Feature or Bugfix
- Feature

### Detail
Autocomplete for environments and teams in the following frontend views
as requested in #1012.
This PR implements it for createPipelines

### Relates
- #1012 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 6c909a3 
Author: Noah Paige <[email protected]> 
Date: Wed Jun 26 2024 11:18:04 GMT-0400 (Eastern Daylight Time) 

    fix migration to not rely on OrganizationService or RequestContext (#1361)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- Ensure migration script does not need RequestContext - otherwise fails
in migration trigger lambda as context info not set / available


### Relates
- #1306

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 90835fb 
Author: Anushka Singh <[email protected]> 
Date: Wed Jun 26 2024 11:17:22 GMT-0400 (Eastern Daylight Time) 

    Issue1248: Persistent Email Reminders (#1354)

### Feature or Bugfix
- Feature


### Detail
- When a share request is initiated and remains pending for an extended
period, dataset producers will receive automated email reminders at
predefined intervals. These reminders will prompt producers to either
approve or extend the share request, thereby preventing delays in
accessing datasets.

Attaching screenshots for emails:

<img width="1336" alt="Screenshot 2024-06-20 at 5 34 31 PM"
src="https://github.com/data-dot-all/dataall/assets/26413731/d7be28c3-5c98-4146-92b1-295e136137a3">

<img width="1322" alt="Screenshot 2024-06-20 at 5 34 52 PM"
src="https://github.com/data-dot-all/dataall/assets/26413731/047556e8-59ee-4ebf-b8a7-c0a6684e2a63">


- Email will be sent every Monday at 9am UTC. Schedule can be changed in
cron expression in container.py

### Relates
- #1248

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Anushka Singh <[email protected]>
Co-authored-by: trajopadhye <[email protected]>
Co-authored-by: Mohit Arora <[email protected]>
Co-authored-by: rbernota <[email protected]>
Co-authored-by: Rick Bernotas <[email protected]>
Co-authored-by: Raj Chopde <[email protected]>
Co-authored-by: Noah Paige <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: jaidisido <[email protected]>
Co-authored-by: dlpzx <[email protected]>
Co-authored-by: mourya-33 <[email protected]>
Co-authored-by: nikpodsh <[email protected]>
Co-authored-by: MK <[email protected]>
Co-authored-by: Manjula <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Daniel Lorch <[email protected]>
Co-authored-by: Tejas Rajopadhye <[email protected]>
Co-authored-by: Zilvinas Saltys <[email protected]>
Co-authored-by: Sofia Sazonova <[email protected]>
Co-authored-by: Sofia Sazonova <[email protected]>

commit e477bdf 
Author: Noah Paige <[email protected]> 
Date: Wed Jun 26 2024 10:39:09 GMT-0400 (Eastern Daylight Time) 

    Enforce non null on GQL query string if non null defined (#1362)

### Feature or Bugfix
<!-- please choose -->
- Bugfix


### Detail
- Add `String!` to ensure non null input argument on FE if defined as
such on backend GQL operation for `listS3DatasetsSharedWithEnvGroup`


### Relates

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit d6b59b3 
Author: Noah Paige <[email protected]> 
Date: Wed Jun 26 2024 08:48:52 GMT-0400 (Eastern Daylight Time) 

    Fix Init Share Base (#1360)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- Need to register processors in init for s3 dataset shares API module


### Relates

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit bd3698c 
Author: Petros Kalos <[email protected]> 
Date: Wed Jun 26 2024 05:19:14 GMT-0400 (Eastern Daylight Time) 

    split cognito urls setup and cognito user creation (#1366)

### Feature or Bugfix
- Bugfix
### Details
For more details about the issue read #1353 
In this PR we are solving the problem by splitting the configuration of
Cognito in 2.
* First part (cognito_users_config.py) is setting up the required groups
and users and runs after UserPool deployment
* Second part (cognito_urls_config.py) is setting up Cognito's
callback/logout urls and runs after the CloudFront deployment

We chose to split the functionality because we need to have the
users/groups setup for the integration tests which are run after the
backend deployment.

The other althernative is to keep the config functionality as one but
make the integ tests run after CloudFront stage.

### Relates
- Solves #1353 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant