S3 arguments #8

kimpham54 · 2024-12-19T21:21:43Z

new arguments for you

./bin/docker-run.sh s3-file -h
./bin/docker-run.sh s3 -h

Testing steps:

install as usual,

python3 -m venv myenv
source myenv/bin/activate
export PYTHONPATH="${PYTHONPATH}:src"
pip install -r requirements.txt
python src/jp2_remediator/main.py -h

move test files in repo
mkdir logs
uncomment in .bin/docker-run.sh the option to pass aws credentials if you want to use aws
create aws credentials and place in .env file in bin directory:

AWS_ACCESS_KEY_ID=
AWS_SECRET_ACCESS_KEY=
AWS_SESSION_TOKEN=

5. ./bin/docker-build.sh

docker entrypoint/docker exec

container has to be alive, e.g. docker run --rm --mount type=bind,source=${PWD},target=/data -it --entrypoint /bin/bash artifactory.huit.harvard.edu/lts/jp2_remediator $@

./bin/docker-run.sh file /data/[your_test_images_folder]/[testfile.jp2]
./bin/docker-run.sh directory /data/[your_test_images_folder]

INPUT BUCKET/PREFIX > OUTPUT BUCKET/PREFIX
only input bucket name is required, other arguments are optional and would refer to the default input bucket as output

./bin/docker-run.sh s3 \
[input-bucket-name] \
--input-prefix [input/paths] \
--output-bucket [output-bucket-name] \
--output-prefix [output/paths]

INPUT FILE > OUTPUT FILE

./bin/docker-run.sh s3-file \
lts-jp2-remediation-dev \
--input-key [input-path/to/file.jp2] \
--output-bucket lts-jp2-remediation-dev \
--output-key [output-path/to/file.jp2]

INPUT FILE > OUTPUT PREFIX
specifying output prefix instead of output key, appends _modified_yyyymmdd.jp2 to single file:

./bin/docker-run.sh s3-file \
lts-jp2-remediation-dev \
--input-key [input-path/to/file.jp2] \
--output-bucket [output-bucket-name] \
--output-prefix [output/paths]

if the "folder"/path doesn't exist in the specified --output-prefix for both s3 and s3-file, it will get created
a custom filename can be used in --output-key, or it defaults to appending _modified_yyyymmdd.jp2

awoods · 2024-12-20T01:18:29Z

src/jp2_remediator/processor.py

+
+        # Download the file from S3
+        download_path = f"/tmp/{os.path.basename(input_key)}"
+        print(f"Downloading file: {input_key} from bucket: {input_bucket}")


Please replace all print statements with self.logger.info(...)

awoods · 2024-12-20T01:20:46Z

src/jp2_remediator/main.py

+        "s3-file", help="Process a single JP2 file in S3"
+    )
+    s3_file_parser.add_argument(
+        "input_bucket", help="Name of the AWS S3 bucket containing the JP2 file"


Change from:

"input_bucket", help="Name of the AWS S3 bucket containing the JP2 file"

to:

"--input_bucket", help="Name of the AWS S3 bucket containing the JP2 file", required=True

should this still be required after adding the -- flag? and it will be --input-bucket instead of --input_bucket

awoods · 2024-12-20T01:21:32Z

src/jp2_remediator/main.py

+    s3_file_parser.add_argument(
+        "--input-key", help="Key (path) of the JP2 file in the S3 bucket", required=True
+    )
+    s3_file_parser.add_argument(


Make required

awoods · 2024-12-20T01:21:45Z

src/jp2_remediator/main.py

+    s3_file_parser.add_argument(
+        "--output-bucket", help="Name of the AWS S3 bucket to upload the modified file (optional)"
+    )
+    s3_file_parser.add_argument(


We can remove this argument

do you want to leave it as an option or remove completely? if it is optional but not used, it defaults to the input bucket

keep this one, after discussion with andrew

awoods · 2024-12-20T01:22:04Z

src/jp2_remediator/main.py

+    s3_file_parser.add_argument(
+        "--output-prefix", help="Prefix for the uploaded file in the output bucket (optional)", default=""
+    )
+    s3_file_parser.add_argument(


Make required

right now it is optional, is there a situation where you wouldn't want to use the --output-prefix option, such as if you wanted to just put it in the same input directory, or same bucket?

kimpham54 · 2024-12-20T14:46:34Z

take out s3 (bucket only) option, not needed

kimpham54 added 9 commits December 18, 2024 09:38

added s3 output bucket and folder arguments

ea1d09f

added new test for output bucket and dir

9f4cb0d

clean up some flake8 linting flags

d59a03b

clean up test

a8baf0f

clean up test

52b29b8

coverage test again

b49265c

new s3 an s3-file arguments

3f3e2b3

add docker file

fd06b0c

remove hhmmss to timestamp of s3-file

7f3dadf

awoods requested changes Dec 20, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

S3 arguments #8

S3 arguments #8

kimpham54 commented Dec 19, 2024 •

edited

Loading

awoods Dec 20, 2024

awoods Dec 20, 2024

kimpham54 Dec 20, 2024

awoods Dec 20, 2024

awoods Dec 20, 2024

kimpham54 Dec 20, 2024 •

edited

Loading

kimpham54 Dec 20, 2024

awoods Dec 20, 2024

kimpham54 Dec 20, 2024

kimpham54 commented Dec 20, 2024

S3 arguments #8

Are you sure you want to change the base?

S3 arguments #8

Conversation

kimpham54 commented Dec 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kimpham54 Dec 20, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kimpham54 commented Dec 20, 2024

kimpham54 commented Dec 19, 2024 •

edited

Loading

kimpham54 Dec 20, 2024 •

edited

Loading