-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix gcs hidden error #930
Fix gcs hidden error #930
Changes from 14 commits
0477959
6d6a8ea
e0050a8
ca9f0ee
5a56817
e3e4f4f
f586e1e
a5ddd1f
3c0d07f
fa22728
b52b144
e82b833
5b8f6aa
09c56db
a110359
afd4a1a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,3 @@ | ||
[flake8] | ||
max-line-length = 100 | ||
max-line-length = 200 | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -110,6 +110,7 @@ venv.bak/ | |
# scratch | ||
scratch* | ||
old!_* | ||
test.ipynb | ||
|
||
# vscode | ||
.vscode/ | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,7 +6,7 @@ repos: | |
language_version: python3 | ||
args: [ | ||
'--extend-ignore=E203,W503', | ||
'--max-line-length=100' | ||
'--max-line-length=200' | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same comment about line length |
||
] | ||
- repo: https://github.com/psf/black | ||
rev: 22.3.0 | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,7 +7,6 @@ | |
import logging | ||
import time | ||
import uuid | ||
from grpc import StatusCode | ||
import gzip | ||
import shutil | ||
from typing import Optional | ||
|
@@ -374,7 +373,6 @@ def copy_bucket_to_gcs( | |
aws_secret_access_key (str): | ||
Secret key to authenticate storage transfer | ||
""" | ||
|
||
if source not in ["gcs", "s3"]: | ||
raise ValueError( | ||
f"Blob transfer only supports gcs and s3 sources [source={source}]" | ||
|
@@ -439,12 +437,11 @@ def copy_bucket_to_gcs( | |
|
||
# Create the transfer job | ||
create_result = client.create_transfer_job(create_transfer_job_request) | ||
logger.info(f"Created TransferJob: {create_result.name}") | ||
|
||
polling = True | ||
wait_time = 0 | ||
wait_between_attempts_in_sec = 10 | ||
max_wait_in_sec = 60 * 10 # Ten Minutes | ||
# max_wait_in_sec = 60 * 10 # Ten Minutes | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If this variable isn't being used in the final version of the code, we should probably delete the line. |
||
|
||
# NOTE: This value defaults to an empty string until GCP | ||
# triggers the job internally ... we'll use this value to | ||
|
@@ -453,27 +450,33 @@ def copy_bucket_to_gcs( | |
|
||
while polling: | ||
if latest_operation_name: | ||
|
||
operation = client.get_operation({"name": latest_operation_name}) | ||
|
||
if not operation.done: | ||
logger.debug("Operation still running...") | ||
|
||
else: | ||
if int(operation.error.code) not in StatusCode["OK"].value: | ||
operation_metadata = storage_transfer.TransferOperation.deserialize( | ||
operation.metadata.value | ||
) | ||
error_output = operation_metadata.error_breakdowns | ||
if len(error_output) != 0: | ||
raise Exception( | ||
f"""{blob_storage} to GCS Transfer Job {create_result.name} failed with error: {operation.error.message} | ||
f"""{blob_storage} to GCS Transfer Job {create_result.name} failed with error: {error_output} | ||
""" | ||
) | ||
if operation.response: | ||
else: | ||
sharinetmc marked this conversation as resolved.
Show resolved
Hide resolved
|
||
logger.info(f"TransferJob: {create_result.name} succeeded.") | ||
return | ||
return | ||
|
||
else: | ||
logger.info("Waiting to kickoff operation...") | ||
get_transfer_job_request = storage_transfer.GetTransferJobRequest( | ||
{"job_name": create_result.name, "project_id": self.project} | ||
) | ||
get_result = client.get_transfer_job(request=get_transfer_job_request) | ||
logger.info(f"get_result: {get_result}") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could this be a debug log, instead of an info? Is this something that would be helpful to someone looking through the log of a higher-level ETL script after it had an error, or something? |
||
latest_operation_name = get_result.latest_operation_name | ||
|
||
wait_time += wait_between_attempts_in_sec | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intentional? If so, this should probably be a separate PR because it's pretty opinionated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this is still present. Did we get approval from the Parsons community on doubling the maximum line length to 200?
I noticed you had an issue with a long string. Python allows you to separate strings inside parenthesis with a line break to avoid long strings breaking line length limitations. Like this: