-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retire SupportsMoveFromTempLocAfterFillingIt from DataUrlTarget #115
Conversation
It's not needed anymore, presumably due to changes in the move() algorithm.
processing_instructions=pis) | ||
return self.move_from_records_directory(directory, | ||
processing_instructions) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function said "Move something to this filesystem by first creating a temporary directory ON THIS SAME FILESYSTEM and then copying from there". Kind of silly, since you can also just copy directly to the intended destination (see the
The purpose of move_from_temp_loc_after_filling_it()
is for systems like Redshift where the fastest way to copy things in is from an S3 bucket. In those cases, even if the source isn't an S3 bucket, we want to copy to a temporary S3 bucket before loading.
I'm guessing this function in this file was implemented at some point of time in the past in order to work around some wart or another in the move()
algorithm. The move()
algorithm uses these Python interfaces to figure out a way to get data from the source to the target in a multiple dispatch way.
There's been a lot of incremental development and refactoring among all of these things, and evolutionary dead-ends are not unknown :)
@@ -16,7 +14,6 @@ | |||
|
|||
|
|||
class DataUrlTarget(SupportsMoveFromDataframes, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Targets in Records Mover represent the second part in a move (e.g., in a Redshift to S3 copy, S3 is the target and Redshift is the source).
DataUrlTarget represents any kind of filesystem-like thing - currently it works with file://
, gs://
and s3://
URLs, but will probably get sftp://
support at some point as well.
Hi @crvena-sonja! Just pinging on this review. This is review is mainly for a second developer to have a little more familiarity with some Records Mover internals. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
☠️
It's not needed anymore - the integration tests still pass, and it doesn't make any sense on its face.
I ran across the need to change this interface (see #114) and noticed that it was being used in a nonsensical way here.