Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: create a message from a URL #1311

Open
reidsunderland opened this issue Nov 28, 2024 · 3 comments
Open

Feature request: create a message from a URL #1311

reidsunderland opened this issue Nov 28, 2024 · 3 comments
Assignees
Labels
good first issue Good for newcomers Sugar nice to have features... not super important. wishlist would be nice, not pressing for any particular client.

Comments

@reidsunderland
Copy link
Member

reidsunderland commented Nov 28, 2024

In poll plugins, it's kind of painful to create a message from a URL, especially when the "baseUrl" is different from the pollUrl, and changes or is unknown (so you can't hard code a post_baseUrl in the config).

I've had to do this (or some variant of it) a lot:

href = "https://somedomain.com/example/path/to/file.txt"
first_slash_pos = href .find('/', 8) + 1
baseUrl = href[ : first_slash_pos] # find from pos 8 to skip over http(s)://
relPath = href[first_slash_pos : ]
msg = sarracenia.Message.fromFileInfo(relPath, self.o, stat)
msg['baseUrl'] = baseUrl
msg['new_baseUrl'] = baseUrl
msg['_deleteOnPost'] |= {'post_baseUrl'}

It's possible that I'm overlooking something and maybe there's an easy way to do this.

But I would like to just do:

msg = sarracenia.Message.fromFileInfo("https://somedomain.com/example/path/to/file.txt", self.o, stat)

I'm creating this as mostly a to do list item for myself.

Other examples:

# finally create the message!
if data_url:
# The message is created using the post_baseUrl and relative path
url = urlparse(data_url)
baseUrl = url.scheme + "://" + url.netloc + "/"
self.o.post_baseUrl = baseUrl
m = sarracenia.Message.fromFileInfo(url.path, self.o)
# When Sarracenia runs updatePaths again later, from sarracenia.Flow, self.o.post_baseUrl will be
# different, so set msg['post_baseUrl'] here to override whatever setting it has at that point.
m['post_baseUrl'] = baseUrl
m['_deleteOnPost'] |= {'post_baseUrl'}

for endpoint in objects_by_endpoint_bucket:
for bucket in objects_by_endpoint_bucket[endpoint]:
for obj in objects_by_endpoint_bucket[endpoint][bucket]:
stat = paramiko.SFTPAttributes()
if 'LastModified' in obj:
t = obj["LastModified"].timestamp()
stat.st_atime = t
stat.st_mtime = t
if 'Size' in obj:
stat.st_size = obj['Size']
file_path = bucket + '/' + obj['Key']
msg = sarracenia.Message.fromFileInfo(file_path, self.o, stat)
# The (new_)baseUrl field will be set to the post_baseUrl from the config, or pollUrl if
# post_baseUrl is not set. We need to override it here, because the baseUrl can change if the
# files are coming from different endpoints.
msg['baseUrl'] = endpoint
msg['new_baseUrl'] = endpoint
# When Sarracenia runs updatePaths again later, from sarracenia.Flow, self.o.post_baseUrl will be
# different, so set msg['post_baseUrl'] here to override whatever setting it has at that point.
msg['post_baseUrl'] = endpoint
msg['_deleteOnPost'] |= {'post_baseUrl'}

@reidsunderland reidsunderland added good first issue Good for newcomers wishlist would be nice, not pressing for any particular client. Sugar nice to have features... not super important. labels Nov 28, 2024
@reidsunderland reidsunderland self-assigned this Nov 28, 2024
@petersilva
Copy link
Contributor

another option... (not proposing... exploring.):

provide overrides for options.

  • fromFileInfo(path, o, lstat=None, overrides):

  • overrides is a dictionary of option overrides (for post_baseUrl in this case.)

  • or have msg['overrides'] = { 'post_baseUrl' : 'http...' }

provide option for specific override:

another option ... automated baseUrl extraction... (assume base is root...)

  • fromFileInfo( ... auto_base_url_extraction=False )
    when true override setting...

because this is the only case that causes problems.

fix code to not override the msg['post_baseUrl'] setting..

there's and issue about that... #951
... code could check if new_post_baseURL is set.. and leave it alone...

@petersilva
Copy link
Contributor

gradually understanding what Reid was saying... so... if the path passed to the existing "fromFileInfo" is a url, use the root of that url as the baseUrl. ... I can see that ...

@reidsunderland
Copy link
Member Author

Yeah, exactly, just an easy way to do that.

Right now, if you have just a full normal URL string (e.g. "https://example.com/path/to/file.txt") that does not match the pollUrl/post_baseUrl in the config, to create a message you have to:

  1. parse the url string into a baseUrl (https://example.com/) and relPath (path/to/file.txt)
  2. create a message using fromFileInfo("path/to/file.txt", ...) - this sets the relPath correctly, but the baseUrl is wrong
  3. override the baseUrl by setting msg['post_baseUrl'] manually *
  4. add post_baseUrl to the delete on post set

I was thinking a whole new function like fromUrl would make sense, but changing fromFileInfo to handle URLs could work too (although it's already a bit of a big function).

* Technically the post_baseUrl is the only field in the message that you need to set, but I also set baseUrl and new_baseUrl as well, because I think those values show up in some log output before sr3 replaces baseUrl and new_baseUrl with the contents of post_baseUrl.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers Sugar nice to have features... not super important. wishlist would be nice, not pressing for any particular client.
Projects
None yet
Development

No branches or pull requests

2 participants