-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add command string-to-variable
to reuse incoming string as variable
#533
Comments
It can not be set by input modules, because they don't know anything about records at that point. OTOH, the source location (URL, path) is not available anymore when the decoder receives the stream and there is (currently) no way to transport it out-of-band. Setting the ID to the source location would also mean that (potentially) multiple records would get the same ID, so it violates the uniqueness guarantee. It might, however, be possible to save the URL in a variable which can then be used in the transformation. Maybe along the following lines: default inputUrl = "https://phet-dev.colorado.edu/html/build-an-atom/0.0.0-3/simple-text-only-test-page.html";
inputUrl
| open-http(accept="application/xml")
| decode-html
| fix("set_field('_id', '$[inputUrl]')", *)
| change-id
| fix("copy_field('_id', '_id')")
| encode-json(prettyPrinting="true")
| print
; |
I would be fine with a variable that could be used in the FIX and the FLUX. It would help in this scenario. Nice would be also to use the variable in e.g. logging contexts or in other scenarios as variable in the FLUX, but this would be an additional feature. |
So your initial use case is solved?
I'm not sure I understand this part. Do you mean that all variables should be included whenever anything is logged? And what other contexts are you referring to? |
I think if I could use the variable in the fix my use case would be solved yes. :)
If I could configure the logging message and add the variable to the output is one scenario where the variable could be handy. Another could be if the file-name is passed on as a variable I could use it to write a file with a given variable as name. |
But you can. Doesn't the proposed solution work for you? |
ahh, i now I see the specific aspect of your approach. something like this:
Instead you would define the variable beforehand. This would not solve my usecase since you have to provide/configure the variable outside of the flux-workflow itself. Perhaps another and more general solution would be a flux-module that sets the incoming string as variable.
|
url
from open-http
as _id
url
or path from open-http
/pen-file
as variable
url
or path from open-http
/pen-file
as variableurl
or path from open-http
/open-file
as variable
url
or path from open-http
/open-file
as variablestring-to-variable
to reuse incoming string as variable
I suggest we go with this approach. |
I don't think that your idea would work: you seem to propose like setting a variable globally i.e. that could be accessed independently of the modules. This must break, ultimatley when using threads. It would break even before, because the modules are of stream character, and you cannot guarantee that the variable is not changed before the content of the variable (the associated data) is already treated in downstream modules. A possible solution could maybe be, if |
Yes the intend is to set a global variable, that can be reused at a later stage , e.g. usecase scenario we had in oersi-marc: opening a folder with files manipulate them and later reuse the filenames of the incoming string.
The other scenario coming from oersi, when using a sitemapreader one cannot get the URL of each subsite. maybe I am thinking about this in an undercomplex way. btw writing this my solution would not be good enough you are right and would not solve my scenario since the incoming string in openFile is an relativePath not the filename... Then I go back to my old idea. open-file and open-http should provide the filename/filepath or the URL as variable for later use. But to make this threadsafe it seems that it will be difficult. |
At the moment we cannot use the incoming url-string after it is used in
open-http
.A useful scenario would be if we scrape a website but the website does not provide the url as metadata and to quickly identify the source. Another would be if catching errors in a later process it could state the
_id
as source of the error.There also could be a more abstract approach since this could also be useful for
open-file
and provide the file name as_id
e.g.:
https://metafacture.org/playground/?flux=%22https%3A//phet-dev.colorado.edu/html/build-an-atom/0.0.0-3/simple-text-only-test-page.html%22%0A%7C+open-http%28accept%3D%22application/xml%22%29%0A%7C+decode-html%0A%7C+fix%28%22copy_field%28%27_id%27%2C%27_id%27%29%22%29%0A%7C+encode-json%28prettyPrinting%3D%22true%22%29%0A%7C+print%0A%3B
Not sure where the value of
_id
comes from.PS: 17.9.24:
I suggest to introduce a command that would reuse the incoming string as java variable
string-to-variable
that would be a generic approach and the command could be put infront of the specific openerThe text was updated successfully, but these errors were encountered: