Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to filter XML nodes and see result as XML #13

Open
tomeks666 opened this issue Oct 27, 2023 · 3 comments
Open

How to filter XML nodes and see result as XML #13

tomeks666 opened this issue Oct 27, 2023 · 3 comments

Comments

@tomeks666
Copy link

tomeks666 commented Oct 27, 2023

I would like to filter a node set to see the filtered nodes not just something like /root/node[100] but actual nodes in the result in xml, just like in the source. How can I do this?

@pgfearo
Copy link
Member

pgfearo commented Oct 31, 2023

When XPath Notebook was first published, when clicking on the XPath location like root/node[100], the corresponding node in the XML source would be made viewable and selected. An update to VS Code has unfortunately disabled this functionality (see issue 10.

If you want to see the source and not the XPath of the node, you can use the serialize() function in your XPath. For example:

/xsl:stylesheet/xsl:function[3]!(serialize(.) => normalize-space())

In future, I hope to introduce a built-in 'print()' function that should do a better job than serialize(), with features like truncation for long strings, formatting and disabling the use of character references like '<' instead of '<'.

@tomeks666
Copy link
Author

tomeks666 commented Jan 2, 2024

Thanks for adding to feature request. So far I have not figured out any method short of manually mapping every node to get a readable output.

If I use serialize I am getting something that looks like this:

&lt;job_information> &lt;user_id>91435&lt;/user_id> &lt;change_report> &lt;start_date>2023-12-11&lt;/start_date> &lt;end_date>9999-12-31&lt;/end_date> &lt;end_date_previous>2023-12-10&lt;/end_date_previous> &lt;start_date_previous>2023-07-03&lt;/start_date_previous> &lt;changes> &lt;attachment_id> &lt;current/> &lt;previous>1023&lt;/previous> &lt;/attachment_id> &lt;contract_type> &lt;current>S&lt;/current> &lt;previous/> &lt;/contract_type> &lt;cost_center> &lt;current>WERET1S06&lt;/current> &lt;previous/> &lt;/cost_center> &lt;custom_double6> &lt;current>1.0&lt;/current> &lt;previous/> &lt;/custom_double6>

Instead of something like this:

<changes>
           <custom_date1>
             <current>2024-02-01</current>
             <previous/>
           </custom_date1>
           <event>
             <current>12</current>
             <previous>5</previous>
           </event>
           <event_reason>
             <current>rwer</current>
             <previous>ererer</previous>
           </event_reason>
           <workflow_request_id>
             <current>8388444</current>
             <previous/>
           </workflow_request_id>
         </changes>
       </change_report>

It does not have to be XML. Json would be ok too, if I can see the nodes and values. I am working with large XML files that are just hard to filter and navigate. XPath notebook would be a fabulous tool for ad-hoc filtering and summarizing such documents.

Maybe it would be easy to add literal text presentation method without escaping any '<' '>' characters??? Then serialize() would work as expected and I would see normal XML.

@pgfearo
Copy link
Member

pgfearo commented Jan 2, 2024

I fear the problem with the escaping of '<' to '<' is probably down to my XPath Notebook output rather than SaxonJS. I will look into this.

With the following XPath I still get the unwanted escaping and no indentation:

/books => serialize(map {'method':'xml', 'indent':true()})

In the meantime there isn't really a good workaround I can think of.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants