Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace libxml++ w/ libxml2 #218

Closed
4 tasks done
rakhimov opened this issue Aug 21, 2017 · 1 comment
Closed
4 tasks done

Replace libxml++ w/ libxml2 #218

rakhimov opened this issue Aug 21, 2017 · 1 comment

Comments

@rakhimov
Copy link
Owner

rakhimov commented Aug 21, 2017

XML parser facilities are only needed to process MEF and config files.
SCRAM does not need all the bells & whistles provided by libxml++.
It should be possible to replace libxml++
with a custom minimal adapter around libxml2,
providing only the needed functionality.
This would free SCRAM from libxml++ and its dependencies
(glib, glibmm, libsigc++).

Moreover, the performance should improve slightly
if the custom wrapper does not manage wrapper node databases
and XML modification as libxml++ does.

The codebase needs to be updated for the new libxml++3 API anyway,
so this is a good excuse for creating the custom adaptor and dropping libxml++.

Required features:

  • DOM parser & read-only document (from file & stream)
  • XInclude processing (automatic w/ libxml2)
  • RelaxNG validator
  • Element nodes only (w/ attribute & text values)
@rakhimov rakhimov added this to the v0.16.0 milestone Aug 21, 2017
rakhimov added a commit that referenced this issue Aug 23, 2017
This class wraps libxml++ API for convenience.
The minimal API serves only the needs of the SCRAM code,
which doesn't need many functionalities provided by libxml++.

This and other xml helper classes should hide
the complexity of XML libraries.

Issue #218
rakhimov added a commit that referenced this issue Aug 23, 2017
The libxml2 XML document tracks its filename,
so it can be used instead of tracking files separately in a table.

Issue #218
rakhimov added a commit that referenced this issue Aug 23, 2017
The standard XML Node API can be used
when the node type is known to be Element.
This relies on validated XML input.

Unfortunately, since libxml++ uses std::list<Node*> for children,
the performance degrades by around 5%
compared to XPath queries returning std::vector<Node*>.

Issue #218
rakhimov added a commit that referenced this issue Aug 23, 2017
XML child elements are intrusive linked lists,
so it is convenient to reuse this data structure
instead of converting it into C++ std::list<>.

The range interface just wraps the begin and end nodes,
as well as ensuring that child nodes are actually Element nodes.

Issue #218
rakhimov added a commit that referenced this issue Aug 23, 2017
xml::Element fully replaces xmlpp::Element*.
Config does not need as much use as Initializer,
so this is an initial indirect test of the xml::Element API.

Issue #218
rakhimov added a commit that referenced this issue Aug 23, 2017
libxml++ xmlpp::Element, xmlpp::NodeSet, and xmlpp::Node::NodeList
are replaced with xml::Element and xml::Element::Range.
Initializer private interfaces and code are changed appropriately
to accommodate the xml::Element.
xml::Element is also refactored to support the functionalities
required by Initializer.

Issue #218
rakhimov added a commit that referenced this issue Aug 23, 2017
These are all thin and convenient wrappers for libxml++ classes.

Issue #218
rakhimov added a commit that referenced this issue Aug 23, 2017
Initializer now completely free from libxml++ code.

Issue #218
rakhimov added a commit that referenced this issue Aug 23, 2017
Config is now completely free from libxml++ code.

Issue #218
rakhimov added a commit that referenced this issue Aug 23, 2017
This is a convenience overload to work with C++ input streams.
The overload is needed for tests.

Issue #218
rakhimov added a commit that referenced this issue Aug 23, 2017
Tests are now free from libxml++ code.

Issue #218
rakhimov added a commit that referenced this issue Aug 23, 2017
The GUI mainwindow code now is free from libxml++.

Issue #218
rakhimov added a commit that referenced this issue Aug 23, 2017
xml::Parser provides all the needed functionality.

Issue #218
rakhimov added a commit that referenced this issue Aug 23, 2017
xml::Element adaptor directly uses libxml2 facilities.
The downside of this approach is
that libxml2 C code pollutes
the global namespace with C xml functions.

Issue #218
rakhimov added a commit that referenced this issue Aug 23, 2017
Validator, Parser, and Document are now fully implemented
in terms of libxml2 code.
These are the last facilities to transition to libxml2.

Even though the code switched to libxml2,
the proper error reporting is not yet implemented.

Issue #218
rakhimov added a commit that referenced this issue Aug 23, 2017
Only libxml2 version is needed.

Issue #218
rakhimov added a commit that referenced this issue Aug 23, 2017
Use the standard FindLibXml2 to find libxml2.

Issue #218
rakhimov added a commit that referenced this issue Aug 23, 2017
Installation of libxml++ is removed.

Issue #218
rakhimov added a commit that referenced this issue Aug 23, 2017
- Remove libxml++ installation instructions.
- The minimum version (2.9.1) for libxml2 is chosen
  to be what is in Ubuntu 14.04.
  Older versions may also work just fine.
  This is rather untested.

Issue #218
rakhimov added a commit that referenced this issue Aug 23, 2017
This provides a thin view to the underlying UTF-8 strings
without excessive copying them to std::string.

Issue #218
rakhimov added a commit that referenced this issue Aug 23, 2017
rakhimov added a commit that referenced this issue Aug 23, 2017
This is way faster than boost::lexical_cast.
The overall MEF initialization improves by 10%.

Issue #218
rakhimov added a commit that referenced this issue Aug 23, 2017
This is the library time that needs
to be subtracted from overall initialization time.

Issue #218
rakhimov added a commit that referenced this issue Aug 23, 2017
Upon moving to libxml2 and refactoring,
the overall gain is 2 - 2.5x speedup.
However, there's only 1-2% memory utilization improvement w/ libxml2.

Issue #218
rakhimov added a commit that referenced this issue Aug 23, 2017
GUI does not need to parse the files again
to get the DOM document for validation purposes.

Issue #218
@rakhimov
Copy link
Owner Author

libxml++ is fully replaced by libxml2 custom wrapper,
but error message construction is not handled.
This is delegated to #219 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant