-
Notifications
You must be signed in to change notification settings - Fork 17
lesson: define a basic terminology
This Tutorial is known to work with om version 3.0.4.
Please update this wiki to reflect any other versions that have been tested.
- Define a simple OM Terminology for XML metadata
- Create OM Documents based on your Terminology
- Create and update XML nodes using the OM Terminology
- Inspect OM Documents to find out what XPath queries are being used for a given Term
- Use OM's API to access the underlying Nokogiri Document and the Nodesets it returns from QPath queries
For this first example we want to model simple, flat XML. Let's say the root node of our XML documents is called fields
and we have elements for title
and author
.
<fields>
<title>ZOIA! Memoirs of Zoia Horn, Battler for the People's Right to Know.</title>
<author>Horn, Zoia</author>
</fields>
Note that we do not have any namespaces, attributes on elements, schema declarations, or any other joyful XML features. OM does provide ways to handle these, but it does not require them. We will look at each of those separately in other lessons.
Now we'll create a file called book_metadata.rb
Paste the following code into that file:
require "om"
class BookMetadata
# This include statement adds the behaviors of an OM Document to your class
include OM::XML::Document
set_terminology do |t|
t.root(path: "fields")
t.title
t.author
end
# This method is called when you create new XML documents from scratch.
# It must return a Nokogiri::Document. Other than that, you can make your "default" documents look however you want.
def self.xml_template
Nokogiri::XML.parse("<fields/>")
end
end
Open up an irb console (Ruby Interactive Console). Rather than simply calling irb
on the command line, Use bundler to ensure that your dependencies are handled predictably.
bundle console
require "./book_metadata"
newdoc = BookMetadata.new
puts newdoc.to_xml
<?xml version="1.0"?>
<fields/>
Now you have an empty OM document that was initialized using the BookMetadata.xml_template method you defined.
Because this Document is a BookMetadata object, you can use the Terminology to set and retrieve the values of the Terms you've defined.
newdoc.author = "Horn, Zoia"
=> "Horn, Zoia"
newdoc.title = "ZOIA! Memoirs of Zoia Horn, Battler for the People's Right to Know."
=> "ZOIA! Memoirs of Zoia Horn, Battler for the People's Right to Know."
puts newdoc.to_xml
<?xml version="1.0"?>
<fields>
<author>Horn, Zoia</author>
<title>ZOIA! Memoirs of Zoia Horn, Battler for the People's Right to Know.</title>
</fields>
As you can see, calling .to_xml
has returned an XML document with the title and author set to the values you provided.
OM makes it easy to update these elements.
newdoc.author = ["Horn, Zoia", "Hypatia"]
=> ["Horn, Zoia", "Hypatia"]
puts newdoc.to_xml
<?xml version="1.0"?>
<fields>
<author>Horn, Zoia</author>
<title>ZOIA! Memoirs of Zoia Horn, Battler for the People's Right to Know.</title>
<author>Hypatia</author>
</fields>
Each OM Document you create is basically just a wrapper around a Nokogiri Document and the Document's Terminology is basically just a handy structure that remembers XPath queries for you. You can access the inner Nokogiri Document by calling .ng_xml
on the OM Document and you can get the stored XPath query by calling .xpath on any of the terms.
Since OM simply runs XPath queries against that underlying Nokogiri document, you don't need to do anything to keep the OM Document in sync with the Nokogiri Document. You can use the Nokogiri API to make any changes you want to the Nokogiri Document and the OM Document will reflect those changes.
newdoc.title.xpath
=> "//title"
newdoc.author.xpath
=> "//author"
newdoc.ng_xml
=> #<Nokogiri::XML::Document:0x80776da4 name="document" children=[#<Nokogiri::XML::Element:0x8090bbb0 name="fields" children=[#<Nokogiri::XML::Element:0x80818e74 name="author" children=[#<Nokogiri::XML::Text:0x80573868 "Horn, Zoia">]>, #<Nokogiri::XML::Element:0x804a22e0 name="title" children=[#<Nokogiri::XML::Text:0x805795c4 "ZOIA! Memoirs of Zoia Horn, Battler for the People's Right to Know.">]>]>]>
When you access a Term's values, OM is just running an XPath query for you and returning the values from the XML Nodes that were returned from the query. If you want to get the Nokogiri Nodeset from the XPath Query instead of the value from those Nodes, call .nodeset
on the term.
newdoc.author.nodeset
=> [#<Nokogiri::XML::Element:0x80818e74 name="author" children=[#<Nokogiri::XML::Text:0x80573868 "Horn, Zoia">]>]
Go on to Lesson: Define a Terminology with a nested hierarchy of Terms or return to the Tame your XML with OM page.