-
Notifications
You must be signed in to change notification settings - Fork 16
Node text gotchas
zverok edited this page Jun 24, 2015
·
1 revision
"Invisible" nodes: idea of Node#text
is to provide "plain readable"
version of page fragment; so, some node types give intentionally empty
text. This relates to and templates (the templates matter is
complicated, though)
para = Infoboxer::Parser.paragraphs('')
para.text
# But
para.lookup(Ref).text
Paragraph-level nodes return text, ending with "\n\n". This way
paragraph's text can be just .join
-ed to obtain pretty rendered
paragraphs. But if you want to just output TOC or something like this,
extra "\n\n"-s can be irritating. For such cases there's method with
cumbersome name #text_
-- which is kinda synonym for node.text.strip
page = Infoboxer.wp.get('Argentina')
page.headings.each{|h| puts ' ' * h.level + h.text}
# Output:
# ...
# But
page.headings.each{|h| puts ' ' * h.level + h.text_}
# Output:
# ...
Tables is rendered (somewhat experimentally) with [terminal-table] gem. This looks pretty good on demo, but I'm not sure at all that this approach is not an overkill. Let's try and decide.
puts Infoboxer.wp.get('Sri Lanka').tables.first.text
# Output:
# +----------------------------------------+--------------+------------+---------+-------------+
# | Administrative Divisions of Sri Lanka |
# +----------------------------------------+--------------+------------+---------+-------------+
# | Province | Capital | Area (km) | Area | Population |
# | | | | (sq mi) | |
# | Central | Kandy | 5,674 | | 2,556,774 |
# | Eastern | Trincomalee | 9,996 | | 1,547,377 |
# | North Central | Anuradhapura | 10,714 | | 1,259,421 |
# | Northern | Jaffna | 8,884 | | 1,060,023 |
# | North Western | Kurunegala | 7,812 | | 2,372,185 |
# | Sabaragamuwa | Ratnapura | 4,902 | | 1,919,478 |
# | Southern | Galle | 5,559 | | 2,465,626 |
# | Uva | Badulla | 8,488 | | 1,259,419 |
# | Western | Colombo | 3,709 | | 5,837,294 |
# +----------------------------------------+--------------+------------+---------+-------------+