-
Notifications
You must be signed in to change notification settings - Fork 16
Retrieving pages
# One page
Infoboxer.wikipedia.get('Argentina')
# also aliased is Infoboxer.wp
# Several pages (in one API request)
Infoboxer.wikipedia.get('Argentina', 'Bolivia', 'Chile')
# From non-English Wikipedia
Infoboxer.wikipedia('fr').get('Argentine')
# or, if it looks cleaner for you
Infoboxer.wikipedia(:fr).get('Argentine')
Wikimedia sister projects are all the publicly available wikis operated by the Wikimedia Foundation, including Wikipedia. →
Infoboxer.wiktionary.get('test')
Infoboxer.wikiquote.get('Vonnegut')
Infoboxer.commons.get('Category:Kittens')
Infoboxer.wikivoyage.get('Chiang Mai')
Wikia hosts a lot of of interesting Wikis, all published under copyleft and very interesting to study (the largest and most complete of them created by books, TV shows and games fans). So, Infoboxer provides shortcut for this, too:
# Default language
Infoboxer.wikia('tardis').get('Eleventh Doctor')
# Other language:
Infoboxer.wikia('tardis', :fr).get('Onzième Docteur')
As simple as that:
Infoboxer.wiki('http://mydomain.com').get('My Product')
Note: this assumes you have api.php
installed as usual at /w/api.php
.
If it is not so, use slightly more verbose version with full api URL:
Infoboxer.wiki('http://mydomain.com/myapipath/api.php').get('My Product')
There are many "page list generators" in MediaWiki API, but Infoboxer currently supports only some of them. Though, quite useful:
Infoboxer.wp.category('Countries in South America')
# => list of pages from category
Infoboxer.wp.search('intitle:"List of tramway systems"')
# => list of pages corresponding to search request
Infoboxer.wp.prefixsearch('List of tramway systems')
# => list of pages with titles starting from request
(You should do it before any significant amount of data extraction, per Wikipedia terms):
UA = 'MyCoolTool/1.1 (http://example.com/MyCoolTool/; [email protected])'
# All requests to all wikis will be with your User-Agent:
Infoboxer.user_agent = UA
# or, alternatively, just for one target site:
client = Infoboxer.wikipedia(user_agent: UA)
Next: Extracting data