-
Notifications
You must be signed in to change notification settings - Fork 16
Reasons
zverok edited this page Jun 16, 2015
·
5 revisions
Wikipedia has lots of informations. Some can say, all of world "common" information (and others can say it is misleading, incomplete, or broken in any way possible).
TODO: information extraction
DBPedia is a great effort for extracting data from Wikipedia and store them in structured form, it does an appropriate use of Semantic Web technologies (RDF, SPARQL), interoperates with existing ontologies and overall awesome.
But DBPedia also is:
- outdated -- at a time I'm writing it (May 26, 2015), DBPedia resources, accessible online, are from Wikipedia dump of May 02, 2014. Yep, 2014. More than year old. Enough for some topics, dramatically outdated for others (governments, movies, solar eclypse, births and deaths...);
- incomplete (and information is lost unrecoverably) -- DBPedia maps only subset of properties and areas of Wikipedia pages, and everything left behind that mapping can not be received through DBPedia at all;
- ambiguous -- trying to interweave existing ontologies, languages, means of representing same properties, DBPedia leaves you with several ways to query even the basic properties (like "name" or "type"), and any of them can be broken in strange way for very similar page;
- complicated -- for querying the simplest data, you should have some understanding of Semantic Web technologies -- RDF, triples, namespaces, literals representation, sometimes SPARQL...
So, I've tried to implement simpler and cleaner (and also more up-to-date) way to grab your data.
Still, DBPedia is useful for complex (SPARQL) queries, when you need something like "all Argentinian cities with population more than X, to the south of Y", which Wikipedia API can not.