The contents of the World Factbook in open data formats
The World Factbook data in this repo was retrieved on August 31, 2014
The open World Factbook aims to provide the contents of the World Factbook (referred to as WFB) in various open formats so it is easily acessible for re-use.
The tools and procedures used to process and update the data in this repopository are provided, see the tools directory.
The World Factbook is in the public domain. Accordingly, it may be copied freely without permission of the Central Intelligence Agency (CIA). The tools provided in this repository are dedicated to the public domain by their respective author(s) as well.
For a brief description of the original World Factbook follow this link.
As Gihub
renders files in markdown
notation to HTML
the country profiles
in the geos.md/
directory provide a nice-looking preview which even includes
flags, locators and maps.
Preview index by region • by status
The files related to geographic entities are named strictly as follows:
-
two lower case letters containing the GEC (FIPS 10-4 code), e.g.
gm
for Germany andau
for Austria (as opposed to the ISO 3166-1 alpha2 codesde
andat
respectively). -
file extension in lower case, e.g.
.png
for portable network graphics,.md
for files inpandoc
markdown notation.
Note: when the factbook/
directory is referred to the top level directory
of the download version of the original WFB is meant. The preferred location
of the factbook/
directory in the filesystem tree is the directory
containing this README.md
file.
The contents of the open World Factbook in JSON
format as produced by
the World Factbook scraper wfbScraper.py.
All other file formats are derived from the JSON
files provided here.
read more • index by region • by status
The contents of the open World Factbook converted from JSON
format
to pandoc
markdown notation by wfbJson2x.py.
References to flags, locators and maps are all relative (media directories are
supposed to be siblings of the geos.md/
directory.
The contents of the open World Factbook converted from JSON
format to plain
.txt
by wfbJson2x.py with option "-f txt
". This text version
is somewhat easier to grasp than other plain text versions created by
standard conversion tools (e.g. pandoc -t plain
or
pandoc -t html
followed by lynx --dump
) or similar.
The contents of the open World Factbook converted from JSON
format to custom
LaTeX
by wfbJson2x.py with option "-f tex
".
read more • index by region • by status
Flags for the open World Factbook, taken from directory
factbook/graphics/flags/large/
, renamed
according to the naming conventions above and
converted from .gif
to .png
format using Imagemagick's convert
,
image size retained.
Flags for the open World Factbook, scaled down to approximately 15000 pixels, independent of aspect ratio (y/x), primarily for web usage.
Overview maps for the open World Factbook, from directory
factbook/graphics/locator/
, moved to a flat
directory, renamed according to the naming conventions above and
converted from .gif
to .png
format with white background applied using convert
,
image size retained.
Overview maps for the open World Factbook, scaled down where necessary to fit into a 400x400 box, primarily for web usage.
Maps for the open World Factbook taken from directory factbook/maps/
renamed according to the naming conventions above and converted from
.gif
to .png
format using convert
, image size retained.
Maps for the open World Factbook scaled down where necessary to fit into a 400x400 box, primarily for web usage.
Tools for the open World Factbook:
-
wfbScraper.py -- Script that scrapes a single HTML page from the Word Factbook and outputs its content in
JSON
format to stdout. For additional header information the referenced images showing flag, locator and map are read and analyzed. -
wfbJson2x.py --Script that creates various formats from
JSON
files created bywfbScraper.py
. -
scrape-fieldlist.py -- Script that scrapes the field name from the contents of one or more files in the
factbook/fields/
directory. -
scrape-fielddesc.py -- Script that scrapes the field description from the contents of one or more files in the
factbook/fields/
directory.
Some useful metadata. To be completed.
The files in directory geos.md/
provide a good starting point for automated conversion
of the WFB into other formats, e.g. .html
, LaTeX
, .odt
, .docx
(MS Word) and
several others. See the Pandoc website for details.
Note: Other open file formats may be added to the repopository over time.
The open World Factbook project is hosted on github
at
https://github.com/openfactbook/master
For discussion of topics related to the open World Factbook visit the openmundi forum
and mailing list at https://groups.google.com/forum/#!forum/openmundi
(Open World Database - world.db and Friends). Thanks Gerald.