Toolkit to display, analyze, and visualize data and documents based on RDF graphs and the SPARQL query language using Pandas, Jupyter, and other Python ecosystem tools.
Gastrodon links databases that support the SPARQL protocol (more than ten!) to http://pandas.pydata.org/, a popular Python library for analysis of tabular data. Pandas, in turn, is connected to a vast number of visualization, statistics, and machine learning tools, all of which work with Jupyter notebooks. The result is an ideal environment for telling stories that reveal the value of data, ontologies, taxonomies, and models.
In addition to remote databases, Gastrodon can do SPARQL queries over in-memory RDF graphs (from rdflib). It has facilities to copy subgraphs from one graph to another, making it possible to assemble local graphs that contain facts relevant to a particular decision, work on them intimately, and then store results in a permanent triple store.
Gastrodon mediates between three data models: (1) RDF, (2) Pandas/NumPy,
and (3) Native Python. Gastrodon lets you use Python variables in your
SPARQL queries simply by adding ?_
to the name of your variables.
Unlike many RDF libraries, substitution works with both local and remote
SPARQL endpoints. Gastrodon works with the Python type system to keep
track of details such as "is this variable a URI or a String?" so that
you don't have to.
Gastrodon always has your back because it understands SPARQL. Gastrodon
automatically keeps track of namespaces and appends prefix
declarations to your queries to keep them short and sweet. Unlike many
RDF libraries, Gastrodon supports variable substitution for queries in
both local and remote stores. Gastrodon identifies GROUP BY
variables and automatically makes them the index of the resulting Pandas
DataFrames so that you can make common visualizations automatically.
Many software packages ignore error handling, which is a big mistake, because poor error handling gets in the way of both everyday use and the learning process. Instead of making excuses, Gastrodon has intelligent error handling which adds to the convenience of data analysis and visualization with Gastrodon.
Gastrodon requires Python 3.6 and is registered in the Python Package Index and can be installed by typing:
pip install gastrodon
on the command line. Note: Gastrodon downloads packages it requires via pip. If you are running Anancoda (which works great with Gastrodon) you have a second package manager, running parallel with pip, which can install better versions of important software packages than the ones you can get from pip. In Anaconda, you should type the following to create an environment for gastrodon:
conda create -n gastrodonSandbox python=3.6 anaconda activate gastrodonSandbox conda install jupyter IPython pandas matplotlib pip install gastrodon
The major documentation resources for Gastrodon itself are:
The following are reference documentation for tools you will use: