Skip to content
This repository has been archived by the owner on Feb 1, 2023. It is now read-only.

Commit

Permalink
[#213] Add initial documentation using Sphinx
Browse files Browse the repository at this point in the history
It contains:

* Quickstart on GitHub
* Writing a table schema
* Configuring goodtables.io
* A (small) goodtables.yml reference

It's missing docs for AWS, and the layout is still the default Sphinx theme.
  • Loading branch information
vitorbaptista committed Dec 12, 2017
1 parent 5159820 commit 8606670
Show file tree
Hide file tree
Showing 13 changed files with 570 additions and 1 deletion.
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -139,3 +139,6 @@ jspm_packages
# Extra
/static/
/public/

# Temporary files
.#*
1 change: 1 addition & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ install:

script:
- make test
- make docs

after_success:
- pip install coveralls
Expand Down
8 changes: 7 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ install-backend: ## Install the dependencies for the backend app
pip3 install --upgrade --no-cache-dir --exists-action w -r requirements.txt

install-dev: ## Install the additional development dependencies for the app
pip3 install --upgrade --no-cache-dir -r requirements.dev
pip3 install --upgrade --no-cache-dir -r requirements.dev -r requirements.doc

install-frontend: ## Install the dependencies for frontend development and compilation
npm install
Expand Down Expand Up @@ -116,3 +116,9 @@ server: ## Command to run the app as queue or server

spec:
wget -O frontend/spec.json https://raw.githubusercontent.com/frictionlessdata/data-quality-spec/master/spec.json

docs:
sphinx-build -b html docs/ docs/_build

docs-watch:
sphinx-autobuild docs/ docs/_build/html/
20 changes: 20 additions & 0 deletions docs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line.
SPHINXOPTS =
SPHINXBUILD = sphinx-build
SPHINXPROJ = Goodtables
SOURCEDIR = .
BUILDDIR = _build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
187 changes: 187 additions & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,187 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
#
# Goodtables documentation build configuration file, created by
# sphinx-quickstart on Fri Dec 1 14:50:05 2017.
#
# This file is execfile()d with the current directory set to its
# containing dir.
#
# Note that not all possible configuration values are present in this
# autogenerated file.
#
# All configuration values have a default; values that are commented out
# serve to show the default.

# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
# import os
# import sys
# sys.path.insert(0, os.path.abspath('.'))
from recommonmark.parser import CommonMarkParser
from recommonmark.transform import AutoStructify


# -- General configuration ------------------------------------------------

# If your documentation needs a minimal Sphinx version, state it here.
#
# needs_sphinx = '1.0'

# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = ['sphinx.ext.viewcode']

# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']

# The suffix(es) of source filenames.
# You can specify multiple suffix as a list of string:
#
# source_suffix = ['.rst', '.md']
source_suffix = ['.rst', '.md']

# The master toctree document.
master_doc = 'index'

# General information about the project.
project = 'Goodtables'
copyright = '2017, Open Knowledge International'
author = 'Open Knowledge International'

# The version info for the project you're documenting, acts as replacement for
# |version| and |release|, also used in various other places throughout the
# built documents.
#
# The short X.Y version.
version = ''
# The full version, including alpha/beta/rc tags.
release = ''

# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
#
# This is also used if you do content translation via gettext catalogs.
# Usually you set "language" from the command line for these cases.
language = None

# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This patterns also effect to html_static_path and html_extra_path
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']

# The name of the Pygments (syntax highlighting) style to use.
pygments_style = 'sphinx'

# If true, `todo` and `todoList` produce output, else they produce nothing.
todo_include_todos = False


# -- Options for HTML output ----------------------------------------------

# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = 'alabaster'

# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
# documentation.
#
html_theme_options = {
'description': 'Continuous validation for tabular datasets',
'github_user': 'frictionlessdata',
'github_repo': 'goodtables.io',
'github_type': 'star',
'github_count': False,
}

# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']

# Custom sidebar templates, must be a dictionary that maps document names
# to template names.
#
# This is required for the alabaster theme
# refs: http://alabaster.readthedocs.io/en/latest/installation.html#sidebars
html_sidebars = {
'**': [
'about.html',
'navigation.html',
'searchbox.html',
]
}


# -- Options for HTMLHelp output ------------------------------------------

# Output file base name for HTML help builder.
htmlhelp_basename = 'Goodtablesdoc'


# -- Options for LaTeX output ---------------------------------------------

latex_elements = {
# The paper size ('letterpaper' or 'a4paper').
#
# 'papersize': 'letterpaper',

# The font size ('10pt', '11pt' or '12pt').
#
# 'pointsize': '10pt',

# Additional stuff for the LaTeX preamble.
#
# 'preamble': '',

# Latex figure (float) alignment
#
# 'figure_align': 'htbp',
}

# Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title,
# author, documentclass [howto, manual, or own class]).
latex_documents = [
(master_doc, 'Goodtables.tex', 'Goodtables Documentation',
'Open Knowledge International', 'manual'),
]


# -- Options for manual page output ---------------------------------------

# One entry per manual page. List of tuples
# (source start file, name, description, authors, manual section).
man_pages = [
(master_doc, 'goodtables', 'Goodtables Documentation',
[author], 1)
]


# -- Options for Texinfo output -------------------------------------------

# Grouping the document tree into Texinfo files. List of tuples
# (source start file, target name, title, author,
# dir menu entry, description, category)
texinfo_documents = [
(master_doc, 'Goodtables', 'Goodtables Documentation',
author, 'Goodtables', 'One line description of project.',
'Miscellaneous'),
]

source_parsers = {
'.md': CommonMarkParser,
}


# app setup hook
def setup(app):
app.add_config_value('recommonmark_config', {
'enable_eval_rst': True,
}, True)
app.add_transform(AutoStructify)
136 changes: 136 additions & 0 deletions docs/configuring.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
# Configuration

Goodtables.io is configured via a `goodtables.yml` file in the root directory. For example, you can define:

* Which files goodtables should validate
* Which spreadsheet page should be validated
* What delimiter your CSV file uses (e.g. `;`)
* Which validation checks should be executed

The rest of this page is divided in sections on common things you want to change. For the full reference, check the [goodtables.yml file reference][gtyml-reference].

## Defining the files to validate

By default goodtables validates all files with extension CSV, ODS, XLS, or XLSX, and all files named `datapackage.json`.

You can overwrite the default files in `goodtables.yml`:

```yaml
files:
- source: data1.csv
schema: schema1.json
- source: data2.xls
schema: schema2.json
```
Alternatively, you can define a pattern like:
```yaml
files: '*.csv'
```
You can also configure how the file is loaded using the options:
```eval_rst
+-----------------------------------+-----------------------------------+
| Option | Description |
+===================================+===================================+
| format | The file format (csv, xls, ...) |
+-----------------------------------+-----------------------------------+
| encoding | The file encoding (utf-8, ...) |
+-----------------------------------+-----------------------------------+
| skip_rows | Either the number of rows to |
| | skip, or an array of strings |
| | (e.g. ``#``, ``//``, ...). Rows |
| | that begin with any of the |
| | strings will be ignored. |
+-----------------------------------+-----------------------------------+
```

## Validating data packages

By default goodtables validates all files named `datapackage.json`.

You can overwrite this default in `goodtables.yml`:

```yaml
datapackages:
- report1/datapackage.json
- report2/datapackage.json
```
## Validating CSV files with custom dialects
You can configure how the CSV file is loaded by adding one of the following options on `goodtables.yml`:

```yaml
files:
- source: data.csv
delimiter: ;
doublequote: True
escapechar: \
lineterminator: \r\n
quotechar: "
```

The entire list of options can be found on the [Python CSV formatting reference][python-csv-docs].

## Defining the spreadsheet page to validate

By default goodtables validates the first sheet of a spreadsheet.

You can overwrite the default sheet in `goodtables.yml`:

```yaml
files:
- source: data.xlsx
sheet: 3
```

## Changing the limit of rows to validate

By default goodtables validates at most 1,000 rows. You can change it in `goodtables.yml`:

```yaml
settings:
row_limit: 2000
```

## Defining which validation checks are executed

By default goodtables runs all validation checks. You can customize which checks are executed in `goodtables.yml`:

```yaml
settings:
checks:
# You can pass check types
- structure
- schema
# ... or individual checks
- blank-header
- duplicate-row
- missing-value
skip_checks:
# You can also skip individual checks
- minimum-constraint
```

Note that if you use the `checks` setting, you have to define all checks you want to be used. Because of this, we recommend using `skip_checks` instead.

The list of validation checks can be found on the [goodtables-py documentation][gtpy-docs].

## Automatically inferring the schema

By default goodtables does not infer the data schema. You can enable inferring in `goodtables.yml`:

```yaml
settings:
infer_schema: True
infer_fields: True
```

Goodtables will infer the schema of all files and columns that don't have an explicit schema.

[gtyml-reference]: goodtables_yml.html "goodtables.yml file reference"
[python-csv-docs]: https://docs.python.org/3.6/library/csv.html#csv-fmt-params "Python CSV Formatting docs"
[gtpy-docs]: https://github.com/frictionlessdata/goodtables-py "Goodtables.py documentation"
9 changes: 9 additions & 0 deletions docs/getting_started.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Getting started

```eval_rst
.. toctree::
:maxdepth: 2
getting_started_github
writing_data_schema
```
Loading

0 comments on commit 8606670

Please sign in to comment.