Skip to content
This repository has been archived by the owner on Jan 6, 2025. It is now read-only.

[MRG + 1] Copyedit all documentation for Camelot #112

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 10 additions & 12 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ Kenneth Reitz has also written an [essay](https://www.kennethreitz.org/essays/be

As the [Requests Code Of Conduct](http://docs.python-requests.org/en/master/dev/contributing/#be-cordial) states, **all contributions are welcome**, as long as everyone involved is treated with respect.

## Your First Contribution
## Your first contribution

A great way to start contributing to Camelot is to pick an issue tagged with the [Contributor Friendly](https://github.com/socialcopsdev/camelot/labels/Contributor%20Friendly) tag or the [Level: Easy](https://github.com/socialcopsdev/camelot/labels/Level%3A%20Easy) tag. If you're unable to find a good first issue, feel free to contact the maintainer.

Expand All @@ -26,19 +26,17 @@ To install the dependencies needed for development, you can use pip:
$ pip install camelot-py[dev]
</pre>

### Alternatively

You can clone the project repository, and install using pip:
Alternatively, you can clone the project repository, and install using pip:

<pre>
$ pip install .[dev]
$ pip install ".[dev]"
</pre>

## Pull Requests

### Submit a Pull Request
### Submit a pull request

The preferred workflow for contributing to Camelot is to fork the [project repository](https://github.com/socialcopsdev/camelot) on GitHub, clone, develop on a branch and then finally submit a pull request. Steps:
The preferred workflow for contributing to Camelot is to fork the [project repository](https://github.com/socialcopsdev/camelot) on GitHub, clone, develop on a branch and then finally submit a pull request. Here are the steps:

1. Fork the project repository. Click on the ‘Fork’ button near the top of the page. This creates a copy of the code under your account on the GitHub.

Expand Down Expand Up @@ -73,15 +71,15 @@ $ git push -u origin my-feature

Now it's time to go to the your fork of Camelot and create a pull request! You can [follow these instructions](https://help.github.com/articles/creating-a-pull-request-from-a-fork/) to do this.

### Work on your Pull Request
### Work on your pull request

We recommend that your pull request complies with the following rules:

- Make sure your code follows [pep8](http://pep8.org).

- In case your pull request contains function docstrings, make sure you follow the [numpydoc](https://numpydoc.readthedocs.io/en/latest/format.html) format. All function docstrings in Camelot follow this format. Moreover, following the format will make sure that the API documentation is generated flawlessly.

- Make sure your commit messages follow [the seven rules of a great git commit message](https://chris.beams.io/posts/git-commit/).
- Make sure your commit messages follow [the seven rules of a great git commit message](https://chris.beams.io/posts/git-commit/):
- Separate subject from body with a blank line
- Limit the subject line to 50 characters
- Capitalize the subject line
Expand All @@ -104,15 +102,15 @@ Writing documentation, function docstrings, examples and tutorials is a great wa

It is written in [reStructuredText](https://en.wikipedia.org/wiki/ReStructuredText), with [Sphinx](http://www.sphinx-doc.org/en/master/) used to generate these lovely HTML files that you're currently reading (unless you're reading this on GitHub). You can edit the documentation using any text editor and then generate the HTML output by running `make html` in the `docs/` directory.

The function docstrings are written using the [numpydoc](https://numpydoc.readthedocs.io/en/latest/format.html) extension for Sphinx. Make sure you check out its format guidelines, before you start writing one.
The function docstrings are written using the [numpydoc](https://numpydoc.readthedocs.io/en/latest/format.html) extension for Sphinx. Make sure you check out its format guidelines before you start writing one.

## Filing Issues

We use [GitHub issues](https://docs.pytest.org/en/latest/) to keep track of all issues and pull requests. Before opening an issue (which asks a question or reports a bug), it is advisable to use GitHub search to look for existing issues (both open and closed) that may be similar.
We use [GitHub issues](https://docs.pytest.org/en/latest/) to keep track of all issues and pull requests. Before opening an issue (which asks a question or reports a bug), please use GitHub search to look for existing issues (both open and closed) that may be similar.

### Questions

Please don't use GitHub issues for support questions, a better place for them would be [Stack Overflow](http://stackoverflow.com). Make sure you tag them using the `python-camelot` tag.
Please don't use GitHub issues for support questions. A better place for them would be [Stack Overflow](http://stackoverflow.com). Make sure you tag them using the `python-camelot` tag.

### Bug Reports

Expand Down
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,11 @@
[![Build Status](https://travis-ci.org/socialcopsdev/camelot.svg?branch=master)](https://travis-ci.org/socialcopsdev/camelot) [![codecov.io](https://codecov.io/github/socialcopsdev/camelot/badge.svg?branch=master&service=github)](https://codecov.io/github/socialcopsdev/camelot?branch=master)
[![image](https://img.shields.io/pypi/v/camelot-py.svg)](https://pypi.org/project/camelot-py/) [![image](https://img.shields.io/pypi/l/camelot-py.svg)](https://pypi.org/project/camelot-py/) [![image](https://img.shields.io/pypi/pyversions/camelot-py.svg)](https://pypi.org/project/camelot-py/)

**Camelot** is a Python library which makes it easy for *anyone* to extract tables from PDF files!
**Camelot** is a Python library that makes it easy for *anyone* to extract tables from PDF files!

---

**Here's how you can extract tables from PDF files.** Check out the PDF used in this example, [here](https://github.com/socialcopsdev/camelot/blob/master/docs/_static/pdf/foo.pdf).
**Here's how you can extract tables from PDF files.** Check out the PDF used in this example [here](https://github.com/socialcopsdev/camelot/blob/master/docs/_static/pdf/foo.pdf).

<pre>
>>> import camelot
Expand Down Expand Up @@ -43,14 +43,14 @@

There's a [command-line interface](https://camelot-py.readthedocs.io/en/latest/user/cli.html) too!

**Note:** Camelot only works with text-based PDFs and not scanned documents. If you can click-and-drag to select text in your table in a PDF viewer, then your PDF is text-based.
**Note:** Camelot only works with text-based PDFs and not scanned documents. If you can click and drag to select text in your table in a PDF viewer, then your PDF is text-based.

## Why Camelot?

- **You are in control**: Unlike other libraries and tools which either give a nice output or fail miserably (with no in-between), Camelot gives you the power to tweak table extraction. (Since everything in the real world, including PDF table extraction, is fuzzy.)
- **Metrics**: *Bad* tables can be discarded based on metrics like accuracy and whitespace, without ever having to manually look at each table.
- Each table is a **pandas DataFrame**, which enables seamless integration into [ETL and data analysis workflows](https://gist.github.com/vinayak-mehta/e5949f7c2410a0e12f25d3682dc9e873).
- **Export** to multiple formats, including json, excel and html.
- **You are in control.**: Unlike other libraries and tools which either give a nice output or fail miserably (with no in-between), Camelot gives you the power to tweak table extraction. (This is important since everything in the real world, including PDF table extraction, is fuzzy.)
- *Bad* tables can be discarded based on **metrics** like accuracy and whitespace, without ever having to manually look at each table.
- Each table is a **pandas DataFrame**, which seamlessly integrates into [ETL and data analysis workflows](https://gist.github.com/vinayak-mehta/e5949f7c2410a0e12f25d3682dc9e873).
- **Export** to multiple formats, including JSON, Excel and HTML.

See [comparison with other PDF table extraction libraries and tools](https://github.com/socialcopsdev/camelot/wiki/Comparison-with-other-PDF-Table-Extraction-libraries-and-tools).

Expand Down
20 changes: 12 additions & 8 deletions docs/dev/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ As the `Requests Code Of Conduct`_ states, **all contributions are welcome**, as

.. _Requests Code Of Conduct: http://docs.python-requests.org/en/master/dev/contributing/#be-cordial

Your First Contribution
Your first contribution
-----------------------

A great way to start contributing to Camelot is to pick an issue tagged with the `Contributor Friendly`_ or the `Easy`_ tags. If you're unable to find a good first issue, feel free to contact the maintainer.
Expand All @@ -39,13 +39,17 @@ To install the dependencies needed for development, you can use pip::

$ pip install camelot-py[dev]

Alternatively, you can clone the project repository, and install using pip::

$ pip install ".[dev]"

Pull Requests
-------------

Submit a Pull Request
Submit a pull request
^^^^^^^^^^^^^^^^^^^^^

The preferred workflow for contributing to Camelot is to fork the `project repository`_ on GitHub, clone, develop on a branch and then finally submit a pull request. Steps:
The preferred workflow for contributing to Camelot is to fork the `project repository`_ on GitHub, clone, develop on a branch and then finally submit a pull request. Here are the steps:

.. _project repository: https://github.com/socialcopsdev/camelot

Expand Down Expand Up @@ -76,7 +80,7 @@ Now it's time to go to the your fork of Camelot and create a pull request! You c

.. _follow these instructions: https://help.github.com/articles/creating-a-pull-request-from-a-fork/

Work on your Pull Request
Work on your pull request
^^^^^^^^^^^^^^^^^^^^^^^^^

We recommend that your pull request complies with the following guidelines:
Expand All @@ -89,7 +93,7 @@ We recommend that your pull request complies with the following guidelines:

.. _numpydoc: https://numpydoc.readthedocs.io/en/latest/format.html

- Make sure your commit messages follow `the seven rules of a great git commit message`_.
- Make sure your commit messages follow `the seven rules of a great git commit message`_:
- Separate subject from body with a blank line
- Limit the subject line to 50 characters
- Capitalize the subject line
Expand Down Expand Up @@ -119,7 +123,7 @@ Writing documentation, function docstrings, examples and tutorials is a great wa

The documentation is written in `reStructuredText`_, with `Sphinx`_ used to generate these lovely HTML files that you're currently reading (unless you're reading this on GitHub). You can edit the documentation using any text editor and then generate the HTML output by running `make html` in the ``docs/`` directory.

The function docstrings are written using the `numpydoc`_ extension for Sphinx. Make sure you check out how its format guidelines, before you start writing one.
The function docstrings are written using the `numpydoc`_ extension for Sphinx. Make sure you check out how its format guidelines before you start writing one.

.. _reStructuredText: https://en.wikipedia.org/wiki/ReStructuredText
.. _Sphinx: http://www.sphinx-doc.org/en/master/
Expand All @@ -128,14 +132,14 @@ The function docstrings are written using the `numpydoc`_ extension for Sphinx.
Filing Issues
-------------

We use `GitHub issues`_ to keep track of all issues and pull requests. Before opening an issue (which asks a question or reports a bug), it is advisable to use GitHub search to look for existing issues (both open and closed) that may be similar.
We use `GitHub issues`_ to keep track of all issues and pull requests. Before opening an issue (which asks a question or reports a bug), please use GitHub search to look for existing issues (both open and closed) that may be similar.

.. _GitHub issues: https://docs.pytest.org/en/latest/

Questions
^^^^^^^^^

Please don't use GitHub issues for support questions, a better place for them would be `Stack Overflow`_. Make sure you tag them using the ``python-camelot`` tag.
Please don't use GitHub issues for support questions. A better place for them would be `Stack Overflow`_. Make sure you tag them using the ``python-camelot`` tag.

.. _Stack Overflow: http://stackoverflow.com

Expand Down
20 changes: 10 additions & 10 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,11 +23,11 @@ Release v\ |version|. (:ref:`Installation <install>`)
.. image:: https://img.shields.io/pypi/pyversions/camelot-py.svg
:target: https://pypi.org/project/camelot-py/

**Camelot** is a Python library which makes it easy for *anyone* to extract tables from PDF files!
**Camelot** is a Python library that makes it easy for *anyone* to extract tables from PDF files!

----

**Here's how you can extract tables from PDF files.** Check out the PDF used in this example, `here`_.
**Here's how you can extract tables from PDF files.** Check out the PDF used in this example `here`_.

.. _here: _static/pdf/foo.pdf

Expand Down Expand Up @@ -55,15 +55,15 @@ Release v\ |version|. (:ref:`Installation <install>`)

There's a :ref:`command-line interface <cli>` too!

.. note:: Camelot only works with text-based PDFs and not scanned documents. If you can click-and-drag to select text in your table in a PDF viewer, then your PDF is text-based.
.. note:: Camelot only works with text-based PDFs and not scanned documents. If you can click and drag to select text in your table in a PDF viewer, then your PDF is text-based.

Why Camelot?
------------

- **You are in control**: Unlike other libraries and tools which either give a nice output or fail miserably (with no in-between), Camelot gives you the power to tweak table extraction. (Since everything in the real world, including PDF table extraction, is fuzzy.)
- **Metrics**: *Bad* tables can be discarded based on metrics like accuracy and whitespace, without ever having to manually look at each table.
- Each table is a **pandas DataFrame**, which enables seamless integration into `ETL and data analysis workflows`_.
- **Export** to multiple formats, including json, excel and html.
- **You are in control.** Unlike other libraries and tools which either give a nice output or fail miserably (with no in-between), Camelot gives you the power to tweak table extraction. (This is important since everything in the real world, including PDF table extraction, is fuzzy.)
- *Bad* tables can be discarded based on **metrics** like accuracy and whitespace, without ever having to manually look at each table.
- Each table is a **pandas DataFrame**, which seamlessly integrates into `ETL and data analysis workflows`_.
- **Export** to multiple formats, including JSON, Excel and HTML.

See `comparison with other PDF table extraction libraries and tools`_.

Expand All @@ -73,7 +73,7 @@ See `comparison with other PDF table extraction libraries and tools`_.
The User Guide
--------------

This part of the documentation, begins with some background information about why Camelot was created, takes a small dip into the implementation details and then focuses on step-by-step instructions for getting the most out of Camelot.
This part of the documentation begins with some background information about why Camelot was created, takes a small dip into the implementation details and then focuses on step-by-step instructions for getting the most out of Camelot.

.. toctree::
:maxdepth: 2
Expand All @@ -85,8 +85,8 @@ This part of the documentation, begins with some background information about wh
user/advanced
user/cli

The API Documentation / Guide
-----------------------------
The API Documentation/Guide
---------------------------

If you are looking for information on a specific function, class, or method,
this part of the documentation is for you.
Expand Down
Loading