Skip to content

Commit

Permalink
Documentation updates
Browse files Browse the repository at this point in the history
- Removes dependency on Prism for code
highlighting
- Corrects indentation on document.rst
- Adds new areas throughout the documentation to generally better explain things
-
  • Loading branch information
jamie-lemon committed Mar 28, 2024
1 parent d543903 commit 179886f
Show file tree
Hide file tree
Showing 14 changed files with 1,660 additions and 1,686 deletions.
3 changes: 2 additions & 1 deletion docs/annot.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,8 @@
================
Annot
================
**This class is supported for PDF documents only.**

|pdf_only_class|

Quote from the :ref:`AdobeManual`: "An annotation associates an object such as a note, sound, or movie with a location on a page of a PDF document, or provides a way to interact with the user by means of the mouse and keyboard."

Expand Down
7 changes: 5 additions & 2 deletions docs/document-writer-class.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,14 @@
DocumentWriter
================

|pdf_only_class|


* New in v1.21.0

This class represents a utility which can output various :ref:`document types supported by MuPDF<Supported_File_Types>`.
This class represents a utility which can output various :ref:`document types supported by PyMuPDF<Supported_File_Types>`.

In PyMuPDF only used for outputting PDF documents whose pages are populated by :ref:`Story` DOMs.
In :title:`PyMuPDF` only used for outputting PDF documents whose pages are populated by :ref:`Story` DOMs.

Using DocumentWriter_ also for other document types might happen in the future.

Expand Down
2,335 changes: 1,172 additions & 1,163 deletions docs/document.rst

Large diffs are not rendered by default.

11 changes: 0 additions & 11 deletions docs/footer.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,17 +12,6 @@
var a = document.getElementById('feedbackLinkBottom');
a.setAttribute("href", "https://artifex.com/contributor/feedback.php?utm_source=rtd-pymupdf&utm_medium=rtd&utm_content=footer-link&url="+url_string);
Prism.plugins.NormalizeWhitespace.setDefaults({
'remove-trailing': true,
'remove-indent': true,
'left-trim': true,
'right-trim': true,
'break-lines': 100,
'indent': 0,
'remove-initial-line-feed': false,
'tabs-to-spaces': 4,
'spaces-to-tabs': 4
});
</script>

<p style="color:#999" id="footerDisclaimer">This software is provided AS-IS with no warranty, either express or implied. This software is distributed under license and may not be copied, modified or distributed except as expressly authorized under the terms of that license. Refer to licensing information at <a href="https://www.artifex.com?utm_source=rtd-pymupdf&utm_medium=rtd&utm_content=footer-link">artifex.com</a> or contact Artifex Software Inc., 39 Mesa Street, Suite 108A, San Francisco CA 94129, United States for further information.</p>
Expand Down
101 changes: 40 additions & 61 deletions docs/header.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,83 +13,62 @@

</small></details>

.. |pdf_only_class| raw:: html

<div style="width:100%; text-align:right"><b>This class is for PDF only.</b></div>

.. raw:: html

<link rel="stylesheet" type="text/css" href="_static/prism/prism.css">
.. raw:: html

<style>
<style>
/* Prism Updates */
#languageToggle {
width:25%;
margin:8px 10px 0;
}
.code-toolbar .copy-to-clipboard-button {
background: #007aff !important;
color: white !important;
padding: 10px !important;
border-radius: 5px !important;
font-family: Arial !important;
}
#button-select-en {
padding: 5px 10px;
background-color: #fff;
border: 1px solid #000;
border-radius: 10px 0 0 10px;
font-size: 14px;
}
.code-toolbar pre {
background: #fff;
border: #999 1px dashed;
}
#button-select-ja {
padding: 5px 10px;
background-color: #fff;
border: 1px solid #000;
border-radius: 0px 10px 10px 0;
border-left: 0;
font-size: 14px;
}
.code-toolbar code {
border: 0px !important;
}
#button-select-en , #button-select-ja, #button-select-en:hover , #button-select-ja:hover {
color: #fff;
text-decoration: none;
}
/* small screens */
@media all and (max-width : 768px) {
#languageToggle {
width:25%;
margin:8px 10px 0;
}
#button-select-en {
padding: 5px 10px;
background-color: #fff;
border: 1px solid #000;
border-radius: 10px 0 0 10px;
font-size: 14px;
}
#button-select-ja {
padding: 5px 10px;
background-color: #fff;
border: 1px solid #000;
border-radius: 0px 10px 10px 0;
border-left: 0;
font-size: 14px;
}
#button-select-en , #button-select-ja, #button-select-en:hover , #button-select-ja:hover {
color: #fff;
text-decoration: none;
}
/* small screens */
@media all and (max-width : 768px) {
#languageToggle {
width:50%;
}
width:50%;
}
}
@media all and (max-width : 400px) {
#languageToggle {
width:70%;
}
@media all and (max-width : 400px) {
#languageToggle {
width:70%;
}
}
@media all and (max-width : 375px) {
#button-select-en , #button-select-ja {
font-size: 11px;
}
@media all and (max-width : 375px) {
#button-select-en , #button-select-ja {
font-size: 11px;
}
}
</style>

<script type="text/javascript" src="_static/prism/prism.js"></script>
</style>

<div style="display:flex;justify-content:space-between;align-items: center;">
<form class="sidebar-search-container top" method="get" action="search.html" role="search" style="width:75%">
Expand Down
3 changes: 2 additions & 1 deletion docs/how-to-open-a-file.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,9 @@ Opening Files
Supported File Types
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

:title:`PyMuPDF` can open files other that just :title:`PDF`.

:title:`PyMuPDF` supports the following file types:
The following file types are supported:

.. include:: supported-files-table.rst

Expand Down
6 changes: 5 additions & 1 deletion docs/page.rst
Original file line number Diff line number Diff line change
Expand Up @@ -803,6 +803,8 @@ In a nutshell, this is what you can do with PyMuPDF:
|history_end|


**Drawing Methods**

.. index::
pair: closePath; draw_line
pair: color; draw_line
Expand Down Expand Up @@ -1447,7 +1449,9 @@ In a nutshell, this is what you can do with PyMuPDF:

.. method:: get_textpage_ocr(flags=3, language="eng", dpi=72, full=False, tessdata=None)

Create a :ref:`TextPage` for the page that includes OCRed text. MuPDF will invoke Tesseract-OCR if this method is used. Otherwise this is a normal :ref:`TextPage` object.
**Optical Character Recognition** (**OCR**) technology can be used to extract text data for documents where text is in a raster image format throughout the page. Use this method to **OCR** a page for text extraction.

This method returns a :ref:`TextPage` for the page that includes OCRed text. MuPDF will invoke Tesseract-OCR if this method is used. Otherwise this is a normal :ref:`TextPage` object.

:arg int flags: indicator bits controlling the content available for subsequent test extractions and searches -- see the parameter of :meth:`Page.get_text`.
:arg str language: the expected language(s). Use "+"-separated values if multiple languages are expected, "eng+spa" for English and Spanish.
Expand Down
4 changes: 3 additions & 1 deletion docs/recipes-annotations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,9 @@ Annotations
How to Add and Modify Annotations
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In PyMuPDF, new annotations can be added via :ref:`Page` methods. Once an annotation exists, it can be modified to a large extent using methods of the :ref:`Annot` class.
In :title:`PyMuPDF`, new annotations can be added via :ref:`Page` methods. Once an annotation exists, it can be modified to a large extent using methods of the :ref:`Annot` class.

Annotations can **only** be inserted in :title:`PDF` pages - other document types do not support annotation insertion.

In contrast to many other tools, initial insert of annotations happens with a minimum number of properties. We leave it to the programmer to e.g. set attributes like author, creation date or subject.

Expand Down
36 changes: 32 additions & 4 deletions docs/recipes-drawing-and-graphics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,17 +7,17 @@ Drawing and Graphics
==============================


PDF files support elementary drawing operations as part of their syntax. This includes basic geometrical objects like lines, curves, circles, rectangles including specifying colors.
PDF files support elementary drawing operations as part of their syntax. These are **vector graphics** and include basic geometrical objects like lines, curves, circles, rectangles including specifying colors.

The syntax for such operations is defined in "A Operator Summary" on page 643 of the :ref:`AdobeManual`. Specifying these operators for a PDF page happens in its :data:`contents` objects.

PyMuPDF implements a large part of the available features via its :ref:`Shape` class, which is comparable to notions like "canvas" in other packages (e.g. `reportlab <https://pypi.org/project/reportlab/>`_).
:title:`PyMuPDF` implements a large part of the available features via its :ref:`Shape` class, which is comparable to notions like "canvas" in other packages (e.g. `reportlab <https://pypi.org/project/reportlab/>`_).

A shape is always created as a **child of a page**, usually with an instruction like *shape = page.new_shape()*. The class defines numerous methods that perform drawing operations on the page's area. For example, *last_point = shape.draw_rect(rect)* draws a rectangle along the borders of a suitably defined *rect = fitz.Rect(...)*.

The returned *last_point* **always** is the :ref:`Point` where drawing operation ended ("last point"). Every such elementary drawing requires a subsequent :meth:`Shape.finish` to "close" it, but there may be multiple drawings which have one common *finish()* method.

In fact, :meth:`Shape.finish` *defines* a group of preceding draw operations to form one -- potentially rather complex -- graphics object. PyMuPDF provides several predefined graphics in `shapes_and_symbols.py <https://github.com/pymupdf/PyMuPDF-Utilities/blob/master/shapes/shapes_and_symbols.py>`_ which demonstrate how this works.
In fact, :meth:`Shape.finish` *defines* a group of preceding draw operations to form one -- potentially rather complex -- graphics object. :title:`PyMuPDF` provides several predefined graphics in `shapes_and_symbols.py <https://github.com/pymupdf/PyMuPDF-Utilities/blob/master/shapes/shapes_and_symbols.py>`_ which demonstrate how this works.

If you import this script, you can also directly use its graphics as in the following example::

Expand Down Expand Up @@ -84,12 +84,15 @@ This is the script's outcome:

------------------------------


.. _RecipesDrawingAndGraphics_Extract_Drawings:

How to Extract Drawings
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

* New in v1.18.0

The drawing commands issued by a page can be extracted. Interestingly, this is possible for :ref:`all supported document types<Supported_File_Types>` -- not just PDF: so you can use it for XPS, EPUB and others as well.
Drawing commands (**vector graphics**) issued by a page can be extracted as a list of dictionaries. Interestingly, this is possible for :ref:`all supported document types<Supported_File_Types>` -- not just PDF: so you can use it for XPS, EPUB and others as well.

Page method, :meth:`Page.get_drawings()` accesses draw commands and converts them into a list of Python dictionaries. Each dictionary -- called a "path" -- represents a separate drawing -- it may be simple like a single line, or a complex combination of lines and curves representing one of the shapes of the previous section.

Expand Down Expand Up @@ -195,4 +198,29 @@ Here is a comparison between input and output of an example page, created by the

.. note:: You can use the path list to make your own lists of e.g. all lines or all rectangles on the page and subselect them by criteria, like color or position on the page etc.



How to Draw Graphics
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Drawing graphics is as simple as calling the type of :meth:`Drawing Method <Page.draw_line>` you may want. You can draw graphics directly on pages or within shape objects.


For example, to draw a circle::

# Draw a circle on the page using the Page method
page.draw_circle((center_x, center_y), radius, color=(1, 0, 0), width=2)

# Draw a circle on the page using a Shape object
shape = page.new_shape()
shape.draw_circle((center_x, center_y), radius)
shape.finish(color=(1, 0, 0), width=2)
shape.commit(overlay=True)

The :ref:`Shape` object can be used to combine multiple drawings that should receive common properties as specified by :meth:`Shape.finish`.





.. include:: footer.rst
25 changes: 25 additions & 0 deletions docs/recipes-text.rst
Original file line number Diff line number Diff line change
Expand Up @@ -677,4 +677,29 @@ This example combines multiple requirements:

.. image:: images/img-htmlbox5.*


|
.. _RecipesText_J:


How to Extract Text with Color
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Iterate through your text blocks and find the spans of text you need for this information.

::

for page in doc:
text_blocks = page.get_text("dict", flags=fitz.TEXTFLAGS_TEXT)["blocks"]
for block in text_blocks:
for line in block["lines"]:
for span in line["spans"]:
text = span["text"]
color = fitz.sRGB_to_rgb(span["color"])
print(f"Text: {text}, Color: {color}")




.. include:: footer.rst
2 changes: 2 additions & 0 deletions docs/shape.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@
Shape
================

|pdf_only_class|

This class allows creating interconnected graphical elements on a PDF page. Its methods have the same meaning and name as the corresponding :ref:`Page` methods.

In fact, each :ref:`Page` draw method is just a convenience wrapper for (1) one shape draw method, (2) the :meth:`Shape.finish` method, and (3) the :meth:`Shape.commit` method. For page text insertion, only the :meth:`Shape.commit` method is invoked. If many draw and text operations are executed for a page, you should always consider using a Shape object.
Expand Down
2 changes: 2 additions & 0 deletions docs/textwriter.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@
TextWriter
================

|pdf_only_class|

* New in v1.16.18

This class represents a MuPDF *text* object. The basic idea is to **decouple (1) text preparation, and (2) text output** to PDF pages.
Expand Down
Loading

0 comments on commit 179886f

Please sign in to comment.