Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tutorials or better demos to manipulate annotations #107

Closed
ubuntuslave opened this issue Jun 11, 2014 · 12 comments · Fixed by #1745
Closed

Tutorials or better demos to manipulate annotations #107

ubuntuslave opened this issue Jun 11, 2014 · 12 comments · Fixed by #1745
Labels
is-maintenance Anything that is just internal: Simplifying code, syntax changes, updating docs, speed improvements nf-documentation Non-functional change: Documentation

Comments

@ubuntuslave
Copy link

ubuntuslave commented Jun 11, 2014

edit by Martin to show the current state:

Number of files with at least one annotations in my big dataset:

✔️ /Link: 2694x
❌ /Widget: 431x - #1207
✔️ /Popup: 189x - #1198 - now #1665
✔️ /FreeText: 85x
✔️ /Text: 77x
✔️ /Square: 47x - #1388
/Stamp: 40x
✔️ /Line: 22x
/DGAP:RedaxBox: 3x
✔️ /Circle: 2x - #1556
/FileAttachment: 2x - don' confuse it with /EmbeddedFiles
/Ink: 1x - https://pyfpdf.github.io/fpdf2/Annotations.html#ink-annotations
/Caret: 1x
✔️ /Polygon - #1557
✔️ /PolyLine - see #1726

Text markup annotations:
✔️ /Highlight: 22x - https://stackoverflow.com/q/9099497/562769 - see #1740
/StrikeOut: 2x
/Underline: 2x


Original post:

I'm trying to understand how PyPDF2 works with existing annotation objects, such as highlights, and Popups. The demos provided don't show how to _add_ new DictionaryObjects to the current list of annotations. I believe my problem resides around not having a way to obtain (find out) idnum for the new object that it's needed by its parent's to refer to. Here is what I've been doing so far... (To be honest, I gave up looking at the code inside pdf.py and generic.py because it's taking too long at the moment)

from typing import cast

from PyPDF2 import PdfReader, PdfWriter
from PyPDF2.generic import ArrayObject

writer = PdfWriter()
reader = PdfReader("commented.pdf")

# print how many pages input1 has:
print(f"document1.pdf has {len(reader.pages)} pages.")

# add page 1 from input1 to output document, unchanged
page = reader.pages[0]

annots = cast(ArrayObject, page["/Annots"])
annot0 = annots[0].getObject()
annot1 = annots[1].getObject()
annot2 = annots[2].getObject()
annot3 = annots[3].getObject()
annot4 = annots[4].getObject()

# TEST: changing postition of the Popup's rectangle (Works!)
from PyPDF2.generic import *

popup_size = (180, 120)

rect = RectangleObject(annot0["/Rect"])
rect_x0 = rect.left
rect_y0 = rect.top
annot2.update(
    {
        NameObject("/Open"): BooleanObject(False),
        NameObject("/Rect"): RectangleObject(
            [rect_x0, rect_y0, rect_x0 + popup_size[0], rect_y0 + popup_size[1]]
        ),
    }
)

# Test make new Popup Annotation
rect = RectangleObject(annot1["/Rect"])
rect_x0 = rect.left
rect_y0 = rect.top

popup = DictionaryObject()
popup.update(
    {
        NameObject("/Type"): NameObject("/Annot"),
        NameObject("/Subtype"): NameObject("/Popup"),
        NameObject("/Parent"): IndirectObject(
            15, 0, reader
        ),  # TODO: How to find the parent's idnum? Manually: this is 15 for this object
        NameObject("/Open"): BooleanObject(False),
        NameObject("/Rect"): RectangleObject(
            [rect_x0, rect_y0, rect_x0 + popup_size[0], rect_y0 + popup_size[1]]
        ),
        NameObject("/F"): NumberObject(28),  # The type of object? A popup?
    }
)

popup_ref = writer._add_object(popup)

if "/Annots" in page:
    page["/Annots"].append(popup_ref)
else:
    page[NameObject("/Annots")] = ArrayObject([popup_ref])

# Adding the reference to its new popup:
annot1.update(
    {NameObject("/Popup"): popup_ref}  # TODO: put the number for the new reference
)

annots_ref = writer._add_object(annots)

writer.add_page(page)  # FIXME: not adding the new Popup annotation
# finally, write "output" to document-output.pdf
with open("PyPDF2-output.pdf", "wb") as fp:
    writer.write(fp)

P.S: My motivation for this little script is to give it to users of docear since mind-mapping of highlights requires retroactively adding missing popups to highlight annotations (in adobe acrobat) like the proprietary program done by this guy.

@mstamy2
Copy link
Collaborator

mstamy2 commented Jun 16, 2014

Thanks for the feedback. I'm currently working on better documentation for PyPDF2, and I intend for it to feature plenty of examples/demos that might answer questions like yours

@cheetah90
Copy link

Hi, just want to follow up on this issue. Are there demos/examples for adding annotations?

@agentcooper
Copy link

Here is an example for creating a highlight:
https://gist.github.com/agentcooper/4c55133f5d95866acdee5017cd318558

@mstamy2 are you interested in adding methods for highlights? I can provide a pull request.

@snakemicro
Copy link

snakemicro commented Apr 7, 2019

Tried to add a new method in pdf.py

def hex_to_rgb(value):
    return tuple(int(value[i:i+2], 16)/255.0 for i in (0, 2 ,4))

and a method in PdfFileWriter class
def addFreeTextAnno(self, pagenum, txt, rect, font="Helvetica", bold=False,
                        italic=False, fontSize="14pt", fontColor="ff0000",
                        borderColor="ff0000", bgColor="ffffff"):
        '''
            Helvetica
            Times New Roman
            Courier
            bold italic
            \r return code

        frtxt = DictionaryObject()
        frtxt.update({
            NameObject('/Type'): NameObject('/Annot'),
            NameObject('/Subtype'): NameObject('/FreeText'),
            NameObject('/P'): pageLink,
            NameObject('/Rect'): RectangleObject(rect),
            NameObject('/Contents'): TextStringObject(txt),
            #font size color
            NameObject('/DS'): TextStringObject("font: bold italic Times New Roman 20pt;color:#ff0000"),
            #border color
            NameObject('/DA'): TextStringObject("1 0 0 rg "),
            #background color
            NameObject('/C'): ArrayObject([FloatObject(1), FloatObject(1), FloatObject(1)])
        })
        '''
        pageLink = self.getObject(self._pages)['/Kids'][pagenum]
        pageRef = self.getObject(pageLink)

        fontStr = "font: "
        if bold == True : fontStr = fontStr + "bold "
        if italic == True : fontStr = fontStr + "italic "
        fontStr = fontStr + font + " " + fontSize
        fontStr = fontStr + ";text-align:left;color:#" + fontColor

        bgColorStr = ""
        for st in hex_to_rgb(borderColor):
            bgColorStr = bgColorStr + str(st) + " "
        bgColorStr = bgColorStr + "rg"

        frtxt = DictionaryObject()
        frtxt.update({
            NameObject('/Type'): NameObject('/Annot'),
            NameObject('/Subtype'): NameObject('/FreeText'),
            NameObject('/P'): pageLink,
            NameObject('/Rect'): RectangleObject(rect),
            NameObject('/Contents'): TextStringObject(txt),
            #font size color
            NameObject('/DS'): TextStringObject(fontStr),
            #border color
            NameObject('/DA'): TextStringObject(bgColorStr),
            #background color
            NameObject('/C'): ArrayObject([FloatObject(n) for n in hex_to_rgb(bgColor)])
        })

        lnkRef = self._addObject(frtxt)

        if "/Annots" in pageRef:
            pageRef['/Annots'].append(lnkRef)
        else:
            pageRef[NameObject('/Annots')] = ArrayObject([lnkRef])



from PyPDF2 import PdfFileReader, PdfFileWriter

and following is how to call that method.

def main():
    output = PdfFileWriter()
    input1 = PdfFileReader(file("aaa.pdf", "rb"))

    numPages = input1.getNumPages()

    for index in range(0, numPages):
        pageObj = input1.getPage(index)

        output.addPage(pageObj)
        output.addFreeTextAnno(index, "My Free Text Annotate", [100,100,350,150],
            bold=True, italic=True, fontSize="20pt", fontColor="ffffff", borderColor="ffaa00",
            bgColor="000000", font="Times")

        annot = pageObj["/Annots"][0].getObject()
        annot1 = pageObj["/Annots"][1].getObject()
        print annot
        print annot1
        output.write(open("xx.pdf", 'wb'))

@rien333
Copy link

rien333 commented Sep 2, 2019

And @snakemicro? Did it work?

@marksweb
Copy link

Hoping someone can offer some assistance on font size/style.

I'm trying to detect when a value is too long for a field I've got & adjust the font size, but the font doesn't change.

    # Add data to a page
    page = pdf_writer.getPage(0)
    pdf_writer.updatePageFormFieldValues(page, data_dict)

    for j in range(0, len(page['/Annots'])):
        writer_annot = page['/Annots'][j].getObject()
        writer_annot.update({
            NameObject("/DS"): TextStringObject(
                "font: 12.0pt; color:#E52237"
            )
        })

@MartinThoma MartinThoma added the nf-documentation Non-functional change: Documentation label Apr 8, 2022
@MartinThoma
Copy link
Member

We might also want to add a PyPDF answer to https://stackoverflow.com/q/47497309/562769

@MartinThoma
Copy link
Member

I've just added some basic examples. Feel free to add more by creating PRs :-)

@MartinThoma
Copy link
Member

The demo is here: https://pypdf2.readthedocs.io/en/latest/user/adding-pdf-annotations.html

What I miss:

  1. How to read comments
  2. How to add comments (text + location)
  3. An overview over different types of annotations

MartinThoma added a commit that referenced this issue Jun 12, 2022
Full credit to the GitHub user snakemicro:
#107 (comment)

Co-authored-by: snakemicro
@MartinThoma
Copy link
Member

@snakemicro I would add it #981 :-) I just want to wait for some feedback.

MartinThoma added a commit that referenced this issue Jun 12, 2022
Full credit to the GitHub user snakemicro:
#107 (comment)

Co-authored-by: snakemicro
MartinThoma added a commit that referenced this issue Jun 12, 2022
Full credit to the GitHub user snakemicro:
#107 (comment)

Co-authored-by: snakemicro
@MartinThoma MartinThoma added the is-maintenance Anything that is just internal: Simplifying code, syntax changes, updating docs, speed improvements label Jun 26, 2022
MartinThoma added a commit that referenced this issue Jul 22, 2022
…ionBuilder (#1120)

* Add `page.annotations` (getter and setter)
* Add `writer.add_annotation(page_number, annotation_dictionary)`
* Add AnnotationBuilder to generate the `annotation_dictionary` for the different subtypes of annotations. Similarly, we could have an AnnotationsParser.

See #107

Closes #981
MartinThoma added a commit that referenced this issue Aug 3, 2022
@MartinThoma
Copy link
Member

@ubuntuslave @snakemicro I've started working on improving the situation in #1198 . Somehow the popup doesn't show... your help would be very appreciated 🙏

@MartinThoma
Copy link
Member

MartinThoma commented Aug 5, 2022

Number of files with at least one annotations in my big dataset:

✔️ /Link: 2694x
❌ /Widget: 431x - #1207
⏳ /Popup: 189x - #1198 - now #1665
✔️ /FreeText: 85x
✔️ /Text: 77x
✔️ /Square: 47x - #1388
/Stamp: 40x
/Highlight: 22x - https://stackoverflow.com/q/9099497/562769
✔️ /Line: 22x
/DGAP:RedaxBox: 3x
⏳ /Circle: 2x - #1556
/FileAttachment: 2x - don' confuse it with /EmbeddedFiles
/StrikeOut: 2x
/Underline: 2x
/Ink: 1x
/Caret: 1x
✔️ /Polygon - #1557
/PolyLine

MartinThoma added a commit that referenced this issue Jan 15, 2023
MartinThoma added a commit that referenced this issue Jan 15, 2023
MartinThoma added a commit that referenced this issue Jan 16, 2023
MartinThoma added a commit that referenced this issue Jan 18, 2023
MartinThoma added a commit that referenced this issue Mar 19, 2023
MartinThoma added a commit that referenced this issue Mar 19, 2023
MartinThoma added a commit that referenced this issue Mar 23, 2023
MartinThoma added a commit that referenced this issue Mar 26, 2023
* Annotation module: pypdf/generic/_annotations.py ➔  pypdf/annotation.py
* Create abstract base class AnnotationDictionary
* Create annotation classes: Text, FreeText

See #107

See #1741
  The annotationBuilder design pattern discussion
MartinThoma added a commit that referenced this issue Mar 26, 2023
* Annotation module: pypdf/generic/_annotations.py ➔  pypdf/annotation.py
* Create abstract base class AnnotationDictionary
* Create annotation classes: Text, FreeText
* DOC: Remove AnnotationBuilder from the docs, use AnnotationDictionary
  classes directly

See #107

See #1741
  The annotationBuilder design pattern discussion
MartinThoma added a commit that referenced this issue Mar 26, 2023
* Annotation module: pypdf/generic/_annotations.py ➔  pypdf/annotation.py
* Create abstract base class AnnotationDictionary
* Create annotation classes: Text, FreeText
* DOC: Remove AnnotationBuilder from the docs, use AnnotationDictionary
  classes directly

See #107

See #1741
  The annotationBuilder design pattern discussion
MartinThoma pushed a commit that referenced this issue Mar 26, 2023
MartinThoma added a commit that referenced this issue Jul 29, 2023
The goal of this PR is to create a more intuitive interface for creating annotations. The AnnotationBuild gets deprecated in favor of several annotation classes, e.g.

```python
# old
from pypdf.generic import AnnotationBuilder
annotation = AnnotationBuilder.free_text(
    "Hello World\nThis is the second line!",
    rect=(50, 550, 200, 650),
    font="Arial",
    bold=True,
    italic=True,
    font_size="20pt",
    font_color="00ff00",
    border_color="0000ff",
    background_color="cdcdcd",
)

# new
from pypdf.annotations import FreeText
annotation = FreeText(
    text="Hello World\nThis is the second line!",
    rect=(50, 550, 200, 650),
    font="Arial",
    bold=True,
    italic=True,
    font_size="20pt",
    font_color="00ff00",
    border_color="0000ff",
    background_color="cdcdcd",
)
```

* `pypdf/generic/_annotations.py` ➔ `pypdf/annotations/`
* Create abstract base class AnnotationDictionary
* Create abstract base class MarkupAnnotation which inherits from AnnotationDictionary. Most annotations are MarkupAnnotations.
* Deprecated AnnotationBuilder
* Ensure the AnnotationBuilder is not used in the docs

Closes #107
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
is-maintenance Anything that is just internal: Simplifying code, syntax changes, updating docs, speed improvements nf-documentation Non-functional change: Documentation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants