Skip to content

Commit

Permalink
feat: add USPTO backend parser
Browse files Browse the repository at this point in the history
Add a backend implementation to parse patent applications and
grants from the United States Patent Office (USPTO).

Signed-off-by: Cesar Berrospi Ramis <[email protected]>
  • Loading branch information
ceberam committed Dec 11, 2024
1 parent 458df06 commit 543e13a
Show file tree
Hide file tree
Showing 32 changed files with 466,598 additions and 7 deletions.
1,844 changes: 1,837 additions & 7 deletions docling/backend/patent_uspto_backend.py

Large diffs are not rendered by default.

508 changes: 508 additions & 0 deletions tests/data/groundtruth/docling_v2/ipa20110039701.itxt

Large diffs are not rendered by default.

397,684 changes: 397,684 additions & 0 deletions tests/data/groundtruth/docling_v2/ipa20110039701.json

Large diffs are not rendered by default.

4,518 changes: 4,518 additions & 0 deletions tests/data/groundtruth/docling_v2/ipa20110039701.md

Large diffs are not rendered by default.

185 changes: 185 additions & 0 deletions tests/data/groundtruth/docling_v2/ipa20180000016.itxt

Large diffs are not rendered by default.

5,827 changes: 5,827 additions & 0 deletions tests/data/groundtruth/docling_v2/ipa20180000016.json

Large diffs are not rendered by default.

380 changes: 380 additions & 0 deletions tests/data/groundtruth/docling_v2/ipa20180000016.md

Large diffs are not rendered by default.

79 changes: 79 additions & 0 deletions tests/data/groundtruth/docling_v2/ipa20200022300.itxt

Large diffs are not rendered by default.

1,137 changes: 1,137 additions & 0 deletions tests/data/groundtruth/docling_v2/ipa20200022300.json

Large diffs are not rendered by default.

155 changes: 155 additions & 0 deletions tests/data/groundtruth/docling_v2/ipa20200022300.md

Large diffs are not rendered by default.

267 changes: 267 additions & 0 deletions tests/data/uspto/ipa20060127578.xml

Large diffs are not rendered by default.

25,908 changes: 25,908 additions & 0 deletions tests/data/uspto/ipa20110039701.xml

Large diffs are not rendered by default.

638 changes: 638 additions & 0 deletions tests/data/uspto/ipa20110256314.xml

Large diffs are not rendered by default.

6,248 changes: 6,248 additions & 0 deletions tests/data/uspto/ipa20120044998.xml

Large diffs are not rendered by default.

723 changes: 723 additions & 0 deletions tests/data/uspto/ipa20180000016.xml

Large diffs are not rendered by default.

579 changes: 579 additions & 0 deletions tests/data/uspto/ipa20180000017.xml

Large diffs are not rendered by default.

456 changes: 456 additions & 0 deletions tests/data/uspto/ipa20200022300.xml

Large diffs are not rendered by default.

590 changes: 590 additions & 0 deletions tests/data/uspto/ipa20200022301.xml

Large diffs are not rendered by default.

4,468 changes: 4,468 additions & 0 deletions tests/data/uspto/ipg07997973.xml

Large diffs are not rendered by default.

783 changes: 783 additions & 0 deletions tests/data/uspto/ipg08672134.xml

Large diffs are not rendered by default.

1,151 changes: 1,151 additions & 0 deletions tests/data/uspto/ipgD0701016.xml

Large diffs are not rendered by default.

447 changes: 447 additions & 0 deletions tests/data/uspto/pa20010031492.xml

Large diffs are not rendered by default.

501 changes: 501 additions & 0 deletions tests/data/uspto/pftaps039302750.txt

Large diffs are not rendered by default.

165 changes: 165 additions & 0 deletions tests/data/uspto/pftaps045931038.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,165 @@
PATN
WKU 045931038
SRC 6
APN 5253349
APT 1
ART 154
APD 19830822
TTL Polyoxazoline compounds
ISD 19860603
NCL 2
ECL 1,2
EXA Gibson; S. A.
EXP Lesmes; George F.
INVT
NAM Johnson; Mark R.
CTY Breckenridge
STA MI
ASSG
NAM The Dow Chemical Company
CTY Midland
STA MI
COD 02
CLAS
OCL 548239
XCL 544 88
EDF 4
ICL C07D41312
FSC 548
FSS 239
FSC 544
FSS 88
UREF
PNO 3563920
ISD 19710200
NAM Tomalia et al.
OCL 548239
UREF
PNO 3682948
ISD 19720800
NAM Tomalia et al.
OCL 548239
UREF
PNO 3996237
ISD 19761200
NAM Tomalia
OCL 544 88
OREF
PAL Hackh's Chemical Dictionary, 4ed, 1969, McGraw-Hill Book Company, p. 331.
ABST
PAL Polyoxazoline compounds are prepared by reacting a polymercaptan with a
2-alkenyloxazoline or a 2-alkenyloxazine. The compounds so prepared have
at least two oxazoline or oxazine functionalities and can be employed in
applications where a compound having an oxazoline or oxazine functionality
has useful activity.
BSUM
PAC BACKGROUND OF THE INVENTION
PAR This invention relates to polyoxazoline and polyoxazine compounds and, in
particular, those compounds having at least two pendant oxazoline or
oxazine functionalities.
PAR Compounds having two or more oxazoline rings are useful as crosslinking
agents. Previously, such compounds have been prepared a number of ways,
each having some limitations. Bisoxazolines have been prepared by the
reaction of dicarboxylic acids with monoethanolamine. This reaction is not
especially clean, and requires difficult purification procedures.
Bisoxazolines have also been prepared by reaction of hydrogen sulfide with
isopropenyl oxazoline. This reaction is only capable of producing
bisoxazolines and, thus, is not useful in preparing higher polyoxazolines.
Another method of preparing polyoxazolines is by homo- or copolymerization
of isopropenyl oxazoline. This procedure is frequently unsatisfactory as
it produces polyoxazolines with many pendant groups, resulting in
inefficient use of the oxazoline rings in crosslinking reactions. Also,
since the polyoxazoline has a high molecular weight, it is often not
convenient to handle and use.
PAR In view of the deficiencies of the prior art, it would be highly desirable
to prepare, in a relatively simple and efficient manner, a compound having
a plurality of oxazoline or oxazine rings.
PAC SUMMARY OF THE INVENTION
PAR The present invention is a compound comprising at least two, preferably at
least three, oxazoline or oxazine functionalities which result from the
reaction of a polymercaptan and a 2-alkenyloxazoline or a
2-alkenyloxazine.
PAC DETAILED DESCRIPTION OF THE INVENTION
PAR The 2-alkenyloxazolines and 2-alkenyloxazines of the present invention have
the general formula:
##STR1##
wherein R is hydrogen or lower alkyl, and each of R.sup.1 -R.sup.4 is
independently hydrogen, alkyl, aralkyl, phenyl or inertly substituted
phenyl; and n is zero or one. Examples of suitable 2-alkenyloxazolines and
2-alkenyloxazines and their methods of preparation are catalogued in U.S.
Pat. Nos. 3,505,297 and 4,144,211, which are incorporated herein by
reference. Examples of preferred 2-alkenyloxazolines include
2-isopropenyloxazoline, 2-vinyloxazoline, and
5-methyl-2-isopropenyloxazoline.
PAR The polymercaptans of this invention are selected from a known class of
compounds having many members, and any member of this class which reacts
with vinyl functionalities can be employed. Polymercaptans, for purposes
of this invention, contain at least two, preferably at least three,
mercapto (i.e., --SH) groups. Although the structure of the polymercaptan
is not particularly critical and can vary depending upon the desired
application, preferred polymercaptans are those which correspond to the
formula R(--SH).sub.n wherein R is a hydrocarbyl or inertly substituted
hydrocarbyl group of from 1 to about 24 carbon atoms, most preferably from
1 to about 6 carbon atoms; and n is from 2 to about 10, preferably from 3
to about 10, most preferably from 3 to about 6. Examples of especially
preferred polymercaptans are pentaerythritol tetra-3-mercaptopropionate
and dipentaerythritol hexa-3-mercaptopropionate.
PAR Catalysts useful herein include the known free radical generating
catalysts, such as the organic peroxides, the azobis compounds, actinic
light, electron beams or other high energy radiation. Particularly useful
catalysts include the amine catalysts such as triethyl amine.
PAR The amount of polymercaptan which is employed is most preferably an amount
such that there are an equivalent number of mercapto groups and
2-alkenyloxazolines or 2-alkenyloxazines. Alternatively, there can be a
slight excess of mercapto groups over 2-alkenyloxazolines or
2-alkenyloxazines.
PAR Compounds of this invention which contain two or more oxazine or oxazoline
rings are prepared at very high yield in very high purity. Preferably, the
desired 2-alkenyloxazoline or 2-alkenyloxazine is added to an inert
organic solvent with an effective amount of a suitable catalyst and the
desired polymercaptan. It is desirable to heat the reaction mixture, as
for example by reflux, for a period from 1/2 to about 5 hours. The
compound is prepared quickly and efficiently, is obtained without
significant by-product and no further purification is required.
PAR The compounds of this invention can be reacted with ethylenically
unsaturated carboxylic acids as described in U.S. Pat. No. 3,996,237 which
is incorporated herein by reference. Thus, the compounds of this invention
are particularly useful in introducing crosslinking capabilities to
numerous polymers prepared from ethylenically unsaturated monomers.
PAR In addition, the compounds of this invention are useful in the various
other applications where a compound comprising a pendant oxazine or
oxazoline ring is known to have useful activity. Of particular interest is
the preparation of highly branched and/or crosslinked polyoxazoline and
polyoxazine networks. For example, the compounds of this invention can be
contacted with an alkyloxazine or alkyloxazoline under conditions suitable
to bring about oxazoline or oxazine polymerization. It is understood that
the greater the number of pendant oxazoline or oxazine rings per molecule
of compound of this invention, the greater the branching or network form
of the polymer so prepared.
DETD
PAR The following example is presented to illustrate the invention and should
not be construed as limiting its scope.
PAC EXAMPLE 1
PAR To a solution comprising 5 g of pentaerythritol tetra-3-mercaptopropionate
in 15 ml of toluene is added 4.30 g of isopropenyl oxazoline and 0.15 ml
of triethylamine. The solution is heated to reflux for 2 hours and allowed
to cool to room temperature. The solvent is removed under reduced pressure
to yield the tetraoxazoline which is a clear viscous product. The product
has the structure:
##STR2##
PAC EXAMPLE 2
PAR To a solution comprising 50 g of dipentaerythritol
hexa-3-mercaptopropionate in 200 ml of acetonitrile is added 42.5 g of
isopropenyl oxazoline and 2 ml of triethylamine. The solution is heated to
reflux at 80.degree. C. for 5 hours and allowed to cool to room
temperature. The solution is stirred for 2 days. The solvent is removed
under reduced pressure to yield the hexaoxazoline which is a viscous oil.
The product has the structure:
##STR3##
CLMS
STM What is claimed is:
NUM 1.
PAR 1. A compound having the structure:
NUM 2.
PAR 2. A compound having the structure:
##STR4##
Loading

0 comments on commit 543e13a

Please sign in to comment.