The purpose of this API Wrapper is to extend the functionality of the PatentsView API. The wrapper can take in a list of values (such as patent numbers), retrieve multiple data points, and then convert and merge the results into a CSV file.
- Clone or download this repository
git clone https://github.com/CSSIP-AIR/PatentsView-APIWrapper.git
- Install dependencies
cd PatentsView-APIWrapper
pip install -r requirements.txt
-
Modify the sample config file
sample_config.cfg
or create a copy with your own configuration settings -
Run the API Wrapper using Python 3:
python api_wrapper.py sample_config.cfg
The PatentsView API Wrapper reads in query specifications from the configuration file you point it to. The configuration file should define at least one query. Below is a description of each parameter that defines each query.
The name of the query, and the name given to the resulting file (for example, [QUERY1] produces QUERY1.csv). If your configuration file contains multiple queries, each query should have a distinct name. Query parameters must directly follow the query name.
The type of object you want to return. This must be one of the PatentsView API endpoints:
"patents"
"inventors"
"assignees"
"locations"
"cpc_subsections"
"uspc_mainclasses"
"nber_subcategories"
The name or relative path of the input file containing the values you want to query. For example, sample_config.cfg
points to sample_patents.txt
, which contains a list of patent numbers; the API wrapper will query for each of these patents.
The absolute path of the directory of your input file and results. Use forward slashes (/
) instead of backward slashes (\
). For Windows, this may look like:
directory = "/Users/jsennett/Code/PatentsView-APIWrapper"
For OSX/Unix systems:
directory = "C:/Users/jsennett/Code/PatentsView-APIWrapper"
The type of object represented in the input file. The full list of input types can be found in the PatentsView API Documentation. Common input types include:
"patent_number"
"inventor_id"
"assignee_id"
"cpc_subsection_id"
"location_id"
"uspc_mainclass_id"
The fields that will be returned in the results. Valid fields for each endpoint can be found in the PatentsView API Documentation. Fields should be specified as an array of strings, such as:
fields = ["patent_number", "patent_title", "patent_date"]
Additional rules, written in the PatentsView API syntax, to be applied to each query. Each criteria can specify multiple rules combined with OR or AND operators. If multiple criteria are listed, they will be combined with the AND operator. Multiple criteria should be named criteria1, criteria2, criteria3, etc.
For example, the following criteria will limit results to patents from Jan 1 to Dec 31, 2015 with a patent abstract containing either "cell" or "mobile".
criteria1 = {"_gte":{"patent_date":"2015-1-1"}}
criteria2 = {"_lte":{"patent_date":"2015-12-31"}}
criteria3 = {"_or":[{"_contains":{"patent_abstract":"cell"}, {"_contains":{"patent_abstract":"mobile"}]}
The fields and directions over which the output file will be sorted. This should be specified as an array of JSON objects, pairing the field with its direction. The sort order will follow the array order. To sort just by patent number (ascending):
sort = [{"patent_number": "asc"}]
To sort first by patent_date (descending), and then by patent title (ascending):
sort = [{"patent_date": "desc"}, {"patent_title":, "asc"}]
The API wrapper is currently compatible with Python 3.
Users are free to use, share, or adapt the material for any purpose, subject to the standards of the Creative Commons Attribution 4.0 International License.
Attribution should be given to PatentsView for use, distribution, or derivative works.