Create a dataset to investigate the prevalence of Open Science practices in European patent literature by
- extracting scholarly publications cited in patents (NPL) issued by European Patent Offices
- identifing the OA status of NPLs
- linking to other sources with rich scholarly data
This repository keep tracks of the work, using dynamic notebooks (R Markdown, Jupyter Notebook).
Source code supplement for
Jahn, Najko, Klebel, Thomas, Pride, David, & Ross-Hellauer, Tony. (2021). ON-MERRIT D4.3 Quantifying the influence of Open Access on innovation and patents (1.0). Zenodo.
- d4.3.Rmd contains code used for presenting results in the deliverable. Resulting plots can be found in figure/.
- Done via Google BigQuery Patent datasets. See sql/ for corresponding SQL code. SQL statements were run in d4.3.Rmd.
- Script; resulting datasets attached to GitHub release because of data file size. We used a local snapshot from Unpaywall. See sql/enrich_oa_with_patent_md.sql as well as corresponding code in d4.3.Rmd.
- [analysis/extract_ids.R] demonstrates how we extracted IDs from the repositories PMC and arxiv as further OA evidence source
- [analysis/extract_ids.R] demonstrates how we extracted IDs from the arxiv inclduing DOI to published version, if available
- cr_prepints.R shows how preprint information were gathered from Crossref.
- Created a random sample of 10,000 NPLs
- Extraction NPL
- Pre-Evaluation
This work is licensed under CCO. Using CC0, we waive all copyrights and related or neighboring rights that we may have in all jurisdictions worldwide.