Skip to content

Commit

Permalink
Improve encoding non-ASCII characters to LaTeX in toBiblatex (#113)
Browse files Browse the repository at this point in the history
* Remove appveyor CI

Signed-off-by: Mathew W. McLean <[email protected]>

* Avoid msg about page numbers in ReadPDFs for 1page PDFs

* Always use pdfinfo to determine number of pages in each PDF in ReadPDFs
even when use.metadata is FALSE
* This avoids a message from pdftotext about incorrect pages
* Refactor test-readPDF.R to make it clearer when tests are skipped
if poppler is not installed

Signed-off-by: Mathew W. McLean <[email protected]>

* Fix pkg documentation for latest roxygen2

Signed-off-by: Mathew W. McLean <[email protected]>

* Fixes for non-ASCII name list fields in toBiblatex and toBibtex

* Add latexify() from dplR instead of tools::encoded_text_to_latex
to improve conversion of non-ASCII characters to valid latex
* Fixes #102, #105, #106

Signed-off-by: Mathew W. McLean <[email protected]>

* Update documentation for latexify author

Signed-off-by: Mathew W. McLean <[email protected]>

* Add website to DESCRIPTION

Signed-off-by: Mathew W. McLean <[email protected]>

* Use default value for rettype in ReadPubMed

* c.f. https://www.ncbi.nlm.nih.gov/books/NBK25499/table/chapter4.T._valid_values_of__retmode_and/
* Add mock response for one GetPubMedByID test

Signed-off-by: Mathew W. McLean <[email protected]>

* Add more sleep for PubMed/Entrez test

Signed-off-by: Mathew W. McLean <[email protected]>

---------

Signed-off-by: Mathew W. McLean <[email protected]>
  • Loading branch information
mwmclean authored Nov 19, 2024
1 parent 2619809 commit d616da5
Show file tree
Hide file tree
Showing 16 changed files with 449 additions and 144 deletions.
63 changes: 0 additions & 63 deletions .appveyor.yml

This file was deleted.

14 changes: 9 additions & 5 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
Package: RefManageR
Version: 1.4.0
Version: 1.4.3
Title: Straightforward 'BibTeX' and 'BibLaTeX' Bibliography Management
Authors@R: person(c("Mathew", "W."), "McLean", role = c("aut", "cre"),
Authors@R: c(person(c("Mathew", "W."), "McLean", role = c("aut", "cre"),
email = "[email protected]",
comment = c(ORCID = "0000-0002-7891-9645"))
comment = c(ORCID = "0000-0002-7891-9645")),
person("Andy", "Bunn", role = "ctb",
email = "[email protected]", comment = "function latexify used by toBiblatex"))
Maintainer: Mathew W. McLean <[email protected]>
Description: Provides tools for importing and working with bibliographic
references. It greatly enhances the 'bibentry' class by providing a class
Expand All @@ -27,6 +29,8 @@ Imports:
httr,
lubridate (>= 1.5.0),
stringr,
stringi,
R.utils,
methods,
bibtex (>= 0.4.1)
Suggests:
Expand All @@ -38,5 +42,5 @@ Depends:
R (>= 3.0)
VignetteBuilder: knitr
BugReports: https://github.com/ropensci/RefManageR/issues
URL: https://github.com/ropensci/RefManageR/
RoxygenNote: 7.2.1
URL: https://github.com/ropensci/RefManageR/, https://docs.ropensci.org/RefManageR/
RoxygenNote: 7.3.2
5 changes: 4 additions & 1 deletion NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ export(as.BibEntry)
export(fields)
export(is.BibEntry)
export(toBiblatex)
importFrom(R.utils,captureOutput)
importFrom(bibtex,do_read_bib)
importFrom(httr,GET)
importFrom(httr,POST)
Expand All @@ -73,6 +74,9 @@ importFrom(methods,hasArg)
importFrom(plyr,llply)
importFrom(plyr,progress_text)
importFrom(stats,setNames)
importFrom(stringi,stri_trans_nfc)
importFrom(stringi,stri_trans_nfd)
importFrom(stringi,stri_unescape_unicode)
importFrom(stringr,str_length)
importFrom(stringr,str_sub)
importFrom(stringr,str_trim)
Expand All @@ -82,7 +86,6 @@ importFrom(tools,Rd2txt)
importFrom(tools,Rd2txt_options)
importFrom(tools,bibstyle)
importFrom(tools,deparseLatex)
importFrom(tools,encoded_text_to_latex)
importFrom(tools,getBibstyle)
importFrom(tools,latexToUtf8)
importFrom(tools,loadPkgRdMacros)
Expand Down
22 changes: 21 additions & 1 deletion R/ReadPDFs.R
Original file line number Diff line number Diff line change
Expand Up @@ -89,8 +89,28 @@ ReadPDFs <- function (path, .enc = 'UTF-8', recursive = TRUE,
out, pages.idx))
}else
{
.findPages <- function(files)
{
n.files <- length(files)
page.bounding.boxes <- lapply(files, function(x)
system2("pdfinfo", paste(shQuote('-enc'),
shQuote(.enc), shQuote("-box"),
shQuote(normalizePath(x))),
stdout = TRUE, stderr = TRUE))
pages.idx <- lapply(page.bounding.boxes, grep, pattern = "^Pages:")
pages <- rep(Inf, n.files)
for (i in seq_along(pages.idx))
{
if (length(pages.idx[[i]]))
pages[i] <- as.integer(sub("^Pages:\\s+(\\d+)", "\\1",
page.bounding.boxes[[i]][pages.idx[[i]]],
perl = TRUE))
}
return(pages)
}

doi.meta.ind <- logical(n.files)
pages <- rep(Inf, n.files)
pages <- .findPages(files)
}

########################################
Expand Down
4 changes: 1 addition & 3 deletions R/ReadPubMed.R
Original file line number Diff line number Diff line change
Expand Up @@ -114,12 +114,10 @@ GetPubMedByID <- function(id, db = "pubmed", ...){
parms$id <- paste0(id, collapse=",")

parms$retmode <- "xml"
parms$rettype <- "medline"

base.url <- "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?"
## temp <- postForm(base.url, .params = parms)
## tdoc <- xmlParse(temp)
temp <- POST(base.url, query = parms)
stop_for_status(temp)
tdoc <- read_xml(temp)

## Note: directly using xpathApply on tdoc won't work if some results are
Expand Down
20 changes: 4 additions & 16 deletions R/RefManageR-package.R
Original file line number Diff line number Diff line change
@@ -1,20 +1,8 @@
#' Import and Manage BibTeX and BibLaTeX references with RefManageR
#'
#' RefManageR provides tools for importing and working with
#' bibliographic references. It greatly enhances the bibentry class by
#' providing a class BibEntry which stores BibTeX and BibLaTeX references,
#' supports UTF-8 encoding, and can be easily searched by any field, by date
#' ranges, and by various formats for name lists (author by last names,
#' translator by full names, etc.). Entries can be updated, combined, sorted, printed
#' in a number of styles, and exported. BibTeX and BibLaTeX .bib files can be
#' read into R and converted to BibEntry objects. Interfaces to NCBI's
#' Entrez, CrossRef, and Zotero are provided for importing references and
#' references can be created from locally stored PDFs using Poppler. Includes
#' functions for citing and generating a bibliography with hyperlinks for
#' documents prepared with RMarkdown or RHTML.
#' @keywords internal
"_PACKAGE"

#' @name RefManageR-package
#' @aliases RefManageR refmanager
#' @docType package
#' @author McLean, M. W. \email{mathew.w.mclean@@gmail.com}
#' @details
#' \bold{Importing and Creating References}
Expand Down Expand Up @@ -56,7 +44,7 @@
#' \code{\link{Cite}}. Its interface is similar to \code{\link{options}}.
#' @keywords package
#' @references McLean, M. W. (2014). Straightforward Bibliography Management in R Using the RefManageR Package.
#' \href{https://arxiv.org/abs/1403.2036}{arXiv: 1403.2036 [cs.DL]}. Submitted.
#' \href{https://arxiv.org/abs/1403.2036}{arXiv: 1403.2036 [cs.DL]}.
#' @references Kime, P., M. Wemheuer, and P. Lehman (2022). The biblatex Package.
#' \url{http://mirrors.ibiblio.org/CTAN/macros/latex/contrib/biblatex/doc/biblatex.pdf}.
#' @references Hornik, K., D. Murdoch, and A. Zeileis (2012).
Expand Down
1 change: 0 additions & 1 deletion R/WriteBib.R
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@
#' @note To write the contents of \code{bib} \dQuote{as is}, the argument
#' \code{biblatex} should be \code{TRUE}, otherwise
#' conversion is done as in \code{\link{toBibtex.BibEntry}}.
#' @importFrom tools encoded_text_to_latex
#' @author McLean, M. W. based on \code{write.bib} by Gaujoux, R.
#' in package \code{bibtex}.
#' @export
Expand Down
Loading

0 comments on commit d616da5

Please sign in to comment.