Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Presubmission inquiry - openalexR: interacts with OpenAlex API #557

Closed
1 of 20 tasks
trangdata opened this issue Oct 7, 2022 · 8 comments
Closed
1 of 20 tasks

Presubmission inquiry - openalexR: interacts with OpenAlex API #557

trangdata opened this issue Oct 7, 2022 · 8 comments
Assignees

Comments

@trangdata
Copy link

Submitting Author Name: Trang Le
Submitting Author Github Handle: @trangdata
Other Package Authors Github handles: (comma separated, delete if none) @massimoaria
Repository: https://github.com/massimoaria/openalexR
Submission type: Pre-submission
Language: en


  • Paste the full DESCRIPTION file inside a code block below:
Type: Package
Package: openalexR
Title: Getting Bibliographic Records from 'OpenAlex' Database Using 'DSL'
    API
Version: 1.0.1
Authors@R: c(
    person(given = "Massimo",
           family = "Aria",
           role = c("aut", "cre", "cph"),
           email = "[email protected]",
           comment = c(ORCID = "0000-0002-8517-9411")),
    person(given = "Corrado",
           family = "Cuccurullo",
           role = c("ctb"),
           email = "[email protected]",
           comment = c(ORCID = "0000-0002-7401-8575")),       
    person(given = "Trang",
           family = "Le",
           role = "aut",
           email = "[email protected]",
           comment = c(ORCID = "0000-0003-3737-6565"))
    )
Description: A set of tools to extract bibliographic content from
    'OpenAlex' database using API <https://docs.openalex.org/api/>.
License: MIT + file LICENSE
URL: https://github.com/massimoaria/openalexR,
    https://massimoaria.github.io/openalexR/
BugReports: https://github.com/massimoaria/openalexR/issues
Imports: 
    curl,
    httr,
    jsonlite,
    progress,
    tibble
Suggests: 
    testthat (>= 3.0.0),
    dplyr,
    knitr,
    rmarkdown,
    tidyr,
    purrr,
    ggplot2,
    covr
VignetteBuilder: 
    knitr
Encoding: UTF-8
LazyData: true
RoxygenNote: 7.2.1
Config/testthat/edition: 3
Depends: 
    R (>= 2.10)

Scope

  • Please indicate which category or categories from our package fit policies or statistical package categories this package falls under. (Please check an appropriate box below):

    Data Lifecycle Packages

    • data retrieval
    • data extraction
    • data munging
    • data deposition
    • data validation and testing
    • workflow automation
    • version control
    • citation management and bibliometrics
    • scientific software wrappers
    • field and lab reproducibility tools
    • database software bindings
    • geospatial data
    • text analysis

    Statistical Packages

    • Bayesian and Monte Carlo Routines
    • Dimensionality Reduction, Clustering, and Unsupervised Learning
    • Machine Learning
    • Regression and Supervised Learning
    • Exploratory Data Analysis (EDA) and Summary Statistics
    • Spatial Analyses
    • Time Series Analyses
  • Explain how and why the package falls under these categories (briefly, 1-2 sentences). Please note any areas you are unsure of:
    The package interacts with the OpenAlex API. Similar packages in the same category are rcrossref and rotl.

  • If submitting a statistical package, have you already incorporated documentation of standards into your code via the srr package?
    N/A

  • Who is the target audience and what are scientific applications of this package?
    Anyone who wants to work in R to interact with the OpenAlex API to acquire information on publications, authors, etc., including researchers in the field of bibliometrics, text mining, etc. We including some nice example analyses in our README.

  • Are there other R packages that accomplish the same thing? If so, how does yours differ or meet our criteria for best-in-category?
    There are a couple of packages that we're aware of: https://github.com/KTH-Library/openalex and https://github.com/ekmaloney/openalexR. We think our package strikes the balance of complexity/flexibility, follow best API package practices, and offer additional useful functionality such as snowball search.

  • (If applicable) Does your package comply with our guidance around Ethics, Data Privacy and Human Subjects Research?
    Yes.

  • Any other questions or issues we should be aware of?:
    None.

@adamhsparks
Copy link
Member

Hi @trangdata, could you give us a more detailed comparison of the packages' overlap?

Here's an example of a previous inquiry for another package that had some overlap. #199 (comment)

@trangdata
Copy link
Author

Hi @adamhsparks thank you so much for your time! 🙏🏽 The example comparison is very helpful!

Quick Q: is the overlap analysis for within the ropensci packages like in the case of scrubr and CoordinateCleaner (of which there is not one similar ropensci package that I'm aware of), or do you mean the overlap between what we have and KTH-Library/openalex and ekmaloney/openalexR?

@adamhsparks
Copy link
Member

I mean the overlap between the packages that you cited as being similar.

@trangdata
Copy link
Author

trangdata commented Oct 28, 2022

Thank you @adamhsparks for clarifying! 🌻

I'm showing a detailed table of comparison below. In short, our package shares some similarity with KTH-Library/openalex, but we offer additional functionalities such as snowball search. We also expose functions that are crucial for many use cases such as filtering for a particular set of works/authors/etc. and generally a more "canonical" and simpler way to construct the query (see example). Our package is actively maintained and well tested. We also have a lot fewer dependencies. Lastly, these other packages currently only live on GitHub and do not belong to any repository (RO, CRAN, or BioC).

We're confident we have satisfied three out of four criteria listed in the guideline's package overlap section. Let me know if there is anything you would like me to elaborate on. 🌈

Table of comparison

Type Property massimoaria/openalexR KTH-Library/openalex ekmaloney/openalexR
Metadata last commit 11 days ago 8 months ago 9 months ago
Metadata # vignettes 4 0 0
Metadata # readme examples 10 1 4
Metadata test coverage 89% N/A N/A
Metadata # commits 198 7 20
Metadata # pull requests 17 0 0
Metadata # stars 26 8 7
Metadata license MIT MIT MIT
Metadata Imports curl, httr, jsonlite, progress, tibble httr, magrittr, utils, dplyr, purrr, progress, jsonlite, tibble magrittr, dplyr, httr, jsonlite, purrr, tidyr, stringr, tibble
Function request data from query oa_request openalex_crawl openalex_api
Function enter polite pool openalexR.mailto openalex_polite
Function generate a valid query oa_query openalex_query construct_links
Function converts the JSON object in classical bibliographic tibble oa2df openalex_flatten_long clean_author_info, clean_venue_info, clean_works_info
Function composes functions to ease data fetching oa_fetch N/A find_work, find_author, find_institution, find_concept, find_venue
Function display openalex attribution openalex_attribution
Function snowballing oa_snowball
Function convert works to bibliometrix object oa2bibliometrix
Function get random entity oa_random
Function simplify works result show_works
Function simplify authors result show_authors
Function available entities in the OpenAlex database oa_entities
Function get papers for a particular author oa_fetch (see example) openalex_flatten_long (see example) get_authors_papers
Function get coauthors for a particular author get_coauthors
Function other functions, unknown use case get_links_for_each_page, get_number_of_pages, get_all_data_for_query
Data list of countries and their alpha-2 and aplha-3 codes countrycode
Data 0-level concepts and corresponding abbreviations concept_abbrev

Example

Getting a dataframe of papers published in 2022 by a specific author.

Our way

oa_fetch(
  "works",  
  authors.orcid = "0000-0002-8517-9411", 
  publication_year = 2022
)

KTH-Library/openalex's way:

openalex_flatten_long(openalex_crawl(
  "works", 
  query = openalex:::openalex_query(
    filter = "authors.orcid:0000-0002-8517-9411,publication_year:2022"
  )
))

ekmaloney/openalexR's way:

Not available. The package doesn't allow for filtering by publication_year.

@trangdata
Copy link
Author

Hi @adamhsparks — just wanted to float this up on your list. I'm excited to get the package to ropensci and start the review process. Let me know if there is anything else that you need from me, or if there is anything from the presubmission guide that I should highlight. Thank you again for your time! 🪴

@annakrystalli
Copy link
Contributor

Hello @trangdata ! 👋🙂

I can confirm that the editorial team feel the package is eligible for submission 🎉

Feel free to open a full submission issue. Let me know if you have any further questions.

@trangdata
Copy link
Author

That's great news! 🥳 Thank you so much @annakrystalli, @adamhsparks and team! Please feel free to close this issue when ready.

@adamhsparks
Copy link
Member

adamhsparks commented Nov 8, 2022

Thanks, @trangdata and @annakrystalli. I'll hand everything over to @annakrystalli as the incoming EIC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants