Skip to content

Commit

Permalink
chores: 80 character line [skipci]
Browse files Browse the repository at this point in the history
  • Loading branch information
c1au6i0 committed Dec 31, 2024
1 parent 050c765 commit 7a91be5
Show file tree
Hide file tree
Showing 19 changed files with 220 additions and 153 deletions.
24 changes: 24 additions & 0 deletions R/cache.R
Original file line number Diff line number Diff line change
Expand Up @@ -63,3 +63,27 @@ read_from_cache <- function(file_name, verbose = FALSE) {
}
out
}


#' Run Code in a Temporary Sandbox Environment
#'
#' This function creates a temporary directory and sets it as `R_USER_CACHE_DIR`
#' before executing the provided code block. It is used for testing or running
#' code without affecting the user's default cache directory as required by CRAN for the examples .
#' This function is not designed to be used by package users. Shamelessly "inspired" by
#' some @luciorq code.
#' @param code The code to be executed inside the sandbox. Should be an expression.
#' @param temp_dir A temporary directory created using `temdir()`.
#' @return The result of the executed code.
#' @export
#' @examples
#' with_extr_sandbox(Sys.getenv("R_USER_CACHE_DIR"))
#' with_extr_sandbox(tools::R_user_dir("extractox", "cache"))
with_extr_sandbox <- function(code, temp_dir = tempdir()) {
withr::with_envvar(
new = c("R_USER_CACHE_DIR" = temp_dir),
code = {
eval(substitute(code), envir = parent.frame())
}
)
}
4 changes: 2 additions & 2 deletions R/checks.R
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
" Check for NA values in a specific column of a dataframe
#"
#' Check for NA values in a specific column of a dataframe
#'
#' Checks for NA values in a specified column of a dataframe and optionally warns if any are found.
#'
#' @param dat A dataframe that contains the data.
Expand Down
159 changes: 103 additions & 56 deletions R/extr_comptox.R
Original file line number Diff line number Diff line change
@@ -1,89 +1,138 @@
#' Download and Extract Data from CompTox Chemistry Dashboard
#'
#' This function interacts with the CompTox Chemistry Dashboard to download and extract a wide range of chemical data based on user-defined search criteria. It allows for flexible input types and supports downloading various chemical properties, identifiers, and predictive data.
#' It was inspired by the `ECOTOXr::websearch_comptox` function.
#' This function interacts with the CompTox Chemistry Dashboard to download and
#' extract a wide range of chemical data based on user-defined search criteria.
#' It allows for flexible input types and supports downloading various chemical
#' properties, identifiers, and predictive data. It was inspired by the
#' `ECOTOXr::websearch_comptox` function.
#'
#' @param ids A character vector containing the items to be searched within the CompTox Chemistry Dashboard. These can be chemical names, CAS Registry Numbers (CASRN), InChIKeys, or DSSTox substance identifiers (DTXSID).
#' @param download_items A character vector of items to be downloaded. This includes a comprehensive set of chemical properties, identifiers, predictive data, and other relevant information. By Default, it download all the info
#' @param ids A character vector containing the items to be searched within the
#' CompTox Chemistry Dashboard. These can be chemical names, CAS Registry Numbers
#' (CASRN), InChIKeys, or DSSTox substance identifiers (DTXSID).
#' @param download_items A character vector of items to be downloaded.
#' This includes a comprehensive set of chemical properties, identifiers, predictive
#' data, and other relevant information. By Default, it download all the info
#' \describe{
#' \item{CASRN}{The Chemical Abstracts Service Registry Number, a unique numerical identifier for chemical substances.}
#' \item{INCHIKEY}{The hashed version of the full International Chemical Identifier (InChI) string.}
#' \item{IUPAC_NAME}{The International Union of Pure and Applied Chemistry (IUPAC) name of the chemical.}
#' \item{SMILES}{The Simplified Molecular Input Line Entry System (SMILES) representation of the chemical structure.}
#' \item{CASRN}{The Chemical Abstracts Service Registry Number, a unique numerical
#' identifier for chemical substances.}
#' \item{INCHIKEY}{The hashed version of the full International Chemical
#' Identifier (InChI) string.}
#' \item{IUPAC_NAME}{The International Union of Pure and Applied Chemistry
#' (IUPAC) name of the chemical.}
#' \item{SMILES}{The Simplified Molecular Input Line Entry System (SMILES)
#' representation of the chemical structure.}
#' \item{INCHI_STRING}{The full International Chemical Identifier (InChI) string.}
#' \item{MS_READY_SMILES}{The SMILES representation of the chemical structure, prepared for mass spectrometry analysis.}
#' \item{QSAR_READY_SMILES}{The SMILES representation of the chemical structure, prepared for quantitative structure-activity relationship (QSAR) modeling.}
#' \item{MOLECULAR_FORMULA}{The chemical formula representing the number and type of atoms in a molecule.}
#' \item{AVERAGE_MASS}{The average mass of the molecule, calculated based on the isotopic distribution of the elements.}
#' \item{MONOISOTOPIC_MASS}{The mass of the molecule calculated using the most abundant isotope of each element.}
#' \item{MS_READY_SMILES}{The SMILES representation of the chemical structure,
#' prepared for mass spectrometry analysis.}
#' \item{QSAR_READY_SMILES}{The SMILES representation of the chemical structure,
#' prepared for quantitative structure-activity relationship (QSAR) modeling.}
#' \item{MOLECULAR_FORMULA}{The chemical formula representing the number and type
#' of atoms in a molecule.}
#' \item{AVERAGE_MASS}{The average mass of the molecule, calculated based on the
#' isotopic distribution of the elements.}
#' \item{MONOISOTOPIC_MASS}{The mass of the molecule calculated using the most
#' abundant isotope of each element.}
#' \item{QC_LEVEL}{The quality control level of the data.}
#' \item{SAFETY_DATA}{Safety information related to the chemical.}
#' \item{EXPOCAST}{Exposure predictions from the EPA's ExpoCast program.}
#' \item{DATA_SOURCES}{Sources of the data provided.}
#' \item{TOXVAL_DATA}{Toxicological values related to the chemical.}
#' \item{NUMBER_OF_PUBMED_ARTICLES}{The number of articles related to the chemical in PubMed.}
#' \item{NUMBER_OF_PUBMED_ARTICLES}{The number of articles related to the chemical
#' in PubMed.}
#' \item{PUBCHEM_DATA_SOURCES}{Sources of data from PubChem.}
#' \item{CPDAT_COUNT}{The number of entries in the Chemical and Product Categories Database (CPDat).}
#' \item{IRIS_LINK}{Link to the EPA's Integrated Risk Information System (IRIS) entry for the chemical.}
#' \item{PPRTV_LINK}{Link to the EPA's Provisional Peer-Reviewed Toxicity Values (PPRTV) entry for the chemical.}
#' \item{CPDAT_COUNT}{The number of entries in the Chemical and Product Categories
#' Database (CPDat).}
#' \item{IRIS_LINK}{Link to the EPA's Integrated Risk Information System (IRIS)
#' entry for the chemical.}
#' \item{PPRTV_LINK}{Link to the EPA's Provisional Peer-Reviewed Toxicity Values
#' (PPRTV) entry for the chemical.}
#' \item{WIKIPEDIA_ARTICLE}{Link to the Wikipedia article for the chemical.}
#' \item{QC_NOTES}{Notes related to the quality control of the data.}
#' \item{ABSTRACT_SHIFTER}{Information related to the abstract shifter.}
#' \item{TOXPRINT_FINGERPRINT}{The ToxPrint chemoinformatics fingerprint of the chemical.}
#' \item{ACTOR_REPORT}{The Aggregated Computational Toxicology Resource (ACTOR) report for the chemical.}
#' \item{ACTOR_REPORT}{The Aggregated Computational Toxicology Resource (ACTOR)
#' report for the chemical.}
#' \item{SYNONYM_IDENTIFIER}{Identifiers for synonyms of the chemical.}
#' \item{RELATED_RELATIONSHIP}{Information on related chemicals.}
#' \item{ASSOCIATED_TOXCAST_ASSAYS}{Assays associated with the chemical in the ToxCast database.}
#' \item{ASSOCIATED_TOXCAST_ASSAYS}{Assays associated with the chemical
#' in the ToxCast database.}
#' \item{TOXVAL_DETAILS}{Details of toxicological values.}
#' \item{CHEMICAL_PROPERTIES_DETAILS}{Details of the chemical properties.}
#' \item{BIOCONCENTRATION_FACTOR_TEST_PRED}{Predicted bioconcentration factor from tests.}
#' \item{BOILING_POINT_DEGC_TEST_PRED}{Predicted boiling point in degrees Celsius from tests.}
#' \item{48HR_DAPHNIA_LC50_MOL/L_TEST_PRED}{Predicted 48-hour LC50 for Daphnia in mol/L from tests.}
#' \item{BIOCONCENTRATION_FACTOR_TEST_PRED}{Predicted bioconcentration factor from
#' tests.}
#' \item{BOILING_POINT_DEGC_TEST_PRED}{Predicted boiling point in degrees Celsius
#' from tests.}
#' \item{48HR_DAPHNIA_LC50_MOL/L_TEST_PRED}{Predicted 48-hour LC50 for Daphnia
#' in mol/L from tests.}
#' \item{DENSITY_G/CM^3_TEST_PRED}{Predicted density in g/cm³ from tests.}
#' \item{DEVTOX_TEST_PRED}{Predicted developmental toxicity from tests.}
#' \item{96HR_FATHEAD_MINNOW_MOL/L_TEST_PRED}{Predicted 96-hour LC50 for fathead minnow in mol/L from tests.}
#' \item{FLASH_POINT_DEGC_TEST_PRED}{Predicted flash point in degrees Celsius from tests.}
#' \item{MELTING_POINT_DEGC_TEST_PRED}{Predicted melting point in degrees Celsius from tests.}
#' \item{96HR_FATHEAD_MINNOW_MOL/L_TEST_PRED}{Predicted 96-hour LC50 for fathead
#' minnow in mol/L from tests.}
#' \item{FLASH_POINT_DEGC_TEST_PRED}{Predicted flash point in degrees Celsius
#' from tests.}
#' \item{MELTING_POINT_DEGC_TEST_PRED}{Predicted melting point in degrees Celsius
#' from tests.}
#' \item{AMES_MUTAGENICITY_TEST_PRED}{Predicted Ames mutagenicity from tests.}
#' \item{ORAL_RAT_LD50_MOL/KG_TEST_PRED}{Predicted oral LD50 for rats in mol/kg from tests.}
#' \item{SURFACE_TENSION_DYN/CM_TEST_PRED}{Predicted surface tension in dyn/cm from tests.}
#' \item{THERMAL_CONDUCTIVITY_MW_M×K_TEST_PRED}{Predicted thermal conductivity in mW/m×K from tests.}
#' \item{TETRAHYMENA_PYRIFORMIS_IGC50_MOL/L_TEST_PRED}{Predicted IGC50 for Tetrahymena pyriformis in mol/L from tests.}
#' \item{ORAL_RAT_LD50_MOL/KG_TEST_PRED}{Predicted oral LD50 for rats in mol/kg
#' from tests.}
#' \item{SURFACE_TENSION_DYN/CM_TEST_PRED}{Predicted surface tension in dyn/cm
#' from tests.}
#' \item{THERMAL_CONDUCTIVITY_MW_M×K_TEST_PRED}{Predicted thermal conductivity
#' in mW/m×K from tests.}
#' \item{TETRAHYMENA_PYRIFORMIS_IGC50_MOL/L_TEST_PRED}{Predicted IGC50 for Tetrahymena
#' pyriformis in mol/L from tests.}
#' \item{VISCOSITY_CP_CP_TEST_PRED}{Predicted viscosity in cP from tests.}
#' \item{VAPOR_PRESSURE_MMHG_TEST_PRED}{Predicted vapor pressure in mmHg from tests.}
#' \item{WATER_SOLUBILITY_MOL/L_TEST_PRED}{Predicted water solubility in mol/L from tests.}
#' \item{ATMOSPHERIC_HYDROXYLATION_RATE_\(AOH\)_CM3/MOLECULE\*SEC_OPERA_PRED}{Predicted atmospheric hydroxylation rate in cm³/molecule\*sec from OPERA.}
#' \item{VAPOR_PRESSURE_MMHG_TEST_PRED}{Predicted vapor pressure in mmHg from
#' tests.}
#' \item{WATER_SOLUBILITY_MOL/L_TEST_PRED}{Predicted water solubility in mol/L
#' from tests.}
#' \item{ATMOSPHERIC_HYDROXYLATION_RATE_\(AOH\)_CM3/MOLECULE\*SEC_OPERA_PRED}{Predicted
#' atmospheric hydroxylation rate in cm³/molecule\*sec from OPERA.}
#' \item{BIOCONCENTRATION_FACTOR_OPERA_PRED}{Predicted bioconcentration factor from OPERA.}
#' \item{BIODEGRADATION_HALF_LIFE_DAYS_DAYS_OPERA_PRED}{Predicted biodegradation half-life in days from OPERA.}
#' \item{BOILING_POINT_DEGC_OPERA_PRED}{Predicted boiling point in degrees Celsius from OPERA.}
#' \item{HENRYS_LAW_ATM-M3/MOLE_OPERA_PRED}{Predicted Henry's law constant in atm-m³/mole from OPERA.}
#' \item{BIODEGRADATION_HALF_LIFE_DAYS_DAYS_OPERA_PRED}{Predicted biodegradation
#' half-life in days from OPERA.}
#' \item{BOILING_POINT_DEGC_OPERA_PRED}{Predicted boiling point in degrees
#' Celsius from OPERA.}
#' \item{HENRYS_LAW_ATM-M3/MOLE_OPERA_PRED}{Predicted Henry's law constant
#' in atm-m³/mole from OPERA.}
#' \item{OPERA_KM_DAYS_OPERA_PRED}{Predicted Km in days from OPERA.}
#' \item{OCTANOL_AIR_PARTITION_COEFF_LOGKOA_OPERA_PRED}{Predicted octanol-air partition coefficient (log Koa) from OPERA.}
#' \item{SOIL_ADSORPTION_COEFFICIENT_KOC_L/KG_OPERA_PRED}{Predicted soil adsorption coefficient (Koc) in L/kg from OPERA.}
#' \item{OCTANOL_WATER_PARTITION_LOGP_OPERA_PRED}{Predicted octanol-water partition coefficient (log P) from OPERA.}
#' \item{MELTING_POINT_DEGC_OPERA_PRED}{Predicted melting point in degrees Celsius from OPERA.}
#' \item{OCTANOL_AIR_PARTITION_COEFF_LOGKOA_OPERA_PRED}{Predicted octanol-air
#' partition coefficient (log Koa) from OPERA.}
#' \item{SOIL_ADSORPTION_COEFFICIENT_KOC_L/KG_OPERA_PRED}{Predicted soil adsorption
#' coefficient (Koc) in L/kg from OPERA.}
#' \item{OCTANOL_WATER_PARTITION_LOGP_OPERA_PRED}{Predicted octanol-water partition
#' coefficient (log P) from OPERA.}
#' \item{MELTING_POINT_DEGC_OPERA_PRED}{Predicted melting point in degrees Celsius
#' from OPERA.}
#' \item{OPERA_PKAA_OPERA_PRED}{Predicted pKa (acidic) from OPERA.}
#' \item{OPERA_PKAB_OPERA_PRED}{Predicted pKa (basic) from OPERA.}
#' \item{VAPOR_PRESSURE_MMHG_OPERA_PRED}{Predicted vapor pressure in mmHg from OPERA.}
#' \item{WATER_SOLUBILITY_MOL/L_OPERA_PRED}{Predicted water solubility in mol/L from OPERA.}
#' \item{EXPOCAST_MEDIAN_EXPOSURE_PREDICTION_MG/KG-BW/DAY}{Predicted median exposure from ExpoCast in mg/kg-bw/day.}
#' \item{VAPOR_PRESSURE_MMHG_OPERA_PRED}{Predicted vapor pressure in mmHg from
#' OPERA.}
#' \item{WATER_SOLUBILITY_MOL/L_OPERA_PRED}{Predicted water solubility in mol/L
#' from OPERA.}
#' \item{EXPOCAST_MEDIAN_EXPOSURE_PREDICTION_MG/KG-BW/DAY}{Predicted median
#' exposure from ExpoCast in mg/kg-bw/day.}
#' \item{NHANES}{National Health and Nutrition Examination Survey data.}
#' \item{TOXCAST_NUMBER_OF_ASSAYS/TOTAL}{Number of assays in ToxCast.}
#' \item{TOXCAST_PERCENT_ACTIVE}{Percentage of active assays in ToxCast.}
#' }
#' @param mass_error Numeric value indicating the mass error tolerance for searches involving mass data. Default is `0`.
#' @param verify_ssl Logical value indicating whether SSL certificates should be verified. Default is `FALSE`. Note that this argument is not used
#' on linux OS.
#' @param verbose A logical value indicating whether to print detailed messages. Default is TRUE.
#' @param ... Additional arguments passed to `httr2::req_options()`. Note that this argument is not used
#' on linux OS.
#' @param mass_error Numeric value indicating the mass error tolerance for searches
#' involving mass data. Default is `0`.
#' @param verify_ssl Logical value indicating whether SSL certificates should be
#' verified. Default is `FALSE`. Note that this argument is not used on linux OS.
#' @param verbose A logical value indicating whether to print detailed messages.
#' Default is TRUE.
#' @param ... Additional arguments passed to `httr2::req_options()`.
#' @details
#' Please note that this function, which pulls data from EPA servers, may encounter issues on some Linux systems.
#' This is because those servers do not accept secure legacy renegotiation. On Linux systems, the current function depends
#' on \code{curl} and \code{OpenSSL}, which have known problems with unsafe legacy renegotiation in newer versions.
#' One workaround is to downgrade to \code{curl v7.78.0} and \code{OpenSSL v1.1.1}.
#' However, please be aware that using these older versions might introduce potential security vulnerabilities.
#' Refer to \href{https://gist.github.com/c1au6i0/5cc2d87966340a31032ffebf1cfb657c}{this gist} for instructions on how to downgrade \code{curl} and \code{OpenSSL} on Ubuntu.
#' Please note that this function, which pulls data from EPA servers, may encounter
#' issues on some Linux systems. This is because those servers do not accept secure
#' legacy renegotiation. On Linux systems, the current function depends on \code{curl}
#' and \code{OpenSSL}, which have known problems with unsafe legacy renegotiation
#' in newer versions. One workaround is to downgrade to \code{curl v7.78.0} and
#' \code{OpenSSL v1.1.1}. However, please be aware that using these older versions
#' might introduce potential security vulnerabilities. Refer to
#' \href{https://gist.github.com/c1au6i0/5cc2d87966340a31032ffebf1cfb657c}{this gist}
#' for instructions on how to downgrade \code{curl} and \code{OpenSSL} on Ubuntu.
#' @seealso \href{https://www.epa.gov/comptox-tools/comptox-chemicals-dashboard-resource-hub}{CompTox Chemicals Dashboard Resource Hub}
#' @return A cleaned data frame containing the requested data from CompTox.
#' @export
Expand Down Expand Up @@ -226,7 +275,6 @@ extr_comptox_ <- function(ids,
identifier_types <- c("chemical_name", "CASRN", "INCHIKEY", "dtxsid")
input_type <- "IDENTIFIER"


download_items <- c("DTXCID", download_items)

params <- list(
Expand Down Expand Up @@ -259,7 +307,6 @@ extr_comptox_ <- function(ids,
}
)


msg <- "Failed to perform the request: {conditionMessage(error_result)}"

if (!is.null(error_result)) {
Expand Down
Loading

0 comments on commit 7a91be5

Please sign in to comment.