Skip to content

lanl/Library-AI-Toolset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Collection of tools designed to parse documents, such as PDFs, and extract structured elements including URLs, citation contexts, tables, formulas, and figures. This toolset leverages AI-based text extraction and classification methods, providing robust solutions for various document processing needs.

The toolset includes PDF converters to text and parsers, featuring AI-based text extraction and classification methods. For example, it can classify URI citation contexts by resource type, such as software or datasets, and determine the intent behind software mentions. The toolset also utilizes PDF-structured information, such as abstracts and authors, to automate common library tasks. These tasks include metadata extraction, topic detection, citation analysis, language detection, and the description of images and media.

License

#LANL O number O4805 © 2024. Triad National Security, LLC. All rights reserved. This program was produced under U.S. Government contract 89233218CNA000001 for Los Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC for the U.S. Department of Energy/National Nuclear Security Administration. All rights in the program are reserved by Triad National Security, LLC, and the U.S. Department of Energy/National Nuclear Security Administration. The Government is granted for itself and others acting on its behalf a nonexclusive, paid-up, irrevocable worldwide license in this material to reproduce, prepare derivative works, distribute copies to the public, perform publicly and display publicly, and to permit others to do so.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published