Skip to content

Commit

Permalink
updated readmes
Browse files Browse the repository at this point in the history
  • Loading branch information
b08x committed Mar 28, 2024
1 parent d80a481 commit 8c283f3
Show file tree
Hide file tree
Showing 10 changed files with 387 additions and 94 deletions.
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,9 @@ This repository is based on [rubydata/docker-stacks](https://github.com/RubyData

### Minimal Image

### Data Science Image
### LLM Image

### LlamaStuff Image

### NLP Image

Expand Down
20 changes: 7 additions & 13 deletions llamaindex/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -7,21 +7,15 @@ USER $NB_UID

WORKDIR /home/$NB_USER

# List of txtai components to install
ARG COMPONENTS=[all]

RUN \
pip install --no-cache-dir -U pip wheel setuptools && \
pip install --no-cache-dir llama-index-llms-ollama \
llama-index-readers-obsidian llama-index-llms-langchain \
llama-index-graph-stores-nebula llama-index-multi-modal-llms-ollama \
llama-index-readers-file unstructured llama-index-embeddings-huggingface \
llama-index-vector-stores-chroma && \
pip install --no-cache-dir llama_index pyvis IPython && \
pip install --no-cache-dir prompttools && \
pip install --no-cache-dir txtai${COMPONENTS} && \
COPY nlp/requirements.txt .

RUN pip install --no-cache-dir -U pip wheel setuptools && \
pip install -r requirements.txt && \
python3 -m spacy download en_core_web_sm && \
python3 -m spacy download en_core_web_lg && \
python -c "import sys, importlib.util as util; 1 if util.find_spec('nltk') else sys.exit(); import nltk; nltk.download('punkt')"


# NOTE: DO NOT CHANGE the version in the path of gem's bin directory
ENV PATH $HOME/.local/share/gem/ruby/3.1.0/bin:$PATH
ENV BUNDLE_PATH $HOME/.local/share/gem
Expand Down
26 changes: 12 additions & 14 deletions llamaindex/Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -62,33 +62,31 @@ gem 'tty-prompt'
gem 'tty-screen'
gem 'yaml'


# Install basic gems
gem 'charty', '>= 0.2.12'
gem 'matplotlib', '>= 1.2.0'
gem 'numpy', '>= 0.4.0'
gem 'pandas', '>= 0.3.8'
# gem 'red_amber', '0.4.2'
# gem 'red-arrow', '11.0.0'
# gem 'red-datasets', '>= 0.1.4'
# gem 'red-gandiva', '11.0.0'
# gem 'red-parquet', '11.0.0'
gem 'unicode_plot', '>= 0.0.5'

# Additional gems
gem 'daru'
gem 'daru-view'
gem 'enumerable-statistics'
gem 'ffi-rzmq'
gem 'matplotlib', '>= 1.2.0'
gem 'nmatrix'
gem 'nmatrix-lapacke'
gem 'numo-linalg'
gem 'numo-narray'
gem 'numpy', '>= 0.4.0'
gem 'pandas', '>= 0.3.8'
gem 'rbplotly'
gem 'rumale'
gem 'unicode_plot', '>= 0.0.5'

# Additional gems
# gem 'red-arrow', '11.0.0'
# gem 'red-arrow-numo-narray'
# gem 'red-chainer'
# gem 'red-datasets', '>= 0.1.4'
# gem 'red-datasets-arrow'
# gem 'red-datasets-daru'
# gem 'red-datasets-pandas'
# gem 'red-gandiva', '11.0.0'
# gem 'red-parquet', '11.0.0'
# gem 'red-plasma'
gem 'rumale'
# gem 'red_amber', '0.4.2'
67 changes: 62 additions & 5 deletions llamaindex/README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,73 @@
# RubyData Data Science
# Docker Stacks: llamaindex

```bash

```bash
+--------+ +------------+
| ruby +------>|data science|
+--------+ +------------+
```

## Project Title:
Unchanged.

## Description:
Unchanged.

## Docker Image:
This Docker image offers a meticulously crafted environment tailored for machine learning, natural language processing, and data science workflows, with a robust foundation for integrating both Python and Ruby-based tools. Here's an updated breakdown:

- **Base Image:** A customized Jupyter/scipy-notebook image (Tag: f3079808ca8c or a designated version) encompassing essential foundational tools.
- **Customization:**
- **System Updates & Prerequisites:** Installation of commonly employed development tools and libraries (e.g., git, gcc, build essentials, database clients) to facilitate a robust development environment.
- **LLVM:** Inclusion of the LLVM compiler infrastructure (version 11) for enhanced compilation capabilities.
- **Python 3.10.7:** Seamless integration of Python 3.10.7, directly copied from the official python:3.10-slim image, ensuring compatibility and reliability.
- **Ruby 3.1.3:** Seamless integration of Ruby 3.1.3, directly copied from the official rubylang/ruby image, ensuring compatibility and reliability.
- **User & Permissions Setup:** Configuration ensures that the container's primary user possesses appropriate permissions, including sudo access, for efficient resource management.
- **Languages:** Python, Ruby
- **Frameworks/Libraries:**
- **Python:**
- spaCy: Natural language processing
- txtai: Real-time text and code search
- LangChain: Language model chaining
- llama-index: Large language model indexing
- IPython: Interactive Python shell
- nbconvert: Convert notebooks to various formats
- NumPy: Scientific computing
- Pandas: Data manipulation and analysis
- Matplotlib: Data visualization
- ...Include other items from requirements.txt
- **Ruby:**
- iruby: Interactive Ruby shell
- pycall: Call Python code from Ruby
- dotenv: Environment variable management
- LangChainRB: Language model chaining for Ruby
- ruby-openai: OpenAI API client
- ...Include other items from Gemfile

## Installation:
### Prerequisites:
- Docker installed and operational

### Pull the Image:
```bash
docker pull <image_name>:<image_tag>
```
(Replace `<image_name>` and `<image_tag>` with the appropriate values.)

## Usage:
### Start the container:
```bash
docker run -it -p 8888:8888 <image_name>:<image_tag>
```
### Access Jupyter Notebook:
Navigate to [http://localhost:8888](http://localhost:8888) in your web browser. You'll require the token provided in the container's output to log in.

https://github.com/red-data-tools/packages.red-data-tools.org
### Example:
Provide a fundamental code example or a concise explanation illustrating a core use case in either Python or Ruby to demonstrate the image's capabilities.

## Additional Notes:
- **Key Changes:** This revised version of the image allows for seamless integration and utilization of both Python and Ruby within the Jupyter environment, expanding the possibilities for data science and machine learning workflows.
- **Ruby Integration:** Explore the Ruby gems listed in the Gemfile or incorporate your own to augment your workflow and extend functionalities.
- **Customization:** While the base image provides a solid foundation, it can be further customized to cater to project-specific dependencies and requirements, ensuring a tailored environment for your specific needs.

- Ruby stack
- pry, iruby, pycall, numpy, pandas, matplotlib, numo-narray, numo-linalg, nmatrix, nmatrix-lapacke, red-arrow, red-arrow-numo-narray, red-arrow-nmatrix, daru, rbplotly, charty
We hope this revised version of the Docker image empowers you to unlock the full potential of your data science and machine learning endeavors.
19 changes: 19 additions & 0 deletions llamaindex/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
llama-index-embeddings-huggingface
llama-index-graph-stores-nebula
llama-index-llms-langchain
llama-index-llms-ollama
llama-index-multi-modal-llms-ollama
llama-index-readers-file
llama-index-readers-obsidian
llama-index-vector-stores-chroma
llama_index
pyvis
IPython
nbconvert[all]
prompttools
pydantic==1.9
spacy_version>=3.3.0,<3.5.0
txtai[all]
typing_extensions<4.6.0
unstructured
llama-parse
Loading

0 comments on commit 8c283f3

Please sign in to comment.