Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

created a pip package #37

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 3 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ This project aims to address the third using LLaMa.cpp and GGML.

- Inference Speed! Focus on inference, not training.
- Precompressed models.
- Minimal setup required - soon `pip install cformers` should be good to get started.
- Minimal setup required - `pip install cformers` should be good to get started.
- Easily switch between models and quantization types.
- Support variety of prompts.

Expand All @@ -26,14 +26,12 @@ And most importantly:

Setup
```bash
pip install transformers wget
git clone https://github.com/nolanoOrg/cformers.git
cd cformers/cformers/cpp && make && cd ..
pip install cformers
```

Usage:
```python
from interface import AutoInference as AI
from cformers import AutoInference as AI
ai = AI('EleutherAI/gpt-j-6B')
x = ai.generate('def parse_html(html_doc):', num_tokens_to_generate=500)
print(x['token_str'])
Expand All @@ -58,8 +56,6 @@ chat.py accepts the following parameteres:
- ```-p Tell me a joke``` for a single prompt interaction
- ```-m pythia``` to load one of the available (bloom, pythia or gptj )

We are working on adding support for `pip install cformers.`

Following Architectures are supported:
- GPT-J
- BLOOM
Expand Down
2 changes: 1 addition & 1 deletion cformers/__init__.py
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
"""Cformers: SoTA Transformer inference on your CPU."""
from .interface import AutoModel, AutoTokenizer
from .interface import AutoInference
7 changes: 5 additions & 2 deletions cformers/interface.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
import select
import wget
import requests
import pathlib

import transformers as tf # RIP TensorFlow

Expand Down Expand Up @@ -193,11 +194,13 @@ def generate(self,
f"Prompt should be a list of integers {prompt}"
# Convert to a string of space separated integers
prompt = " ".join([str(x) for x in prompt])

main_file = str(pathlib.Path(__file__).parent.resolve())

if os.name == 'nt':
main_file = "./cpp/main.exe"
main_file += "/cpp/main.exe"
else:
main_file = "./cpp/main"
main_file += "/cpp/main"

command = [main_file, self.cpp_model_name,
"-m", self.model_save_path,
Expand Down
40 changes: 40 additions & 0 deletions setup.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
from setuptools import setup, find_packages
import codecs
import os
import subprocess

packages= ['cformers', 'cformers/cpp']
package_data = {'cformers': ['*'], 'cformers/cpp': ['*']}
build_main = subprocess.run(["make"], stdout=subprocess.PIPE, cwd="cformers/cpp")

here = os.path.abspath(os.path.dirname(__file__))

with codecs.open(os.path.join(here, "README.md"), encoding="utf-8") as fh:
long_description = "\n" + fh.read()

VERSION = '0.0.4'
DESCRIPTION = 'SoTA Transformers with C-backend for fast inference on your CPU.'
LONG_DESCRIPTION = 'We identify three pillers to enable fast inference of SoTA AI models on your CPU:\n1. Fast C/C++ LLM inference kernels for CPU.\n2. Machine Learning Research & Exploration front - Compression through quantization, sparsification, training on more data, collecting data and training instruction & chat models.\n3. Easy to use API for fast AI inference in dynamically typed language like Python.\n\nThis project aims to address the third using LLaMa.cpp and GGML.'

# Setting up
setup(
name="cformers",
version=VERSION,
author="Ayush Kaushal (Ayushk4)",
author_email="[email protected]",
description=DESCRIPTION,
long_description_content_type="text/markdown",
long_description=LONG_DESCRIPTION,
packages=packages,
package_data=package_data,
install_requires=['transformers', 'torch', 'wget'],
keywords=['python', 'local inference', 'c++ inference', 'language models', 'cpu inference', 'quantization'],
classifiers=[
"Development Status :: 2 - Pre-Alpha",
"Intended Audience :: Developers",
"Programming Language :: Python :: 3",
"Operating System :: Unix",
"Operating System :: MacOS :: MacOS X",
"Operating System :: Microsoft :: Windows",
]
)