Skip to content

Latest commit

 

History

History
 
 

summarizer

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

Text Summarizer Application

This model service is intended be be used for text summarization tasks. This service can ingest an arbitrarily long text input. If the input length is less than the models maximum context window it will summarize the input directly. If the input is longer than the maximum context window, the input will be divided into appropriately sized chunks. Each chunk will be summarized and a final "summary of summaries" will be the services final output.

To use this model service, please follow the steps below:

Download model(s)

This example assumes that the developer already has a copy of the model that they would like to use downloaded onto their host machine and located in the /models directory of this repo.

The two models that we have tested and recommend for this example are Llama2 and Mistral. Please download any of the GGUF variants you'd like to use.

For a full list of supported model variants, please see the "Supported models" section of the llama.cpp repository.

cd models

wget https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q5_K_S.gguf

Build the image

To build the image we will use a build.sh script that will simply copy the desired model and shared code into the build directory temporarily. This prevents any large unused model files in the repo from being loaded into the podman environment during build which can cause a significant slowdown.

cd summarizer/model_services/builds

sh build.sh llama-2-7b-chat.Q5_K_S.gguf arm summarizer

The user should provide the model name, the architecture and image name they want to use for the build.

Run the image

Once the model service image is built, it can be run with the following:

podman run -it -p 7860:7860 summarizer

Interact with the app

Now the service can be used with the python code below.

from gradio_client import Client
client = Client("http://0.0.0.0:7860")
result = client.predict("""
It's Hackathon day. 
All the developers are excited to work on interesting problems.
There are six teams total, but only one can take home the grand prize. 
The first team to solve Artificial General Intelligence wins!""",
api_name="/chat")
print(result)
  Sure, here is a summary of the input in bullet points:
• Hackathon day
• Developers excited to work on interesting problems
• Six teams participating
• Grand prize for the first team to solve Artificial General Intelligence
• Excitement and competition among the teams

You can also use the summarize.py script under /ai_applications to run the summary application against a local file. If the --file argument is left blank, it will run against the demo file data/fake_meeting.text

cd summarizer/ai_applications

python summarize --file <YOUR-FILE>