Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generative AI using RAG on Xeon #32

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
136 changes: 136 additions & 0 deletions examples/gen-ai-rag-demo/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
<p align="center">
<img src="https://github.com/intel/terraform-intel-aws-vm/blob/main/images/logo-classicblue-800px.png?raw=true" alt="Intel Logo" width="250"/>
</p>

# Intel® Optimized Cloud Modules for Terraform

© Copyright 2024, Intel Corporation

## AWS M7i EC2 Instance with 4th Generation Intel® Xeon® Scalable Processor (Sapphire Rapids) & Intel® Cloud Optimized Recipe for Retrival Augmented Generated GenAI

This demo will showcase Large Language Model(LLM) CPU inference using 4th Gen Xeon Scalable Processors on AWS using RAG based Generative AI

## Usage

### variables.tf

Modify the region to target a specific AWS Region

```hcl
variable "region" {
description = "Target AWS region to deploy EC2 in."
type = string
default = "us-east-1"
}
```

### main.tf

Modify settings in this file to choose your AMI as well as instance size and other details around the instance that will be created

```hcl
## Get latest Ubuntu 22.04 AMI in AWS for x86
data "aws_ami" "ubuntu-linux-2204" {
most_recent = true
owners = ["099720109477"] # Canonical
filter {
name = "name"
values = ["ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-*"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
}

module "ec2-vm" {
source = "intel/aws-vm/intel"
key_name = aws_key_pair.TF_key.key_name
instance_type = "m7i.16xlarge"
availability_zone = "us-east-1a"
ami = data.aws_ami.ubuntu-linux-2204.id
user_data = data.cloudinit_config.ansible.rendered

root_block_device = [{
volume_size = "100"
}]

tags = {
Name = "my-test-vm-${random_id.rid.dec}"
Owner = "OwnerName-${random_id.rid.dec}",
Duration = "2"
}
}
```

Run the Terraform Commands below to deploy the demos.

```Shell
terraform init
terraform plan
terraform apply
```

## Running the Demo using AWS CloudShell

Open your AWS account and click the Cloudshell prompt
At the command prompt enter in in these command prompts to install Terraform into the AWS Cloudshell

```Shell
git clone https://github.com/tfutils/tfenv.git ~/.tfenv
mkdir ~/bin
ln -s ~/.tfenv/bin/* ~/bin/
tfenv install 1.3.0
tfenv use 1.3.0
```

Download and run the [Gen-AI-RAG-Demo](https://github.com/intel/terraform-intel-aws-vm/tree/main/examples/gen-ai-rag-demo) Terraform Module by typing this command

```Shell
git clone https://github.com/intel/terraform-intel-aws-vm.git
```

Change into the `examples/gen-ai-rag-demo` example folder

```Shell
cd terraform-intel-aws-vm/examples/gen-ai-rag-demo
```

Run the Terraform Commands below to deploy the demos.

```Shell
terraform init
terraform plan
terraform apply
```

After the Terraform module successfully creates the EC2 instance, wait 15 minutes for the recipe to download and install the dependencies before continuing.

## Run the demo

To run the demo, after waiting for the dependencies to be downloaded and installed:

```shell
cd /
cd tmp/optimized-cloud-recipes/recipes/ai-rag-ubuntu/
sudo python amx_gradio_rag.py
```

## Accessing the Demo

You can access the demos using the following:

- RAG Demo: `http://yourpublicip:8080`

- Note: Whenever you choose the Intel Neural chat model, it will take around 2 minutes to initialize and produce the first output.

- Note: This module is created using the m7i.16xlarge instance size, you can change your instance type by modifying the **
instance_type = "m7i.16xlarge"** in the main.tf under the **ec2-vm module** section of the code. If you just change to an 8xlarge and then run **terraform apply** the module will destroy the old instance and rebuild with a larger instance size.

## Deleting the Demo

To delete the demo, run `terraform destroy` to delete all resources created.

## Considerations

- The AWS region where this example is run should have a default VPC
12 changes: 12 additions & 0 deletions examples/gen-ai-rag-demo/cloud_init.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
#cloud-config
package_update: true
package_upgrade: true

package:
- git

runcmd:
- apt install ansible -y
- git clone https://github.com/optimized-cloud-recipes.git /tmp/optimized-cloud-recipes
- cd /tmp/optimized-cloud-recipes
- ansible-playbook recipes/ai-rag-amx-ubuntu/recipe.yml &
35 changes: 35 additions & 0 deletions examples/gen-ai-rag-demo/logs
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
9d1dfcc69b7'}>], 'root_node': 'Query', 'params': {'Retriever': {'top_k': 10}, 'Reranker': {'top_k': 1}, 'generation_kwargs': {'max_length': 10, 'do_sample': False}}, 'query': 'Who is the sister of Sansa?', 'node_id': 'Reranker'}
2024-03-01 20:20:26,294 - haystack.nodes.ranker.base - DEBUG - Retrieved documents with IDs: ['a9245aa3e99dd4cd413b8439ca530ef9']
2024-03-01 20:20:26,294 - haystack.pipelines.base - DEBUG - Running node 'Prompter` with input: {'documents': [<Document: {'content': "==Character and appearances==\nSansa Stark is the second child and elder daughter of Eddard Stark and Catelyn Stark. She was born and raised in Winterfell, until leaving with her father and sister at the beginning of the series. She was raised with a younger sister Arya Stark, two younger brothers Rickon Stark and Bran Stark, as well as an older brother Robb Stark, and an older illegitimate half-brother, Jon Snow.\n\nRaised as a lady, Sansa is traditionally feminine. Sansa's interests are music, poetry, and singing. She strives to become like the heroines of romantic tales by attempting to find a prince, knight, or gentleman to fall in love with. For a companion animal, she owned a direwolf named Lady. However, Lady was killed in place of Arya's direwolf, Nymeria, after Nymeria attacked the Crown Prince, Joffrey Baratheon, and later fled.\n\nSansa has been described as tall, slim, womanly, and beautiful, destined to be a lady or a queen. She has blue eyes and thick auburn hair that she inherits from her mother, who came from House Tully in the Riverlands region prior to her marriage to Eddard Stark. ", 'content_type': 'text', 'score': 0.9990221261978149, 'meta': {'_split_id': 1}, 'id_hash_keys': ['content'], 'embedding': None, 'id': 'a9245aa3e99dd4cd413b8439ca530ef9'}>], 'root_node': 'Query', 'params': {'Retriever': {'top_k': 10}, 'Reranker': {'top_k': 1}, 'generation_kwargs': {'max_length': 10, 'do_sample': False}}, 'query': 'Who is the sister of Sansa?', 'node_id': 'Prompter'}
2024-03-01 20:20:26,294 - haystack.pipelines.base - DEBUG - Exception while running node 'Prompter' with input {'documents': [<Document: {'content': "==Character and appearances==\nSansa Stark is the second child and elder daughter of Eddard Stark and Catelyn Stark. She was born and raised in Winterfell, until leaving with her father and sister at the beginning of the series. She was raised with a younger sister Arya Stark, two younger brothers Rickon Stark and Bran Stark, as well as an older brother Robb Stark, and an older illegitimate half-brother, Jon Snow.\n\nRaised as a lady, Sansa is traditionally feminine. Sansa's interests are music, poetry, and singing. She strives to become like the heroines of romantic tales by attempting to find a prince, knight, or gentleman to fall in love with. For a companion animal, she owned a direwolf named Lady. However, Lady was killed in place of Arya's direwolf, Nymeria, after Nymeria attacked the Crown Prince, Joffrey Baratheon, and later fled.\n\nSansa has been described as tall, slim, womanly, and beautiful, destined to be a lady or a queen. She has blue eyes and thick auburn hair that she inherits from her mother, who came from House Tully in the Riverlands region prior to her marriage to Eddard Stark. ", 'content_type': 'text', 'score': 0.9990221261978149, 'meta': {'_split_id': 1}, 'id_hash_keys': ['content'], 'embedding': None, 'id': 'a9245aa3e99dd4cd413b8439ca530ef9'}>], 'root_node': 'Query', 'params': {'Retriever': {'top_k': 10}, 'Reranker': {'top_k': 1}, 'generation_kwargs': {'max_length': 10, 'do_sample': False}}, 'query': 'Who is the sister of Sansa?', 'node_id': 'Prompter'}
Traceback (most recent call last):
File "/home/ubuntu/.local/lib/python3.10/site-packages/haystack/pipelines/base.py", line 567, in run
node_output, stream_id = self._run_node(node_id, node_input)
File "/home/ubuntu/.local/lib/python3.10/site-packages/haystack/pipelines/base.py", line 469, in _run_node
return self.graph.nodes[node_id]["component"]._dispatch_run(**node_input)
File "/home/ubuntu/.local/lib/python3.10/site-packages/haystack/nodes/base.py", line 201, in _dispatch_run
return self._dispatch_run_general(self.run, **kwargs)
File "/home/ubuntu/.local/lib/python3.10/site-packages/haystack/nodes/base.py", line 245, in _dispatch_run_general
output, stream = run_method(**run_inputs, **run_params)
File "/home/ubuntu/.local/lib/python3.10/site-packages/haystack/nodes/prompt/prompt_node.py", line 312, in run
results = self(**invocation_context, prompt_collector=prompt_collector)
File "/home/ubuntu/.local/lib/python3.10/site-packages/haystack/nodes/prompt/prompt_node.py", line 140, in __call__
return self.prompt(prompt_template, *args, **kwargs)
File "/home/ubuntu/.local/lib/python3.10/site-packages/haystack/nodes/prompt/prompt_node.py", line 163, in prompt
for prompt in template_to_fill.fill(*args, **kwargs):
File "/home/ubuntu/.local/lib/python3.10/site-packages/haystack/nodes/prompt/prompt_template.py", line 564, in fill
template_dict = self.prepare(*args, **kwargs)
File "/home/ubuntu/.local/lib/python3.10/site-packages/haystack/nodes/prompt/prompt_template.py", line 519, in prepare
raise ValueError(
ValueError: Expected prompt parameters ['document_store', 'query'] to be provided but got only ['query']. Make sure to provide all template parameters.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/ubuntu/fastrag/fastrag.py", line 104, in <module>
answer_result = pipe.run(query="Who is the sister of Sansa?", params={
File "/home/ubuntu/fastrag/fastrag/utils.py", line 34, in wrapper
ret = fn(*args, **kwargs)
File "/home/ubuntu/.local/lib/python3.10/site-packages/haystack/pipelines/base.py", line 574, in run
raise Exception(
Exception: Exception while running node 'Prompter': Expected prompt parameters ['document_store', 'query'] to be provided but got only ['query']. Make sure to provide all template parameters.
97 changes: 97 additions & 0 deletions examples/gen-ai-rag-demo/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
# Provision EC2 Instance on Icelake on Amazon Linux OS in default vpc. It is configured to create the EC2 in
# US-East-1 region. The region is provided in variables.tf in this example folder.

# This example also create an EC2 key pair. Associate the public key with the EC2 instance. Create the private key
# in the local system where terraform apply is done. Create a new scurity group to open up the SSH port
# 22 to a specific IP CIDR block

######### PLEASE NOTE TO CHANGE THE IP CIDR BLOCK TO ALLOW SSH FROM YOUR OWN ALLOWED IP ADDRESS FOR SSH #########

data "cloudinit_config" "ansible" {
gzip = true
base64_encode = true

part {
filename = "cloud_init"
content_type = "text/cloud-config"
content = templatefile(
"cloud_init.yml",
{}
)
}
}

data "aws_ami" "ubuntu-linux-2204" {
most_recent = true
owners = ["099720109477"] # Canonical
filter {
name = "name"
values = ["ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-*"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
}

resource "random_id" "rid" {
byte_length = 5
}

# RSA key of size 4096 bits
resource "tls_private_key" "rsa" {
algorithm = "RSA"
rsa_bits = 4096
}

resource "aws_key_pair" "TF_key" {
key_name = "TF_key-${random_id.rid.dec}"
public_key = tls_private_key.rsa.public_key_openssh
}

resource "local_file" "TF_private_key" {
content = tls_private_key.rsa.private_key_pem
filename = "tfkey.private"
}
resource "aws_security_group" "ssh_security_group" {
description = "security group to configure ports for ssh"
name_prefix = "ssh_security_group"
}

# Modify the `ingress_rules` variable in the variables.tf file to allow the required ports for your CIDR ranges
resource "aws_security_group_rule" "ingress_rules" {
count = length(var.ingress_rules)
type = "ingress"
security_group_id = aws_security_group.ssh_security_group.id
from_port = var.ingress_rules[count.index].from_port
to_port = var.ingress_rules[count.index].to_port
protocol = var.ingress_rules[count.index].protocol
cidr_blocks = [var.ingress_rules[count.index].cidr_blocks]
}

resource "aws_network_interface_sg_attachment" "sg_attachment" {
count = length(module.ec2-vm)
security_group_id = aws_security_group.ssh_security_group.id
network_interface_id = module.ec2-vm[count.index].primary_network_interface_id
}

# Modify the `vm_count` variable in the variables.tf file to create the required number of EC2 instances
module "ec2-vm" {
count = var.vm_count
source = "intel/aws-vm/intel"
key_name = aws_key_pair.TF_key.key_name
instance_type = "m7i.16xlarge"
availability_zone = "us-east-1c"
ami = data.aws_ami.ubuntu-linux-2204.id
user_data = data.cloudinit_config.ansible.rendered

root_block_device = [{
volume_size = "400"
}]

tags = {
Name = "my-test-vm-${count.index}-${random_id.rid.dec}"
Owner = "Owner-${random_id.rid.dec}",
Duration = "2"
}
}
Loading
Loading