Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: [LLM] Added support for Large Language Models
# Features: * Generate text * Chat * Get text embedding vector for a list of texts * Tune a language model using a training dataset The following models are available: * `text-bison@001` * `chat-bison@001` * `textembedding-gecko@001` # Example usage ## Text Generation ```python from vertexai.preview.language_models import TextGenerationModel model = TextGenerationModel.from_pretrained("text-bison@001") print(model.predict( "What is the best recipe for banana bread? Recipe:", # Optional: #max_output_tokens=128, #temperature=0, #top_p=1, #top_k=5, )) ``` ## Chat ```python from aiplatform.preview.language_models import ChatModel, InputOutputTextPair chat_model = ChatModel.from_pretrained("chat-bison@001") chat = chat_model.start_chat( # Optional: context="My name is Ned. You are my personal assistant. My favorite movies are Lord of the Rings and Hobbit.", examples=[ InputOutputTextPair( input_text="Who do you work for?", output_text="I work for Ned.", ), InputOutputTextPair( input_text="What do I like?", output_text="Ned likes watching movies.", ), ], ) print(chat.send_message("Are my favorite movies based on a book series?")) print(chat.send_message("When where these books published?")) ``` ## Text embedding ```python from vertexai.preview.language_models import TextEmbeddingModel model = TextEmbeddingModel.from_pretrained("textembedding-gecko@001") embeddings = model.get_embeddings(["What is life?"]) for embedding in embeddings: vector = embedding.values print(len(vector)) ``` # Tuning ```python from vertexai.preview.language_models import TextGenerationModel model = TextGenerationModel.from_pretrained("text-bison@001") # Dataset URI training_data = "gs://<>bucket/<path>.jsonl" # Pandas dataset training_data = pandas.DataFrame(data=[ {"input_text": "Input 1", "output_text": "Output 1"}, {"input_text": "Input 2", "output_text": "Output 2"}, ]) # Prompt dataset resource name training_data = "projects/.../locations/.../datasets/..." model.tune_model( training_data=training_data, # Optional: train_steps=10, tuning_job_location="europe-west4", model_deployment_location="us-central1", ) model.predict("What is life?") ``` PiperOrigin-RevId: 529799173
- Loading branch information