added text summarization notebook #317

Marjan-emd · 2023-10-20T00:20:22Z

Added this text generation notebook to the blueprints, since the only available one (Taylor Swift Lyrics) is not creating the text SQS and enables the old mpt-7b model.

This notebook will be called in the showcase text metrics blog and is added to the blog folder following up John's suggestion, here.

vercel · 2023-10-20T00:20:30Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
gretel-blueprints	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Oct 26, 2023 0:14am

merge main to the branch

kboyd

Some minor suggestions and questions. Please also have Johnny review and approve before merging as he's working on tidying up this notebooks directory. Want to make sure this follows the planned style and structure instead of creating more stuff to clean up later.

docs/notebooks/content/Text-Summerization-gpt.ipynb

johnnygreco

Hey @Marjan-emd,

I'll probably move the file location as I reorganize the notebooks in this repo, but we can go ahead and get this merged for your blog. I'll make sure the link gets switched (there will be a bunch of broken links, so no worries here).

One request before we merge. Can we update this to the new SDK interface?

Here's what that would look like:

 from gretel_client import Gretel

PROJECT = 'data-summarization'
DATASET_PATH = 'https://gretel-datasets.s3.us-west-2.amazonaws.com/Text-dataset/Samsum-text-summerization-sample-1000.csv'

gretel = Gretel(project_name=f"{PROJECT}-llama-2-7b", api_key="prompt", validate=True)

trained = gretel.submit_train(
    "natural-language",
    data_source=DATASET_PATH
    params={"steps": 1000}, 
 )

trained.report.display_in_notebook()

You can pass the DataFrame as the data_source if you prefer.

johnnygreco · 2023-10-25T22:05:20Z

Maybe explicitly add the pretrained model parameter:

from gretel_client import Gretel

PROJECT = 'data-summarization'
DATASET_PATH = 'https://gretel-datasets.s3.us-west-2.amazonaws.com/Text-dataset/Samsum-text-summerization-sample-1000.csv'
LLM = "meta-llama/Llama-2-7b-chat-hf"

gretel = Gretel(project_name=f"{PROJECT}-llama-2-7b", api_key="prompt", validate=True)

trained = gretel.submit_train(
    "natural-language",
    data_source=DATASET_PATH,
    pretrained_model=LLM,
    params={"steps": 1000},
 )

trained.report.display_in_notebook()

Marjan-emd · 2023-10-25T23:51:28Z

Maybe explicitly add the pretrained model parameter:

from gretel_client import Gretel

PROJECT = 'data-summarization'
DATASET_PATH = 'https://gretel-datasets.s3.us-west-2.amazonaws.com/Text-dataset/Samsum-text-summerization-sample-1000.csv'
LLM = "meta-llama/Llama-2-7b-chat-hf"

gretel = Gretel(project_name=f"{PROJECT}-llama-2-7b", api_key="prompt", validate=True)

trained = gretel.submit_train(
    "natural-language",
    data_source=DATASET_PATH,
    pretrained_model=LLM,
    params={"steps": 1000},
 )

trained.report.display_in_notebook()

Thanks for reviewing this. I totally forgot about adding the new SDK interface!
Just changed the model to the regular Llama-2 instead of the chat since I did my experiments of that one, though there should not be a huge change in the results.

johnnygreco

Nice – thanks, @Marjan-emd!

One last thing before you merge: will you put this at the top of the first markdown cell?

Marjan-emd

All feedback comments were addressed.

added text summarization notebook

a7c884f

Marjan-emd requested a review from kboyd October 20, 2023 00:20

vercel bot deployed to Preview October 20, 2023 00:31 View deployment

changed end point to prod

0596495

vercel bot deployed to Preview October 20, 2023 00:34 View deployment

added a seperate folder for the notebooks called in blogs

d0ea36b

vercel bot deployed to Preview October 20, 2023 18:37 View deployment

update notebook name

e0468bc

vercel bot deployed to Preview October 20, 2023 18:39 View deployment

changed folder name to content

58244a8

vercel bot temporarily deployed to Preview October 24, 2023 18:40 Inactive

Merge branch 'main' into me/RDS-736

eba5420

merge main to the branch

vercel bot deployed to Preview October 25, 2023 16:11 View deployment

kboyd approved these changes Oct 25, 2023

View reviewed changes

Marjan-emd requested a review from johnnygreco October 25, 2023 16:42

marjan_emd added 4 commits October 25, 2023 16:50

change notebook name to the snake case

e3bec54

update the link to the API key

3b14ac1

update steps for slightly better results

62b9cb7

addressed feedback comments

1eb0df3

vercel bot deployed to Preview October 25, 2023 18:01 View deployment

johnnygreco reviewed Oct 25, 2023

View reviewed changes

updated the notebook to the new SDK interface

edd040a

vercel bot deployed to Preview October 25, 2023 23:49 View deployment

Marjan-emd requested a review from johnnygreco October 25, 2023 23:51

johnnygreco approved these changes Oct 26, 2023

View reviewed changes

added the Open in Colab button.

f474dba

vercel bot deployed to Preview October 26, 2023 00:14 View deployment

Marjan-emd commented Oct 26, 2023

View reviewed changes

Marjan-emd merged commit 0f4f3c1 into main Oct 26, 2023
4 checks passed

Marjan-emd deleted the me/RDS-736 branch October 26, 2023 00:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added text summarization notebook #317

added text summarization notebook #317

Marjan-emd commented Oct 20, 2023 •

edited

Loading

vercel bot commented Oct 20, 2023 •

edited

Loading

kboyd left a comment

johnnygreco left a comment •

edited

Loading

johnnygreco commented Oct 25, 2023 •

edited

Loading

Marjan-emd commented Oct 25, 2023

johnnygreco left a comment

Marjan-emd left a comment

added text summarization notebook #317

added text summarization notebook #317

Conversation

Marjan-emd commented Oct 20, 2023 • edited Loading

vercel bot commented Oct 20, 2023 • edited Loading

kboyd left a comment

Choose a reason for hiding this comment

johnnygreco left a comment • edited Loading

Choose a reason for hiding this comment

johnnygreco commented Oct 25, 2023 • edited Loading

Marjan-emd commented Oct 25, 2023

johnnygreco left a comment

Choose a reason for hiding this comment

Marjan-emd left a comment

Choose a reason for hiding this comment

Marjan-emd commented Oct 20, 2023 •

edited

Loading

vercel bot commented Oct 20, 2023 •

edited

Loading

johnnygreco left a comment •

edited

Loading

johnnygreco commented Oct 25, 2023 •

edited

Loading