The Storyteller project is an experimental Python application that converts images into short audio stories. It leverages OpenAI's GPT-3 and Hugging Face's transformers library to perform tasks like image-to-text conversion, text-based story generation, and text-to-speech conversion.
- Python 3.x
- Conda (Optional but recommended for environment isolation)
-
Clone the Repository
git clone https://github.com/your-username/storyteller.git cd storyteller
-
Setup the Environment
-
Using Conda:
conda env create -f environment.yaml conda activate <env-name>
-
Or using
pipvenv
:pipenv --python 3.8 pipenv shell pipenv install -r requirements.txt
-
-
Setup Environment Variables
-
Copy
.env.example
to.env
cp .env.example .env
-
Edit the
.env
file to add your API keys and tokens.
-
-
Run the App
python app.py
After running app.py
, a Streamlit application will start, and you should see a web interface with an option to upload an image. Upon uploading an image, the application will generate a short story based on the image's content and convert the story to audio. You can view the generated scenario, read the story, and listen to the audio story directly in the web interface.