This project aims to make the vibrant world of comics accessible to everyone, including the visually impaired, by transforming them into audio narratives. Developed as a capstone project for a postgraduate degree in Data Storytelling, it explores the use of technology to enhance narrative accessibility through an old Daredevil comic now in the public domain.
The "Spell Bubble" web application features a responsive design tailored for accessibility and user engagement. The interface is built with HTML, CSS, and JavaScript, ensuring compatibility across various devices and screen sizes. Key features include:
- Responsive Layout: Adapts seamlessly to different devices, providing an optimal viewing experience on both desktops and mobile phones.
- Interactive Controls: Users can navigate through comic panels using clearly labeled buttons, facilitating ease of use for all, including those relying on screen readers.
- Audio Integration: Each comic panel comes with an audio description, playable via a custom-built audio player. This feature is designed to bring the comic's story to life through sound, enhancing the narrative for visually impaired users.
- Dynamic Content Loading: The application dynamically loads comic panels and audio files, ensuring smooth transitions and minimal load times.
- Accessible Design: High-contrast colors and large text are used to assist users with visibility impairments.
Techniques used include image refinement to base64, computer vision for panel separation, and interactive character identification.
Utilized OCR to convert dialogues into accessible text, focusing on preserving the essence and integrity of the original comics.
Tools used: SpaCy and OpenAI for refining texts, addressing ambiguities to ensure vibrancy in transcription.
Ensured that each speech is correctly associated with its corresponding character for narrative fidelity.
Implemented detailed AI-driven prompts to capture action, emotion, and scenery in each panel.
Adjusted narratives to maintain story fluidity, ensuring smooth transitions between panels.
Employed OpenAI's Text-to-Speech to voice characters, customizing timbres and tones for a rich auditory experience.
Each character was given a unique voice, while maintaining a neutral, engaging tone for narration.
Combined narration and dialogue effectively, using strategic pauses and synchronization to enhance the auditory experience.
Plans to enhance the website design for better attractiveness, responsiveness, and accessibility. Aiming to test the tool with real users to iterate based on feedback, ensuring that "Spell Bubble" adapts to the needs of its audience.
This project's notebooks can be accessed and run either through Google Colab for convenience or locally for those who prefer it. Additionally, the interactive web application can be viewed and interacted with directly through CodeSandbox. Below are the instructions for accessing the project through these platforms:
- Visit the Google Colab website.
- Sign in with your Google account if not already signed in.
- Click on 'File' > 'Open notebook' > 'GitHub' tab.
- Enter the URL of this repository and select the notebook you want to run.
- Ensure you have Jupyter Notebook installed. If not, you can install it using the following command in your terminal (if using pip):
pip install notebook
- Clone this repository by running the following command in your terminal:
git clone
- Navigate to the cloned repository directory.
- Launch Jupyter Notebook by running:
jupyter notebook
- Navigate to the DATA folder to access the necessary datasets.
- Open the .ipynb files within the Jupyter environment to start running them step by step.
To interact with the web application:
- Visit the Spell Bubble project on CodeSandbox.
- Explore the functionalities directly from your browser without any installations.
Contributions to "Spell Bubble" are welcome! Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests.
This project is licensed under the Creative Commons Attribution 4.0 International License - see the LICENSE.md file for details.
Thanks to Leah Brochu and Rachel Osolen for their guidelines on Comic Book / Graphic Novel Description, which greatly informed the descriptive processes in this project.