This project is a web application that allows users to upload an audio file, which is then transcribed using the Rev.ai Speech-to-Text API. The transcribed text is saved to a .txt
file on the server. The application includes a frontend that enables users to download the transcription once it's ready.
- Frontend: React application for uploading audio files with polling to check transcription status and enabling download of the transcription file.
- Backend: Node.js with Express for handling file uploads and interacting with Rev.ai.
- Transcription: Uses Rev.ai API for converting audio to text.
- Webhook: Configured to receive transcription results.
- Docker: Containerizes the application for easy deployment.
- Node.js (v16 or later)
- Docker
- Rev.ai API Key
- ngrok Auth Token (handled internally by the application)
-
Clone the repository:
git clone https://github.com/nantes/audioTranscriber.git cd audioTranscriber/backend
-
Install dependencies:
npm install
-
Create a
.env
file in thebackend
directory with the following content:REV_AI_API_KEY=your_rev_ai_api_key NGROK_AUTHTOKEN=your_ngrok_auth_token
Replace
your_rev_ai_api_key
with the API key provided by Rev.ai, andyour_ngrok_auth_token
with your ngrok auth token. -
Run the backend locally:
npm start
The backend will be available at
http://localhost:3000
.
-
Navigate to the frontend directory:
cd ../frontend
-
Install dependencies:
npm install
-
Run the frontend locally:
npm start
The frontend will be available at
http://localhost:8080
by default.
-
Create a
.env
file in the root of your project directory (if not already created) with the following content:REV_AI_API_KEY=your_rev_ai_api_key NGROK_AUTHTOKEN=your_ngrok_auth_token
Replace
your_rev_ai_api_key
with the API key provided by Rev.ai, andyour_ngrok_auth_token
with your ngrok auth token. -
Build Docker images:
Navigate to the root of your project directory and run:
docker-compose build
-
Start the application:
docker-compose up
This will start both the backend and frontend services. The application will handle ngrok integration internally.
- File Upload: Users upload an audio file through the React frontend.
- Transcription Request: The backend sends the file to Rev.ai for transcription.
- Polling: Once the upload is complete and Rev.ai responds, the frontend begins polling every 5 seconds to check if the transcription is complete.
- Webhook: Rev.ai sends the transcription result to the configured webhook URL.
- Download: When the transcription is ready, the frontend enables the user to download the
.txt
file containing the transcribed text from thetranscriptions
folder.
- The
transcriptions
folder is where the transcribed text files are saved. - The frontend performs polling to check for transcription status updates every 5 seconds.
- Ensure that Docker is properly installed and running on your system for containerized deployments.
- The backend listens on port
3000
and the frontend listens on port8080
by default. - The
.env
file is required both for local development and Docker deployment to provide necessary environment variables.
- Rev.ai for the Speech-to-Text API
- Docker for containerizing the application
- React and Express for the frontend and backend frameworks