Skip to content

Commit

Permalink
https://trello.com/c/9jNbKNkz Simple FREE application to convert spee…
Browse files Browse the repository at this point in the history
…ch into text using Wit.ai
  • Loading branch information
Nilesh Kumar committed Aug 15, 2024
1 parent 19b891d commit 20d4b4f
Show file tree
Hide file tree
Showing 5 changed files with 111 additions and 12 deletions.
49 changes: 41 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,46 @@ Simple Python application to convert speech into text using Wit.ai
* `brew install portaudio`
* `brew install flac`

## Features
- Continuous speech recognition
- Improved accuracy using Wit.ai
- Ambient noise adjustment
- Error handling and user-friendly feedback

## Setup

Before using bhashan2pathtak, you need to set up a Wit.ai account and obtain a token:

1. Go to https://wit.ai/ and create an account if you haven't already.
2. Create a new Wit.ai app and copy your Client Access Token.
3. Set your token using one of these methods:
a. Set an environment variable:
```
export WIT_AI_TOKEN=your_token_here
```
b. Create a `config.json` file in the directory where you'll run the application, with the following content:
```json
{
"WIT_AI_TOKEN": "your_token_here"
}
```
## Installation
To use bhashan2pathtak as a package:
1. Install the package:
```
pip install bhashan2pathtak
```
2. Run the application:
```
bhashan2pathtak
```
Note: You still need to configure your Wit.ai token as described in the Setup section.
## Development Setup
1. Clone this repository
2. Install the required packages:
```
Expand All @@ -14,7 +53,7 @@ Simple Python application to convert speech into text using Wit.ai
Note: This project uses PyAudio 0.2.14. If you encounter issues with installation, try upgrading to this version.
3. Sign up for a Wit.ai account and create a new app to get an access token
## Configuration
## Development Configuration
To run this application, you need to provide your Wit.ai token. You have two options:
1. Environment Variable:
Expand All @@ -38,16 +77,10 @@ After setting up the configuration:
python3 speech_to_text.py
```
## Features
- Continuous speech recognition
- Improved accuracy using Wit.ai
- Ambient noise adjustment
- Error handling and user-friendly feedback

## Troubleshooting
If you encounter any issues with PyAudio, make sure you have version 0.2.14 installed:
```
pip install PyAudio==0.2.14
```
For any other issues, please check the Wit.ai documentation or open an issue in this repository.
For any other issues, please check the Wit.ai documentation or open an issue in this repository.
5 changes: 5 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
[build-system]
requires = ["setuptools>=45", "wheel", "setuptools_scm[toml]>=6.2"]
build-backend = "setuptools.build_meta"

[tool.setuptools_scm]
39 changes: 39 additions & 0 deletions setup.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
from setuptools import setup, find_packages

with open("README.md", "r", encoding="utf-8") as fh:
long_description = fh.read()

setup(
name="bhashan2pathtak",
version="0.1.0",
author="Nilesh Kumar",
author_email="[email protected]",
description="A simple speech-to-text application using Wit.ai",
long_description=long_description,
long_description_content_type="text/markdown",
url="https://github.com/nilukush/bhashan2pathtak",
package_dir={"": "src"},
packages=find_packages(where="src"),
classifiers=[
"Development Status :: 3 - Alpha",
"Intended Audience :: Developers",
"License :: OSI Approved :: MIT License",
"Operating System :: OS Independent",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.7",
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
],
python_requires=">=3.7",
install_requires=[
"SpeechRecognition>=3.8.1",
"PyAudio>=0.2.11",
"wit>=6.0.0",
],
entry_points={
"console_scripts": [
"bhashan2pathtak=bhashan2pathtak.speech_to_text:main",
],
},
)
4 changes: 4 additions & 0 deletions src/bhashan2pathtak/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
from .speech_to_text import main

__version__ = "0.1.0"
__all__ = ["main"]
26 changes: 22 additions & 4 deletions speech_to_text.py → src/bhashan2pathtak/speech_to_text.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ def __ge__(self, other):
def load_config():
# Try to load from config file first
try:
with open('config.json', 'r') as config_file:
with open('../../config.json', 'r') as config_file:
return json.load(config_file)
except FileNotFoundError:
return {}
Expand All @@ -64,9 +64,26 @@ def get_wit_token():
return token

# If not in environment, try to get from config file
config = load_config()
return config.get('WIT_AI_TOKEN')

try:
with open('config.json', 'r') as config_file:
config = json.load(config_file)
token = config.get('WIT_AI_TOKEN')
if token:
return token
except FileNotFoundError:
pass

# If still no token, guide the user
print("Wit.ai token not found. Please follow these steps to set up your token:")
print("1. Go to https://wit.ai/ and create an account if you haven't already.")
print("2. Create a new Wit.ai app and copy your Client Access Token.")
print("3. Set your token using one of these methods:")
print(" a. Set an environment variable:")
print(" export WIT_AI_TOKEN=your_token_here")
print(" b. Create a config.json file in the current directory with the following content:")
print(" {\"WIT_AI_TOKEN\": \"your_token_here\"}")
print("\nAfter setting up your token, run this program again.")
sys.exit(1)

def transcribe_wit(audio_data, wit_client):
try:
Expand All @@ -83,6 +100,7 @@ def main():
wit_token = get_wit_token()
if not wit_token:
print("Error: WIT_AI_TOKEN not found in environment variables or config file")
print("Please set the WIT_AI_TOKEN environment variable or create a config.json file")
return

recognizer = sr.Recognizer()
Expand Down

0 comments on commit 20d4b4f

Please sign in to comment.