Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

readme updates #338

Merged
merged 4 commits into from
Nov 30, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/c-demos.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@ on:
push:
branches: [ master ]
paths:
- '!demo/c/README.md'
- '.github/workflows/c-demos.yml'
- 'demo/c/**'
- '!demo/c/README.md'
- 'include/**'
- 'lib/common/**'
- 'lib/jetson/**'
Expand All @@ -20,9 +20,9 @@ on:
pull_request:
branches: [ master, 'v[0-9]+.[0-9]+' ]
paths:
- '!demo/c/README.md'
- '.github/workflows/c-demos.yml'
- 'demo/c/**'
- '!demo/c/README.md'
- 'include/**'
- 'lib/common/**'
- 'lib/jetson/**'
Expand Down
6 changes: 4 additions & 2 deletions .github/workflows/react-native-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,17 +4,19 @@ on:
push:
branches: [ master ]
paths:
- '.github/workflows/react-native-tests.yml'
- 'binding/react-native/**'
- '!binding/react-native/README.md'
- 'lib/common/**'
- '.github/workflows/react-native-tests.yml'
- 'resources/audio_samples/**'
- 'resources/.test/**'
pull_request:
branches: [ master, 'v[0-9]+.[0-9]+' ]
paths:
- '.github/workflows/react-native-tests.yml'
- 'binding/react-native/**'
- '!binding/react-native/README.md'
- 'lib/common/**'
- '.github/workflows/react-native-tests.yml'
- 'resources/audio_samples/**'
- 'resources/.test/**'

Expand Down
121 changes: 68 additions & 53 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,10 +79,10 @@ pip3 install pvleoparddemo
Run the following in the terminal:

```bash
leopard_demo_file --access_key ${ACCESS_KEY} --audio_paths ${AUDIO_PATH}
leopard_demo_file --access_key ${ACCESS_KEY} --audio_paths ${AUDIO_FILE_PATH}
```

Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console and `${AUDIO_PATH}` with a path to an audio file you
Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console and `${AUDIO_FILE_PATH}` with a path to an audio file you
wish to transcribe.

### C Demo
Expand All @@ -96,12 +96,12 @@ cmake -S demo/c/ -B demo/c/build && cmake --build demo/c/build
Run the demo:

```console
./demo/c/build/leopard_demo -a ${ACCESS_KEY} -l ${LIBRARY_PATH} -m ${MODEL_PATH} ${AUDIO_PATH}
./demo/c/build/leopard_demo -a ${ACCESS_KEY} -l ${LIBRARY_PATH} -m ${MODEL_FILE_PATH} ${AUDIO_FILE_PATH}
```

Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console, `${LIBRARY_PATH}` with the path to appropriate
library under [lib](/lib), `${MODEL_PATH}` to path to [default model file](./lib/common/leopard_params.pv)
(or your custom one), and `${AUDIO_PATH}` with a path to an audio file you wish to transcribe.
library under [lib](/lib), `${MODEL_FILE_PATH}` to path to [default model file](./lib/common/leopard_params.pv)
(or your custom one), and `${AUDIO_FILE_PATH}` with a path to an audio file you wish to transcribe.

### iOS Demo

Expand Down Expand Up @@ -132,10 +132,10 @@ yarn global add @picovoice/leopard-node-demo
Run the following in the terminal:

```console
leopard-file-demo --access_key ${ACCESS_KEY} --input_audio_file_path ${AUDIO_PATH}
leopard-file-demo --access_key ${ACCESS_KEY} --input_audio_file_path ${AUDIO_FILE_PATH}
```

Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console and `${AUDIO_PATH}` with a path to an audio file you
Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console and `${AUDIO_FILE_PATH}` with a path to an audio file you
wish to transcribe.

For more information about Node.js demos go to [demo/nodejs](./demo/nodejs).
Expand Down Expand Up @@ -165,10 +165,10 @@ The demo requires `cgo`, which on Windows may mean that you need to install a gc
From [demo/go](./demo/go) run the following command from the terminal to build and run the file demo:

```console
go run filedemo/leopard_file_demo.go -access_key "${ACCESS_KEY}" -input_audio_path "${AUDIO_PATH}"
go run filedemo/leopard_file_demo.go -access_key "${ACCESS_KEY}" -input_audio_path "${AUDIO_FILE_PATH}"
```

Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console and `${AUDIO_PATH}` with a path to an audio file you wish to transcribe.
Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console and `${AUDIO_FILE_PATH}` with a path to an audio file you wish to transcribe.

For more information about Go demos go to [demo/go](./demo/go).

Expand Down Expand Up @@ -202,10 +202,10 @@ From [demo/java](./demo/java) run the following commands from the terminal to bu
cd demo/java
./gradlew build
cd build/libs
java -jar leopard-file-demo.jar -a ${ACCESS_KEY} -i ${AUDIO_PATH}
java -jar leopard-file-demo.jar -a ${ACCESS_KEY} -i ${AUDIO_FILE_PATH}
```

Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console and `${AUDIO_PATH}` with a path to an audio file you wish to transcribe.
Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console and `${AUDIO_FILE_PATH}` with a path to an audio file you wish to transcribe.

For more information about Java demos go to [demo/java](./demo/java).

Expand All @@ -217,10 +217,10 @@ file or on real-time microphone input.
From [demo/dotnet/LeopardDemo](./demo/dotnet/LeopardDemo) run the following in the terminal:

```console
dotnet run -c FileDemo.Release -- --access_key ${ACCESS_KEY} --input_audio_path ${AUDIO_PATH}
dotnet run -c FileDemo.Release -- --access_key ${ACCESS_KEY} --input_audio_path ${AUDIO_FILE_PATH}
```

Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console and `${AUDIO_PATH}` with a path to an audio file you
Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console and `${AUDIO_FILE_PATH}` with a path to an audio file you
wish to transcribe.

For more information about .NET demos, go to [demo/dotnet](./demo/dotnet).
Expand All @@ -233,10 +233,10 @@ file or on real-time microphone input.
From [demo/rust/filedemo](./demo/rust/filedemo) run the following in the terminal:

```console
cargo run --release -- --access_key ${ACCESS_KEY} --input_audio_path ${AUDIO_PATH}
cargo run --release -- --access_key ${ACCESS_KEY} --input_audio_path ${AUDIO_FILE_PATH}
```

Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console and `${AUDIO_PATH}` with a path to an audio file you
Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console and `${AUDIO_FILE_PATH}` with a path to an audio file you
wish to transcribe.

For more information about Rust demos, go to [demo/rust](./demo/rust).
Expand Down Expand Up @@ -294,14 +294,14 @@ Create an instance of the engine and transcribe an audio file:
```python
import pvleopard

handle = pvleopard.create(access_key='${ACCESS_KEY}')
leopard = pvleopard.create(access_key='${ACCESS_KEY}')

print(handle.process_file('${AUDIO_PATH}'))
print(leopard.process_file('${AUDIO_FILE_PATH}'))
```

Replace `${ACCESS_KEY}` with yours obtained from [Picovoice Console](https://console.picovoice.ai/) and
`${AUDIO_PATH}` to path an audio file. Finally, when done be sure to explicitly release the resources using
`handle.delete()`.
`${AUDIO_FILE_PATH}` to path an audio file. Finally, when done be sure to explicitly release the resources using
`leopard.delete()`.

### C

Expand All @@ -314,17 +314,29 @@ Create an instance of the engine and transcribe an audio file:

#include "pv_leopard.h"

pv_leopard_t *handle = NULL;
bool automatic_punctuation = false;
pv_status_t status = pv_leopard_init("${ACCESS_KEY}", "${MODEL_PATH}", automatic_punctuation, &handle);
pv_leopard_t *leopard = NULL;
bool enable_automatic_punctuation = false;
bool enable_speaker_diarization = false;

pv_status_t status = pv_leopard_init(
"${ACCESS_KEY}",
"${MODEL_FILE_PATH}",
enable_automatic_punctuation,
enable_speaker_diarization,
&leopard);
if (status != PV_STATUS_SUCCESS) {
// error handling logic
}

char *transcript = NULL;
int32_t num_words = 0;
pv_word_t *words = NULL;
status = pv_leopard_process_file(handle, "${AUDIO_PATH}", &transcript, &num_words, &words);
status = pv_leopard_process_file(
leopard,
"${AUDIO_FILE_PATH}",
&transcript,
&num_words,
&words);
if (status != PV_STATUS_SUCCESS) {
// error handling logic
}
Expand All @@ -333,20 +345,21 @@ fprintf(stdout, "%s\n", transcript);
for (int32_t i = 0; i < num_words; i++) {
fprintf(
stdout,
"[%s]\t.start_sec = %.1f .end_sec = %.1f .confidence = %.2f\n",
"[%s]\t.start_sec = %.1f .end_sec = %.1f .confidence = %.2f .speaker_tag = %d\n",
words[i].word,
words[i].start_sec,
words[i].end_sec,
words[i].confidence);
words[i].confidence,
words[i].speaker_tag);
}

pv_leopard_transcript_delete(transcript);
pv_leopard_words_delete(words);
```

Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console, `${MODEL_PATH}` to path to
[default model file](./lib/common/leopard_params.pv) (or your custom one), and `${AUDIO_PATH}` to path an audio file.
Finally, when done be sure to release resources acquired using `pv_leopard_delete(handle)`.
Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console, `${MODEL_FILE_PATH}` to path to
[default model file](./lib/common/leopard_params.pv) (or your custom one), and `${AUDIO_FILE_PATH}` to path an audio file.
Finally, when done be sure to release resources acquired using `pv_leopard_delete(leopard)`.

### iOS

Expand Down Expand Up @@ -393,20 +406,20 @@ Create an instance of the engine and transcribe an audio file:
import ai.picovoice.leopard.*;

final String accessKey = "${ACCESS_KEY}"; // AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)
final String modelPath = "${MODEL_FILE}";
final String modelPath = "${MODEL_FILE_PATH}";
try {
Leopard handle = new Leopard.Builder()
Leopard leopard = new Leopard.Builder()
.setAccessKey(accessKey)
.setModelPath(modelPath)
.build(appContext);

File audioFile = new File("${AUDIO_FILE_PATH}");
LeopardTranscript transcript = handle.processFile(audioFile.getAbsolutePath());
LeopardTranscript transcript = leopard.processFile(audioFile.getAbsolutePath());

} catch (LeopardException ex) { }
```

Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console, `${MODEL_FILE}` with a custom trained model from [console](https://console.picovoice.ai/) or the [default model](./lib/common/leopard_params.pv), and `${AUDIO_FILE_PATH}` with the path to the audio file.
Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console, `${MODEL_FILE_PATH}` with a custom trained model from [console](https://console.picovoice.ai/) or the [default model](./lib/common/leopard_params.pv), and `${AUDIO_FILE_PATH}` with the path to the audio file.

### Node.js

Expand All @@ -421,19 +434,19 @@ Create instances of the Leopard class:
```javascript
const Leopard = require("@picovoice/leopard-node");
const accessKey = "${ACCESS_KEY}" // Obtained from the Picovoice Console (https://console.picovoice.ai/)
let handle = new Leopard(accessKey);
let leopard = new Leopard(accessKey);

const result = engineInstance.processFile('${AUDIO_PATH}');
const result = engineInstance.processFile('${AUDIO_FILE_PATH}');
console.log(result.transcript);
```

Replace `${ACCESS_KEY}` with yours obtained from [Picovoice Console](https://console.picovoice.ai/) and
`${AUDIO_PATH}` to path an audio file.
`${AUDIO_FILE_PATH}` to path an audio file.

When done, be sure to release resources using `release()`:

```javascript
handle.release();
leopard.release();
```

### Flutter
Expand All @@ -450,29 +463,29 @@ Create an instance of the engine and transcribe an audio file:
```dart
import 'package:leopard/leopard.dart';

const accessKey = "{ACCESS_KEY}" // AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)
final String accessKey = '{ACCESS_KEY}' // AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)

try {
Leopard _leopard = await Leopard.create(accessKey, '{LEOPARD_MODEL_PATH}');
Leopard _leopard = await Leopard.create(accessKey, '{MODEL_FILE_PATH}');
LeopardTranscript result = await _leopard.processFile("${AUDIO_FILE_PATH}");
print(result.transcript);
} on LeopardException catch (err) { }
```

Replace `${ACCESS_KEY}` with your `AccessKey` obtained from [Picovoice Console](https://console.picovoice.ai/), `${MODEL_FILE}` with a custom trained model from [Picovoice Console](https://console.picovoice.ai/) or the [default model](./lib/common/leopard_params.pv), and `${AUDIO_FILE_PATH}` with the path to the audio file.
Replace `${ACCESS_KEY}` with your `AccessKey` obtained from [Picovoice Console](https://console.picovoice.ai/), `${MODEL_FILE_PATH}` with a custom trained model from [Picovoice Console](https://console.picovoice.ai/) or the [default model](./lib/common/leopard_params.pv), and `${AUDIO_FILE_PATH}` with the path to the audio file.

### Go

Install the Go binding:

```console
go get github.com/Picovoice/leopard/binding/go
go get github.com/Picovoice/leopard/binding/go/v2
```

Create an instance of the engine and transcribe an audio file:

```go
import . "github.com/Picovoice/leopard/binding/go"
import . "github.com/Picovoice/leopard/binding/go/v2"

leopard = Leopard{AccessKey: "${ACCESS_KEY}"}
err := leopard.Init()
Expand All @@ -481,7 +494,7 @@ if err != nil {
}
defer leopard.Delete()

transcript, words, err := leopard.ProcessFile("${AUDIO_PATH}")
transcript, words, err := leopard.ProcessFile("${AUDIO_FILE_PATH}")
if err != nil {
// handle process error
}
Expand All @@ -490,7 +503,7 @@ log.Println(transcript)
```

Replace `${ACCESS_KEY}` with yours obtained from [Picovoice Console](https://console.picovoice.ai/) and
`${AUDIO_PATH}` to path an audio file. Finally, when done be sure to explicitly release the resources using
`${AUDIO_FILE_PATH}` to path an audio file. Finally, when done be sure to explicitly release the resources using
`leopard.Delete()`.

### React Native
Expand All @@ -511,7 +524,7 @@ const getAudioFrame = () => {
}

try {
const leopard = await Leopard.create("${ACCESS_KEY}", "${MODEL_FILE}")
const leopard = await Leopard.create("${ACCESS_KEY}", "${MODEL_FILE_PATH}")
const { transcript, words } = await leopard.processFile("${AUDIO_FILE_PATH}")
console.log(transcript)
} catch (err: any) {
Expand All @@ -521,7 +534,7 @@ try {
}
```

Replace `${ACCESS_KEY}` with your `AccessKey` obtained from Picovoice Console, `${MODEL_FILE}` with a custom trained model from [Picovoice Console](https://console.picovoice.ai/) or the [default model](./lib/common/leopard_params.pv) and `${AUDIO_FILE_PATH}` with the absolute path of the audio file.
Replace `${ACCESS_KEY}` with your `AccessKey` obtained from Picovoice Console, `${MODEL_FILE_PATH}` with a custom trained model from [Picovoice Console](https://console.picovoice.ai/) or the [default model](./lib/common/leopard_params.pv) and `${AUDIO_FILE_PATH}` with the absolute path of the audio file.
When done be sure to explicitly release the resources using `leopard.delete()`.

### Java
Expand All @@ -541,14 +554,14 @@ final String accessKey = "${ACCESS_KEY}";

try {
Leopard leopard = new Leopard.Builder().setAccessKey(accessKey).build();
LeopardTranscript result = leopard.processFile("${AUDIO_PATH}");
LeopardTranscript result = leopard.processFile("${AUDIO_FILE_PATH}");
leopard.delete();
} catch (LeopardException ex) { }

System.out.println(transcript);
System.out.println(result.getTranscriptString());
```

Replace `${ACCESS_KEY}` with yours obtained from [Picovoice Console](https://console.picovoice.ai/) and `${AUDIO_PATH}` to the path an audio file. Finally, when done be sure to explicitly release the resources using `leopard.delete()`.
Replace `${ACCESS_KEY}` with yours obtained from [Picovoice Console](https://console.picovoice.ai/) and `${AUDIO_FILE_PATH}` to the path an audio file. Finally, when done be sure to explicitly release the resources using `leopard.delete()`.

### .NET

Expand All @@ -564,14 +577,14 @@ Create an instance of the engine and transcribe an audio file:
using Pv;

const string accessKey = "${ACCESS_KEY}";
const string audioPath = "/absolute/path/to/audio_file";
const string audioPath = "${AUDIO_FILE_PATH}";

Leopard handle = Leopard.Create(accessKey);
Leopard leopard = Leopard.Create(accessKey);

Console.Write(handle.ProcessFile(audioPath));
Console.Write(leopard.ProcessFile(audioPath));
```

Replace `${ACCESS_KEY}` with yours obtained from [Picovoice Console](https://console.picovoice.ai/). Finally, when done release the resources using `handle.Dispose()`.
Replace `${ACCESS_KEY}` with yours obtained from [Picovoice Console](https://console.picovoice.ai/). Finally, when done release the resources using `leopard.Dispose()`.

### Rust

Expand Down Expand Up @@ -698,8 +711,10 @@ function App(props) {

## Releases

### v2.0.0 - November 27th, 2023
### v2.0.0 - November 30th, 2023

- Added speaker diarization feature
- Added React SDK
- Improvements to error reporting
- Upgrades to authorization and authentication system
- Improved engine accuracy
Expand Down
Loading