Skip to content

Latest commit

 

History

History
278 lines (198 loc) · 13.7 KB

README.md

File metadata and controls

278 lines (198 loc) · 13.7 KB

Deepgram Go SDK

Discord

Official Go SDK for Deepgram. Start building with our powerful transcription & speech understanding API.

SDK Documentation

This SDK implements the Deepgram API found at https://developers.deepgram.com.

Documentation for specifics about the structs, interfaces, and functions of this SDK can be found here: Go SDK Documentation

For documentation relating to Speech-to-Text from Live/Streaming Audio:

For documentation relating to Speech-to-Text (and Intelligence) from PreRecorded Audio:

For documentation relating to Text-to-Speech:

For documentation relating to Text Intelligence:

For documentation relating to Manage API:

Getting an API Key

🔑 To access the Deepgram API you will need a free Deepgram API Key.

Installation

To incorporate this SDK into your project's go.mod file, run the following command from your repo:

go get github.com/deepgram/deepgram-go-sdk

Requirements

Go (version ^1.19)

Quickstarts

This SDK aims to reduce complexity and abtract/hide some internal Deepgram details that clients shouldn't need to know about. However you can still tweak options and settings if you need.

Speech-to-Text from Live/Streaming Audio Quickstart

You can find a walkthrough on our documentation site. Transcribing Live Audio can be done using the following sample code:

// options
transcriptOptions := &interfaces.LiveTranscriptionOptions{
    Language:    "en-US",
    Punctuate:   true,
    Encoding:    "linear16",
    Channels:    1,
    Sample_rate: 16000,
}

// create the client
dgClient, err := client.NewWebSocketWithDefaults(ctx, transcriptOptions, callback)
if err != nil {
    log.Println("ERROR creating LiveTranscription connection:", err)
    os.Exit(1)
}

// call connect!
wsconn := dgClient.Connect()
if wsconn == nil {
    log.Println("Client.Connect failed")
    os.Exit(1)
}

Speech-to-Text from PreRecorded Audio Quickstart

You can find a walkthrough on our documentation site. Transcribing Pre-Recorded Audio can be done using the following sample code:

// context
ctx := context.Background()

//client
c := client.NewRESTWithDefaults()
dg := prerecorded.New(c)

// transcription options
options := &interfaces.PreRecordedTranscriptionOptions{
    Punctuate:  true,
    Diarize:    true,
    Language:   "en-US",
}

// send URL
URL := "https://my-domain.com/files/my-conversation.mp3"
res, err := dg.FromURL(ctx, URL, options)
if err != nil {
    log.Fatalf("FromURL failed. Err: %v\n", err)
    os.Exit(1)
}

Text-to-Speech WebSocket Quickstart

You can find a walkthrough on our documentation site. Transcribing Live Audio can be done using the following sample code:

// set the TTS options
ttsOptions := &interfaces.SpeakOptions{
    Model: "aura-asteria-en",
}

// create the callback
callback := MyCallback{}

// create a new stream using the NewStream function
dgClient, err := speak.NewWebSocketWithDefaults(ctx, ttsOptions, callback)
if err != nil {
    fmt.Println("ERROR creating TTS connection:", err)
    os.Exit(1)
}

// connect the websocket to Deepgram
bConnected := dgClient.Connect()
if !bConnected {
    fmt.Println("Client.Connect failed")
    os.Exit(1)
}

Text-to-Speech REST Quickstart

You can find a walkthrough on our documentation site. Transcribing Live Audio can be done using the following sample code:

// set the Transcription options
options := &interfaces.SpeakOptions{
    Model: "aura-asteria-en",
}

// create a Deepgram client
c := client.NewRESTWithDefaults()
dg := api.New(c)

// send/process file to Deepgram
res, err := dg.ToSave(ctx, "Hello, World!", textToSpeech, options)
if err != nil {
    fmt.Printf("FromStream failed. Err: %v\n", err)
    os.Exit(1)
}

Examples

There are examples for *every- API call in this SDK. You can find all of these examples in the examples folder at the root of this repo.

These examples provide:

Speech-to-Text - Live Audio / WebSocket:

Speech-to-Text - PreRecorded / REST:

Speech-to-Text - Live Audio:

Text-to-Speech - WebSocket

Text-to-Speech - REST

Management API exercise the full CRUD operations for:

To run each example set the DEEPGRAM_API_KEY as an environment variable, then cd into each example folder and execute the example: go run main.go.

Logging

This SDK provides logging as a means to troubleshoot and debug issues encountered. By default, this SDK will enable Information level messages and higher (ie Warning, Error, etc) when you initialize the library as follows:

client.InitWithDefault();

To increase the logging output/verbosity for debug or troubleshooting purposes, you can set the TRACE level but using this code:

// init library
client.Init(client.InitLib{
    LogLevel: client.LogLevelTrace,
})

Testing

TBD

Backwards Compatibility

Older SDK versions will receive Priority 1 (P1) bug support only. Security issues, both in our code and dependencies, are promptly addressed. Significant bugs without clear workarounds are also given priority attention.

Development and Contributing

Interested in contributing? We ❤️ pull requests!

To make sure our community is safe for all, be sure to review and agree to our Code of Conduct. Then see the Contribution guidelines for more information.

Getting Help

We love to hear from you so if you have questions, comments or find a bug in the project, let us know! You can either: