Integrating Speech-to-Text Functionality in Django Applications

Lawrence Jengar
Sep 28, 2024 03:26

Learn how to integrate Speech-to-Text into Django apps using AssemblyAI API. Build an app to transcribe audio files and display the transcriptions.

Integrating Speech-to-Text functionality into Django applications can significantly enhance user experience by allowing audio transcription directly within the app. According to AssemblyAI, developers can leverage their API to implement this feature seamlessly.

Setting Up the Project

To get started, create a new project folder and establish a virtual environment:

# Mac/Linux
python3 -m venv venv
. venv/bin/activate

# Windows
python -m venv venv
.\venv\Scripts\activate.bat

Next, install the necessary packages including Django, AssemblyAI Python SDK, and python-dotenv:

pip install Django assemblyai python-dotenv

Creating the Django Project

Create a new Django project named ‘stt_project’ and a new app within it called ‘transcriptions’:

django-admin startproject stt_project
cd stt_project
python manage.py startapp transcriptions

Building the View

In the ‘transcriptions’ app, create a view to handle file uploads and transcriptions. Open transcriptions/views.py and add the following code:

from django.shortcuts import render
from django import forms
import assemblyai as aai

class UploadFileForm(forms.Form):
    audio_file = forms.FileField()

def index(request):
    context = None
    if request.method == 'POST':
        form = UploadFileForm(request.POST, request.FILES)
        if form.is_valid():
            file = request.FILES['audio_file']
            transcriber = aai.Transcriber()
            transcript = transcriber.transcribe(file.file)
            file.close()
            context = {'transcript': transcript.text} if not transcript.error else {'error': transcript.error}
    return render(request, 'transcriptions/index.html', context)

Defining URL Configuration

Map the view to a URL by creating transcriptions/urls.py:

from django.urls import path
from . import views

urlpatterns = [
    path('', views.index, name="index"),
]

Include this app URL pattern in the global project URL configuration in stt_project/urls.py:

from django.contrib import admin
from django.urls import include, path

urlpatterns = [
    path('', include('transcriptions.urls')),
    path('admin/', admin.site.urls),
]

Creating the HTML Template

Inside the ‘transcriptions/templates’ directory, create an index.html file with the following content:




    
    
    AssemblyAI Django App


    
    
    Transcript:
    {% if error %}
        {{ error }}
    {% endif %}
    {{ transcript }}

Setting the API Key

Store the AssemblyAI API key in a .env file in the root directory:

ASSEMBLYAI_API_KEY=your_api_key_here

Load this environment variable in stt_project/settings.py:

from dotenv import load_dotenv
load_dotenv()

Running the Django App

Start the server using the following command:

python manage.py runserver

Visit the app in your browser, upload an audio file, and see the transcribed text appear.

Non-blocking Implementations

To avoid blocking operations, consider using webhooks or async functions. Webhooks notify you when the transcription is ready, while async calls allow the app to continue running during the transcription process.

Using Webhooks

Set a webhook URL in the transcription config and handle the webhook delivery in a separate view function:

webhook_url = f'{request.get_host()}/webhook'
config = aai.TranscriptionConfig().set_webhook(webhook_url)
transcriber.submit(file.file, config)

Define the webhook receiver:

def webhook(request):
    if request.method == 'POST':
        data = json.loads(request.body)
        transcript_id = data['transcript_id']
        transcript = aai.Transcript.get_by_id(transcript_id)

Map this view to a URL:

urlpatterns = [
    path('', views.index, name="index"),
    path('webhook/', views.webhook, name="webhook"),
]

Using Async Functions

Utilize async views in Django for non-blocking transcription:

transcript_future = transcriber.transcribe_async(file.file)
if transcript_future.done():
    transcript = transcript_future.result()

Speech-to-Text Options for Django Apps

When implementing Speech-to-Text, consider cloud-based APIs like AssemblyAI or Google Cloud Speech-to-Text for high accuracy and scalability, or open-source libraries like SpeechRecognition and Whisper for greater control and privacy.

Conclusion

This guide shows how to integrate Speech-to-Text into Django apps using the AssemblyAI API. Developers can choose between blocking and non-blocking implementations and select the best Speech-to-Text solution based on their needs.

For more details, visit the AssemblyAI blog.

Image source: Shutterstock

Credit: Source link

Integrating Speech-to-Text Functionality in Django Applications

BNB Chain Welcomes Diverse New Projects for December 2025

AI Innovations Pave the Way for Global Environmental and Health Solutions

Related Posts

BNB Chain Welcomes Diverse New Projects for December 2025

AI Innovations Pave the Way for Global Environmental and Health Solutions

Arkham Partners with Sui to Enhance Blockchain Data Integration

BIRDS GameFi Gains Traction on Sui with Engaging Telegram Experience

ElevenLabs Enhances Workspace Management with Group Permissions

Greenback Flexes: Dollar Index Reaches Highest Level in Over Two Years

Bitcoin Gold Rockets 140% in 24 Hours Despite Looming Upbit Delisting Deadline

‘A Big Move Is Near’: Crypto Trader Says Ethereum Competitor Could Skyrocket by 266% Within a Few Weeks

The Defi Era: Redefining Capitalism and Unlocking Economic Freedom for All

Bitcoin Gold Rockets 140% in 24 Hours Despite Looming Upbit Delisting Deadline

‘A Big Move Is Near’: Crypto Trader Says Ethereum Competitor Could Skyrocket by 266% Within a Few Weeks

The Defi Era: Redefining Capitalism and Unlocking Economic Freedom for All

From Crypto Gains to Timeless Treasures: Bitcoiners Embrace Luxury Watches

Bitcoin Gold Rockets 140% in 24 Hours Despite Looming Upbit Delisting Deadline

‘A Big Move Is Near’: Crypto Trader Says Ethereum Competitor Could Skyrocket by 266% Within a Few Weeks

The Defi Era: Redefining Capitalism and Unlocking Economic Freedom for All

From Crypto Gains to Timeless Treasures: Bitcoiners Embrace Luxury Watches

Topics to Cover!

What’s New Here!

Newsletter

Integrating Speech-to-Text Functionality in Django Applications

Related articles

Setting Up the Project

Creating the Django Project

Building the View

Defining URL Configuration

Creating the HTML Template

Transcript:

Setting the API Key

Running the Django App

Non-blocking Implementations

Using Webhooks

Using Async Functions

Speech-to-Text Options for Django Apps

Conclusion

Related Posts

Topics to Cover!

What’s New Here!

Newsletter