Implementing a Local Audio Transcription Service with OpenAI Whisper and Docker

Posted on 29th Jul 2024

As an IT professional, finding efficient and innovative solutions to unique project requirements is part of the job. Recently, I was tasked with setting up a service that could transcribe audio into text locally, without relying on an internet connection. This service needed to run on a Linux system, ensuring reliability and security in a contained environment. Solution Exploration

After some research, I identified OpenAI Whisper as a potential solution. I tested Whisper on my laptop to ensure its effectiveness, and it performed exceptionally well. However, for deployment on the final system, I decided to use Docker to streamline the setup and ensure ease of use and portability.

Leveraging Docker for Seamless Deployment

Docker is a powerful tool for creating, deploying, and managing containerized applications. By containerizing OpenAI Whisper, we can encapsulate the application and its dependencies, making it easier to deploy and maintain. During my search for existing Docker solutions, I discovered Ahmet Öner's ready-to-use Docker container that perfectly matched my requirements.

Docker Setup Documentation

Below is the detailed Docker documentation I created to set up the audio transcription service on a Linux server:

Credits

Docker Webservice by Ahmet Öner

Author website - https://ahmetoner.com
Documentation - https://ahmetoner.com/whisper-asr-webservice/run/

Setup

Docker

Check if you have Docker installed docker -v
Install Docker if not installed sudo snap install docker

Pull the Docker image

Check available docker images https://hub.docker.com/r/onerahmet/openai-whisper-asr-webservice
Get the image sudo docker pull onerahmet/openai-whisper-asr-webservice:v1.5.0

Start the service

There are GPU and CPU versions.

See the available model list and replace base in the docker run command with the model you prefer to use.
CPU
sudo docker run -d -p 9000:9000 -e ASR_MODEL=base -e ASR_ENGINE=openai_whisper onerahmet/openai-whisper-asr-webservice:latest
GPU
sudo docker run -d --gpus all -p 9000:9000 -e ASR_MODEL=base -e ASR_ENGINE=openai_whisper onerahmet/openai-whisper-asr-webservice:latest-gpu

Use it

Open http://localhost:9000/
Expend ASR drop down
Click "Try it out" button in the top right corner
Pick the language
Select audio file
Click "Execute"
- Usually, it takes 1-2 minutes to transcribe a couple-minute conversation.

Docker commands

sudo docker ps list running services.
sudo docker images list available images.
sudo docker pause container_name/ID Pause all processes within one or more containers
sudo docker unpause container_name/ID Unpause all processes within one or more containers
sudo docker start container_name/ID Start one or more stopped containers
sudo docker stop container_name/ID Stop one or more running containers
sudo docker restart container_name/ID Restart one or more containers

Previous Next

Implementing a Local Audio Transcription Service with OpenAI Whisper and Docker

Leveraging Docker for Seamless Deployment

Docker Setup Documentation

Credits

Setup

Docker

Pull the Docker image

Start the service

Use it

Docker commands

Featured Posts

Password Validator Issue

LinkedIn bug turned into a user experience issue

Morning man

Syndicate