As an IT professional, finding efficient and innovative solutions to unique project requirements is part of the job. Recently, I was tasked with setting up a service that could transcribe audio into text locally, without relying on an internet connection. This service needed to run on a Linux system, ensuring reliability and security in a contained environment. Solution Exploration
After some research, I identified OpenAI Whisper as a potential solution. I tested Whisper on my laptop to ensure its effectiveness, and it performed exceptionally well. However, for deployment on the final system, I decided to use Docker to streamline the setup and ensure ease of use and portability.
Leveraging Docker for Seamless Deployment
Docker is a powerful tool for creating, deploying, and managing containerized applications. By containerizing OpenAI Whisper, we can encapsulate the application and its dependencies, making it easier to deploy and maintain. During my search for existing Docker solutions, I discovered Ahmet Öner's ready-to-use Docker container that perfectly matched my requirements.
Docker Setup Documentation
Below is the detailed Docker documentation I created to set up the audio transcription service on a Linux server:
Credits
Docker Webservice by Ahmet Öner
- Author website - https://ahmetoner.com
- Documentation - https://ahmetoner.com/whisper-asr-webservice/run/
Setup
Docker
- Check if you have Docker installed
docker -v
- Install Docker if not installed
sudo snap install docker
Pull the Docker image
- Check available docker images https://hub.docker.com/r/onerahmet/openai-whisper-asr-webservice
- Get the image
sudo docker pull onerahmet/openai-whisper-asr-webservice:v1.5.0
Start the service
There are GPU and CPU versions.
-
See the available model list and replace
base
in the docker run command with the model you prefer to use. -
CPU
sudo docker run -d -p 9000:9000 -e ASR_MODEL=base -e ASR_ENGINE=openai_whisper onerahmet/openai-whisper-asr-webservice:latest
-
GPU
sudo docker run -d --gpus all -p 9000:9000 -e ASR_MODEL=base -e ASR_ENGINE=openai_whisper onerahmet/openai-whisper-asr-webservice:latest-gpu
Use it
- Open
http://localhost:9000/
- Expend ASR drop down
- Click "Try it out" button in the top right corner
- Pick the language
- Select audio file
- Click "Execute"
- Usually, it takes 1-2 minutes to transcribe a couple-minute conversation.
Docker commands
sudo docker ps
list running services.
sudo docker images
list available images.
sudo docker pause container_name/ID
Pause all processes within one or more containers
sudo docker unpause container_name/ID
Unpause all processes within one or more containers
sudo docker start container_name/ID
Start one or more stopped containers
sudo docker stop container_name/ID
Stop one or more running containers
sudo docker restart container_name/ID
Restart one or more containers