AI Project: Speech-to-Text with Hugging Face (OpenAI Whisper)

ByAhmed Nabil April 10, 2026March 21, 2026

3D isometric illustration of sound waves entering a machine and exiting as 3D text, representing AI speech-to-text conversion.

You’ve used the Hugging Face pipeline to understand text and images. Now, let’s teach it to understand audio using Hugging Face Speech to Text.

We will use OpenAI’s Whisper, a state-of-the-art model for speech recognition, to build a script that can transcribe an audio file into text.

Step 1: Installation

You’ll need transformers, torch, and a library to load audio files.

pip install transformers torch
pip install librosa soundfile

Step 2: The Code

The pipeline handles all the complexity. You just need to point it at an audio file.

from transformers import pipeline
import librosa # We'll use this to load the audio

# 1. Load the pipeline
# This downloads the 'whisper-base' model
transcriber = pipeline(
    "automatic-speech-recognition",
    model="openai/whisper-base"
)

# 2. Load your audio file
# (You'll need your own .wav or .mp3 file for this)
audio_file = "my_speech.wav"
try:
    # Load the audio and get the raw data + sample rate
    speech_data, sample_rate = librosa.load(audio_file, sr=16000)
except FileNotFoundError:
    print(f"Error: '{audio_file}' not found. Please provide an audio file.")
    exit()

# 3. Transcribe!
# Pass the raw data directly to the pipeline
result = transcriber(speech_data)

# 4. Print the result
print("--- Transcription ---")
print(result['text'])

Step 3: The Result

If your my_speech.wav file contained someone saying “Python is the future,” the output will be:

--- Transcription ---
 Python is the future.

You’ve just built a powerful, accurate transcription service in a few lines of Python!

Key Takeaways

Use Hugging Face to transcribe audio files into text with OpenAI’s Whisper model.
Install necessary libraries like transformers and torch for the transcription process.
The pipeline simplifies the task; just specify the audio file to get the transcription result.
For example, transcribing ‘my_speech.wav’ outputs the text ‘Python is the future.’
You’ve successfully created an accurate transcription service using only a few lines of Python!

Ahmed Nabil

Python Engineer and the founder of Python Pro Hub. With a focus on modern data science (Polars), backend architecture (FastAPI/Django), and automation, builds production-grade tutorials designed to take developers from absolute beginners to advanced software engineers.

Data Science
Data Visualization in Python: Seaborn for Beautiful Charts
ByAhmed Nabil January 19, 2026March 17, 2026
While Matplotlib is powerful, its default charts can look a bit… basic. For those new to data visualization, a Seaborn Beginner Guide can be very…
Read More Data Visualization in Python: Seaborn for Beautiful Charts
Data Science | Web Development
PyScript Project: Load and Analyze a User’s CSV File in the Browser
ByAhmed Nabil April 8, 2026March 21, 2026
This is the ultimate goal of PyScript for Data Science: building a tool that lets your users analyze their own data, all inside their browser….
Read More PyScript Project: Load and Analyze a User’s CSV File in the Browser
Data Science
Visualizing Millions of Rows: Polars + Datashader (Big Data Plotting)
ByAhmed Nabil July 6, 2026May 31, 2026
If you try to plot 10 million points with Matplotlib or Seaborn, your computer will freeze. It tries to draw 10 million individual circles, which…
Read More Visualizing Millions of Rows: Polars + Datashader (Big Data Plotting)
Data Science | Python Projects
AI Project: Visual Question Answering (VQA) with Hugging Face
ByAhmed Nabil May 27, 2026April 25, 2026
This is a true “2026 Vision” project. Hugging Face VQA is at the core of what we’re building—we’re giving our AI eyes and a brain….
Read More AI Project: Visual Question Answering (VQA) with Hugging Face
Data Science | Python Projects | Web Development
PyScript Project: Build a Full Data Dashboard (API + Pandas + Plot)
ByAhmed Nabil June 10, 2026May 1, 2026
This is the project that ties all your PyScript skills together. In this guide, you’ll learn how to create a PyScript Data Dashboard. We will…
Read More PyScript Project: Build a Full Data Dashboard (API + Pandas + Plot)
Data Science
Handling Missing Data in Polars (null, fill_null, drop_nulls)
ByAhmed Nabil April 4, 2026March 21, 2026
Just like Pandas has NaN, Polars has null to represent missing or empty data. Before you can analyze a dataset, you must have a strategy…
Read More Handling Missing Data in Polars (null, fill_null, drop_nulls)

Step 1: Installation

Step 2: The Code

Step 3: The Result

Key Takeaways

Similar Posts

Leave a Reply Cancel reply