AI Project: Create Text Embeddings with Hugging Face (Sentence Transformers)

ByAhmed Nabil April 29, 2026April 26, 2026

3D isometric illustration of text being converted into glowing vector arrows, representing Sentence Transformers.

This is one of the most powerful concepts in modern AI. In particular, Hugging Face Text Embeddings are an innovative way to use embeddings today. An “Embedding” is a way to turn a sentence into a list of numbers (a “vector”) that represents its meaning.

Why is this useful? You can compare two vectors to see how semantically similar two sentences are. This is the magic behind “semantic search,” RAG, and finding “related documents.”

Step 1: Install

We need a special library, sentence-transformers, which is built on top of Hugging Face.

pip install sentence-transformers

Step 2: The Code

We will load a pre-trained model and use it to “encode” sentences.

from sentence_transformers import SentenceTransformer, util
import torch # We'll use torch to calculate similarity

# 1. Load a pre-trained model
# 'all-MiniLM-L6-v2' is a popular, fast, and good model
model = SentenceTransformer('all-MiniLM-L6-v2')

# 2. Sentences to encode
sentences = [
    "A man is eating a piece of bread.",
    "A person is consuming food.",
    "The cat is playing with a ball.",
    "A programmer is writing Python code."
]

# 3. Generate the embeddings!
embeddings = model.encode(sentences)

print(f"Shape of one embedding: {embeddings[0].shape}")
# Output: (384,) -> Each sentence is now a 384-dimension vector

Step 3: Compare Similarity

Now, let’s see which sentences are “closest” in meaning. We’ll compare our first sentence to all the others.

# Convert embeddings to PyTorch tensors for similarity calculation
emb_tensors = torch.tensor(embeddings)

# 4. Calculate "Cosine Similarity" between sentence 0 and all others
# This returns a matrix of scores (0.0 to 1.0)
cosine_scores = util.cos_sim(emb_tensors[0], emb_tensors)

print("\nSimilarity of 'A man is eating a piece of bread.' to:")
print(f"- '{sentences[1]}': {cosine_scores[0][1]:.4f}")
print(f"- '{sentences[2]}': {cosine_scores[0][2]:.4f}")
print(f"- '{sentences[3]}': {cosine_scores[0][3]:.4f}")

Output:

Similarity of 'A man is eating a piece of bread.' to:
- 'A person is consuming food.': 0.7554
- 'The cat is playing with a ball.': 0.0763
- 'A programmer is writing Python code.': -0.0121

The AI correctly knows that “eating bread” is very similar to “consuming food,” but not at all similar to “a cat playing” or “writing code.”

Key Takeaways

An ‘Embedding’ converts a sentence into a vector, representing its meaning.
This technique helps compare vectors to determine the semantic similarity of sentences.
To use this concept, first install the ‘sentence-transformers’ library, built on Hugging Face.
Next, load a pre-trained model to encode sentences and compare their meanings.
The AI identifies that ‘eating bread’ is similar to ‘consuming food’, but not to ‘a cat playing’ or ‘writing code’.

Ahmed Nabil

Python Engineer and the founder of Python Pro Hub. With a focus on modern data science (Polars), backend architecture (FastAPI/Django), and automation, builds production-grade tutorials designed to take developers from absolute beginners to advanced software engineers.

Data Science | Python Projects
AI Project: Build a Web App for Your Hugging Face Model (with Gradio)
ByAhmed Nabil April 27, 2026April 14, 2026
You’ve built amazing AI models, but how do you let your friends or colleagues use them without running your script? One solution is to share…
Read More AI Project: Build a Web App for Your Hugging Face Model (with Gradio)
Data Science | Python Errors
How to Fix: SettingWithCopyWarning in Pandas
ByAhmed Nabil February 7, 2026May 7, 2026
This isn’t technically an error (your code usually still runs), but if you’ve encountered the SettingWithCopyWarning, it’s a giant red warning that means “You might…
Read More How to Fix: SettingWithCopyWarning in Pandas
Data Science | Python Projects
AI Project: Build an Image Classifier with Hugging Face (Vision)
ByAhmed Nabil April 1, 2026March 17, 2026
We’ve used Hugging Face to understand text and generate text. Now, let’s use it to understand images with Hugging Face Image Classification. Image Classification is…
Read More AI Project: Build an Image Classifier with Hugging Face (Vision)
Automation
Python Automation Project: Build a ‘Focus Mode’ Website Blocker
ByAhmed Nabil February 25, 2026February 2, 2026
This script redirects distracting websites to your computer’s “localhost” (a blank page), helping you focus. The Python Website Blocker is a simple tool designed to…
Read More Python Automation Project: Build a ‘Focus Mode’ Website Blocker
Python Projects
Beginner Project: Build a Weather App with Python (Using APIs)
ByAhmed Nabil January 21, 2026March 17, 2026
Real-world applications don’t just use data you type in; they fetch data from the internet. A Python Weather App, for instance, would do this using…
Read More Beginner Project: Build a Weather App with Python (Using APIs)
Data Science
The Future of DataFrames: Intro to Polars for High-Performance Python (2026 Guide)
ByAhmed Nabil March 9, 2026February 3, 2026
For years, Pandas has been the undisputed king of DataFrames. But as datasets have grown into 10s or 100s of gigabytes, a new tool has…
Read More The Future of DataFrames: Intro to Polars for High-Performance Python (2026 Guide)

Step 1: Install

Step 2: The Code

Step 3: Compare Similarity

Key Takeaways

Similar Posts

Leave a Reply Cancel reply