
Recommender systems are the engine of the modern internet (Netflix, Amazon, Spotify). In this post, we’ll introduce a Hugging Face Recommender System and explore how it works. The simplest type is “content-based filtering,” which asks: “If you like this item, what other items are most similar to it?”
We can do this by:
- Loading a dataset (e.g., of movies with descriptions).
- Converting all descriptions into Text Embeddings.
- Using a special library called FAISS to find the “nearest neighbors” (closest vectors) to a movie you like.
Step 1: Installation
pip install datasets transformers sentence-transformers faiss-cpu # 'faiss-cpu' is Facebook's fast similarity search library
Step 2: Prepare the Data
We’ll load a movie dataset, get the embeddings for all descriptions, and add them to a FAISS “index.”
from datasets import load_dataset
from sentence_transformers import SentenceTransformer
# 1. Load a small movie dataset
ds = load_dataset("all-movies-from-1990s-TMDb", split='train').select(range(1000))
# 2. Load an embedding model
model = SentenceTransformer('all-MiniLM-L6-v2')
# 3. Create embeddings for all movie overviews (This takes time!)
print("Creating embeddings...")
ds = ds.map(lambda x: {
"embedding": model.encode(x["overview"])
})
# 4. Add embeddings to a FAISS index for fast search
ds.add_faiss_index(column="embedding")
print("FAISS index created!")Step 3: Make a Recommendation!
Now, let’s pick a movie and find the 5 most similar movies.
# Let's find movies similar to 'Pulp Fiction' (ID=680)
query_movie = ds[20] # (Index 20 is Pulp Fiction in this dataset slice)
query_embedding = query_movie["embedding"]
# 5. Search the index
# It finds the 5 closest embeddings to our query
scores, similar_movies = ds.get_nearest_examples("embedding", query_embedding, k=5)
# 6. Print results
print(f"--- Movies similar to: {query_movie['title']} ---")
for movie in similar_movies['title']:
print(movie)Output:
--- Movies similar to: Pulp Fiction --- Pulp Fiction Reservoir Dogs Four Rooms Natural Born Killers From Dusk Till Dawn
The AI has “understood” the vibe of Pulp Fiction and found other 90s crime films by Quentin Tarantino.
Key Takeaways
- Recommender systems power platforms like Netflix, Amazon, and Spotify by suggesting similar items.
- Content-based filtering identifies items that are most similar to those users already like.
- To build a Hugging Face Recommender System, load a dataset, convert descriptions into Text Embeddings, and use FAISS for finding nearest neighbours.
- The process involves installation, data preparation, and then making movie recommendations based on user preferences.





