|

AI Project: Build an Image Classifier with Hugging Face (Vision)

3D isometric illustration of a robotic eye scanning photos and sorting them into categories, representing an Hugging Face Image Classification.

We’ve used Hugging Face to understand text and generate text. Now, let’s use it to understand images with Hugging Face Image Classification.

Image Classification is an AI task where the model looks at a picture and tells you what it sees. The transformers library makes this just as easy as our text projects.

Step 1: Installation

You’ll need <a href="https://pypi.org/project/pillow/" type="link" id="https://pypi.org/project/pillow/">Pillow</a> to handle the images and timm which is a common backend for vision models.

pip install transformers torch pillow timm

Step 2: The Code

We will use the same pipeline as before, but tell it to do “image-classification”. We’ll use a model from Google called vit-base-patch16-224.

from transformers import pipeline
from PIL import Image

# 1. Load the image classification pipeline
# This will download the ViT (Vision Transformer) model
classifier = pipeline("image-classification", model="google/vit-base-patch16-224")

# 2. Open a local image
try:
    # (You'll need to download an image of a cat and save it as 'cat.jpg')
    img = Image.open('cat.jpg')
except FileNotFoundError:
    print("Error: 'cat.jpg' not found. Please download an image.")
    exit()

# 3. Classify the image!
results = classifier(img)

# 4. Print the results
print("--- AI Classification Results ---")
for result in results:
    print(f"Label: {result['label']}")
    print(f"Confidence: {result['score']:.4f}")
    print("-----")

Step 3: The Result

You will get a list of the top 5 predictions:

--- AI Classification Results ---
Label: Egyptian cat
Confidence: 0.8654
-----
Label: tabby, tabby cat
Confidence: 0.0862
-----
...

You just built a powerful computer vision model in a few lines of code!

Similar Posts

Leave a Reply