|

AI Project: Object Detection with Hugging Face (DETR)

3D isometric illustration of a street scene with glowing bounding boxes around objects, representing AI object detection with DETR.

We’ve taught our AI to classify an image (e.g., “This is a cat”). Now let’s teach it to find the cat.

Object Detection is a computer vision task that identifies what is in an image and where it is by drawing a “bounding box” around it.

Step 1: Installation

You’ll need Pillow to handle images and timm.

pip install transformers torch pillow timm

Step 2: The Code

We’ll use the object-detection pipeline with DETR, a popular model from Facebook AI.

from transformers import pipeline
from PIL import Image
import requests # To get an image from the web

# 1. Load the pipeline
# This will download a DETR model
detector = pipeline("object-detection")

# 2. Get an image
# Let's use a sample image URL
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
img = Image.open(requests.get(url, stream=True).raw)

# 3. Run the detector!
results = detector(img)

# 4. Print the results
print("--- Objects Found ---")
for obj in results:
    print(f"Label: {obj['label']}")
    print(f"Confidence: {obj['score']:.4f}")
    print(f"Location: {obj['box']}")
    print("-----")

Step 3: The Result

The output will be a list of all objects the model found.

--- Objects Found ---
Label: remote
Confidence: 0.9982
Location: {'ymin': 74, 'xmin': 42, 'ymax': 118, 'xmax': 176}
-----
Label: cat
Confidence: 0.9960
Location: {'ymin': 19, 'xmin': 30, 'ymax': 375, 'xmax': 289}
-----
Label: cat
Confidence: 0.9952
Location: {'ymin': 12, 'xmin': 255, 'ymax': 375, 'xmax': 640}
-----

It found the remote and both cats! You can use this to count items, track objects in videos, and more.

Key Takeaways

  • The article teaches how to implement Hugging Face Object Detection to locate objects in images.
  • Object Detection identifies what is in an image and where it is by using bounding boxes.
  • Installation requires the Pillow and timm libraries for image handling.
  • Use the object-detection pipeline with the DETR model from Facebook AI to find objects.
  • The output provides a list of detected items, which can be used for counting and tracking in videos.

Similar Posts

Leave a Reply