|

Machine Learning Project: Your First Classifier (Iris Dataset)

3D visualization of a robot scanning Iris flowers and identifying their species, representing a Machine Learning classifier project.

In our House Price project, we did Regression (predicting a number). Today, we’ll do Classification (predicting a category). We’re going to explore a Machine Learning Classification Project in detail.

Our goal: Build a model that can guess the species of an Iris flower (“setosa”, “versicolor”, or “virginica”) just by looking at its petal and sepal measurements.

Step 1: Load the Data

Scikit-Learn comes with this classic dataset built-in.

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score

# Load data
iris = load_iris()
X = iris.data
y = iris.target
  • X is the data (4 columns: sepal length/width, petal length/width).
  • y is the target (0, 1, or 2, representing the 3 species).

Step 2: Train-Test Split

We split the data so we can train on 80% and test on the unseen 20%.

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Step 3: Train the Model

We’ll use a simple “K-Nearest Neighbors” classifier. It just finds the 3 “closest” known flowers and picks the most common species.

model = KNeighborsClassifier(n_neighbors=3)
model.fit(X_train, y_train)

Step 4: Test the Model

y_pred = model.predict(X_test)

# See how accurate it was
acc = accuracy_score(y_test, y_pred)
print(f"Model Accuracy: {acc * 100:.2f}%")
# Output: Model Accuracy: 100.00% (It's a very easy dataset!)

Step 5: Make a New Prediction

Let’s predict a new flower we just found.

new_flower = [[5.1, 3.5, 1.4, 0.2]] # [sepal_L, sepal_W, petal_L, petal_W]
prediction = model.predict(new_flower)

species_name = iris.target_names[prediction[0]]
print(f"Prediction: This is a {species_name}!")
# Output: Prediction: This is a setosa!

Similar Posts

Leave a Reply