| |

AI Project: Deploy Your Fine-Tuned Model with Flask

3D isometric illustration of a fine-tuned robot serving predictions from inside a Flask-shaped server booth.

This is the ultimate capstone project. In Deploy Hugging Face API, we deployed a pre-trained pipeline. In Fine-Tuning : Part 3, you saved your own custom model. Now, you’ll discover how to Deploy Fine-Tuned Model Flask in your own application.

let’s combine them. We’ll load your own fine-tuned model into a Flask server to create a specialized, high-performance API.

Step 1: Install Libraries

pip install flask transformers torch

Step 2: The Flask Server (app.py)

This script will:

  1. Load your local, custom model and tokenizer (not from the hub).
  2. Wrap them in a pipeline for easy use.
  3. Create a /predict route to serve predictions.
from flask import Flask, request, jsonify
from transformers import pipeline, AutoModelForSequenceClassification, AutoTokenizer

# 1. Define your app and model path
app = Flask(__name__)
MODEL_PATH = "./my-awesome-model" # The folder you saved in Week 68

# 2. Load your *local* fine-tuned model and tokenizer
print("Loading custom model...")
try:
    model = AutoModelForSequenceClassification.from_pretrained(MODEL_PATH)
    tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH)
    
    # 3. Create a pipeline with your custom model
    classifier = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)
    print("Custom model loaded successfully!")

except EnvironmentError:
    print(f"Error: Could not load model from {MODEL_PATH}")
    print("Please run the fine-tuning articles first!")
    classifier = None

# 4. Define the API endpoint
@app.route("/predict", methods=['POST'])
def predict():
    if classifier is None:
        return jsonify({"error": "Model is not loaded"}), 500

    data = request.json
    if not data or 'text' not in data:
        return jsonify({"error": "Missing 'text' key"}), 400
    
    # 5. Run prediction
    result = classifier(data['text'])
    return jsonify(result)

# 6. Run the app
if __name__ == "__main__":
    app.run(debug=True, port=5000)

You now have a production-ready API that serves your custom-trained AI, ready to be called from any website or application.


Key Takeaways

  • This project involves deploying a fine-tuned model in a Flask server.
  • You will load your own custom model and tokenizer, not from the hub.
  • The Flask server will create an API that serves predictions through a /predict route.
  • This setup allows you to use your trained AI from any website or application.

Similar Posts

Leave a Reply