**Bridging Scikit-LLM And Open-Source LLMs: A Practical Guide**

This guide walks you through using locally hosted language models via Ollama to handle text classification tasks—completely free, with no API costs involved.

Here’s what we’ll explore:

Setting up Ollama and downloading open-source models such as Llama 3, Mistral, and Gemma to run directly on your own machine.
Configuring the Scikit-LLM library so it sends requests to your local Ollama endpoint rather than a paid cloud-based API.
Building a zero-shot text classifier powered by a local large language model, using Scikit-LLM within a scikit-learn-style workflow you’re already familiar with.

Using Scikit-LLM with Open-Source LLMs

Introduction

In this guide, you’ll discover how to tackle language tasks like text classification by leveraging locally hosted large language models (LLMs) of reasonable size—such as Mistral, Gemma, and Llama 3—at zero cost. This is made possible through Ollama, a free platform for running LLMs locally, combined with the Scikit-LLM Python library.

Pre-requisite: Installing Ollama

We recommend using an IDE for this tutorial, since you’ll need to communicate with your locally installed Ollama instance from within it. If Ollama is new to you, we suggest reading through this introductory article first. That said, here’s a quick rundown of the commands you’ll run in your local terminal to download a local LLM after getting Ollama set up on your machine.

# Pulling Llama 3 (one of Ollama’s most popular downloadable models) ollama run llama3 # Or alternatively, try pulling Mistral ollama run mistral # Or, if you feel picky today, just pull Google’s Gemma ollama run gemma

# Pulling Llama 3 (one of Ollama’s most popular downloadable models)

ollama run llama3

# Or alternatively, try pulling Mistral

ollama run mistral

# Or, if you feel picky today, just pull Google’s Gemma

ollama run gemma

Once the model interaction window appears in your terminal, type “/bye” to leave it running in the background, ready to accept API calls. Then, in a fresh project within your Python IDE, make sure you have the following libraries installed:

pip install scikit-learn pandas scikit-llm

pip install scikit–learn pandas scikit–llm

If you run into a “Module not found” error when running the Python code, try installing each of the above dependencies individually.

Great! Now it’s time to start building out your Python code file (name it whatever you like!), step by step. As always, we begin with the imports. One of them is the ZeroShotGPTClassifier class. Much like traditional scikit-learn, this is a specialized class designed for training and using a model for zero-shot classification—specifically, an LLM served through Ollama.

import pandas as pd from sklearn.model_selection import train_test_split from skllm.config import SKLLMConfig from skllm.models.gpt.classification.zero_shot import ZeroShotGPTClassifier

import pandas as pd

from sklearn.model_selection import train_test_split

from skllm.config import SKLLMConfig

from skllm

.models.gpt.classification.zero_shot import ZeroShotGPTClassifier

Next, we need to apply a couple of specific configurations to be able to communicate with Ollama.

# Use this to tell Scikit-LLM to route cloud requests towards your default local Ollama port SKLLMConfig.set_gpt_url(” # Scikit-LLM needs, by default, a key to pass internal validation checks. # But because Ollama is local and free, this string will be ignored in practice. SKLLMConfig.set_openai_key(“local-ollama-is-free”)

# Use this to tell Scikit-LLM to route cloud requests towards your default local Ollama port

SKLLMConfig.set_gpt_url(“http://localhost:11434/v1”)

# Scikit-LLM needs, by default, a key to pass internal validation checks.

# But because Ollama is local and free, this string will be ignored in practice.

SKLLMConfig.set_openai_key(“local-ollama-is-free”)

After that, we create a small dataset and prepare it for classification. Since we are not going to evaluate the model’s classification performance in this tutorial — our main goal is to learn how to use Scikit-LLM locally with open-source models like those available through Ollama — we do not need a large number of data examples.

data = { “review”: [ “The new macOS update is fantastic and runs smoothly.”, “My battery is draining incredibly fast after the patch.”, “I need help resetting my account password.”, “The display on this monitor is breathtakingly crisp.”, “Customer support hung up on me, very disappointing.” ], “category”: [ “Positive Feedback”, “Technical Issue”, “Support Request”, “Positive Feedback”, “Negative Feedback” ] } df = pd.DataFrame(data) X = df[“review”] y = df[“category”] # Splitting data into train/test sets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state=42)

data = {

“review”: [

“The new macOS update is fantastic and runs smoothly.”,

“My battery is draining incredibly fast after the patch.”,

“I need help resetting my account password.”,

“The display on this monitor is breathtakingly crisp.”,

“Customer support hung up on me, very disappointing.”

“category”: [

“Positive Feedback”,

“Technical Issue”,

“Support Request”,

“Positive Feedback”,

“Negative Feedback”

]

}

df = pd.DataFrame(data)

X = df

[“review”]

y = df[“category”]

# Splitting data into train/test sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state=42)

The dataset includes user reviews along with their assigned categories, such as different types of customer inquiries or feedback. As is standard practice in machine learning workflows, we divided the data into training and testing subsets.

In the following section of the code, we set up the necessary instructions to initialize and run our classifier. At its core, this will be a task-adapted instance of one of our locally installed Ollama models, such as Llama 3:

print(“Initializing ZeroShotGPTClassifier with local Llama 3…”) # Using the ‘custom_url::’ prefix to tell the system to use your “set_gpt_url” endpoint (see above) clf = ZeroShotGPTClassifier(model=”custom_url::llama3″) # Fitting the model clf.fit(X_train, y_train) print(“Sending data to Ollama for local inference…n”) predictions = clf.predict(X_test)

print(“Initializing ZeroShotGPTClassifier with local Llama 3…”)

# Using the ‘custom_url::’ prefix to tell the system to use your “set_gpt_url” endpoint (see above)

clf = ZeroShotGPTClassifier(model=“custom_url::llama3”)

# Fitting the model

clf.fit(X_train, y_train)

print(“Sending data to Ollama for local inference…n”)

predictions = clf.predict(X_test)

To wrap up, we display a few outputs showing the model’s inference results (classification predictions) for the two examples in the test set. Although this is a very small dataset, the goal here is to demonstrate how we successfully connected Scikit-LLM with a local, free Ollama model to efficiently use an LLM for a specific task at no cost!

for review, prediction in zip(X_test, predictions): print(f”Review Text: ‘{review}'”) print(f”Predicted Tag: {prediction}”) print(“-” * 50)

for review, prediction in zip(X_test, predictions):

print(f“Review Text: ‘{review}'”)

print(f“Predicted Tag: {prediction}”)

print(“-“ * 50)

The

Here’s the paraphrased version:

Sending data to Ollama for local inference… 100%|███████████████████████████████████████████████████████████| 2/2 [00:12<00:00, 6.36s/it] Review Text: ‘My battery is draining incredibly fast after the patch.’ Predicted Tag: Support Request ————————————————– Review Text: ‘Customer support hung up on me, very disappointing.’ Predicted Tag: Support Request ————————————————–

Sending data to Ollama for local inference...

100%|███████████████████████████████████████████████████████████| 2/2 [00:12<00:00, 6.36s/it]

Review Text: ‘My battery is draining incredibly fast after the patch.’

Predicted Tag: Support Request

—————————————————————————

Review Text: ‘Customer support hung up on me, very disappointing.’

Predicted Tag: Support Request

—————————————————————————

Another option is to execute your Python script directly from the terminal. For instance, if you saved it as local_classification.py, simply run this command:

python local_classification.py

python local_classification.py

Whichever method you choose, as long as you’ve completed all the steps correctly, everything should be up and running. Great job!

Wrapping Up

This guide demonstrated how to replace paid models with free, locally hosted alternatives powered by Ollama — including Llama, Mistral, or Gemma — all at no cost and with minimal setup. This is made possible by Python’s Scikit-LLM library, which lets you integrate state-of-the-art large language models into a traditional machine learning pipeline you’re already familiar with.

Top Posts

When the World Cup Collided with the Cloud: 2026’s Digital Traffic Surge

Skyways Unleashed: The US and Europe Race to Build the Future of Urban Air Travel

5 No-Cost Courses to Transform from AI Newbie to Pro

Bridging Scikit-LLM and Open-Source LLMs: A Practical Guide

Beyond Guesswork: A Slurm-Powered Battle Plan for Benchmarking Distributed LLM Servers

Beyond Prompt Engineering: How 4 Context Bricks Silence RAG Hallucinations

Run Mythos Enhanced Coding Model Locally with llama.cpp on Raspberry Pi

Astryx: Meta’s Open-Source React Toolkit—150+ Accessible Components, 7 Themes, and a CLI Agent-Ready Design System

Endless Code: Mastering the Art of the 24-Hour Claude Agent

Unlock Peak Performance: Your Blueprint for Lightning-Fast Agentic Coding with Claude

When the World Cup Collided with the Cloud: 2026’s Digital Traffic Surge

Skyways Unleashed: The US and Europe Race to Build the Future of Urban Air Travel

5 No-Cost Courses to Transform from AI Newbie to Pro

Beyond Guesswork: A Slurm-Powered Battle Plan for Benchmarking Distributed LLM Servers

The Magic of Friction: Engineering Smarter Robot World Models

Trump Mobilizes Defense Industry to Chart Software and Supplier Networks Nationwide

KuCoin Pay: Weaving Crypto Seamlessly Into Everyday Payments

The End of an Era: US Civil Rights Agency Dismantles 60-Year Data Archive

Trending

When the World Cup Collided with the Cloud: 2026’s Digital Traffic Surge

Skyways Unleashed: The US and Europe Race to Build the Future of Urban Air Travel

Latest Posts

Not More Data, but Better World Models – Unite.AI

OpenAI Is Hiring Head of Preparedness, Amid AI Cyberattack Fears

Subscribe to Updates

Top Posts

**Bridging Scikit-LLM and Open-Source LLMs: A Practical Guide**

Introduction

Pre-requisite: Installing Ollama

Wrapping Up

Related Posts

Bridging Scikit-LLM and Open-Source LLMs: A Practical Guide