Sktime In Python: A Practical Guide To Building Time-Series Machine Learning Models

# Getting Started

Whether you’re working with sensor data, system performance logs, or any other information collected over time, you’ve probably noticed that standard scikit-learn workflows aren’t a great match. Time series data carries unique characteristics that typical tabular models overlook — things like seasonality, trends, chronological order, and the reliance of future values on earlier observations.

sktime is a purpose-built Python library for handling exactly these kinds of challenges. It offers a familiar scikit-learn-style interface — fit, predict, transform — but it was designed from scratch with time series in mind. Whether you need forecasting, classification, regression, or clustering for time-based data, you get a unified approach throughout.

In this guide, you’ll tackle a practical task: predicting temperature readings from an industrial HVAC sensor. Along the way, you’ll discover how sktime manages time series datasets, construct preprocessing pipelines, train forecasting models, and assess their performance.

The full code is available on GitHub.

# Before You Begin

Make sure you’re running Python 3.10 or later and have a basic understanding of pandas. Grab the needed packages using:

pip install sktime pmdarima statsmodels

Prefer to pull in every optional dependency at once? Just run pip install sktime[all_extras] and you’re covered.

# Why sktime Stands Out

To appreciate what sktime brings to the table, let’s look at the core issue it addresses. In scikit-learn, data is organized as a flat 2D grid — rows represent individual samples, and columns represent features. Time series data violates this convention because each “row” is actually an ordered sequence of observations, where the sequence order is critical.

Here are the primary data structures you’ll encounter:

Data Type	Representation	Description
Series	`pd.Series` or `pd.DataFrame`	A single time series, typically used for straightforward forecasting tasks.
Panel	`pd.DataFrame` with a 2-level `MultiIndex`	A group of multiple independent time series bundled together.
Hierarchical	`pd.DataFrame` with a 3+ level `MultiIndex`	A collection of time series organized with aggregation layers spanning several dimensions.

For the time index, sktime is compatible with several pandas index types: DatetimeIndex, PeriodIndex, Int64Index, and RangeIndex. The index needs to be sorted monotonically. If you’re working with a DatetimeIndex, make sure the freq attribute is properly assigned.

# Preparing the Dataset

Let’s put together a realistic dataset. Picture an HVAC sensor on a factory floor that logs temperature readings every hour. The data shows a daily cycle (warmer during operating hours), a gradual upward trend as summer approaches, and some random fluctuation.

import numpy as np
import pandas as pd

np.random.seed(42)

# 90 days of hourly readings starting Jan 1, 2026
n_hours = 90 * 24
timestamps = pd.date_range(start="2026-01-01", periods=n_hours, freq="h")

# Trend: gradual 5-degree rise over 90 days
trend = np.linspace(0, 5, n_hours)

# Daily seasonality: temperature peaks at 2pm, dips at 4am
hour_of_day = np.arange(n_hours) % 24
daily_cycle = 4 * np.sin(2 * np.pi * (hour_of_day - 4) / 24)

# Noise
noise = np.random.normal(0, 0.8, n_hours)

# Base temperature around 20°C
temperature = 20 + trend + daily_cycle + noise

# Introduce a few missing values (sensor dropout)
dropout_indices = [300, 301, 302, 1440, 1441]
temperature[dropout_indices] = np.nan

y = pd.Series(temperature, index=timestamps, name="temp_celsius")
y.index.freq = pd.tseries.frequencies.to_offset("h")

print(y.head())
print(f"nShape: {y.shape}")
print(f"Missing values: {y.isna().sum()}")
print(f"Index type: {type(y.index)}")

Output:

2026-01-01 00:00:00    16.933270
2026-01-01 01:00:00    17.063277
2026-01-01 02:00:00    18.522783
2026-01-01 03:00:00    20.190095
2026-01-01 04:00:00    19.821941
Freq: h, Name: temp_celsius, dtype: float64

Shape: (2160,)
Missing values: 5
Index type:

# Dividing Time Series Data into Training and Testing Sets

Splitting time series data works differently than with tabular data — you can’t randomly shuffle the rows. The split must follow chronological order: earlier data for training, more recent data for testing.

sktime includes temporal_train_test_split to handle this:

from sktime.split import temporal_train_test_split

# Hold out the last 7 days (168 hours) as the test set
y_train, y_test = temporal_train_test_split(y, test_size=168)

print(f"Train: {y_train.index[0]} → {y_train.index[-1]}")
print(f"Test:  {y_test.index[0]} → {y_test.index[-1]}")
print(f"Train size: {len(y_train)}, Test size: {len(y_test)}")

Output:

Train: 2026-01-01 00:00:00 → 2026-03-24 23:00:00
Test:  2026-03-25 00:00:00 → 2026-03-31 23:00:00
Train size: 1992, Test size: 168

This function guarantees a clean, chronological split — no future data leaks into the training portion.

# Specifying the Forecasting Horizon

Before training any model, you need to define exactly which future time steps you’d like to forecast. This is done through the ForecastingHorizon.

from sktime.forecasting.base import ForecastingHorizon

# Predict 168 steps ahead (7 days of hourly data)
# is_relative=False means we're using absolute timestamps
fh = ForecastingHorizon(y_test.index, is_relative=False)

print(f"Horizon length: {len(fh)}")
print(f"First forecast point: {fh[0]}")
print(f"Last forecast point:  {fh[-1]}");

This produces:

Horizon length: 168
First forecast point: 2026-03-25 00:00:00
Last forecast point:  2026-03-31 23:00:00

You can also specify relative horizons like fh = [1, 2, 3, ..., 168], meaning “1 step ahead, 2 steps ahead, …”. However, absolute horizons are more intuitive when you need predictions for specific real-world timestamps.

# Building a Preprocessing and Forecasting Pipeline

Raw sensor data often contains missing values, seasonal patterns, and trends — all of which must be addressed before or during forecasting. sktime’s TransformedTargetForecaster allows you to chain data transformations with a forecasting model into a unified estimator. The transformations are automatically applied to the target series y during training and seamlessly reversed when generating predictions.

from sktime.forecasting.exp_smoothing import ExponentialSmoothing
from sktime.forecasting.compose import TransformedTargetForecaster
from sktime.transformations.series.impute import Imputer
from sktime.transformations.series.detrend import Deseasonalizer, Detrender

pipeline = TransformedTargetForecaster(
    steps=[
        # Step 1: Fill missing sensor readings using linear interpolation
        ("imputer", Imputer(method="linear")),
        # Step 2: Remove the linear trend to make the series stationary
        ("detrender", Detrender()),
        # Step 3: Eliminate daily seasonality (sp=24 for hourly data)
        ("deseasonalizer", Deseasonalizer(model="additive", sp=24)),
        # Step 4: Forecast the cleaned, stationary residuals
        ("forecaster", ExponentialSmoothing(trend=None, seasonal=None)),
    ]
)

pipeline.fit(y_train, fh=fh)
y_pred = pipeline.predict()

print(y_pred.head());

Output:

2026-03-25 00:00:00    21.210066
2026-03-25 01:00:00    21.788986
2026-03-25 02:00:00    22.615184
2026-03-25 03:00:00    23.688449
2026-03-25 04:00:00    24.621127
Freq: h, Name: temp_celsius, dtype: float64

Here’s what each step does:

Imputer(method="linear") fills gaps in the data by interpolating between adjacent readings — ideal for sensor measurements.
Detrender() fits and subtracts a linear trend from the training data, then adds it back during prediction.
Deseasonalizer(sp=24) strips out the 24-hour cyclical pattern; the sp parameter defines the seasonal period.
Lastly, ExponentialSmoothing forecasts the detrended, deseasonalized residuals.
When predict() is called, all inverse transformations are applied in reverse order automatically, giving you forecasts on the original temperature scale.

# Evaluating the Forecast

sktime works seamlessly with common evaluation metrics. For forecasting tasks, mean absolute error (MAE) and mean absolute percentage error (MAPE) are widely used.

from sktime.performance_metrics.forecasting import (
    mean_absolute_error,
    mean_absolute_percentage_error,
)

mae = mean_absolute_error(y_test, y_pred)
mape = mean_absolute_percentage_error(y_test, y_pred)

print(f"MAE:  {mae:.3f} °C")
print(f"MAPE: {mape*100:.2f}%");

Output:

MAE:  0.584 °C
MAPE: 2.40%

# Swapping in a Different Forecaster

One of sktime’s greatest strengths is how easily you can swap the underlying forecasting algorithm — it typically requires changing only a single line. Let’s switch from exponential smoothing to an ARIMA model and compare results.

from sktime.forecasting.arima import ARIMA

pipeline_arima = TransformedTargetForecaster(
    steps=[
        ("imputer", Imputer(method="linear")),
        ("detrender", Detrender()),
        ("deseasonalizer", Deseasonalizer(model="additive", sp=24)),
        # ARIMA(1,1,1) on the cleaned residuals
        ("forecaster", ARIMA(order=(1, 1, 1), suppress_warnings=True)),
    ]
)

pipeline_arima.fit(y_train, fh=fh)
y_pred_arima = pipeline_arima.predict()

mae_arima = mean_absolute_error(y_test, y_pred_arima)
mape_arima = mean_absolute_percentage_error(y_test, y_pred_arima)

print(f"ARIMA MAE:  {mae_arima:.3f} °C")
print(f"ARIMA MAPE: {mape_arima*100:.2f}%");

Output:

ARIMA MAE:  0.586 °C
ARIMA MAPE: 2.41%

The takeaway here is that the preprocessing steps — imputation, detrending, and deseasonalization — remained exactly the same. Only the final forecaster was swapped out, and everything else composed together smoothly.

# Cross-Validating Across Time

Relying on a single train/test split can give an unreliable picture of model performance. sktime offers time-series-specific cross-validation splitters that respect the chronological ordering of data.

SlidingWindowSplitter advances a fixed-size rolling window through time. ExpandingWindowSplitter gradually grows the training set as new data becomes available, which is preferable when you want to leverage all historical observations.

from sktime.split import ExpandingWindowSplitter
from sktime.forecasting.model_evaluation import evaluate

# Expanding window: start with 1800-hour train set, evaluate on 168-hour windows
cv = ExpandingWindowSplitter(
    initial_window=1800,
    fh=list(range(1, 169)),
    step_length=168,
)

results = evaluate(
    forecaster=pipeline,
    y=y,
    cv=cv,
    scoring=mean_absolute_error,
    return_data=False,
)

print(results[["test__DynamicForecastingErrorMetric",
]]).round(3))
print(f"nMean CV MAE: {results['test__DynamicForecastingErrorMetric'].mean():.3f} °C")

Output:

   test__DynamicForecastingErrorMetric  fit_time
0                                0.627     0.274
1                                0.585     0.100

Mean CV MAE: 0.606 °C

The evaluate function produces a DataFrame containing performance metrics and training durations for each fold. The cross-validation mean absolute error (MAE) verifies that the model maintains reliable performance consistently across varying time windows within the dataset.

# Next Steps

This tutorial walked through the essential forecasting pipeline in sktime, though the library offers capabilities that go well beyond simple prediction.

It also supports time-series classification, probabilistic forecasting with uncertainty quantification, training shared models across multiple related time series, adapting classical ML algorithms for sequence-based forecasting, and automating model selection and hyperparameter tuning.

One of sktime’s standout advantages is its unified API and seamless integration with the broader Python machine-learning ecosystem, streamlining experimentation for practitioners at all levels. The sktime documentation and accompanying example notebooks are particularly well-crafted and make excellent reference materials for anyone who frequently deals with forecasting or time-dependent data challenges.

Bala Priya C is a developer and technical writer based in India. She thrives at the crossroads of mathematics, programming, data science, and content creation. Her DevOps, data science, and natural language processing skills are among her primary interests and areas of expertise. She is passionate about reading, writing, coding, and enjoying good coffee! Presently, she is focused on expanding her knowledge and sharing it with the developer community through tutorials, guides, opinion articles, and other content. She also creates comprehensive resource guides and instructional coding walkthroughs.

Top Posts

AWS Weekly Highlights: FinOps Agent Goes Live in Preview, Gemini 4 Lands on Bedrock, Kiro Pro Max Debuts & Fresh Updates — June 15, 2026

Revolutionizing Legacy IoT: Telit Cinterion’s SE869eK2L GNSS Module Takes Center Stage

A generalizable Hi-C foundation model for chromatin architecture, single-cell and multiomics analysis across species

sktime in Python: A Practical Guide to Building Time-Series Machine Learning Models

Windows Subsystem for Linux 3: The Game-Changer That Makes Developers Loyal to Microsoft

Anthropic Export Controls Spark Global AI Sovereignty Scramble

GPU Time-Slicing for Concurrent LLM Agents on Kubernetes

3 Sneaky Signs Your Wi-Fi Is Being Hacked — Plus How to Shut It Down for Good

4 Essential Lines Every Claude Skill Must Have

Databricks Unveils Omnigent: The Open-Source Meta-Harness Uniting Claude Code, Codex, and Pi Under One AI Agent Orchestration Layer

AWS Weekly Highlights: FinOps Agent Goes Live in Preview, Gemini 4 Lands on Bedrock, Kiro Pro Max Debuts & Fresh Updates — June 15, 2026

Revolutionizing Legacy IoT: Telit Cinterion’s SE869eK2L GNSS Module Takes Center Stage

A generalizable Hi-C foundation model for chromatin architecture, single-cell and multiomics analysis across species

sktime in Python: A Practical Guide to Building Time-Series Machine Learning Models

Bitcoin Eyes $69K Breakthrough as Oil Slips Under $80 on Iran Peace Deal Momentum

Rewriting the title to be unique and engaging while maintaining its core message:“GPU-Powered EDR: Revolutionizing Endpoint Detection for the Future”

How Cloudflare Scaled Its AI Team by Acquiring Ensemble AI’s Top Talent

I tested an AirTag alternative that uses LoRa mesh for location tracking — and it’s impressively reliable

Trending

AWS Weekly Highlights: FinOps Agent Goes Live in Preview, Gemini 4 Lands on Bedrock, Kiro Pro Max Debuts & Fresh Updates — June 15, 2026

Revolutionizing Legacy IoT: Telit Cinterion’s SE869eK2L GNSS Module Takes Center Stage

Latest Posts

Not More Data, but Better World Models – Unite.AI

OpenAI Is Hiring Head of Preparedness, Amid AI Cyberattack Fears

Subscribe to Updates

Top Posts

sktime in Python: A Practical Guide to Building Time-Series Machine Learning Models

# Getting Started

# Before You Begin

# Why sktime Stands Out

# Preparing the Dataset

# Dividing Time Series Data into Training and Testing Sets

# Specifying the Forecasting Horizon

# Building a Preprocessing and Forecasting Pipeline

# Evaluating the Forecast

# Swapping in a Different Forecaster

# Cross-Validating Across Time

# Next Steps

Related Posts