On this tutorial, we construct an end-to-end cognitive complexity evaluation workflow utilizing complexipy. We begin by measuring complexity straight from uncooked code strings, then scale the identical evaluation to particular person recordsdata and a whole challenge listing. Alongside the way in which, we generate machine-readable stories, normalize them into structured DataFrames, and visualize complexity distributions to grasp how determination depth accumulates throughout capabilities. By treating cognitive complexity as a measurable engineering sign, we present how it may be built-in naturally into on a regular basis Python improvement and high quality checks. Try the FULL CODES right here.
!pip -q set up complexipy pandas matplotlib
import os
import json
import textwrap
import subprocess
from pathlib import Path
import pandas as pd
import matplotlib.pyplot as plt
from complexipy import code_complexity, file_complexity
print("✅ Installed complexipy and dependencies")We arrange the surroundings by putting in the required libraries and importing all dependencies wanted for evaluation and visualization. We make sure the pocket book is absolutely self-contained and able to run in Google Colab with out exterior setup. It kinds the spine of execution for the whole lot that follows.
snippet = """
def score_orders(orders):
whole = 0
for o in orders:
if o.get("valid"):
if o.get("priority"):
if o.get("amount", 0) > 100:
whole += 3
else:
whole += 2
else:
if o.get("amount", 0) > 100:
whole += 2
else:
whole += 1
else:
whole -= 1
return whole
"""
res = code_complexity(snippet)
print("=== Code string complexity ===")
print("Overall complexity:", res.complexity)
print("Functions:")
for f in res.capabilities:
print(f" - {f.name}: {f.complexity} (lines {f.line_start}-{f.line_end})")We start by analyzing a uncooked Python code string to grasp cognitive complexity on the operate stage. We straight examine how nested conditionals and management circulation contribute to complexity. It helps us validate the core conduct of complexipy earlier than scaling to actual recordsdata.
root = Path("toy_project")
src = root / "src"
assessments = root / "tests"
src.mkdir(dad and mom=True, exist_ok=True)
assessments.mkdir(dad and mom=True, exist_ok=True)
(src / "__init__.py").write_text("")
(assessments / "__init__.py").write_text("")
(src / "simple.py").write_text(textwrap.dedent("""
def add(a, b):
return a + b
def safe_div(a, b):
if b == 0:
return None
return a / b
""").strip() + "n")
(src / "legacy_adapter.py").write_text(textwrap.dedent("""
def legacy_adapter(x, y):
if x and y:
if x > 0:
if y > 0:
return x + y
else:
return x - y
else:
if y > 0:
return y - x
else:
return -(x + y)
return 0
""").strip() + "n")
(src / "engine.py").write_text(textwrap.dedent("""
def route_event(occasion):
type = occasion.get("kind")
payload = occasion.get("payload", {})
if type == "A":
if payload.get("x") and payload.get("y"):
return _handle_a(payload)
return None
elif type == "B":
if payload.get("flags"):
return _handle_b(payload)
else:
return None
elif type == "C":
for merchandise in payload.get("items", []):
if merchandise.get("enabled"):
if merchandise.get("mode") == "fast":
_do_fast(merchandise)
else:
_do_safe(merchandise)
return True
else:
return None
def _handle_a(p):
whole = 0
for v in p.get("vals", []):
if v > 10:
whole += 2
else:
whole += 1
return whole
def _handle_b(p):
rating = 0
for f in p.get("flags", []):
if f == "x":
rating += 1
elif f == "y":
rating += 2
else:
rating -= 1
return rating
def _do_fast(merchandise):
return merchandise.get("id")
def _do_safe(merchandise):
if merchandise.get("id") is None:
return None
return merchandise.get("id")
""").strip() + "n")
(assessments / "test_engine.py").write_text(textwrap.dedent("""
from src.engine import route_event
def test_route_event_smoke():
assert route_event({"kind": "A", "payload": {"x": 1, "y": 2, "vals": [1, 20]}}) == 3
""").strip() + "n")
print(f"✅ Created project at: {root.resolve()}")We programmatically assemble a small however lifelike Python challenge with a number of modules and check recordsdata. We deliberately embody different control-flow patterns to create significant variations in complexity. Try the FULL CODES right here.
engine_path = src / "engine.py"
file_res = file_complexity(str(engine_path))
print("n=== File complexity (Python API) ===")
print("Path:", file_res.path)
print("File complexity:", file_res.complexity)
for f in file_res.capabilities:
print(f" - {f.name}: {f.complexity} (lines {f.line_start}-{f.line_end})")
MAX_ALLOWED = 8
def run_complexipy_cli(project_dir: Path, max_allowed: int = 8):
cmd = [
"complexipy",
".",
"--max-complexity-allowed", str(max_allowed),
"--output-json",
"--output-csv",
]
proc = subprocess.run(cmd, cwd=str(project_dir), capture_output=True, textual content=True)
preferred_csv = project_dir / "complexipy.csv"
preferred_json = project_dir / "complexipy.json"
csv_candidates = []
json_candidates = []
if preferred_csv.exists():
csv_candidates.append(preferred_csv)
if preferred_json.exists():
json_candidates.append(preferred_json)
csv_candidates += listing(project_dir.glob("*.csv")) + listing(project_dir.glob("**/*.csv"))
json_candidates += listing(project_dir.glob("*.json")) + listing(project_dir.glob("**/*.json"))
def uniq(paths):
seen = set()
out = []
for p in paths:
p = p.resolve()
if p not in seen and p.is_file():
seen.add(p)
out.append(p)
return out
csv_candidates = uniq(csv_candidates)
json_candidates = uniq(json_candidates)
def pick_best(paths):
if not paths:
return None
paths = sorted(paths, key=lambda p: p.stat().st_mtime, reverse=True)
return paths[0]
return proc.returncode, pick_best(csv_candidates), pick_best(json_candidates)
rc, csv_report, json_report = run_complexipy_cli(root, MAX_ALLOWED)We analyze an actual supply file utilizing the Python API, then run the complexipy CLI on your entire challenge. We run the CLI from the right working listing to reliably generate stories. This step bridges native API utilization with production-style static evaluation workflows.
df = None
if csv_report and csv_report.exists():
df = pd.read_csv(csv_report)
elif json_report and json_report.exists():
information = json.hundreds(json_report.read_text())
if isinstance(information, listing):
df = pd.DataFrame(information)
elif isinstance(information, dict):
if "files" in information and isinstance(information["files"], listing):
df = pd.DataFrame(information["files"])
elif "results" in information and isinstance(information["results"], listing):
df = pd.DataFrame(information["results"])
else:
df = pd.json_normalize(information)
if df is None:
elevate RuntimeError("No report produced")
def explode_functions_table(df_in):
if "functions" in df_in.columns:
tmp = df_in.explode("functions", ignore_index=True)
if tmp["functions"].notna().any() and isinstance(tmp["functions"].dropna().iloc[0], dict):
fn = pd.json_normalize(tmp["functions"])
base = tmp.drop(columns=["functions"])
return pd.concat([base.reset_index(drop=True), fn.reset_index(drop=True)], axis=1)
return tmp
return df_in
fn_df = explode_functions_table(df)
col_map = {}
for c in fn_df.columns:
lc = c.decrease()
if lc in ("path", "file", "filename", "module"):
col_map[c] = "path"
if ("function" in lc and "name" in lc) or lc in ("function", "func", "function_name"):
col_map[c] = "function"
if lc == "name" and "function" not in fn_df.columns:
col_map[c] = "function"
if "complexity" in lc and "allowed" not in lc and "max" not in lc:
col_map[c] = "complexity"
if lc in ("line_start", "linestart", "start_line", "startline"):
col_map[c] = "line_start"
if lc in ("line_end", "lineend", "end_line", "endline"):
col_map[c] = "line_end"
fn_df = fn_df.rename(columns=col_map)We load the generated complexity stories into pandas and normalize them right into a function-level desk. We deal with a number of potential report schemas to maintain the workflow strong. This structured illustration permits us to cause about complexity utilizing normal information evaluation instruments.
if "complexity" in fn_df.columns:
fn_df["complexity"] = pd.to_numeric(fn_df["complexity"], errors="coerce")
plt.determine()
fn_df["complexity"].dropna().plot(type="hist", bins=20)
plt.title("Cognitive Complexity Distribution (functions)")
plt.xlabel("complexity")
plt.ylabel("count")
plt.present()
def refactor_hints(complexity):
if complexity >= 20:
return [
"Split into smaller pure functions",
"Replace deep nesting with guard clauses",
"Extract complex boolean predicates"
]
if complexity >= 12:
return [
"Extract inner logic into helpers",
"Flatten conditionals",
"Use dispatch tables"
]
if complexity >= 8:
return [
"Reduce nesting",
"Early returns"
]
return ["Acceptable complexity"]
if "complexity" in fn_df.columns and "function" in fn_df.columns:
for _, r in fn_df.sort_values("complexity", ascending=False).head(8).iterrows():
cx = float(r["complexity"]) if pd.notna(r["complexity"]) else None
if cx is None:
proceed
print(r["function"], cx, refactor_hints(cx))
print("✅ Tutorial complete.")We visualize the distribution of cognitive complexity and derive refactoring steerage from numeric thresholds. We translate summary complexity scores into concrete engineering actions. It closes the loop by connecting measurement on to maintainability choices.
In conclusion, we introduced a sensible, reproducible pipeline for auditing cognitive complexity in Python tasks utilizing complexipy. We demonstrated how we are able to transfer from advert hoc inspection to data-driven reasoning about code construction, determine high-risk capabilities, and supply actionable refactoring steerage primarily based on quantified thresholds. The workflow permits us to cause about maintainability early, implement complexity budgets persistently, and evolve codebases with readability and confidence, moderately than relying solely on instinct.
Try the FULL CODES right here. Additionally, be happy to comply with us on Twitter and don’t neglect to hitch our 100k+ ML SubReddit and Subscribe to our E-newsletter. Wait! are you on telegram? now you may be a part of us on telegram as properly.



