This tutorial walks you through a complete, step-by-step workflow for using Qualcomm AI Hub Models. First, we set up the necessary packages, explore the library of available models, and prepare MobileNet-V2 to run locally with PyTorch. A key part of the process involves resolving a common input-format mismatch—specifically, transforming image data from NHWC arrangement into the NCHW layout the model requires. After that, we perform inference on both a built-in test input and an actual photograph, examine the highest-confidence predictions, run the official Qualcomm AI Hub command-line demo, and expand the example to include object detection with YOLOv7. We also provide an optional section for deploying the model on a real Qualcomm-powered device in the cloud, covering compilation, profiling, and execution when you have an API token.
import subprocess, sys, os, glob, textwrap, traceback
import numpy as np, torch
from PIL import Image
import matplotlib.pyplot as plt
def pip_install(*pkgs):
subprocess.run([sys.executable, "-m", "pip", "install", "-q", *pkgs], check=True)
pip_install("qai_hub_models")
OUT_DIR = "/content/qaihm_out"; os.makedirs(OUT_DIR, exist_ok=True)
torch.set_grad_enabled(False)
def to_nchw(value):
arr = value[0] if isinstance(value, (list, tuple)) else value
t = torch.from_numpy(np.asarray(arr, dtype=np.float32))
if t.ndim == 3:
t = t.unsqueeze(0)
if t.ndim == 4 and t.shape[1] != 3 and t.shape[-1] == 3:
t = t.permute(0, 3, 1, 2).contiguous()
return tWe start by importing all required libraries and writing a small helper routine that installs Python packages right inside the Colab environment. The qai_hub_models package is installed, an output folder is created and gradient computation is turned off since we are only performing inference. We also introduce the to_nchw() utility, which rearranges any input image tensor into the channel-first format the model expects.
import pkgutil, qai_hub_models.models as _m
model_ids = sorted(n for _, n, p in pkgutil.iter_modules(_m.__path__)
if p and not n.startswith("_"))
print(f">>> {len(model_ids)} models available. First 40:")
print(textwrap.fill(", ".join(model_ids[:40]), 100), "")
from qai_hub_models.models.mobilenet_v2 import Model as MobileNetV2
model = MobileNetV2.from_pretrained().eval()
spec = model.get_input_spec()
input_name = list(spec.keys())[0]
print(">>> Input:", input_name, spec[input_name].shape, spec[input_name].dtype)
from torchvision.models import MobileNet_V2_Weights
IMAGENET_CLASSES = MobileNet_V2_Weights.IMAGENET1K_V1.meta["categories"]
def top5(logits):
if logits.ndim == 1: logits = logits.unsqueeze(0)
probs = torch.softmax(logits, dim=1)[0]
conf, idx = probs.topk(5)
return [(IMAGENET_CLASSES[i], float(c)) for c, i in zip(conf, idx)]Next, we scan the Qualcomm AI Hub model catalog and list the first batch of model identifiers so we can see what is available. We then load a pretrained MobileNet-V2, query its input specification to find the proper input tensor name and shape, pull the human-readable ImageNet class labels from torchvision, and write a top5() helper that translates raw model output scores into a readable list of the five most likely labels.
sample = model.sample_inputs()
x = to_nchw(sample[input_name])
print(">>> fed tensor shape:", tuple(x.shape))
print("n>>> Top-5 for the built-in sample input:")
for label, conf in top5(model(x)):
print(f" {conf:6.2%} {label}")
from torchvision import transforms
preprocess = transforms.Compose([
transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(),
])
img_path = None
try:
import urllib.request
p = os.path.join(OUT_DIR, "input.jpg")
urllib.request.urlretrieve(
"https://upload.wikimedia.org/wikipedia/commons/thumb/4/4d/Cat_November_2010-1a.jpg/800px-Cat_November_2010-1a.jpg", p)
img_path = Image.open(p).convert("RGB")
except Exception as e:
print(">>> photo download skipped:", e)
if img_path is not None:
preds = top5(model(preprocess(img_path).unsqueeze(0)))
print("n>>> Top-5 for the downloaded photo:")
for label, conf in preds: print(f" {conf:6.2%} {label}")
plt.figure(figsize=(5,5)); plt.imshow(img_path); plt.axis("off")
plt.title(f"{preds[0][0]} ({preds[0][1]:.1%})"); plt.show()We now feed the model its built-in sample input—first passing it through to_nchw() so the tensor dimensions are correct—and print the top-five predictions. After that, we retrieve a real photograph from the web, resize and center-crop it to 224×224 pixels, convert it to a torch tensor, and run the prediction pipeline again. Finally, we render the image alongside its top predicted class so you can visually confirm that the model’s output matches what is actually in the picture.
def run_demo(module, extra=None, timeout=900):
cmd = [sys.executable, "-m", module, "--eval-mode", "fp",
"--output-dir", OUT_DIR] + (extra or [])
print(f"n>>> {' '.join(cmd)}")
try:
r = subprocess.run(cmd, capture_output=True, text=True, timeout=timeout)
print("n".join((r.stdout + r.stderr).strip().splitlines()[-25:]))
except Exception as e:
print(">>> demo skipped:", e)
run_demo("qai_hub_models.models.mobilenet_v2.demo")
try:
pip_install("qai_hub_models[yolov7]")
run_demo("qai_hub_models.models.yolov7.demo")
imgs = sorted(glob.glob(OUT_DIR + "/*.png") + glob.glob(OUT_DIR + "/*.jpg"),
key=os.path.getmtime)
if imgs:
yolo_img = Image.open(imgs[-1]).convert("RGB")
plt.figure(figsize=(8, 8)); plt.imshow(yolo_img); plt.axis("off")
plt.title("YOLOv7 detection result"); plt.show()
except Exception as e:
print(">>> YOLOv7 demo skipped:", e)To wrap up, we define a single helper—run_demo()—that launches any QAI Hub demo module as a subprocess and prints the last 25 lines of output. We call it first for the official MobileNet-V2 demo and then for YOLOv7 object detection (installing the optional yolov7 extras along the way). For YOLOv7 we grab the latest saved image and display it so you can visually inspect the bounding-box detections drawn by the model. This gives you a reusable pattern for running any other QAI Hub demo from the command line and reviewing its output directly inside the notebook.
if imgs:
plt.figure(figsize=(9,9)); plt.imshow(Image.open(imgs[-1]).convert(“RGB”))
plt.axis(“off”); plt.title(“YOLOv7 detections”); plt.show()
else:
print(“>>> no output image found (results may have printed instead).”)
except Exception:
print(“>>> YOLOv7 section skipped:n”, traceback.format_exc())
We create a flexible `run_demo()` function to launch official Qualcomm AI Hub demos directly from the command line. First, we execute the MobileNet-V2 demo, then add the YOLOv7 package for object detection tasks. After running the YOLOv7 demo, we look for any output image produced and display the detection results visually if one is found.
python
try:
import qai_hub as hub
devices = hub.get_devices()
print(f”n>>> Authenticated. {len(devices)} cloud devices available.”)
device = hub.Device(“Samsung Galaxy S24 (Family)”)
sample = model.sample_inputs()
nchw = to_nchw(sample[input_name])
traced = torch.jit.trace(model, [nchw])
cloud_inputs = {input_name: [nchw.numpy()]}
cj = hub.submit_compile_job(model=traced, device=device,
input_specs=model.get_input_spec(),
options=”–target_runtime tflite”)
target = cj.get_target_model(); print(“>>> compiled:”, cj.url)
pj = hub.submit_profile_job(model=target, device=device); print(“>>> profiling:”, pj.url)
ij = hub.submit_inference_job(model=target, device=device, inputs=cloud_inputs)
out = ij.download_output_data()
dev_logits = torch.from_numpy(np.asarray(list(out.values())[0][0]))
print(“>>> Top-5 from the REAL device:”)
for label, conf in top5(dev_logits): print(f” {conf:6.2%} {label}”)
target.download(os.path.join(OUT_DIR, “mobilenet_v2.tflite”))
print(“>>> saved compiled .tflite to”, OUT_DIR)
except Exception as e:
print(“n>>> Cloud (on-device) section skipped — no API token configured.”)
print(” Get one at workbench.aihub.qualcomm.com, then:”)
print(” !qai-hub configure –api_token YOUR_TOKEN”)
print(” detail:”, (str(e).splitlines() or [type(e).__name__])[0])
print(“n>>> Tutorial complete. Outputs in:”, OUT_DIR)
We also include an optional cloud-based workflow through Qualcomm AI Hub, which activates only when a valid API token is set up. This workflow lists accessible cloud devices, converts the PyTorch model into a traced format, compiles it for TFLite runtime, profiles its performance on a Qualcomm-powered device, and runs an inference job. Once complete, we retrieve the results from the actual hardware, display the top predicted classes, store the compiled TFLite model locally, and wrap up by showing where all generated files are saved.
To summarize, this guide offers a full hands-on pipeline for working with Qualcomm AI Hub models in Colab. We covered loading pretrained models, formatting inputs properly, performing local inference, visualizing both classification and detection outputs, and leveraging official demos as reliable benchmarks. Additionally, we explored how the same model can transition from local PyTorch testing to Qualcomm’s cloud-device ecosystem for compilation, performance profiling, and real-hardware inference — giving you a clear path from initial experimentation to deployment-ready optimization using Qualcomm AI Hub.
—
Check out the **Full Codes with Notebook here.** Also, feel free to follow us on **Twitter** and don’t forget to join our **150k+ ML SubReddit** and Subscribe to **our Newsletter**. Wait! Are you on Telegram? **Now you can join us on Telegram as well.**
Interested in partnering with us to promote your GitHub Repo, Hugging Face Page, Product Release, Webinar, or more? **Connect with us.**
*The post A Hands-On Coding Tutorial on Qualcomm AI Hub Models for Classification, Object Detection, and Hardware-Aware Deployment appeared first on MarkTechPost.*



