On this tutorial, we stroll via a complicated, end-to-end exploration of Polyfactory, specializing in how we are able to generate wealthy, life like mock information straight from Python kind hints. We begin by establishing the surroundings and progressively construct factories for information courses, Pydantic fashions, and attrs-based courses, whereas demonstrating customization, overrides, calculated fields, and the technology of nested objects. As we transfer via every snippet, we present how we are able to management randomness, implement constraints, and mannequin real-world constructions, making this tutorial straight relevant to testing, prototyping, and data-driven growth workflows. Take a look at the FULL CODES right here.
import subprocess
import sys
def install_package(bundle):
subprocess.check_call([sys.executable, "-m", "pip", "install", "-q", package])
packages = [
"polyfactory",
"pydantic",
"email-validator",
"faker",
"msgspec",
"attrs"
]
for bundle in packages:
strive:
install_package(bundle)
print(f"✓ Installed {package}")
besides Exception as e:
print(f"✗ Failed to install {package}: {e}")
print("n")
print("=" * 80)
print("SECTION 2: Basic Dataclass Factories")
print("=" * 80)
from dataclasses import dataclass
from typing import Listing, Elective
from datetime import datetime, date
from uuid import UUID
from polyfactory.factories import DataclassFactory
@dataclass
class Deal with:
avenue: str
metropolis: str
nation: str
zip_code: str
@dataclass
class Particular person:
id: UUID
identify: str
e mail: str
age: int
birth_date: date
is_active: bool
tackle: Deal with
phone_numbers: Listing[str]
bio: Elective[str] = None
class PersonFactory(DataclassFactory[Person]):
cross
individual = PersonFactory.construct()
print(f"Generated Person:")
print(f" ID: {person.id}")
print(f" Name: {person.name}")
print(f" Email: {person.email}")
print(f" Age: {person.age}")
print(f" Address: {person.address.city}, {person.address.country}")
print(f" Phone Numbers: {person.phone_numbers[:2]}")
print()
individuals = PersonFactory.batch(5)
print(f"Generated {len(people)} people:")
for i, p in enumerate(individuals, 1):
print(f" {i}. {p.name} - {p.email}")
print("n")We arrange the surroundings and guarantee all required dependencies are put in. We additionally introduce the core concept of utilizing Polyfactory to generate mock information from kind hints. By initializing the fundamental dataclass factories, we set up the inspiration for all subsequent examples.
print("=" * 80)
print("SECTION 3: Customizing Factory Behavior")
print("=" * 80)
from faker import Faker
from polyfactory.fields import Use, Ignore
@dataclass
class Worker:
employee_id: str
full_name: str
division: str
wage: float
hire_date: date
is_manager: bool
e mail: str
internal_notes: Elective[str] = None
class EmployeeFactory(DataclassFactory[Employee]):
__faker__ = Faker(locale="en_US")
__random_seed__ = 42
@classmethod
def employee_id(cls) -> str:
return f"EMP-{cls.__random__.randint(10000, 99999)}"
@classmethod
def full_name(cls) -> str:
return cls.__faker__.identify()
@classmethod
def division(cls) -> str:
departments = ["Engineering", "Marketing", "Sales", "HR", "Finance"]
return cls.__random__.selection(departments)
@classmethod
def wage(cls) -> float:
return spherical(cls.__random__.uniform(50000, 150000), 2)
@classmethod
def e mail(cls) -> str:
return cls.__faker__.company_email()
workers = EmployeeFactory.batch(3)
print("Generated Employees:")
for emp in workers:
print(f" {emp.employee_id}: {emp.full_name}")
print(f" Department: {emp.department}")
print(f" Salary: ${emp.salary:,.2f}")
print(f" Email: {emp.email}")
print()
print()
print("=" * 80)
print("SECTION 4: Field Constraints and Calculated Fields")
print("=" * 80)
@dataclass
class Product:
product_id: str
identify: str
description: str
worth: float
discount_percentage: float
stock_quantity: int
final_price: Elective[float] = None
sku: Elective[str] = None
class ProductFactory(DataclassFactory[Product]):
@classmethod
def product_id(cls) -> str:
return f"PROD-{cls.__random__.randint(1000, 9999)}"
@classmethod
def identify(cls) -> str:
adjectives = ["Premium", "Deluxe", "Classic", "Modern", "Eco"]
nouns = ["Widget", "Gadget", "Device", "Tool", "Appliance"]
return f"{cls.__random__.choice(adjectives)} {cls.__random__.choice(nouns)}"
@classmethod
def worth(cls) -> float:
return spherical(cls.__random__.uniform(10.0, 1000.0), 2)
@classmethod
def discount_percentage(cls) -> float:
return spherical(cls.__random__.uniform(0, 30), 2)
@classmethod
def stock_quantity(cls) -> int:
return cls.__random__.randint(0, 500)
@classmethod
def construct(cls, **kwargs):
occasion = tremendous().construct(**kwargs)
if occasion.final_price is None:
occasion.final_price = spherical(
occasion.worth * (1 - occasion.discount_percentage / 100), 2
)
if occasion.sku is None:
name_part = occasion.identify.exchange(" ", "-").higher()[:10]
occasion.sku = f"{instance.product_id}-{name_part}"
return occasion
merchandise = ProductFactory.batch(3)
print("Generated Products:")
for prod in merchandise:
print(f" {prod.sku}")
print(f" Name: {prod.name}")
print(f" Price: ${prod.price:.2f}")
print(f" Discount: {prod.discount_percentage}%")
print(f" Final Price: ${prod.final_price:.2f}")
print(f" Stock: {prod.stock_quantity} units")
print()
print()We give attention to producing easy however life like mock information utilizing dataclasses and default Polyfactory conduct. We present how one can rapidly create single cases and batches with out writing any customized logic. It helps us validate how Polyfactory routinely interprets kind hints to populate nested constructions.
print("=" * 80)
print("SECTION 6: Complex Nested Structures")
print("=" * 80)
from enum import Enum
class OrderStatus(str, Enum):
PENDING = "pending"
PROCESSING = "processing"
SHIPPED = "shipped"
DELIVERED = "delivered"
CANCELLED = "cancelled"
@dataclass
class OrderItem:
product_name: str
amount: int
unit_price: float
total_price: Elective[float] = None
@dataclass
class ShippingInfo:
provider: str
tracking_number: str
estimated_delivery: date
@dataclass
class Order:
order_id: str
customer_name: str
customer_email: str
standing: OrderStatus
objects: Listing[OrderItem]
order_date: datetime
shipping_info: Elective[ShippingInfo] = None
total_amount: Elective[float] = None
notes: Elective[str] = None
class OrderItemFactory(DataclassFactory[OrderItem]):
@classmethod
def product_name(cls) -> str:
merchandise = ["Laptop", "Mouse", "Keyboard", "Monitor", "Headphones",
"Webcam", "USB Cable", "Phone Case", "Charger", "Tablet"]
return cls.__random__.selection(merchandise)
@classmethod
def amount(cls) -> int:
return cls.__random__.randint(1, 5)
@classmethod
def unit_price(cls) -> float:
return spherical(cls.__random__.uniform(5.0, 500.0), 2)
@classmethod
def construct(cls, **kwargs):
occasion = tremendous().construct(**kwargs)
if occasion.total_price is None:
occasion.total_price = spherical(occasion.amount * occasion.unit_price, 2)
return occasion
class ShippingInfoFactory(DataclassFactory[ShippingInfo]):
@classmethod
def provider(cls) -> str:
carriers = ["FedEx", "UPS", "DHL", "USPS"]
return cls.__random__.selection(carriers)
@classmethod
def tracking_number(cls) -> str:
return ''.be a part of(cls.__random__.decisions('0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ', ok=12))
class OrderFactory(DataclassFactory[Order]):
@classmethod
def order_id(cls) -> str:
return f"ORD-{datetime.now().year}-{cls.__random__.randint(100000, 999999)}"
@classmethod
def objects(cls) -> Listing[OrderItem]:
return OrderItemFactory.batch(cls.__random__.randint(1, 5))
@classmethod
def construct(cls, **kwargs):
occasion = tremendous().construct(**kwargs)
if occasion.total_amount is None:
occasion.total_amount = spherical(sum(merchandise.total_price for merchandise in occasion.objects), 2)
if occasion.shipping_info is None and occasion.standing in [OrderStatus.SHIPPED, OrderStatus.DELIVERED]:
occasion.shipping_info = ShippingInfoFactory.construct()
return occasion
orders = OrderFactory.batch(2)
print("Generated Orders:")
for order in orders:
print(f"n Order {order.order_id}")
print(f" Customer: {order.customer_name} ({order.customer_email})")
print(f" Status: {order.status.value}")
print(f" Items ({len(order.items)}):")
for merchandise so as.objects:
print(f" - {item.quantity}x {item.product_name} @ ${item.unit_price:.2f} = ${item.total_price:.2f}")
print(f" Total: ${order.total_amount:.2f}")
if order.shipping_info:
print(f" Shipping: {order.shipping_info.carrier} - {order.shipping_info.tracking_number}")
print("n")We construct extra complicated area logic by introducing calculated and dependent fields inside factories. We present how we are able to derive values akin to closing costs, totals, and delivery particulars after object creation. This permits us to mannequin life like enterprise guidelines straight inside our take a look at information turbines.
print("=" * 80)
print("SECTION 7: Attrs Integration")
print("=" * 80)
import attrs
from polyfactory.factories.attrs_factory import AttrsFactory
@attrs.outline
class BlogPost:
title: str
creator: str
content material: str
views: int = 0
likes: int = 0
printed: bool = False
published_at: Elective[datetime] = None
tags: Listing[str] = attrs.discipline(manufacturing unit=checklist)
class BlogPostFactory(AttrsFactory[BlogPost]):
@classmethod
def title(cls) -> str:
templates = [
"10 Tips for {}",
"Understanding {}",
"The Complete Guide to {}",
"Why {} Matters",
"Getting Started with {}"
]
subjects = ["Python", "Data Science", "Machine Learning", "Web Development", "DevOps"]
template = cls.__random__.selection(templates)
subject = cls.__random__.selection(subjects)
return template.format(subject)
@classmethod
def content material(cls) -> str:
return " ".be a part of(Faker().sentences(nb=cls.__random__.randint(3, 8)))
@classmethod
def views(cls) -> int:
return cls.__random__.randint(0, 10000)
@classmethod
def likes(cls) -> int:
return cls.__random__.randint(0, 1000)
@classmethod
def tags(cls) -> Listing[str]:
all_tags = ["python", "tutorial", "beginner", "advanced", "guide",
"tips", "best-practices", "2024"]
return cls.__random__.pattern(all_tags, ok=cls.__random__.randint(2, 5))
posts = BlogPostFactory.batch(3)
print("Generated Blog Posts:")
for publish in posts:
print(f"n '{post.title}'")
print(f" Author: {post.author}")
print(f" Views: {post.views:,} | Likes: {post.likes:,}")
print(f" Published: {post.published}")
print(f" Tags: {', '.join(post.tags)}")
print(f" Preview: {post.content[:100]}...")
print("n")
print("=" * 80)
print("SECTION 8: Building with Specific Overrides")
print("=" * 80)
custom_person = PersonFactory.construct(
identify="Alice Johnson",
age=30,
e mail="[email protected]"
)
print(f"Custom Person:")
print(f" Name: {custom_person.name}")
print(f" Age: {custom_person.age}")
print(f" Email: {custom_person.email}")
print(f" ID (auto-generated): {custom_person.id}")
print()
vip_customers = PersonFactory.batch(
3,
bio="VIP Customer"
)
print("VIP Customers:")
for buyer in vip_customers:
print(f" {customer.name}: {customer.bio}")
print("n")We lengthen Polyfactory utilization to validated Pydantic fashions and attrs-based courses. We show how we are able to respect discipline constraints, validators, and default behaviors whereas nonetheless producing legitimate information at scale. It ensures our mock information stays appropriate with actual utility schemas.
print("=" * 80)
print("SECTION 9: Field-Level Control with Use and Ignore")
print("=" * 80)
from polyfactory.fields import Use, Ignore
@dataclass
class Configuration:
app_name: str
model: str
debug: bool
created_at: datetime
api_key: str
secret_key: str
class ConfigFactory(DataclassFactory[Configuration]):
app_name = Use(lambda: "MyAwesomeApp")
model = Use(lambda: "1.0.0")
debug = Use(lambda: False)
@classmethod
def api_key(cls) -> str:
return f"api_key_{''.join(cls.__random__.choices('0123456789abcdef', k=32))}"
@classmethod
def secret_key(cls) -> str:
return f"secret_{''.join(cls.__random__.choices('0123456789abcdef', k=64))}"
configs = ConfigFactory.batch(2)
print("Generated Configurations:")
for config in configs:
print(f" App: {config.app_name} v{config.version}")
print(f" Debug: {config.debug}")
print(f" API Key: {config.api_key[:20]}...")
print(f" Created: {config.created_at}")
print()
print()
print("=" * 80)
print("SECTION 10: Model Coverage Testing")
print("=" * 80)
from pydantic import BaseModel, ConfigDict
from typing import Union
class PaymentMethod(BaseModel):
model_config = ConfigDict(use_enum_values=True)
kind: str
card_number: Elective[str] = None
bank_name: Elective[str] = None
verified: bool = False
class PaymentMethodFactory(ModelFactory[PaymentMethod]):
__model__ = PaymentMethod
payment_methods = [
PaymentMethodFactory.build(type="card", card_number="4111111111111111"),
PaymentMethodFactory.build(type="bank", bank_name="Chase Bank"),
PaymentMethodFactory.build(verified=True),
]
print("Payment Method Coverage:")
for i, pm in enumerate(payment_methods, 1):
print(f" {i}. Type: {pm.type}")
if pm.card_number:
print(f" Card: {pm.card_number}")
if pm.bank_name:
print(f" Bank: {pm.bank_name}")
print(f" Verified: {pm.verified}")
print("n")
print("=" * 80)
print("TUTORIAL SUMMARY")
print("=" * 80)
print("""
This tutorial lined:
1. ✓ Fundamental Dataclass Factories - Easy mock information technology
2. ✓ Customized Subject Turbines - Controlling particular person discipline values
3. ✓ Subject Constraints - Utilizing PostGenerated for calculated fields
4. ✓ Pydantic Integration - Working with validated fashions
5. ✓ Advanced Nested Constructions - Constructing associated objects
6. ✓ Attrs Assist - Different to dataclasses
7. ✓ Construct Overrides - Customizing particular cases
8. ✓ Use and Ignore - Specific discipline management
9. ✓ Protection Testing - Making certain complete take a look at information
Key Takeaways:
- Polyfactory routinely generates mock information from kind hints
- Customise technology with classmethods and interior designers
- Helps a number of libraries: dataclasses, Pydantic, attrs, msgspec
- Use PostGenerated for calculated/dependent fields
- Override particular values whereas retaining others random
- Excellent for testing, growth, and prototyping
For extra info:
- Documentation:
- GitHub:
""")
print("=" * 80)We cowl superior utilization patterns akin to express overrides, fixed discipline values, and protection testing eventualities. We present how we are able to deliberately assemble edge instances and variant cases for sturdy testing. This closing step ties all the pieces collectively by demonstrating how Polyfactory helps complete and production-grade take a look at information methods.
In conclusion, we demonstrated how Polyfactory allows us to create complete, versatile take a look at information with minimal boilerplate whereas nonetheless retaining fine-grained management over each discipline. We confirmed how one can deal with easy entities, complicated nested constructions, and Pydantic mannequin validation, in addition to express discipline overrides, inside a single, constant factory-based method. Total, we discovered that Polyfactory allows us to maneuver sooner and take a look at extra confidently, because it reliably generates life like datasets that intently mirror production-like eventualities with out sacrificing readability or maintainability.
Take a look at the FULL CODES right here. Additionally, be happy to observe us on Twitter and don’t neglect to hitch our 100k+ ML SubReddit and Subscribe to our E-newsletter. Wait! are you on telegram? now you’ll be able to be a part of us on telegram as effectively.



