Probe Packs

Work with owned and imported probe packs — versioned security test suites for LLM assessment.

Introduction

Probe packs are curated collections of security tests (probes) designed to evaluate specific vulnerabilities or attack vectors in language models. They provide structured, versioned test suites for comprehensive security assessments.

Types of Probe Packs

ModelRed supports two categories of probe packs:

Owned Probe Packs

Probe packs created by your organization. These are private by default but can be made public.

owned_packs = client.list_owned_probes(page_size=20)

for pack in owned_packs["data"]:
    print(f"{pack['name']}: {pack['probeCount']} probes")

Imported Probe Packs

Public probe packs from ModelRed or other organizations that you've imported into your workspace.

imported_packs = client.list_imported_probes(page_size=20)

for pack in imported_packs["data"]:
    print(f"{pack['name']} (imported)")

Important: Public probe packs must be imported before use in assessments. Import them via the web UI.

Listing Owned Probe Packs

Basic Listing

response = client.list_owned_probes(
    page=1,
    page_size=20,
)

for pack in response["data"]:
    print(f"{pack['id']}: {pack['name']}")
    print(f"  Category: {pack['category']}")
    print(f"  Probes: {pack['probeCount']}")
    print(f"  Public: {pack['isPublic']}")

Filtering by Category

# Get injection-related probe packs
injection_packs = client.list_owned_probes(
    category="injection",
)

# Get jailbreak probe packs
jailbreak_packs = client.list_owned_probes(
    category="jailbreak",
)

Search by Name

results = client.list_owned_probes(
    search="sql injection",
)

Filter by Visibility

# Only public packs
public_packs = client.list_owned_probes(
    is_public=True,
)

# Only private packs
private_packs = client.list_owned_probes(
    is_public=False,
)

Owned Probe Pack Parameters

PropTypeDefault
page?
int
1
page_size?
int
20
category?
string | None
-
search?
string | None
-
is_public?
bool | None
-
sort_by?
string
"createdAt"
sort_dir?
"asc" | "desc"
"desc"

Listing Imported Probe Packs

Basic Listing

response = client.list_imported_probes(
    page=1,
    page_size=20,
)

for pack in response["data"]:
    print(f"{pack['name']} (imported on {pack['importedAt']})")

Sorting Options

# Sort by import date (default)
recent = client.list_imported_probes(
    sort_by="importedAt",
    sort_dir="desc",
)

# Sort by name
alphabetical = client.list_imported_probes(
    sort_by="name",
    sort_dir="asc",
)

# Sort by probe count
by_size = client.list_imported_probes(
    sort_by="probeCount",
    sort_dir="desc",
)

Imported Probe Pack Parameters

PropTypeDefault
page?
int
1
page_size?
int
20
category?
string | None
-
search?
string | None
-
sort_by?
"importedAt" | "name" | "category" | "probeCount" | "promptCount"
"importedAt"
sort_dir?
"asc" | "desc"
"desc"

Iterating All Probe Packs

Use iterator helpers for automatic pagination:

Owned Packs

# Iterate all owned packs
for pack in client.iter_owned_probes(page_size=50):
    print(f"Owned: {pack['name']}")

# With filters
for pack in client.iter_owned_probes(
    page_size=50,
    category="injection",
    is_public=True,
):
    print(f"Public injection pack: {pack['name']}")

Imported Packs

# Iterate all imported packs
for pack in client.iter_imported_probes(page_size=50):
    print(f"Imported: {pack['name']}")

# With filters
for pack in client.iter_imported_probes(
    page_size=50,
    category="jailbreak",
):
    print(f"Imported jailbreak pack: {pack['name']}")

See the Pagination guide for more details.

Getting Probe Pack Details

Retrieve comprehensive information about a specific probe pack:

pack = client.get_probe_pack("pack_abc123")

print(f"Name: {pack['name']}")
print(f"Description: {pack['description']}")
print(f"Category: {pack['category']}")
print(f"Version: {pack['version']}")
print(f"Probe Count: {pack['probeCount']}")
print(f"Created: {pack['createdAt']}")

Getting Probe Pack Data

Access the actual probe content and prompts:

data = client.get_probe_pack_data("pack_abc123")

print(f"Probes: {len(data['probes'])}")

for probe in data["probes"]:
    print(f"\nProbe: {probe['name']}")
    print(f"Type: {probe['type']}")
    print(f"Prompts: {len(probe['prompts'])}")

    # Access individual prompts
    for prompt in probe["prompts"]:
        print(f"  - {prompt['text'][:50]}...")

Note: Probe pack data may be large. Use pagination or filtering when possible.

Common Categories

Probe packs are typically organized into these categories:

Injection

Prompt injection, SQL injection, command injection attacks

Jailbreak

Attempts to bypass model safety guardrails and policies

Data Exfiltration

Tests for unauthorized data access or leakage

Prompt Leaking

Attempts to extract system prompts or instructions

Additional categories include:

  • Policy Violation — Tests for adherence to content policies and guidelines
  • Adversarial — General adversarial inputs and edge cases

Using Probe Packs in Assessments

Combine multiple probe packs for comprehensive testing:

# Get relevant packs
owned = client.list_owned_probes(category="injection", page_size=5)
imported = client.list_imported_probes(category="jailbreak", page_size=5)

# Collect IDs
pack_ids = []
if owned.get("data"):
    pack_ids.extend([p["id"] for p in owned["data"][:2]])
if imported.get("data"):
    pack_ids.extend([i["id"] for i in imported["data"][:2]])

# Create assessment with multiple packs
assessment = client.create_assessment_by_id(
    model_id="model_123",
    probe_pack_ids=pack_ids,
    detector_provider="openai",
    detector_api_key="sk-...",
    detector_model="gpt-4o-mini",
)

Versioning

Probe packs are versioned to ensure reproducible assessments:

pack = client.get_probe_pack("pack_abc123")
print(f"Version: {pack['version']}")  # e.g., "1.2.0"

# When creating assessments, the current version is automatically captured
assessment = client.create_assessment_by_id(
    model_id="model_123",
    probe_pack_ids=["pack_abc123"],  # Uses current version
    detector_provider="openai",
    detector_api_key="sk-...",
    detector_model="gpt-4o-mini",
)

Response Structure

Probe Pack Summary

{
    "id": "pack_abc123",
    "name": "SQL Injection Test Suite",
    "description": "Comprehensive SQL injection probes",
    "category": "injection",
    "version": "1.0.0",
    "probeCount": 150,
    "promptCount": 450,
    "isPublic": False,
    "createdAt": "2024-12-01T10:00:00Z",
    "updatedAt": "2024-12-15T14:30:00Z"
}

Probe Pack Data

{
    "id": "pack_abc123",
    "name": "SQL Injection Test Suite",
    "version": "1.0.0",
    "probes": [
        {
            "id": "probe_1",
            "name": "Basic SQL Injection",
            "type": "injection",
            "prompts": [
                {
                    "id": "prompt_1",
                    "text": "'; DROP TABLE users; --",
                    "metadata": {...}
                }
            ]
        }
    ]
}

Async Examples

All operations have async equivalents:

import asyncio
from modelred import AsyncModelRed

async def main():
    async with AsyncModelRed(api_key="mr_...") as client:
        # List owned
        owned = await client.list_owned_probes(page_size=20)

        # List imported
        imported = await client.list_imported_probes(page_size=20)

        # Get details
        pack = await client.get_probe_pack("pack_abc123")

        # Get data
        data = await client.get_probe_pack_data("pack_abc123")

asyncio.run(main())

Best Practices

Importing Public Packs

Before using a public probe pack, import it via the web UI. The SDK will raise NotFound if you reference an unimported pack.

Log into the ModelRed web application

Navigate to Probe Packs section and find public/community packs

Click "Import" on desired packs to add them to your workspace

Imported packs now appear in list_imported_probes() and can be used in assessments

Combining Categories

Test models comprehensively by combining probe packs from different categories:

pack_ids = []

# Add injection tests
injection = client.list_owned_probes(category="injection", page_size=1)
if injection["data"]:
    pack_ids.append(injection["data"][0]["id"])

# Add jailbreak tests
jailbreak = client.list_imported_probes(category="jailbreak", page_size=1)
if jailbreak["data"]:
    pack_ids.append(jailbreak["data"][0]["id"])

# Add policy tests
policy = client.list_owned_probes(category="policy", page_size=1)
if policy["data"]:
    pack_ids.append(policy["data"][0]["id"])

# Run comprehensive assessment
assessment = client.create_assessment_by_id(
    model_id="model_123",
    probe_pack_ids=pack_ids,
    detector_provider="openai",
    detector_api_key="sk-...",
    detector_model="gpt-4o-mini",
)

Version Pinning

For reproducible security audits, document the probe pack versions used:

packs_used = []

for pack_id in probe_pack_ids:
    pack = client.get_probe_pack(pack_id)
    packs_used.append({
        "id": pack["id"],
        "name": pack["name"],
        "version": pack["version"],
    })

# Store packs_used with assessment results for audit trail
print(f"Assessment used probe packs: {packs_used}")

Complete Example

comprehensive_testing.py
from modelred import ModelRed
import os

client = ModelRed(api_key=os.environ["MODELRED_API_KEY"])

# Step 1: Discover available probe packs
print("=== Owned Probe Packs ===")
owned = client.list_owned_probes(page_size=10)
for pack in owned["data"]:
    print(f"- {pack['name']} ({pack['category']}) - {pack['probeCount']} probes")

print("\n=== Imported Probe Packs ===")
imported = client.list_imported_probes(page_size=10)
for pack in imported["data"]:
    print(f"- {pack['name']} ({pack['category']}) - {pack['probeCount']} probes")

# Step 2: Select packs for comprehensive testing
pack_ids = []

# Get one pack from each major category
categories = ["injection", "jailbreak", "exfiltration", "policy"]

for category in categories:
    # Try owned first
    packs = client.list_owned_probes(category=category, page_size=1)
    if packs.get("data"):
        pack_ids.append(packs["data"][0]["id"])
        continue

    # Fall back to imported
    packs = client.list_imported_probes(category=category, page_size=1)
    if packs.get("data"):
        pack_ids.append(packs["data"][0]["id"])

print(f"\n=== Selected {len(pack_ids)} probe packs for testing ===")

# Step 3: Get pack details for documentation
for pack_id in pack_ids:
    pack = client.get_probe_pack(pack_id)
    print(f"- {pack['name']} v{pack['version']} ({pack['probeCount']} probes)")

# Step 4: Create comprehensive assessment
models = client.list_models(status="active", page_size=1)
if models["data"] and pack_ids:
    assessment = client.create_assessment_by_id(
        model_id=models["data"][0]["id"],
        probe_pack_ids=pack_ids,
        detector_provider="openai",
        detector_api_key=os.environ["OPENAI_API_KEY"],
        detector_model="gpt-4o-mini",
        priority="high",
    )
    print(f"\n✓ Comprehensive assessment created: {assessment['id']}")
else:
    print("\n✗ No models or probe packs available")

Error Handling

Common errors when working with probe packs:

from modelred import NotFound, ValidationFailed

try:
    pack = client.get_probe_pack("nonexistent")
except NotFound as e:
    print(f"Probe pack not found: {e.message}")

try:
    assessment = client.create_assessment_by_id(
        model_id="model_123",
        probe_pack_ids=["unimported_public_pack"],
        detector_provider="openai",
        detector_api_key="sk-...",
        detector_model="gpt-4o-mini",
    )
except NotFound as e:
    print("Import the probe pack via web UI first")

See the Error Handling guide for comprehensive error management.

Next Steps