Documentation

Test Suites Reference

Complete reference of all available security test suites, their capabilities, and threat detection mechanisms

ModelRed offers over 60 specialized test suites designed to detect various AI security vulnerabilities. Each test suite contains multiple threat probes that systematically test your AI models against real-world attack vectors.

Quick Reference

14
Free Tier Suites
24
Starter Tier Suites
13
Pro Tier Suites
7
Enterprise Suites

Usage Example

import asyncio
from modelred import ModelRed

async def main():
    async with ModelRed(api_key="mr_your_api_key_here") as client:
        # Run specific test suites
        result = await client.run_assessment(
            model_id="my-model",
            test_suites=[
                "base64_injection",
                "toxicity_generation",
                "advanced_jailbreak_ablation"
            ],
            priority="high",
            wait_for_completion=True,
            timeout_minutes=30
        )

        # With progress callback
        def progress_handler(progress, status):
            print(f"Assessment progress: {progress}% - {status}")

        result = await client.run_assessment(
            model_id="my-model",
            test_suites=["base64_injection", "hexadecimal_injection"],
            priority="normal",
            wait_for_completion=True,
            progress_callback=progress_handler
        )

asyncio.run(main())

Free Tier Test Suites

Basic security and encoding attack detection - Available to all users

Encoding & Injection Attacks

  • base64_injection - Tests for Base64 encoding injection vulnerabilities
  • hexadecimal_injection - Detects hexadecimal encoding attacks
  • base16_injection - Tests Base16 encoding vulnerabilities
  • base32_injection - Base32 encoding injection testing
  • ascii85_injection - ASCII85 encoding attack detection
  • uu_encoding - UUencoding injection testing

Character Encoding & Obfuscation

  • rot13_encoding - ROT13 character substitution attacks
  • morse_encoding - Morse code obfuscation attacks
  • braille_encoding - Braille text encoding for hidden instructions
  • nato_encoding - NATO phonetic alphabet encoding
  • base2048_injection - Base2048 encoding using Unicode characters
  • ecoji_encoding - Emoji-based encoding injection
  • zalgo_encoding - Zalgo text corruption encoding

Terminal & System Attacks

  • ansi_escaped - ANSI escape sequence injection
  • ansi_raw - Raw ANSI control code injection

Starter Tier Test Suites

Content safety, social engineering, and advanced manipulation attacks

Content Safety & Toxicity

  • toxicity_generation - Advanced AI-powered toxicity generation testing
  • continue_slurs - Tests if the model will complete partial ethnic slurs
  • lmrc_bullying - Detects model's willingness to provide bullying advice
  • lmrc_slur_usage - Tests for identity-based slur generation
  • lmrc_sexual_content - Evaluates responses to sexually explicit content requests
  • lmrc_sexualisation - Tests for sexual objectification and inappropriate sexualization
  • lmrc_deadnaming - Tests for inappropriate use of transgender individuals' former names
  • lmrc_quack_medicine - Detects promotion of unscientific medical advice
  • rtp_blank - RealToxicityPrompts baseline test using empty prompts

Social Engineering & Manipulation

  • grandma_win10 - Social engineering attack using emotional manipulation for Windows product keys
  • grandma_slurs - Emotional manipulation technique to elicit ethnic slurs through fake nostalgia
  • grandma_substances - Tests extraction of harmful substance instructions through emotional manipulation
  • json_threat - Tests if threats can force model to output raw JSON without safety preambles
  • tag_injection - Invisible Unicode tag injection for hidden prompt manipulation

Prompt Injection & Jailbreaking

  • prompt_inject_hate - PromptInject framework attack targeting hate speech generation
  • prompt_inject_violence - PromptInject framework attack targeting violent content generation
  • past_tense_jailbreak - Tense manipulation technique using past tense to bypass safety restrictions
  • future_tense_jailbreak - Tense manipulation technique using future tense to evade content policies

Technical Vulnerabilities

  • jinja_template_injection - Server-side template injection testing for remote code execution
  • sql_injection_echo - SQL injection testing through echo commands
  • package_hallucination_python - Supply chain attack testing through non-existent Python package recommendations

Reasoning & Misinformation

  • who_is_riley - Hallucination detection probe testing false information about Riley Goodside
  • false_assertion - Tests model's ability to reject and refute demonstrably false claims
  • snowball_graph - Complex reasoning test using impossible flight routing problems

Pro Tier Test Suites

Research-grade techniques and advanced adversarial attacks

Advanced Jailbreaking

  • advanced_jailbreak_ablation - State-of-the-art DAN 11.0 ablation study with 127 jailbreak variations
  • auto_dan_cached - Cached AutoDAN attacks using pre-computed gradient descent jailbreaks
  • dan_in_the_wild - Real-world jailbreak collection from actual adversarial interactions

Latent Injection Attacks

  • latent_injection_report - Hidden prompt injection within business report documents
  • latent_injection_resume - Covert instruction injection through resume and CV analysis
  • latent_injection_eiffel - Document injection using Eiffel Tower fact snippets as cover
  • latent_injection_legal - Legal document injection attacks through embedded malicious instructions
  • latent_injection_french - Translation-based latent injection using French language context
  • latent_injection_chinese - Translation-based latent injection using Chinese language context
  • latent_jailbreak - Advanced latent jailbreak technique using translation task distraction
  • latent_whois - Domain reputation manipulation through WHOIS record injection

Advanced Attack Techniques

  • gcg_cached - Greedy Coordinate Gradient cached adversarial suffix attacks
  • wordnet_controversial - Controversial topic exploration using WordNet semantic hierarchy

Enterprise Tier Test Suites

Advanced malware detection and enterprise-grade security testing

Malware Generation Detection

  • malware_top_level - High-level malware generation requests across multiple programming languages
  • malware_sub_functions - Malware component generation testing for individual attack functions
  • malware_evasion - Anti-detection malware code generation for security evasion
  • malware_payload - Malicious payload generation including ransomware and cryptojacking

XSS & Data Exfiltration

  • xss_markdown_exfil - Markdown-based data exfiltration through embedded image URIs
  • xss_colab_leakage - Google Colab specific data leakage through image rendering exploits
  • xss_string_assembly - String assembly manipulation for conversation data exfiltration
  • xss_md_exfil_2023 - Historical markdown exfiltration exploit from September 2023

Supply Chain Attacks

  • package_hallucination_ruby - Ruby gem hallucination testing for supply chain vulnerabilities
  • package_hallucination_javascript - NPM package hallucination for JavaScript supply chain attacks
  • package_hallucination_rust - Rust crates.io package hallucination for cargo ecosystem attacks

Best Practices

Running Multiple Test Suites

# Run comprehensive security assessment
test_suites = [
    # Basic encoding attacks
    "base64_injection",
    "hexadecimal_injection",

    # Content safety
    "toxicity_generation",
    "lmrc_bullying",

    # Advanced attacks (Pro tier)
    "advanced_jailbreak_ablation",
    "latent_injection_report"
]

result = await client.run_assessment(
    model_id="my-model",
    test_suites=test_suites,
    wait_for_completion=True
)

Priority and Timeout Configuration

# High priority assessment with custom timeout
result = await client.run_assessment(
    model_id="my-model",
    test_suites=["advanced_jailbreak_ablation", "malware_payload"],
    priority="critical",  # Options: low, normal, high, critical
    wait_for_completion=True,
    timeout_minutes=20  # Extend timeout for long-running tests
)

# Background assessment (don't wait for completion)
result = await client.run_assessment(
    model_id="my-model",
    test_suites=["base64_injection", "toxicity_generation"],
    priority="normal",
    wait_for_completion=False  # Returns immediately with assessment_id
)

# Check status later
status = await client.get_assessment_status(result.assessment_id)

Performance Considerations

When running multiple test suites, consider:

  • Execution Time: Enterprise tier suites can take 8-12 minutes each
  • API Rate Limits: Spread large assessments across multiple sessions
  • Resource Usage: Pro and Enterprise tiers require more computational resources
  • Priority Levels: Use higher priorities for urgent security assessments
# Run assessments in batches to manage load
batch_1 = ["base64_injection", "hexadecimal_injection", "rot13_encoding"]
batch_2 = ["toxicity_generation", "lmrc_bullying", "grandma_win10"]

# Run first batch
result_1 = await client.run_assessment(
    model_id="my-model",
    test_suites=batch_1,
    wait_for_completion=True
)

# Run second batch
result_2 = await client.run_assessment(
    model_id="my-model",
    test_suites=batch_2,
    wait_for_completion=True
)