Documentation

Hugging Face

Integrate Hugging Face models with ModelRed

🤗

Hugging Face Models

Test thousands of open-source models from the Hugging Face Hub using the Inference API or your own dedicated endpoints.

Quick Setup

Get Started in 3 Steps

Connect any Hugging Face model for security testing.

1

Get Your API Token

Visit Hugging Face Tokens and create an API token (optional for public models).

hf_your_token_here
2

Set Environment Variable

BASH
export HUGGINGFACE_API_TOKEN="hf_your_token_here"
3

Register Your Model

PYTHON
from modelred import ModelRed

async with ModelRed() as client:
    await client.register_huggingface_model(
        model_id="my-llama-model",
        model_name="meta-llama/Llama-2-7b-chat-hf",
        api_key="hf_your_token",  # or use env var
        use_inference_api=True
    )
🦙

Llama 2 7B Chat

Popular open-source chat model

Most PopularMeta
💻

CodeLlama 7B

Code generation and instruction following

CodeInstruct

Mistral 7B

Fast and efficient instruction model

FastEfficient

Configuration Options

Advanced Setup

Different ways to connect your Hugging Face models.

🌐

Inference API

PYTHON
# Quick testing with hosted models
await client.register_huggingface_model(
    model_id="quick-test",
    model_name="microsoft/DialoGPT-medium",
    use_inference_api=True,
    task="text-generation"
)
🏢

Custom Endpoint

PYTHON
# Dedicated endpoint for production
await client.register_huggingface_model(
    model_id="production-model",
    model_name="your-org/custom-model",
    use_inference_api=False,
    endpoint_url="https://xyz.endpoints.huggingface.cloud"
)

Common Issues

⚠️ Troubleshooting

Model Not Found
RepositoryNotFoundError: Model not found
Solutions:
  • • Check exact model name on Hugging Face Hub
  • • Verify model supports text generation
  • • Ensure you have access to private models
Rate Limit Exceeded
RateLimitError: Too many requests
Solutions:
  • • Get Hugging Face Pro for higher limits
  • • Use dedicated endpoints for production
  • • Add retry logic with backoff
Model Loading Failed
ModelError: Model failed to load
Solutions:
  • • Model may be too large for Inference API
  • • Try smaller model variants (7B instead of 13B/70B)
  • • Check model compatibility with task type

Quick Test

Verify Your Setup

Run this test to confirm your Hugging Face integration is working:

PYTHON
import asyncio
from modelred import ModelRed

async def test_huggingface():
    async with ModelRed() as client:
        # Register Hugging Face model
        await client.register_huggingface_model(
            model_id="test-hf-model",
            model_name="microsoft/DialoGPT-medium",
            use_inference_api=True
        )
        print("✅ Hugging Face model registered!")

        # Run security test
        result = await client.run_assessment(
            model_id="test-hf-model",
            test_suites=["basic_security", "content_safety"]
        )
        print(f"🔍 Assessment started: {result.assessment_id}")

asyncio.run(test_huggingface())

Model Categories

📚 Popular Model Types

Chat & Instruction Models

meta-llama/Llama-2-7b-chat-hf

mistralai/Mistral-7B-Instruct-v0.1

microsoft/DialoGPT-medium

HuggingFaceH4/zephyr-7b-beta

Code & Specialized Models

codellama/CodeLlama-7b-Instruct-hf

bigcode/starcoder

bigscience/bloom-7b1

google/flan-t5-large

Next Steps