Prerequisites
Before using the OpenAI provider, you need:- An OpenAI API key - Get one here
- Set the
OPENAI_API_KEYenvironment variable:
Basic Usage
Configuration Options
Required Options
OpenAI model name to use for evaluations.Examples:
gpt-4o, gpt-4-turbo, gpt-3.5-turbo, gpt-4o-miniOpenAI API key for authentication.Environment variable:
OPENAI_API_KEYThe API key can be provided via the
OPENAI_API_KEY environment variable instead of passing it as a flag.Optional Options
OpenAI API base URL for compatible endpoints. Use this to connect to OpenAI-compatible services or custom deployments.Environment variable:
OPENAI_BASE_URLOpenAI organization ID for API requests.Environment variable:
OPENAI_ORG_IDSampling temperature between 0 and 2. Higher values make output more random, lower values make it more deterministic.Range: 0.0 to 2.0
Nucleus sampling parameter. An alternative to sampling with temperature.Range: 0.0 to 1.0
Upper bound for the number of tokens that can be generated for a completion.
Number of chat completion choices to generate for each input message.
Number between -2.0 and 2.0 to penalize new tokens based on their existing frequency in the text.Range: -2.0 to 2.0
Number between -2.0 and 2.0 to penalize new tokens based on whether they appear in the text so far.Range: -2.0 to 2.0
Whether to return log probabilities of the output tokens.
Number of most likely tokens to return at each token position, each with an associated log probability.Range: 0 to 20
Up to 4 sequences where the API will stop generating further tokens. Use comma-separated values for multiple sequences.Example:
--stop "\n,END,STOP"Modify the likelihood of specified tokens appearing in the completion.Format:
token_id:bias_value,token_id:bias_valueRange: Bias values must be between -100 and 100Example: --logit-bias "1234:50,5678:-30"Whether to store the output of this chat completion request for model distillation or evaluation purposes.
Specifies the processing type used for serving the request.Options:
auto, default, flex, scale, priorityConstrains effort on reasoning for reasoning models like o1.Options:
none, minimal, low, medium, high, xhighExamples
Basic Single-Turn Evaluation
Multi-Turn Evaluation with Custom Temperature
Using a Custom Fine-Tuned Model
Using an OpenAI-Compatible Endpoint
Advanced Configuration with Multiple Parameters
Supported Models
The OpenAI provider supports all OpenAI chat completion models, including:- GPT-4o: Latest multimodal flagship model
- GPT-4o-mini: Smaller, faster GPT-4o variant
- GPT-4 Turbo: High-performance GPT-4 variant
- GPT-4: Original GPT-4 model
- GPT-3.5 Turbo: Fast and cost-effective model
- o1: Reasoning model series (use with
--reasoning-effort) - Custom fine-tuned models: Any fine-tuned model based on supported base models
For the most up-to-date list of available models and their capabilities, see the OpenAI Models documentation.
Environment Variables
The following environment variables are supported:| Variable | Description | Required |
|---|---|---|
OPENAI_API_KEY | Your OpenAI API key | Yes |
OPENAI_BASE_URL | Custom API endpoint URL | No |
OPENAI_ORG_ID | Your OpenAI organization ID | No |