API Reference¶
Core¶
ImageEval¶
evalmedia.eval.ImageEval
¶
CheckResult¶
evalmedia.core.CheckResult
¶
Bases: BaseModel
Result of a single check evaluation.
EvalResult¶
evalmedia.core.EvalResult
¶
CompareResult¶
evalmedia.core.CompareResult
¶
Bases: BaseModel
Result of comparing multiple images.
best()
¶
Return the top-ranked (label, result) pair.
CheckStatus¶
evalmedia.core.CheckStatus
¶
Bases: str, Enum
Status of a check evaluation.
Checks¶
BaseCheck¶
evalmedia.checks.base.BaseCheck
¶
Bases: ABC
Abstract base class for all checks.
VLMCheck¶
evalmedia.checks.base.VLMCheck
¶
ClassicalCheck¶
evalmedia.checks.base.ClassicalCheck
¶
Image Checks¶
PromptAdherence¶
evalmedia.checks.image.prompt_adherence.PromptAdherence
¶
FaceArtifacts¶
evalmedia.checks.image.face_artifacts.FaceArtifacts
¶
HandArtifacts¶
evalmedia.checks.image.hand_artifacts.HandArtifacts
¶
TextLegibility¶
evalmedia.checks.image.text_legibility.TextLegibility
¶
AestheticQuality¶
evalmedia.checks.image.aesthetic_quality.AestheticQuality
¶
StyleConsistency¶
evalmedia.checks.image.style_consistency.StyleConsistency
¶
CLIPSimilarity¶
evalmedia.checks.image.clip_similarity.CLIPSimilarity
¶
Bases: ClassicalCheck
Computes CLIP cosine similarity between prompt text and image.
evaluate(image, prompt, judge=None)
async
¶
Compute CLIP cosine similarity between the image and prompt.
ResolutionAdequacy¶
evalmedia.checks.image.resolution_adequacy.ResolutionAdequacy
¶
Bases: ClassicalCheck
Checks whether the image resolution meets minimum requirements.
evaluate(image, prompt, judge=None)
async
¶
Check image dimensions against minimum requirements.
Judges¶
Judge Protocol¶
evalmedia.judges.base.Judge
¶
Bases: Protocol
Protocol that all judge backends must implement.
JudgeResponse¶
evalmedia.judges.base.JudgeResponse
¶
Bases: BaseModel
Structured response from a VLM judge.
Rubrics¶
Rubric¶
evalmedia.rubrics.base.Rubric
¶
Bases: BaseModel
A named collection of weighted checks with a pass/fail threshold.
WeightedCheck¶
evalmedia.rubrics.base.WeightedCheck
¶
Bases: BaseModel
A check with an associated weight for rubric scoring.
Configuration¶
set_judge¶
evalmedia.config.set_judge(name, **kwargs)
¶
Set the default judge backend.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Judge name (e.g. "claude", "openai"). |
required |
**kwargs
|
object
|
Additional config overrides (e.g. api_key). |
{}
|
compare¶
evalmedia.eval.compare(images, prompt, checks=None, rubric=None, judge=None, labels=None)
async
¶
Evaluate multiple images and rank them by score.
Integrations¶
openai_tool_schema¶
evalmedia.integrations.openai_tools.openai_tool_schema()
¶
Return a tool definition compatible with OpenAI's function calling format.
Usage
tools = [openai_tool_schema()] response = client.chat.completions.create(..., tools=tools)
anthropic_tool_schema¶
evalmedia.integrations.anthropic_tools.anthropic_tool_schema()
¶
Return a tool definition compatible with Anthropic's tool_use format.
Usage
tools = [anthropic_tool_schema()] response = client.messages.create(..., tools=tools)