Job Search and Career Advice Platform

Enable job alerts via email!

AI Evaluation Scenario Architect

the Homebase

Remote

GBP 40,000 - GBP 60,000

Part time

Today
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading tech opportunities provider in the UK seeks software engineers for project-based roles focused on testing AI systems. This part-time opportunity requires a strong background in Python and familiarity with Git. Contributors will create test cases, define evaluation standards, and analyze agent performance logs. Payment is up to $50/hour, allowing flexibility in work hours based on project complexity.

Qualifications

  • 3+ years of software development experience with a strong Python focus.
  • Experience in working with Git and code repositories.
  • Comfortable with structured formats like JSON/YAML for scenario description.

Responsibilities

  • Create structured test cases that simulate complex human workflows.
  • Define gold-standard behavior and scoring logic to evaluate agent actions.
  • Analyze agent logs and decision paths.

Skills

Software development experience
Strong Python focus
Experience with Git
Structured formats like JSON/YAML
Understanding core LLM limitations
Familiarity with Docker
English proficiency (B2)
Job description
A leading tech opportunities provider in the UK seeks software engineers for project-based roles focused on testing AI systems. This part-time opportunity requires a strong background in Python and familiarity with Git. Contributors will create test cases, define evaluation standards, and analyze agent performance logs. Payment is up to $50/hour, allowing flexibility in work hours based on project complexity.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Project-based AI Testing Engineer — Python & ML

the Homebase

United Kingdom
Remote
GBP 60,000 - 80,000
Part time
30+ days ago
Electrical Engineer & Python AI Evaluator (Contract)

the Homebase

United Kingdom
Remote
GBP 80,000 - 100,000
Part time
30+ days ago
Evaluation Scenario Writer - AI Agent Testing Specialist

the Homebase

Greater London
Remote
GBP 40,000 - 60,000
Part time
30+ days ago
Senior AI Agent Engineer (Equity)

Voice AI Space

Greater London
On-site
GBP 190,000 - 260,000
Full time
30+ days ago
AI Developer (Junior-Mid) | Hybrid & Growth

Delta Controls Inc.

England
Hybrid
GBP 62,000 - 82,000
Full time
30+ days ago
AI Engineer: Architect Autonomous, Production-Ready Agents

MVF Global Ltd

Greater London
Hybrid
GBP 60,000 - 80,000
Full time
30+ days ago
AI Agent Deployment Engineer

Applied Intuition Inc.

United Kingdom
Hybrid
GBP 80,000 - 100,000
Full time
30+ days ago
Remote AI Systems Developer - Train & Evaluate Chatbots

Dataannotation

United Kingdom
Remote
GBP 80,000 - 100,000
Full time
30+ days ago
AI QA Engineer – Multi-Agent & LLM Platforms

Delta Controls Inc.

England
Hybrid
GBP 48,000 - 59,000
Full time
30+ days ago
Freelance Mechanical Engineering & Python Expert - AI Trainer

the Homebase

United Kingdom
Remote
GBP 60,000 - 80,000
Part time
30+ days ago