AI Evaluation Specialist | $20-$35/hr Remote
Overview
This role focuses on training next-generation AI systems by designing self-contained evaluation tasks, grading rubrics, and prompts. The specialist will meticulously observe and document AI agent behaviors, producing clear reports to refine benchmarking methodologies. The position involves cross-functional collaboration and iterative improvement of evaluation frameworks. It's a remote contractor position paying $20-$35 per hour.
What You'll Do6
- 1Design and implement self-contained evaluation tasks, including prompts, supporting files, and detailed grading rubrics to assess AI performance on practical computer-based workflows.
- 2Define clear, unambiguous written criteria for successful and unsuccessful task completion across diverse administrative and workflow scenarios.
- 3Meticulously observe and document AI agent behaviors, producing crisp, precise summaries and reports in high-quality English.
- 4Iterate and refine evaluation tasks and rubrics based on feedback and team collaboration to ensure robust benchmarking methodologies.
- 5Work cross-functionally across a wide range of domains, adapting evaluation frameworks as project requirements evolve.
- 6Collaborate with the customer's team to share insights and help drive continuous improvement in AI evaluation techniques.
Requirements7
- 1Minimum 3 years of experience in roles emphasizing written precision and structured thinking (e.g., paralegal, executive assistant, junior analyst, librarian, technical writer, QA analyst).
- 2Native or fluent English writing ability with demonstrated skill in producing succinct, specific, and unambiguous observations.
- 3Proven skill in designing or applying rubric-based evaluation, grading against set criteria, or building structured scoring frameworks.
- 4High attention to detail and ability to notice subtle patterns or inconsistencies others might miss.
- 5Exceptional written and verbal communication skills for documenting nuanced observations and feedback.
- 6Fluency in navigating computers, common SaaS tools, web browsers, file management, and document editing platforms.
- 7Strong self-direction and ability to independently take ownership of ambiguous or loosely defined projects.
Who Should Apply
The ideal candidate has a background in roles that require written precision and structured thinking, such as paralegal, technical writer, or QA analyst. They should be skilled in rubric-based evaluation, have high attention to detail, and be comfortable working independently on ambiguous projects. Strong English writing and computer literacy are essential.
Salary Insight
$20-$35 per hour (contractor position).
Required Skills
Application Tip
Highlight specific examples of rubric-based evaluation or structured observation from past roles, and provide a sample of concise, precise written reporting in your application.