Academic Research Collaborator for AI Model Evaluation Job at SaidGig, United States

RlFyNkUzMlN4eEZDMTBKTi9aaFhzaXFOOEE9PQ==
  • SaidGig
  • United States

Job Description

Join a leading AI lab''s cutting-edge GenAI team to be at the core of the AI revolution, where your expertise fuels the development of the most advanced Large Language Models. Overview

Professors and PhD students across all academic disciplines, STEM (ML, Coding, Data Science, CS, Physics, Mathematics, Engineering, Statistics) as well as professional and quantitative domains (Finance, Accounting, Economics, Law, Business), are invited to contribute to a project supporting a frontier-model evaluation effort focused on coding and agentic workflows. You will design and validate challenging benchmark tasks to help surface and diagnose reasoning and problem-solving gaps in a target model. The work centers on building robust, real-world tasks with executable Python tests and analyzing model/agent behavior. All applicants are expected to have working proficiency in Python.

This is a W2 employment position with Cincinnatus LLC, with the opportunity to be placed at a leading AI Lab as part of their extended workforce. You will join a team of domain experts and together, you will guide the next generation of frontier AI tools.

Key Responsibilities
  • Task Design and Development: Design challenging, real-world domain-specific problems drawn from your area of expertise (e.g., financial modeling, legal reasoning, econometrics, ML, coding, scientific computation) that serve as the foundation for agentic tasks. Problems should be constructed to target specific core capability loss failures identified in a frontier AI model.
  • Spec & Golden Solution Generation: Integrate the problems into an Agentic development environment, preparing all necessary components using Python.
  • Evaluation and Analysis: Evaluate the target model''s performance on the tasks.
  • Headroom Identification: Identify tasks where the target model fails to pass all tests, specifically classifying the failure as a logical reasoning failure.
Core Qualifications
  • Current or retired professor, OR PhD student, in any of the following areas:
    • STEM: ML, Coding, Data Science, CS, Physics, Mathematics, Engineering, Statistics, Biology, Chemistry
    • Professional / Quantitative: Finance, Accounting, Economics, Law, Business
  • Degree (or PhD in progress) from a top university in your field.
  • Working proficiency in Python, applied in research, industry, GitHub, or coursework (not theoretical familiarity).
  • Ability to engage reliably for at least 30 hours/week during weekdays (i.e., at least 6 hours/day during weekdays).
  • Past experience in AI training, model evaluation, and data annotation is preferred.
  • Basic ability to work independently and manage one''s time.
  • Verbal and written communication skills, problem-solving skills, and interpersonal skills.
About Cincinnatus LLC

Cincinnatus LLC is an enterprise staffing company that partners with leading technology companies to source and employ highly skilled professionals for contingent and contract-based opportunities. Cincinnatus serves as the employer of record for these engagements, providing W-2 employment, payroll, benefits, and compliance, while placing employees directly within client teams to work on high-impact initiatives.

Roles hired through Cincinnatus are not project-based or freelance engagements. They are structured, role-based positions that typically involve part-time or full-time commitments, close collaboration with a client''s internal teams, and integration into standard enterprise workflows.

Cincinnatus is a legal entity separate from any platform. While opportunities may be discovered through various channels, employment, onboarding, payroll, and benefits for these roles are administered by Cincinnatus LLC.

Equal Employment Opportunity

Cincinnatus is proud to be an Equal Employment Opportunity employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy), age, disability, sexual orientation, gender identity, or any other characteristic protected by applicable law.

Job Tags

Full time, Contract work, Part time, Freelance, Work at office, Weekday work

Similar Jobs

Sedgwick County

Part-time Volunteer Coordinator - Aging & Disabilities Job at Sedgwick County

 ...Volunteer Coordinator plans and manages volunteer services for the Sedgwick County Retired and Senior Volunteer Program (RSVP), an AmeriCorps Seniors program. Working a 15-hour per week schedule, this position recruits, trains, supports, and recognizes volunteers;... 

Delta Airlines

Ticket/Gate Agent (Customer Service Agent) - LGA Job at Delta Airlines

 ...Compliance with these policies is mandatory for consideration. At Delta Air Lines, connection is at the heart of everything we do and...  ...our every action. We strive to welcome and care for all of our customers during their travels with us and aim to deliver an elevated... 

Avakar Lakeland Hospitality

Hotel Housekeeper Job at Avakar Lakeland Hospitality

 ...Responsibilities: Thoroughly clean and sanitize guest rooms, including making beds, dusting surfaces, vacuuming carpets, and replenishing amenities to ensure a high standard of cleanliness. Maintain cleanliness of public spaces such as lobbies, hallways, restrooms... 

Chick-fil-A

Management Trainee (Leadership Development Program) Job at Chick-fil-A

 ...team and a culture built on excellence, teamwork, and growth. Our Leadership Development Program (LDP)is built for recent college graduates ready to fast-track their business and leadership careers. Youll lead people, manage real business results, and learn every... 

HARMAN International

Principal Backend Engineer Job at HARMAN International

 ...world-class music metadata services. We are seeking experienced developers passionate about working with data and backend technologies...  ...colleagues in the Data Team and the broader Harman Software Experiences organization, contributing to developing and deploying data-driven...