Code Data Quality Specialist

Mistral AI

Paris / LondononsiteCompetitiveAdded yesterdayPermanentRemote: On Site

Visa Sponsor English Required Research

Mistral AI

·Paris / London

Apply Now

Code Data Quality Specialist

Mistral AI

Original Advert

About Mistral

At Mistral we are on a mission to democratize AI, producing frontier intelligence for everyone, developed in the open, and built by engineers all over the world. We are a dynamic, collaborative team passionate about AI and its potential to transform society. Our diverse workforce thrives in competitive environments and is committed to driving innovation, with teams distributed between Europe, the USA and Asia.

We are creative, low-ego and team-spirited.At Mistral, we develop models for the enterprise and for consumers, focusing on delivering systems which can really change the way in which businesses operate and which can integrate into our daily lives. All while releasing frontier models open-source, for everyone to try and benefit.

Mistral is hiring experts in the training of large language models and distributed systems. Join us to be part of a pioneering company shaping the future of AI.

Role Summary
We're seeking highly motivated Data Quality Specialists with strong analytical skills and a keen eye for detail to join our Human Data Annotation team within the Science organisation.

This is a hybrid quality reviewing and tooling role: you'll spend the majority of your time reviewing and auditing code annotations against rubrics to ensure data used for training and evaluating AI models meets a high bar, and the remainder building, maintaining, and troubleshooting the internal tooling that annotators rely on day-to-day.

You'll collaborate closely with the annotators, technical program manager, and engineer stakeholders, and contribute to refining the guidelines and processes that shape how our data is produced.

Key Responsibilities

Generate and validate high-quality data annotations, based on guidelines and continuous feedback, for the development and evaluation of AI models

Surface systemic issues, edge cases, and gaps in guidelines back to annotation operations and technical stakeholders

Produce annotations yourself when needed, modeling the quality bar expected of the team

Build and maintain internal tools and automation that streamline annotator workflows such as visualization dashboards, batch configuration scripts, output management utilities, and similar

Troubleshoot environment, tooling, and CLI/git issues for annotators on their local machines, liaising with IT and engineering as needed

About you

A degree in computer science, engineering, or a related field. Alternatively, 2 to 5 years of professional experience in software engineering, technical support, or developing tools

Hands-on experience using code agents (e.g. Mistral's vibe) in your own development workflow, and genuine interest in how they're evolving

Proficient in at least one programming language (e.g. Python, JavaScript, or similar), with enough breadth to read and reason about code across a few core languages

Able to apply consistent judgment against a rubric and surface edge cases, ambiguities, or gaps in guidelines

Sustained focus and accuracy on detail-oriented, high-volume review work

Comfortable working in a Unix-like terminal: shell basics, package managers, environment setup, and git workflows (branches, merges, resolving conflicts)

Able to troubleshoot local development environment issues (dependencies, virtual environments, paths, permissions) across common operating systems

Professional proficiency in English, with strong writing and comprehension skills

Nice to have

Prior experience in data annotation for AI/ML, especially LLM training (SFT, RLHF, preference data), evals/benchmarks, or agentic data

Experience building an annotation team through interviews and training

Experience supporting technical users or troubleshooting developer environments (internal tools support, DevRel, teaching assistant for coding courses, etc.)

Fluency across multiple programming languages, or domain depth in one of: frontend, backend, DevOps, MLOps, data engineering

Familiarity with rubric-based evaluation concepts, inter-annotator agreement, or quality measurement for human-labeled data

Experience developing, deploying, and managing internal tooling or automation scripts

Benefits:

France

💰 Competitive cash salary and equity

🥕 Food : Daily lunch vouchers

🥎 Sport : Monthly contribution to a Gympass subscription

🚴 Transportation : Monthly contribution to a mobility pass

🧑‍⚕️ Health : Full health insurance for you and your family

🍼 Parental : Generous parental leave policy

🌎 Visa sponsorship