Posted at: 27 November
Intermediate Machine Learning Engineer - AI Framework
Company
GitLab
GitLab is a comprehensive DevSecOps platform that helps teams deliver software faster and more efficiently while strengthening security and compliance.
Remote Hiring Policy:
GitLab has a flexible remote work policy, allowing employees to work from anywhere. GitLab hires remotely from all over the world, with all team members being remote since inception.
Job Type
Full-time
Allowed Applicant Locations
Serbia, Worldwide
Job Description
An overview of this role
Are you passionate about building robust frameworks to evaluate and ensure the reliability of AI models? As a Machine Learning Engineer on GitLab’s AIF team, you’ll play a critical role in shaping the future of AI-powered features at GitLab. This is an exciting opportunity to work on impactful projects that directly influence the quality of GitLab’s AI capabilities.
You’ll help merge cutting-edge evaluation tools, optimize dataset management, and scale our validation infrastructure. Working closely with other AI feature teams, you’ll ensure that every AI feature we deliver is robust, reliable, and meets the highest quality standards.
Some challenges in this role include designing scalable solutions for LLM evaluation, consolidating disparate validation tools, and contributing to GitLab’s innovative AI roadmap.
Some examples of our projects:
Consolidating Evaluation Tooling | The GitLab Handbook GitLab.org / AI Powered / ELI5
What You’ll Do
-
Design and implement technical evaluators for LLM assessment.
-
Contribute to evaluation infrastructure consolidation efforts.
-
Build scalable evaluation pipelines and frameworks.
-
Develop and manage datasets and evaluation metrics.
-
Collaborate with feature teams to integrate validation solutions.
-
Optimize performance across ML evaluation systems.
-
Support improvements to GitLab’s AI-powered tools through validation.
-
Ensure all solutions align with GitLab’s infrastructure and security protocols.
What You’ll Bring
-
Proven experience designing and implementing LLM evaluation systems.
-
Strong understanding of ML model architectures, including public vs. private implementations.
-
Expertise in ML evaluation metrics and dataset management.
-
Demonstrated ability to build production-grade ML infrastructure.
-
Practical experience with Python-based ML frameworks and evaluation tools (e.g., Langsmith, ELI5).
-
Excellent problem-solving skills with an engineering mindset.
-
Ability to collaborate in an asynchronous, remote-first environment.
-
Familiarity with open-source development and contribution is a plus.
About the team
The AIF team ensures that AI models across GitLab are reliable and well-validated. We focus on building robust evaluation frameworks, consolidating tools, and streamlining processes to scale validation efforts across GitLab’s AI infrastructure. Working on high-impact projects, the team partners with AI feature teams to deliver quality-focused solutions that enhance user trust and product performance.
How GitLab will support you
-
All remote , asynchronous work environment
-
Home office support
Please note that we welcome interest from candidates with varying levels of experience; many successful candidates do not meet every single requirement. Additionally, studies have shown that people from underrepresented groups are less likely to apply to a job unless they meet every single qualification. If you're excited about this role, please apply and allow our recruiters to assess your application.
Remote-Global