Bachelor's degree or equivalent practical experience
8 years of experience working with Large Language Models (LLMs) and building agents
Nice to haves:
Understanding of core machine learning (ML) concepts, training algorithms, best practices for evaluation
Understanding of data, familiarity with basic statistical analysis concepts (e.g., variance, p-value, bias), and data science classics
Understanding of essential concepts such as tokens, context, Retrieval-Augmented Generation (RAG) and function calling
Understanding of core ML concepts, training algorithms, best practices for evaluation
Familiarity with running quality interactions, which involves understanding business goals, defining aligned technical metrics, implementing evaluation frameworks, designing experiments, analyzing results, performing RCAs, formulating hypotheses for improvements, and conducting ablation or live experiments
What you'll be doing:
Create high-quality datasets for evaluation and training to improve model performance for cloud customers
Design new agentic systems and context engineering algorithms, utilizing existing models and identifying need for new training/evaluation objectives where necessary
Define and implement metrics that correspond to business problems
Prototype and iterate on the solution working closely with customers, product management, and business development
Design and implement evaluation frameworks and tooling for dataset creation and curation