Our Privacy Statement & Cookie PolicyJoin a cutting-edge research team working to deliver on the transformation promises of modern AI. We are seeking Machine Learning Research Engineers with the skills and drive to build and conduct experiments with advanced AI systems in an academic environment rich with high-quality data from real-world problems. Foundational Research is the dedicated core Machine Learning research division of Thomson Reuters. We are focused on research and development, with a particular focus on advanced algorithms and training techniques for Large Language Models (LLMs). We are expanding our strong foundation of research capabilities across different areas and are looking for engineers who participate in designing, coding, conducting experiments, and translating findings into concrete deliverables. Our focus areas are:* LLM Training (Continued pretraining, instruction tuning, reinforcement learning, distributed training, efficient ML techniques)* Post-training techniques for planning, reasoning & complex workflows (e.g., reasoning models, LLMs + knowledge graphs, test time compute, CoT pipelines, tool use & API calling, etc.)* Data-centric Machine Learning (Synthetic data, curriculum learning, learned data mixtures, etc.)* Evaluation (Benchmarking best practices, humans/LLMs as a judge, red teaming/adversarial testing, hallucination detection, etc.)We work collaboratively with TR Labs (TR's applied research division), academic partners at world-leading research institutions, and subject matter experts with decades of experience. We experiment, prototype, test, and deliver ideas in the pursuit of smarter and more valuable models trained on an unprecedented wealth of data and powered by state-of-the-art technical infrastructure. Through our unique institutional experience, we have access to an unprecedented number of subject matter experts involved in data collection, testing and evaluation of trained models. As an ML Research Engineer, you will play a key part in a diverse global team of experts. We hire world-leading specialists in ML/NLP/GenAI, as well as Engineering, to drive the company's leading internal AI model development. You will have the opportunity to contribute to our proprietary AI model research & development through rapid prototyping, scalable infrastructure, and production-quality implementations, and to research papers in top tier academic conferences and journals.About the role In this opportunity, as an ML Research Engineer you will:* Build: You will design and implement robust, scalable systems for training and evaluating large language models. You'll build data pipelines for data-centric research, training infrastructure for instruction fine-tuning (IFT), Direct Preference Optimization (DPO), and reinforcement learning workflows, evaluation frameworks for comprehensive model assessment, and infrastructure for agentic workflows that enables researchers to iterate quickly and effectively.* Innovate: You will work at the very cutting edge of AI Research at an institution with some of the richest data sources in the world. Through your work, you will rapidly implement novel research ideas in LLM training, evaluation, agentic systems, and data processing, transforming them into production-ready systems and research publications. You will contribute to advancing the state-of-the-art in data-centric ML by building tools that help us make the best use of our unprecedented data resources.* Experiment and Develop: You are involved in the entire research & model development lifecycle, brainstorming, coding, testing, and delivering high-quality implementations that support cutting-edge research. You'll build pipelines for synthetic data generation, automated evaluation systems, training workflows for various fine-tuning approaches, and agent-based workflows that push the boundaries of what's possible with LLMs.* Collaborate: Working on a collaborative global team of research engineers and scientists both within Thomson Reuters and our academic partners at world-leading universities. You'll work closely with researchers to understand their needs and translate cutting-edge research papers into practical, scalable implementations.* Communicate: Actively engage in sharing technical implementations and best practices with the wider team through code reviews, documentation, technical presentations, and knowledge sharing sessions. Contribute to internal research discussions and stay current with the latest developments in LLM training, evaluation, agentic AI, and data-centric machine learning.About you You're a fit for the role if your background includes: Required qualifications:* Bachelor's or Master's degree in Computer Science, Engineering, or a relevant discipline (or equivalent practical experience)* 3+ years of hands-on experience building ML/NLP/AI systems with strong software engineering practices* Demonstrated expertise in building production-quality code and data pipelines for ML systems* Proficiency in modern AI development frameworks including: PyTorch, Jax , HuggingFace Transformers, LLM APIs (litellm etc) and vLLM for building and deploying large-scale AI applications* Understanding of LLM training methodologies including instruction fine-tuning, preference optimization, and reinforcement learning approaches* Strong software engineering skills including version control, testing, CI/CD, and code quality practices · Hands-on experience with experiment tracking and orchestration tools such as clearml, Weights & Biases, MLflow.* Experience with distributed computing frameworks and large-scale data processing (e.g., Ray, Spark, Dask)* Excellent communication skills to collaborate with researchers and translate research ideas into robust implementations* Self-driven attitude with genuine curiosity about ML research developments* Comfortable working in fast-paced, agile environments, managing the uncertainty and ambiguity of genuinely novel researchHelpful qualifications:* Track record of ML impact in the form of releases, publications or contributions to open source ML libraries or frameworks (especially in training, evaluation, data processing, or agent systems)* Experience building and maintaining ML training infrastructure and data pipelines at scale* Extensive experience with LLM training techniques such as instruction fine-tuning (IFT), Direct Preference Optimization (DPO), Proximal Policy Optimization (PPO), or other RLHF methods* Hands-on experience implementing and scaling supervised fine-tuning, preference learning, and reinforcement learning pipelines for LLMs* Experience building LLM evaluation frameworks, benchmarking systems, or automated testing pipelines* Hands-on experience with agentic workflows, tool-using AI systems, or multi-agent coordination (examples include: langgraph, AutoGPT, LLamaIndex)* Experience with data-centric ML approaches including synthetic data generation, data curation, or curriculum learning pipelines* Experience training large-scale models over distributed nodes with cloud tools such as AWS, MS Azure, or Google Cloud* Hands-on experience with MLOps, experiment tracking, and model deployment systems* Strong interest in staying current with ML research literature and ability to quickly implement novel techniques from academic papers* Familiarity with training optimization techniques such as mixed precision training, gradient checkpointing, and efficient attention mechanisms* Knowledge of modern ML engineering practices (containerization, orchestration, monitoring)You will enjoy:* Learning and development: On-the-job coaching and learning as well as the opportunity to work with cutting-edge methods and technologies.* Plenty of data, compute, and high-impact problems: Our scientists and engineers get to explore large datasets and discover new capabilities and insights. Thomson Reuters is best known
#J-18808-Ljbffr