Be at the heart of actionFly remote-controlled drones into enemy territory to gather vital information.

Apply Now

Platform Engineer, MLOps

DevOps projects
City of London
22 hours ago
Create job alert

Get weekly curated DevOps opportunities, salary insights, and career tips — no spam, only relevant roles that match your stack and experience level.


About this role

As a Platform engineer, MLOps, you will be critical to deploying and managing cutting-edge infrastructure crucial for AI/ML operations, and you will collaborate with AI/ML engineers and researchers to develop a robust CI/CD pipeline that supports safe and reproducible experiments. Your expertise will also extend to setting up and maintaining monitoring, logging, and alerting systems to oversee extensive training runs and client-facing APIs. You will ensure that training environments are optimally available and efficiently managed across multiple clusters, enhancing our containerization and orchestration systems with advanced tools like Docker and Kubernetes.


This role demands a proactive approach to maintaining large Kubernetes clusters, optimizing system performance, and providing operational support for our suite of software solutions. If you are driven by challenges and motivated by the continuous pursuit of innovation, this role offers the opportunity to make a significant impact in a dynamic, fast-paced environment.


Responsibilities

  • Work closely with AI/ML engineers and researchers to design and deploy a CI/CD pipeline that ensures safe and reproducible experiments.
  • Set up and manage monitoring, logging, and alerting systems for extensive training runs and client-facing APIs.
  • Ensure training environments are consistently available and prepared across multiple clusters.
  • Develop and manage containerization and orchestration systems utilizing tools such as Docker and Kubernetes.
  • Operate and oversee large Kubernetes clusters with GPU workloads.
  • Improve reliability, quality, and time-to-market of our suite of software solutions.
  • Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating for continual improvement.
  • Provide primary operational support and engineering for multiple large-scale distributed software applications.

Is this you?

You have professional experience with:



  • Huggingface Transformers
  • Pytorch
  • vLLM
  • TensorRT
  • Infrastructure as code tools like Terraform
  • Scripting languages such as Python or Bash
  • Cloud platforms such as Google Cloud, AWS or Azure
  • Git and GitHub workflows
  • Tracing and Monitoring
  • Familiar with high-performance, large-scale ML systems


  • You have a knack for troubleshooting complex systems and enjoy solving challenging problems.
  • Proactive in identifying problems, performance bottlenecks, and areas for improvement.
  • Take pride in building and operating scalable, reliable, secure systems.
  • Are comfortable with ambiguity and rapid change.

Preferred skills and experience

  • Familiar with monitoring tools such as Prometheus, Grafana, or similar.
  • 5+ years building core infrastructure.
  • Experience running inference clusters at scale.
  • Experience operating orchestration systems such as Kubernetes at scale.

Benefits

  • Comprehensive medical and dental insurance.
  • Paid parental leave for all parents (12 weeks).
  • Competitive pension scheme and company contribution.
  • Home office setup, cell phone, internet.
  • Wellness stipend for gym, massage/chiropractor, personal training, etc.
  • Learning and development stipend.
  • Company-wide off-sites and team off-sites.
  • Competitive compensation and company stock options.


#J-18808-Ljbffr

Related Jobs

View all jobs

MLOps Platform Engineer

MLOps Platform Engineer

Senior ML Platform Engineer - Artificial Intelligence

Senior ML Platform Engineer (London) - Artificial Intelligence London, GBR Posted today

Senior ML Platform Engineer - Artificial Intelligence

Senior ML Platform Engineer - Artificial Intelligence

Subscribe to Future Tech Insights for the latest jobs & insights, direct to your inbox.

By subscribing, you agree to our privacy policy and terms of service.

Industry Insights

Discover insightful articles, industry insights, expert tips, and curated resources.

How to Write an AI CV that Beats ATS (UK examples)

Writing an AI CV for the UK market is about clarity, credibility, and alignment. Recruiters spend seconds scanning the top third of your CV, while Applicant Tracking Systems (ATS) check for relevant skills & recent impact. Your goal is to make both happy without gimmicks: plain structure, sharp evidence, and links that prove you can ship to production. This guide shows you exactly how to do that. You’ll get a clean CV anatomy, a phrase bank for measurable bullets, GitHub & portfolio tips, and three copy-ready UK examples (junior, mid, research). Paste the structure, replace the details, and tailor to each job ad.

AI Recruitment Trends 2025 (UK): What Job Seekers Must Know About Today’s Hiring Process

Summary: UK AI hiring has shifted from titles & puzzle rounds to skills, portfolios, evals, safety, governance & measurable business impact. This guide explains what’s changed, what to expect in interviews, and how to prepare—especially for LLM application, MLOps/platform, data science, AI product & safety roles. Who this is for: AI/ML engineers, LLM engineers, data scientists, MLOps/platform engineers, AI product managers, applied researchers & safety/governance specialists targeting roles in the UK.

Why AI Careers in the UK Are Becoming More Multidisciplinary

Artificial intelligence is no longer a single-discipline pursuit. In the UK, employers increasingly want talent that can code and communicate, model and manage risk, experiment and empathise. That shift is reshaping job descriptions, training pathways & career progression. AI is touching regulated sectors, sensitive user journeys & public services — so the work now sits at the crossroads of computer science, law, ethics, psychology, linguistics & design. This isn’t a buzzword-driven change. It’s happening because real systems are deployed in the wild where people have rights, needs, habits & constraints. As models move from lab demos to products that diagnose, advise, detect fraud, personalise education or generate media, teams must align performance with accountability, safety & usability. The UK’s maturing AI ecosystem — from startups to FTSE 100s, consultancies, the public sector & universities — is responding by hiring multidisciplinary teams who can anticipate social impact as confidently as they ship features. Below, we unpack the forces behind this change, spotlight five disciplines now fused with AI roles, show what it means for UK job-seekers & employers, and map practical steps to future-proof your CV.