Senior Site Reliability Engineer

Thought Machine
London, United Kingdom
3 months ago
£70,000 – £120,000 pa

Salary

£70,000 – £120,000 pa

Job Type
Permanent
Work Pattern
Full-time
Work Location
Hybrid
Seniority
Senior
Education
Degree
Posted
6 Feb 2026 (3 months ago)

Benefits

Employee share package Fantastic workplace culture High Glassdoor rating

Thought Machine’s mission is bold – to properly and permanently rid the world’s banks of legacy technology. To achieve this, we have developed the foundations of modern banking through core and payments technology which run natively in the cloud. What we are attempting is hard and means we need great people working together to build great technology.

We have grown rapidly in the past few years – growing our team to more than 550 individuals across offices in London, New York, Singapore and Sydney. We have raised more than $500m in funding and are now valued at $2.7bn. Our investors include Molten Ventures, Eurazeo, Intesa Sanpaolo, Temasek, Nyca Partners, JPMorgan Chase Strategic Investments, Standard Chartered Ventures, and more.

We have created a culture that enables our team to produce the best work in the industry while ensuring we have fun along the way. We're regularly cited as having a fantastic workplace culture and have been recognised by Sifted magazine as having one of the highest Glassdoor ratings for a UK fintech company and the industry's most generous employee share package. Named one of the world’s most innovative fintechs byGlobal Finance Magazine, we were also recognised by theFinancial Times as one of Europe’s fastest-growing companies for two consecutive years—and a UK Best Employer for 2026.

Thought Machine’s Site Reliability Engineers are the guardians of mission-critical systems for the world's most influential financial institutions. As a member of our elite, globally distributed team, you'll be entrusted with running and maintaining the robust production infrastructure that powers our customers' cutting-edge Core Banking and Payments platforms. This is an opportunity to make a tangible impact on the global financial landscape while collaborating with brilliant minds to solve complex engineering challenges.

This role will be part of the Site Reliability Engineering team at Thought Machine HQ in London, tackling the challenges of automating complex fleet management operations, mentoring team members, promoting communities of best practice within engineering as well as designing operational processes that provide effective interfaces between Thought Machine and our SaaS customers.

The SRE team is deeply involved in tackling the technical challenges of executing Thought Machine’s growth ambitions - expect to be working with senior stakeholders in the organisation and with our customers, and working on programmes and initiatives that are critical to the success of the company.

Duties:

  • Supporting the product engineering teams in building highly fault-tolerant, scalable applications by participating in design discussions, engaging in RFCs and code reviews.

  • Executing various department strategies - contributing to the design and scoping work for team members around disaster recovery, backup, redundancy and capacity planning activities.

  • Being part of a global on-call rotation responsible for identifying and fixing bottlenecks in SaaS customer environments.

  • Regular maintenance of production systems that host Vault products.

  • Driving the evolution of our SaaS products by defining and designing features that foster exceptional reliability and an unparalleled user experience.

  • Implementing and regularly testing DR strategies to ensure the highest level of resilience and fault tolerance of the platform.

  • Maintain and promote high-quality written documentation of assets, processes and runbooks that are used by the team in their day-to-day operations,

  • Working with your Manager in growing team members in their technical skills as well as their understanding of Vault Products.

Requirements:

  • You have a track record of delivering high-impact projects with focus on long-term scalability, ensuring that human intervention scales sub-linearly with usage growth.

  • You possess an up-to-date understanding of design patterns relevant to hosting and networking architectures.

  • You proactively champion product development, driven by a desire to build truly exceptional products, not just solve immediate challenges.

  • You’re a high-agency individual who can independently drive projects to completion by effectively scaling your individual output with the appropriate delegation of work to team members.

  • You have a strong background working in either Python, Golang or Java, having used one of these programming languages to execute a significantly sized project or initiative.

  • You have experience working with Kubernetes or other container orchestration systems.

  • You have experience with automation/configuration management, e.g. Terraform, Puppet, Chef, Ansible.

  • You have expertise in one or more of the following areas: Database Administration, Networking, Observability Tools (such as Prometheus, Jaeger) or automation infrastructure.

  • You have extensive experience working with either GCP or AWS.

Benefits:

  • Highly competitive salary

  • Pension plan (match up to 5%)

  • Life insurance - three times annual salary

  • Competitive maternity (six months fully paid) and paternity leave (four weeks fully paid)

  • Shared parental leave (matched to our maternity leave for the same point in time)

  • 25 days holiday and bank holidays

  • Flexible working hours

  • Cycle-to-work scheme

  • Electric car scheme

  • Season ticket loan

  • Access to outstanding learning materials and courses

  • Sports and hobby clubs, subsidised by Thought Machine

  • All the latest tech you need

  • Start the day properly with fresh fruit and cereals

  • Huge range of healthy (and not-so-healthy) snacks, smoothies and drinks

  • A talented and experienced team as your colleagues

  • An environment where we encourage learning and progress

  • Two charity days a year

  • Weekly food pop-up

We actively hire candidates who demonstrate technical excellence in their field and welcome people of all ages and backgrounds, providing everyone with equal access to professional development. You are encouraged to apply even if your experience doesn't accurately match the job description. We also encourage applications from those with different abilities, including candidates with ADHD, autism, dyslexia or dyspraxia.

Related Jobs

View all jobs

Senior Site Reliability Engineer, Vehicle SW

Wayve London, United Kingdom

Senior Cloud Site Reliability Engineer

Wayve London, United Kingdom
On-site

Site Reliability Engineer

Thought Machine London, United Kingdom
£50,000 – £90,000 pa On-site

Principal Software Reliability Engineer - Consumer Identity

Entrust London, United Kingdom

Senior PLC Engineer

Ocado United Kingdom
£40,000 – £60,000 pa Hybrid

Controls Engineer

Ocado United Kingdom

Subscribe to Future Tech Insights for the latest jobs & insights, direct to your inbox.

By subscribing, you agree to our privacy policy and terms of service.

Industry Insights

Discover insightful articles, industry insights, expert tips, and curated resources.

Where to Advertise AI Jobs in the UK (2026 Guide)

Advertising AI jobs in the UK requires a different approach to most technical hiring. The candidate pool is small, highly informed and in demand across multiple sectors simultaneously. General job boards reach a broad audience but lack the specificity that AI professionals expect — and the filtering mechanisms they rely on. Specialist platforms, direct outreach and academic channels each serve a different part of the market. This guide, published by ArtificialIntelligenceJobs.co.uk, covers where to advertise AI roles in the UK in 2026, how the main platforms compare, what employers should expect to pay, and what the data says about time-to-hire across different role types.

New AI Employers to Watch in 2026: UK and Global Companies Reshaping AI Careers

The artificial intelligence job market in the UK is evolving at an extraordinary pace. With record-breaking investment, government backing, and a surge in enterprise adoption, the landscape of AI employers is shifting rapidly. For candidates exploring opportunities on ArtificialIntelligenceJobs.co.uk, understanding who is hiring next is just as important as understanding what skills are in demand. In this article, we explore the new and emerging AI employers to watch in 2026, focusing on organisations that have recently secured funding, won major contracts, or expanded their UK footprint. From cutting-edge startups to global giants doubling down on Britain, these companies represent the next wave of AI career opportunities.

How Many AI Tools Do You Need to Know to Get an AI Job?

If you are job hunting in AI right now it can feel like you are drowning in tools. Every week there is a new framework, a new “must-learn” platform or a new productivity app that everyone on LinkedIn seems to be using. The result is predictable: job seekers panic-learn a long list of tools without actually getting better at delivering outcomes. Here is the truth most hiring managers will quietly agree with. They do not hire you because you know 27 tools. They hire you because you can solve a problem, communicate trade-offs, ship something reliable and improve it with feedback. Tools matter, but only in service of outcomes. So how many AI tools do you actually need to know? For most AI job seekers: fewer than you think. You need a tight core toolkit plus a role-specific layer. Everything else is optional. This guide breaks it down clearly, gives you a simple framework to choose what to learn and shows you how to present your toolset on your CV, portfolio and interviews.