Research Engineer/Research Scientist – Model Transparency

London, United Kingdom
2 weeks ago
Job Type
Permanent
Work Pattern
Full-time
Work Location
On-site
Seniority
Mid
Education
Degree
Posted
24 Apr 2026 (2 weeks ago)

Benefits

25 days holiday Pension Private healthcare

About the AI Security Institute

The AI Security Institute is the world's largest and best-funded team dedicated to understanding advanced AI risks and translating that knowledge into action. We’re in the heart of the UK government with direct lines to No. 10 (the Prime Minister's office), and we work with frontier developers and governments globally.

We’re here because governments are critical for advanced AI going well, and UK AISI is uniquely positioned to mobilise them. With our resources, unique agility and international influence, this is the best place to shape both AI development and government action.

The deadline for applying to this role is Sunday 24th May 2026, end of day, anywhere on Earth.

Team Description

The ability to effectively evaluate and monitor AI systems will grow in importance as models become more capable, autonomous, and integrated into society. If models can detect and game evaluations, obscure their reasoning, or behave differently under observation, the safety claims that governments and developers rely on become unreliable. Understanding and addressing these risks is essential to ensuring that oversight of advanced AI systems keeps pace with their capabilities.

The Model Transparency team is a research team within AISI focused on ensuring that evaluations, assessments, and monitoring of frontier AI systems remain reliable as models become less transparent. We research how and why oversight is declining – through phenomena such as evaluation awareness, unfaithful chain-of-thought reasoning, and changes in model architectures – and develop methods (including white and black box methods) to detect, measure, and mitigate potential issues. We share our findings with frontier AI companies (including Anthropic, OpenAI, DeepMind), UK government officials, and allied governments, and publicly to inform their deployment, research, and policy decisions. We also work directly with safety teams at frontier labs, contributing to safety case reviews and helping improve their alignment evaluation methodology.

Our recent work includes auditing games for sandbagging, reproducing natural emergent misalignment from reward hacking, and identifying open-weight language models that game propensity evaluations.

Role description

We're looking for Research Scientists and Research Engineers for the Model Transparency team with expertise in technical AI safety – such as interpretability, capability or alignment evaluations, model transparency – or with broader experience with frontier LLM research and development. An ideal candidate would have a strong track record of high-quality research in technical AI safety or adjacent fields.

  • Research Scientists, drive the technical substance of our work – staying abreast of the literature, proposing and designing experiments, conducting rigorous analyses, and owning the evidence stack from experiment through to written output. They write, critique, and strengthen the team's reports and publications.
  • Research Engineers, build the systems and tooling that make our research possible and fast – scaling experimental workflows, automating processes, solving infrastructure challenges, and creating systems that accelerate the entire team's output.

We're interested in candidates along the spectrum between Research Engineers and Research Scientists. The application form will ask you to indicate which role you lean towards.

The team is led by Joseph Bloom, advised by Geoffrey Irving. You'll work with talented, mission-driven technical staff across AISI, including alumni from Anthropic, OpenAI, DeepMind, and top universities. You may also collaborate with external research teams including those at frontier AI labs, METR, and FAR.

We are open to hires across a range of experience levels.

Representative Projects You Might Work On

  • Developing a chain-of-thought monitorability benchmark and comparing monitorability properties across frontier AI systems, leveraging AISI’s unique access to reasoning traces from multiple labs.
  • Designing and running experiments on open-weight models to study alignment and oversight-relevant phenomena – such as reproducing emergent misalignment from reward hacking, or red-teaming techniques like inoculation prompting and character training.
  • Using white-box and interpretability methods – such as activation oracles, sparse auto-encoders or probes – to detect misalignment that isn’t visible through behavioural evaluation alone.
  • Building tooling and infrastructure for our research – including agent orchestration, large-scale RL pipelines, mechanistic interpretability methodologies, and auditing agents.

The work could also involve:

  • Reviewing frontier lab risk assessments and safety cases, providing independent analysis of alignment claims before deployment decisions.
  • Conducting literature reviews and expert interviews to map the state of model transparency risks and inform AISI’s strategic priorities.
  • Translating technical findings into actionable insights for AISI evaluation teams, UK government officials, and international partners.

What we’re looking for

If you’re unsure whether you meet the criteria below, we’d encourage you to apply anyway – we’d rather you erred on the side of applying than not.

Requirements for both roles:

    Related Jobs

    View all jobs
    Spotlight

    Machine Learning Engineer (Forward Deployed)

    Mind Foundry Oxford/ Hybrid, Oxfordshire, United Kingdom
    Spotlight

    Forward Deployed Engineer

    SolveAI London, United Kingdom
    Hybrid

    Senior Software Engineer - Core Services

    PhysicsX London, United Kingdom

    Principal AI Engineer

    PhysicsX London, United Kingdom

    (Alignment) Research Engineer/Research Scientist - Red Team

    AI Security Institute London, United Kingdom

    AI Deployment Engineer, Startups

    OpenAI United Kingdom
    Hybrid

    Research Engineer - Societal Impacts

    AI Security Institute London, United Kingdom
    On-site Clearance Required

    Industry Insights

    Discover insightful articles, industry insights, expert tips, and curated resources.

    Where to Advertise AI Jobs in the UK (2026 Guide)

    Advertising AI jobs in the UK requires a different approach to most technical hiring. The candidate pool is small, highly informed and in demand across multiple sectors simultaneously. General job boards reach a broad audience but lack the specificity that AI professionals expect — and the filtering mechanisms they rely on. Specialist platforms, direct outreach and academic channels each serve a different part of the market. This guide, published by ArtificialIntelligenceJobs.co.uk, covers where to advertise AI roles in the UK in 2026, how the main platforms compare, what employers should expect to pay, and what the data says about time-to-hire across different role types.

    AI Jobs UK 2026: What to Expect Over the Next 3 Years

    Artificial intelligence is creating jobs faster than the market can name them. New roles are appearing every quarter, existing titles are splitting into specialisms, and the technologies underpinning it all are evolving at a pace that makes even last year's job descriptions feel dated. For job seekers, this presents a genuinely unusual challenge. In most industries, career planning means understanding a relatively stable landscape and working out where you fit within it. In AI, the landscape itself is being redrawn in real time. The roles with the most hiring activity in 2028 may not yet have a widely agreed job title in 2026. That's not a reason to feel overwhelmed — it's a reason to get informed. The candidates who thrive in this market aren't necessarily those with the longest CVs or the most credentials. They're the ones who understand the direction of travel: which skills are gaining value, which technologies are driving employer decisions, and how the definition of an "AI job" is expanding well beyond the tech sector. This article breaks down what the UK AI jobs market is likely to look like over the next three years — covering emerging job titles, the technologies reshaping hiring, the skills employers are prioritising, and how to position yourself ahead of the curve rather than behind it.

    New AI Employers to Watch in 2026: UK and Global Companies Reshaping AI Careers

    The artificial intelligence job market in the UK is evolving at an extraordinary pace. With record-breaking investment, government backing, and a surge in enterprise adoption, the landscape of AI employers is shifting rapidly. For candidates exploring opportunities on ArtificialIntelligenceJobs.co.uk, understanding who is hiring next is just as important as understanding what skills are in demand. In this article, we explore the new and emerging AI employers to watch in 2026, focusing on organisations that have recently secured funding, won major contracts, or expanded their UK footprint. From cutting-edge startups to global giants doubling down on Britain, these companies represent the next wave of AI career opportunities.