Jobs

Machine Learning GPU Performance Engineer


Job details
  • Jobleads
  • London
  • 1 month ago
Applications closed

Machine Learning GPU Performance Engineer

  • linkCopy link

corporate_fareGoogleplaceLondon, UK Mid Experience driving progress, solving problems, and mentoring more junior team members; deeper expertise and applied knowledge within relevant area.

Apply

  • linkCopy link
  • Bachelor's degree or equivalent practical experience.
  • 5 years of experience with software development in one or more programming languages, and with data structures/algorithms.
  • 3 years of experience testing, maintaining, or launching software products, and 1 year of experience with software design and architecture.
  • 3 years of experience with performance, systems data analysis, visualization tools, or debugging.

Preferred qualifications:

  • Master's degree or PhD in Computer Science or related technical field.
  • 1 year of experience in a technical leadership role.

About the job

Google's software engineers develop the next-generation technologies that change how billions of users connect, explore, and interact with information and one another. Our products need to handle information at massive scale, and extend well beyond web search. We're looking for engineers who bring fresh ideas from all areas, including information retrieval, distributed computing, large-scale system design, networking and data storage, security, artificial intelligence, natural language processing, UI design and mobile; the list goes on and is growing every day. As a software engineer, you will work on a specific project critical to Google’s needs with opportunities to switch teams and projects as you and our fast-paced business grow and evolve. We need our engineers to be versatile, display leadership qualities and be enthusiastic to take on new problems across the full-stack as we continue to push technology forward.

The Core team builds the technical foundation behind Google’s flagship products. We are owners and advocates for the underlying design elements, developer platforms, product components, and infrastructure at Google. These are the essential building blocks for excellent, safe, and coherent experiences for our users and drive the pace of innovation for every developer. We look across Google’s products to build central solutions, break down technical barriers and strengthen existing systems. As the Core team, we have a mandate and a unique opportunity to impact important technical decisions across the company.

Responsibilities

  • Identify and maintain LLM training and serving benchmarks that are representative to Google production, industry and ML community, use them to identify performance opportunities and drive XLA:GPU/Triton performance toward state-of-the-art, and to guide XLA releases.
  • Engage with Google product teams such as Deepmind to solve their ML model performance problems, onboarding new LLM models and products on GPU hardware, enabling LLMs to train and serve efficiently on a very large scale (i.e., thousands of GPUs).
  • Run architecture level simulations on GPU designs and perform roofline analysis to guide internal teams.
  • Run performance benchmarks on GPU hardware using internal and external tools.
  • Analyze performance and efficiency metrics to identify bottlenecks, design and implement solutions at Google fleetwide scale.

Google is proud to be an equal opportunity and affirmative action employer. We are committed to building a workforce that is representative of the users we serve, creating a culture of belonging, and providing an equal employment opportunity regardless of race, creed, color, religion, gender, sexual orientation, gender identity/expression, national origin, disability, age, genetic information, veteran status, marital status, pregnancy or related condition (including breastfeeding), expecting or parents-to-be, criminal histories consistent with legal requirements, or any other basis protected by law. See also Google's EEO Policy , Know your rights: workplace discrimination is illegal , Belonging at Google , and How we hire .

Google is a global company and, in order to facilitate efficient collaboration and communication globally, English proficiency is a requirement for all roles unless stated otherwise in the job posting.

To all recruitment agencies: Google does not accept agency resumes. Please do not forward resumes to our jobs alias, Google employees, or any other organization location. Google is not responsible for any fees related to unsolicited resumes.

#J-18808-Ljbffr

Sign up for our newsletter

The latest news, articles, and resources, sent to your inbox weekly.

Similar Jobs

GPU Performance Engineer

Oxford Nanopore Technologies is headquartered at the Oxford Science Park outside Oxford, UK, with satellite offices and a commercial presence in many global locations across the US, APAC and Europe.Oxford Nanopore employs from multiple subject areas including nanopore science, molecular biology and applications, informatics, engineering, electronics, manufacturing and commercialisation. The...

Oxford Nanopore Technologies Oxford

Senior Machine Learning Engineer - Scaling and Performance Optimization

InstaDeep, founded in 2014, is a pioneering AI company at the forefront of innovation. With strategic offices in major cities worldwide, including London, Paris, Berlin, Tunis, Kigali, Cape Town, Boston, and San Francisco, InstaDeep collaborates with giants like Google DeepMind and prestigious educational institutions like MIT, Stanford, Oxford, UCL, and...

InstaDeep London

DevOps Engineer

Neurolabsis seeking a highly skilled and motivatedDevOps Engineerto join our growing team. As a DevOps Engineer, you will play a crucial role in maintaining and improving our infrastructure to support the development and deployment of our cutting-edge solutions for the retail automation industry.  As DevOps/MLOps Engineer at Neurolabs, you will...

Neurolabs London

Machine Learning Performance Engineer

Summary:Exciting opportunity to work at a tech-centric prop trading fund which trades a wide range of financial products, with offices across the globe. Looking for an experienced engineer with low-level systems programming and optimization expertise to join their growing ML team.Machine learning is front and centre at this firm, and...

Oxford Knight London

Machine Learning Performance Engineer- World-Leading Prop Trading Fund

Machine Learning Performance EngineerSummary:Exciting opportunity to work at a tech-centric prop trading fund which trades a wide range of financial products, with offices across the globe. Looking for an experienced engineer with low-level systems programming and optimization expertise to join their growing ML team.Machine learning is front and centre at...

Oxford Knight London

Machine Learning Performance Engineer

We are looking for an engineer with experience in low-level systems programming and optimisation to join our growing ML team.Machine learning is a critical pillar of Jane Street's global business. Our ever-evolving trading environment serves as a unique, rapid-feedback platform for ML experimentation, allowing us to incorporate new ideas with...

Jane Street London