Hurdle.bio is revolutionising the biotechnology landscape by democratising access to novel diagnostics through cutting-edge bio-infrastructure. Since 2017, we've processed over 3 million tests and partnered with global healthcare leaders to provide comprehensive diagnostic solutions and multi-omics biomarker services.
We're on a mission to transform how companies and consumers access biological insights. With our end-to-end platform spanning sample collection, lab testing, logistics, and digital interfaces, we're making advanced diagnostics more accessible than ever.
The role:
As aMachine Learning (ML) Engineerat Hurdle.bio, you'll be instrumental in developing and implementing machine learning solutions that power our biomarker discovery pipeline and enhance our diagnostic capabilities. You'll work at the intersection of artificial intelligence and biology, creating algorithms that process multi-modal datasets, with a particular focus on integrating electronic health records (EHRs) for biomarker discovery and precision medicine applications.
What you'll be doing:
- Lead ML Development:Train, test and implement ML models for biomarker discovery and validation using multi-modal data across large global datasets (millions of patients), incorporating structured and unstructured EHR data.
- EHR Data Integration:Develop data extraction, normalization, and harmonization strategies for diverse EHR formats (FHIR, HL7, OMOP, etc.). Ensure seamless integration of clinical embeddings, medical codes (ICD, SNOMED, CPT), and unstructured clinical notes (NLP).
- Multi-Modal Data Processing:Build robust data processing pipelines that handle and unify clinical, omics, imaging, and wearable device data for ML-driven insights.
- Cross-functional Collaboration:Work closely with bioinformaticians, scientists, clinicians and laboratory teams to refine and integrate ML solutions into our diagnostic workflows.
- Research & Innovation:Stay current with latest developments in AI/ML and computational biology, implementing new approaches as appropriate.
What you'll bring:
- Strong ML expertise:Deep understanding of statistics, machine learning and deep learning, particularly as applied to EHRs and healthcare data.
- Programming proficiency:Advanced Python skills with experience in ML tooling (PyTorch/TensorFlow, Hugging Face, Scikit-Learn).
- EHR-Specific Experience:Strong hands-on experience working with structured and unstructured EHR data, including longitudinal patient records, clinical text mining, and feature engineering for predictive modeling.
- Data science and engineering skills:Experience with data preprocessing, feature engineering, and statistical analysis in healthcare settings.
- Software engineering practices:Knowledge of cloud platforms (AWS, CGP), containerization (Docker, Kubernetes), version control (Git), testing, and deployment of ML models.
- Analytical mindset:Ability to solve complex problems and translate biological and clinical questions into scalable computational solutions.
What would make you stand out:
- Experience with knowledge graphs, transformers, LLMs, CNNs, RNNs, and clinical embeddings (e.g., BioBERT, ClinicalBERT, Med-BERT).
- Experience handling MRI, CT, or pathology imaging data, as well as multi-omics datasets (genomics, proteomics, transcriptomics, metabolomics).
- Knowledge of privacy-aware ML techniques for sensitive healthcare data applications (federated learning).
Benefits:
- Take what you need annual leave
- Fully remote working (UK)
- Private health insurance - Vitality
- Enhanced paternity leave
- Pension provided through Nest
Inclusion & Diversity Statement:
We believe diverse perspectives drive innovation. We're committed to building an inclusive environment where all employees can thrive. We welcome applications from candidates of all backgrounds, and ensure our hiring process is equitable and accessible to everyone.