This is a remote position.
Job Title: MLOps Engineer
Job Overview:
We are seeking an experienced MLOps Engineer with a strong background in building and deploying machine learning (ML) and large language models (LLMs). The ideal candidate will have mastery in data science platforms, significant software engineering experience, and expertise in LLMOps best practices. This role involves working on high-scale AI deployments, optimising LLM pipelines, and implementing responsible AI techniques.
About the Company:
We are a dynamic and innovative company committed to delivering exceptional solutions that empower our clients to succeed. With our headquarters in the UK and a global footprint across the US, Noida, and Pune in India, we bring a decade of expertise to every endeavour, driving real results. We take a holistic approach to project delivery, providing end-to-end services that encompass everything from initial discovery and design to implementation, change management, and ongoing support. Our goal is to help clients leverage the full potential of the Salesforce platform to achieve their business objectives.
What Makes VE3 The Best For You We think of your family as our family, no matter the shape or size. We offer maternity leaves, PF Fund Contributions, 5 days working week along with a generous paid time off program that benefits balance your work & personal life.
Requirements
Key Responsibilities:
- ML & LLM Model Development:
- Build, deploy, and manage ML and LLM models on cloud-native platforms such as Microsoft Azure, AWS, or Google Cloud Platform (GCP).
- Utilize LLM-specific frameworks like LangChain, LangGraph, and LlamaIndex to develop sophisticated solutions, incorporating prompt engineering and dynamic response handling.
- Apply LLMOps best practices for model development, training, fine-tuning, and monitoring, ensuring scalability and high availability.
- Software Engineering & Production Optimization:
- Drive software engineering efforts focused on scaling LLM and ML models in low-latency, high-throughput production environments.
- Implement responsible AI practices, integrating LLM guardrails to maintain model reliability and ethical standards in production.
- Streamline response generation with data/LLM response streaming and parallelized workloads to enhance model performance.
- Cloud-Native Computing & Deployment:
- Design and maintain CI/CD pipelines for LLM deployment, leveraging DevOps principles and cloud-native technologies like Docker and Kubernetes.
- Establish and manage scalable cloud infrastructure, optimising resource utilization for parallelized ML and LLM workloads.
- Advanced LLM Operations & Data Engineering:
- Implement advanced LLM operations, including Retrieval-Augmented Generation, multi-agent deployments (CrewAI, AutoGen), and vector databases for efficient context retrieval.
- Develop and maintain data engineering pipelines using technologies like Apache Spark and manage message queues (RabbitMQ, Kafka) to support real-time model integrations.
- Database Management & Optimization:
- Optimize and manage databases like Postgres, MongoDB, SQL Server, Redis, and vector databases to support ML and LLM model workloads and enhance retrieval efficiency.
- Ensure data integrity and model performance by fine-tuning database configurations for high-volume LLM deployments.
Required Qualifications:
- Education:
- Bachelor’s or Master’s degree in Computer Science, Engineering, Data Science, or a related field.
- Experience:
- 8+ years of relevant experience, including mastery in data science platforms such as Microsoft Azure, AWS, or GCP for building and deploying ML and LLM models.
- Significant software engineering experience with a focus on ML/LLM model production, scalability in low-latency environments, and responsible AI implementation.
- Proven expertise in LLMOps practices and frameworks
- Technical Skills:
- Programming & Frameworks: Proficiency in object-oriented programming languages and LLM-specific frameworks with expertise in prompt engineering.
- Cloud & DevOps: Advanced understanding of Docker, Kubernetes, cloud-native computing, and DevOps practices for ML and LLM deployments.
- Data & Vector Management: In-depth knowledge of vector databases, LLM fine-tuning techniques, and the implementation of LLM guardrails.
- Multi-Agent Systems: Experience with chatbot and multi-agent system deployments, including CrewAI, AutoGen, and LangGraph.
- Data Engineering & Messaging: Experience with Data Engineering, message queues (RabbitMQ, Kafka), and programming languages like Python, SQL, C++, R.
- Database Optimization: Proficiency in databases such as Postgres, MongoDB, SQL Server, Redis, and their optimization for ML/LLM workloads.
Key Competencies:
- Strong problem-solving skills with a proactive attitude towards continuous learning.
- Visionary mindset for shaping the future direction of LLMs and AI/ML at Scale.
- Effective communication skills for translating technical concepts to non-technical stakeholders.
- A collaborative mindset and an inclination to mentor and uplift the technical capabilities of the team.
Benefits
<ul style="margin:10px 0px 10px 20px; padding:0px 0px 0px 5px; list-style:outside none disc; background-image:none; color:rgb(49, 57, 73); font-size:14px; font-style:normal; font-weight:400; letter-spacing:normal; orphans:2; text-indent:0px; text-transform:none; widows:2; word-spacing:0px; white-space:normal; background-color:rgb(255, 255, 255); font-family:-apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", Roboto, "Helvetica Neue", "Fira Sans", Ubuntu, Oxygen, "Oxygen Sans", Cantarell, "Droid Sans", "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Emoji", "Segoe UI Symbol", "Lucida Grande", Helvetica, Arial, sans-serif"> Competitive salary and comprehensive benefits package.
Opportunity to work in a dynamic and challenging environment on critical migration projects.
Professional growth opportunities in a supportive and forward-thinking organization.
Health Insurance
Employee Assistance program
Engagement with cutting-edge SAP technologies and methodologies in data migration