Principal AI/ML Engineer
- R9692
- Pune, India
Rapid7 is seeking a Principal AI Engineer to lead the architectural evolution of our AI Center of Excellence. In this role, you will design and own the end-to-end distributed systems that make advanced ML, LLMs, and agentic AI reliable, scalable, and secure at an enterprise level.
About the Team
The AI Center of Excellence leverages advanced ML and agentic AI systems to protect our customers’ attack surfaces by embedding intelligence into real-world security workflows. We operate in ambiguous problem spaces, valuing technical rigor and principled decision-making to deliver production-grade AI at scale.
About the Role
As a Principal AI Engineer – Data Science, your primary responsibility will be to define and own the system architecture for AI platforms and services across the organisation. Specifically, your focus will be to:
Own the end-to-end system architecture for AI, ML, and Agentic platforms, ensuring they are reliable, scalable, and secure.
Design complex data ingestion pipelines, feature stores, and inference microservices that bridge the gap between research and production.
Establish architectural standards and reference patterns for LLM orchestration, RAG systems, and multi-agent workflows.
Lead architectural reviews as the final technical authority, making critical trade-offs across accuracy, latency, cost, and reliability.
The skills and qualities you’ll bring include:
Exceptional ability to reason at the system and architecture level to make long-term technical decisions.
Courageous and principled decision-making when navigating high-stakes, ambiguous problem spaces.
Proven mentorship of Staff and Senior engineers, fostering growth in architectural thinking and technical rigor.
Accountability for the long-term technical health and security of AI systems across multiple teams.
13+ years of experience in Data Science, ML Engineering, or Applied AI with a focus on large-scale systems.
Hands-on mastery of LLM orchestration frameworks such as LangChain and LangGraph for agentic workflows.
Deep expertise in designing RAG pipelines and managing vector database retrieval at scale.
Advanced proficiency in AWS ecosystems, specifically Bedrock, SageMaker, EKS, and Lambda.
Expertise in MLOps standards, including model registries, drift detection, and automated retraining frameworks.
Strong background in deep learning for NLP and sequence-based problems like malware behaviour modelling.
Proficiency in Infrastructure as Code (Terraform) and CI/CD for ML workloads.
Experience implementing robust guardrails and evaluation frameworks (e.g., Promptfoo, HELM) for autonomous systems.
Important: This is an Architecture-First role. If you haven't defended production AI system designs at scale, you aren't a fit—regardless of modelling strength.
We know that the best ideas and solutions come from multi-dimensional teams. That’s because these teams reflect a variety of backgrounds and professional experiences. If you are excited about this role and feel your experience can make an impact, please don’t be shy - apply today.
About Rapid7
At Rapid7, our vision is to create a secure digital world for our customers, our industry, and our communities. We do this by harnessing our collective expertise and passion to challenge what’s possible and drive extraordinary impact. We’re building a dynamic and collaborative workplace where new ideas are welcome.
Protecting 11,000+ customers against bad actors and threats means we’re continuing to push the envelope just like we’ ve been doing for the past 20 years. If you ’re ready to solve some of the toughest challenges in cybersecurity, we’re ready to help you take command of your career. Join us.
Security and Compliance
Rapid7 is committed to keeping customers secure. As a first line of defense, all employees are expected to uphold the highest standards of security and privacy, ensuring the protection of sensitive information and compliance with relevant regulations.