Principal AI/ML Engineer
- R9692
- Pune, India
Principal AI Engineer – Agentic AI, System Architecture & Data Science
Experience: 13+ years
About the Team
The AI Center of Excellence team includes Data Scientists and AI Engineers who work together to conduct research, build prototypes, design features, and deliver production-grade AI systems at scale. Our mission is to leverage the best available technology—including advanced ML, LLMs, and agentic AI systems—to protect our customers’ attack surfaces.
We partner deeply with Detection and Response teams, including our MDR service, to embed AI into real-world security workflows. Our work builds on more than 20 years of threat intelligence, deep domain expertise, and a growing patent portfolio. We operate in ambiguous problem spaces and value technical rigor, strong ownership, and principled decision-making.
As a Principal engineer, you will define and own the system architecture for AI platforms and services across the organization.
The technologies we use include
Python for large-scale data science, modeling, and experimentation
Jupyter notebooks (local & remote)
Classical ML using scikit-learn
Deep learning for NLP and sequence-based security problems
Anomaly detection and behavioral modeling
LLM / GenAI toolchains: HuggingFace, Transformers, LangChain, LangGraph
Agentic AI systems: multi-agent orchestration, tool-calling, reasoning, memory, evaluation
RAG pipelines using vector databases
AWS cloud ecosystem: Bedrock, SageMaker, Lambda, EKS, S3, DynamoDB, Redshift, Kinesis
CI/CD for ML & LLM systems (GitHub Actions, Jenkins)
Model registry, versioning, drift detection, and retraining frameworks
Observability & operations: CloudWatch, Prometheus, Grafana, PagerDuty
Infrastructure as Code using Terraform
About the Role
Rapid7 is seeking a Principal AI Engineer – Data Science who brings deep system design and architectural leadership to our AI organization.
This role sits at the intersection of data science, large-scale distributed systems, agentic AI, and cloud-native architecture. You will be responsible not just for building models, but for designing the end-to-end systems that make AI reliable, scalable, secure, and operable in production.
This role is ideal for someone who:
has designed complex, distributed AI systems end to end,
understands how data, models, infrastructure, and services interact, and
can make long-term architectural decisions under real-world constraints.
In this role, you will
Own the system architecture for AI, ML, LLM, and agentic AI platforms across multiple teams
-
Design end-to-end AI system architectures, including:
data ingestion and streaming pipelines
feature stores and offline/online data paths
model training, fine-tuning, and evaluation
inference services, APIs, and microservices
monitoring, alerting, and incident response workflows
-
Define reference architectures and design patterns for:
LLM orchestration and agentic workflows
RAG systems and vector retrieval
secure and scalable inference
Lead architectural reviews and act as the final technical authority on AI system design decisions
Make trade-offs across accuracy, latency, cost, scalability, security, and reliability
-
Establish architectural standards for:
model registry and lifecycle management
drift detection and retraining
LLM evaluation, guardrails, and governance
Ensure AI systems comply with cloud security best practices (IAM, KMS, VPC, secrets)
Serve as the escalation point for complex production incidents
Mentor Staff and Senior engineers on system design and architectural thinking
Influence product roadmaps and long-term AI platform investments
The skills you’ll bring include
Core (Required)
13+ years of experience in Data Science, ML Engineering, or Applied AI
Proven experience designing and architecting large-scale AI systems
-
Strong background in:
data acquisition, cleaning, enrichment, and transformation
feature engineering for structured and unstructured data
supervised and unsupervised ML
deep learning (NLP, CNNs, RNNs, sequence models)
Experience with model explainability (SHAP, LIME)
-
Hands-on experience with security-focused ML models:
malware detection
malware behavior-based models
user behavioral analytics
Exceptional ability to reason at the system and architecture level
Agentic AI & LLM Systems (Strongly Required)
-
Deep hands-on experience with:
LLM orchestration (LangChain, LangGraph)
agentic and multi-agent architectures
RAG pipelines and vector databases
prompt engineering at scale
LLM evaluation frameworks (Promptfoo, HELM)
fine-tuning approaches (LoRA, PEFT)
Designing robust guardrails, governance, and evaluation frameworks for LLM systems
Understanding of failure modes and risks in autonomous and agentic AI systems
System Design, Cloud & MLOps (Strongly Required)
Strong experience designing distributed systems and microservice architectures
-
Expertise in:
model registry and versioning (MLflow, SageMaker)
drift detection and automated retraining
monitoring and observability (CloudWatch, Prometheus, Grafana)
incident management and on-call leadership (PagerDuty)
-
Deep AWS experience:
Bedrock, SageMaker, Lambda, EKS
data storage systems (S3, DynamoDB, Redshift, Kinesis)
cloud security (IAM, KMS, Secrets Manager, VPCs)
-
Working knowledge of:
Docker and Kubernetes
CI/CD pipelines for ML/LLM workloads
Infrastructure as Code using Terraform
Experience with the following would be advantageous
Architecting internal AI platforms used by multiple product teams
Defining AI governance, risk, and compliance frameworks
Operating AI systems in high-scale or regulated environments
Important clarification
A Principal AI Engineer at Rapid7:
owns architecture, not just code
sets standards others follow
is accountable for the long-term technical health of AI systems across teams
If a candidate has not designed and defended system architectures for production AI platforms, they are not a fit for this role, regardless of individual modeling strength.
About Rapid7
At Rapid7, we are on a mission to create a secure digital world for our customers, our industry, and our communities. We do this by embracing tenacity, passion, and collaboration to challenge what’s possible and drive extraordinary impact.
Here, we’re building a dynamic workplace where everyone can have the career experience of a lifetime. We challenge ourselves to grow to our full potential. We learn from our missteps and celebrate our victories. We come to work every day to push boundaries in cybersecurity and keep our 10,000 global customers ahead of whatever’s next.
Join us and bring your unique experiences and perspectives to tackle some of the world’s biggest security challenges.
About Rapid7
At Rapid7, our vision is to create a secure digital world for our customers, our industry, and our communities. We do this by harnessing our collective expertise and passion to challenge what’s possible and drive extraordinary impact. We’re building a dynamic and collaborative workplace where new ideas are welcome.
Protecting 11,000+ customers against bad actors and threats means we’re continuing to push the envelope just like we’ ve been doing for the past 20 years. If you ’re ready to solve some of the toughest challenges in cybersecurity, we’re ready to help you take command of your career. Join us.
Security and Compliance
Rapid7 is committed to keeping customers secure. As a first line of defense, all employees are expected to uphold the highest standards of security and privacy, ensuring the protection of sensitive information and compliance with relevant regulations.