Senior CloudOps Engineer

  • R9696
  • Pune, India

Senior CloudOps Engineer 

Job Description:

We are seeking an experienced and highly specialized Senior CloudOps Engineer to manage, automate, and secure our production cloud infrastructure and Machine Learning (ML)/Large Language Model (LLM) operational pipelines. This role is strictly focused on the operations and infrastructure that supports our data science and engineering teams—it is not a data science or core LLM development position.

Key Responsibilities and Required Expertise

The successful candidate will be an expert in all the following areas, driving high availability, scalability, and security.
 

I. Cloud Infrastructure & Automation

  • Infrastructure as Code (IaC): Deep expertise in managing and provisioning infrastructure using Terraform.

  • Containerization & Orchestration: Advanced deployment, scaling, and management of services using Docker/Kubernetes.

  • Networking & Services: Architecting and maintaining high-performance API Layers & Microservices.

  • AWS CloudOps: Expert proficiency in AWS operational services, including EventBridge and Step Functions, for building robust automation flows.

  • Data Storage: Managing and optimizing critical AWS data services, including S3, DynamoDB, Redshift, and Kinesis.
     

II. MLOps Tooling & Monitoring

  • ML/LLM Tooling Support: Provide and maintain the operational infrastructure for ML/LLM systems, including Model Registry/Versioning tools like MLflow/SageMaker.

  • Pipeline Automation (CI/CD): Designing and implementing robust CI/CD pipelines for ML/LLM deployments using tools like GitHub Actions/Jenkins.

  • Model Operations: Building the infrastructure to support Drift Detection & Retraining capabilities.

  • Monitoring & Alerting: Implementing comprehensive observability stacks using Prometheus/Grafana/CloudWatch.

  • Incident Management: Leading resolution efforts for production issues, including expertise with PagerDuty and On-call responsibilities.
     

III. Security & Compliance (FinOps)

  • Cloud Security: Establishing and enforcing strong security policies and best practices across the cloud environment (IAM, VPC, Secrets).

  • AWS Security Services: Expert knowledge and application of specific AWS security tools like IAM, KMS, and Secrets Manager.

  • Cost Optimization: Leading initiatives for Cost Optimization (FinOps), balancing performance and efficiency across all cloud resources.

Security and Compliance
Rapid7 is committed to keeping customers secure. As a first line of defense, all employees are expected to uphold the highest standards of security and privacy, ensuring the protection of sensitive information and compliance with relevant regulations.

Apply Now

Not You?

Application loading...

 

Jobs you may be interested in

Sales Operations Analyst

Prague, Czechia. Belfast, United Kingdom. Reading, United Kingdom
Rapid7 is looking for a passionate Revenue Operations business partner to join our INTL organization. You will play a key role in enhancing the operational efficiency and performance of our Sales and GTM teams. You will work closely with sales lea...

Associate Detection & Response Analyst - Afternoon Shift

Prague, Czechia
We are looking for people with a passion for investigation and forensic analysis to join our MDR SOC team at Rapid7. As an Associate Detection & Response Analyst, you will utilise Rapid7's advanced tools to investigate and triage security events a...

Senior Product Manager

Prague, Czechia
Rapid7 is seeking a Senior Product Manager to own the vision, strategy, and execution for the Digital Risk Protection (DRP) offerings within our Threat Intelligence portfolio. This role is critical to helping our customers proactively identify, mo...

AI Platform Engineer II

Pune, India
Overview We’re hiring an AI Engineer to build production-ready, agentic AI experiences on Google Cloud (Vertex AI) and OpenAI. You’ll ship real features—agents, tools, and lightweight UIs—that plug into our data and internal systems to improve pro...

Apply Now

Not You?

Application loading...