We are looking for a skilled DevOps Engineer with hands-on experience in Kubernetes, CI/CD pipelines, cloud infrastructure (AWS/GCP), and observability tooling. You will be responsible for automating deployments, maintaining infrastructure as code, and optimizing system reliability, performance, and scalability across environments.
In this role, you will...
- Develop and maintain CI/CD pipelines to automate testing, deployments, and rollbacks across multiple environments.
- Manage and troubleshoot Kubernetes clusters (EKS, AKS, GKE) including networking, autoscaling, and application deployments.
- Collaborate with development and QA teams to streamline code integration, testing, and deployment workflows.
- Automate infrastructure provisioning using tools like Terraform and Helm.
- Monitor and improve system performance using tools like Prometheus, Grafana, and the ELK stack.
- Set up and maintain Kibana dashboards, and ensure high availability of logging and monitoring systems.
- Manage cloud infrastructure on AWS and GCP, optimizing for performance, reliability, and cost.
- Build unified observability pipelines by integrating metrics, logs, and traces.
- Participate in on-shift rotations, handling incident response and root cause analysis, and continuously improve automation and observability.
- Write scripts and tools in Bash, Python, or Go to automate routine tasks and improve deployment efficiency.
Want more jobs like this?
Get jobs in Pune, India delivered to your inbox every week.
You've got what it takes if you have...
- 3+ years of experience in a DevOps, SRE, or Infrastructure Engineering role.
- Bachelor's degree in Computer Science, IT, or related field.
- Strong understanding of Linux systems, cloud platforms (AWS/GCP), and containerized microservices.
- Proficiency with Kubernetes, CI/CD systems, and infrastructure automation.
- Experience with monitoring/logging tools:
- Prometheus, Grafana, InfluxDB
- ELK stack (Elasticsearch, Logstash, Kibana)
- Familiarity with incident management tools (e.g., PagerDuty) and root cause analysis processes.
- Basic working knowledge of:
- Kafka - monitoring topics and consumer health
- ElastiCache/Redis - caching patterns and diagnostics
- InfluxDB - time-series data and metrics collection