SRE-Production Support Engineering Manager

🔍 Pune, Maharashtra, India

📁: Architect (Level: Manager)

📅   : CREQ221485 Requisition #

📅   : Jun 20, 2025 Post Date

Thanks for your interest in the SRE-Production Support Engineering Manager position. Unfortunately this position has been closed but you can search our 989 open jobs by clicking here.

Pls see below:

We need a strong profile having good exp in stakeholder & SRE team management.
Good understanding of Production engineering/ production support projects is a must which includes handling teams working in 24/7 model.
Good understanding of Incident, change, service req management is a daily routine – so candidate should know how to manage the workload, rotate FTEs as and when required.
Management of Ad hoc activities such as Vulnerabilities fixes/ patching awareness is required.
Should be able to lead BAU governance activities Daily, Weekly & Monthly cadence with necessary reporting data.
Having GCP cloud infra management knowledge, Postgres DB basic knowledge & banking domain experience is a big advantage to the role.

==================================================================================================

Job Description:

Mandatory experience on SRE (not Traditional Production Support) covering integration platforms on cloud-based deployments.
Knowledge of applying SRE practices to daily operations is key.
Ability to manage teams in shifts from office is mandatory; this is a 24x7 on desk operation.
Computer Science and/or Engineering degrees are preferred.
Having domain experience in Banking will be a great advantage.

Working Experience/ Awareness:

24x7 operations support model for mission critical applications and infrastructure using ServiceNow as the ITSM ticketing tool.
GCP and private-cloud operational support / administration activities such as provision, capacity management, reliability management, monitoring, restoration, etc.
Working knowledge on AppDynamics and Splunk for monitoring and setting up observability is key. CI/CD tool chains, setting up and running deployment pipelines and propagating changes on different environments. Maintaining middleware such as Kafka (open source) and MQ as well as application servers (Tomcat).
Maintain Hazelcast Data storage platform clusters and Control M job schedulers.
Kubernetes cluster management, monitoring, and remediation. Knowledge of Docker is important.
Automating deployments and scripting self-healing workflows based on telemetry.
Work closely with the team to define SLIs and configure SLOs, respond to threshold alerts and optimize monitoring capability.
Work closely with the team to understand the code as well as configuration artifacts to debug and fix issues that may arise.
Must be inclined to work on proof of concepts solutions to optimize reliability such as those incorporating AI models for event correlation and assisted triaging.
Able to lead & drive SRE team to parallelly work on Service or Change Requests, Defect management board, backlog management in agile manner.

Good to have:

SRE Foundation certification by DevOps Institute or any other equivalent certification on SRE by a recognized body is mandatory.
CKA certification.
GCP Cloud Digital Leader certification at a minimum is mandatory; Cloud Engineer level is a bonus.
Hazelcast Platform Operations certification badge

SRE-Production Support Engineering Manager

🔍 Pune, Maharashtra, India

Previous Job Searches

Similar Listings