🌎
This job posting isn't available in all website languages
📁
Architect (Level: Manager)
📅
CREQ232897 Requisition #
Need a strong profile having good exp in stakeholder & SRE team management.
Experience working on Production engineering/ production support projects is a must which includes handling teams working in 24/7 model.
Good understanding of Incident, change, service req management is a daily routine so candidate should know how to manage the workload, rotate FTEs as and when required.
Management of Ad hoc activities such as Vulnerabilities fixes/ patching awareness is required.
Should be able to lead BAU governance activities Daily, Weekly & Monthly cadence with necessary reporting data.
Knowledge of applying SRE practices to daily operations is key.
Computer Science and/or Engineering degrees are preferred.
Having domain experience in Domestic Banking application areas (IMPS/ UPI) will be a great advantage.
Working Experience/ Awareness:
24x7 operations support model for mission critical applications and infrastructure using ServiceNow as the ITSM ticketing tool.
Hybrid and private-cloud operational support / administration activities such as provision, capacity management, reliability management, monitoring, restoration, etc.
Working knowledge on AppDynamics and Splunk for monitoring and setting up observability is key. CI/CD tool chains, setting up and running deployment pipelines and propagating changes on different environments. Maintaining middleware such as MQ as well as application servers (Tomcat).
Maintain Hazelcast Data storage platform clusters and Control M job schedulers.
Kubernetes cluster management, monitoring, and remediation. Knowledge of Docker is important.
Automating deployments and scripting self-healing workflows based on telemetry.
Work closely with the team to define SLIs and configure SLOs, respond to threshold alerts and optimize monitoring capability.
Work closely with the team to understand the code as well as configuration artifacts to debug and fix issues that may arise.
Must be inclined to work on proof of concepts solutions to optimize reliability such as those incorporating AI models for event correlation and assisted triaging.
Able to lead & drive SRE team to parallelly work on Service or Change Requests, Defect management board, backlog management in agile manner.
Good to have:
SRE Foundation certification by DevOps Institute or any other equivalent certification on SRE by a recognized body is mandatory.
ITIL/ ITSM certified

Previous Job Searches

Similar Listings

Pune, Maharashtra, India

📁 Architect (Level: Manager)

Requisition #: CREQ233386

C++

Pune, Maharashtra, India

📁 Architect (Level: Manager)

Requisition #: CREQ232901

Pune, Maharashtra, India

📁 Architect (Level: Manager)

Requisition #: CREQ233396