Lead Database Reliability Engineer

🔍 Colombo, Western Province, Sri Lanka

New

📁: Lead Software Engineer

📅   : CREQ258047 Requisition #

📅   : 3 hours ago Post Date

About the Role:

The Database Reliability Engineer (DBRE) – which is an extension or subset of the SRE (Site

Reliability Engineering) model, just specializing in database technologies but with the same

underlying DevOps principle, will be a lead strategic partner in building and maintaining a

Database as a Service Platform to help software engineers build, deploy, and monitor

applications with an emphasis on automation. This is an engineering discipline that combines

software and systems engineering to build and run large-scale, massively distributed, fault-

tolerant systems.

DBRE is responsible for the availability and reliability of our most critical database platform

services and ensures they meet our internal and external users requirements. The hosting

platforms will be on-prem servers as well as public clouds such as AWS/Azure.

How you will make an impact:

Drive technology initiatives by taking the lead and providing guidance to team members.
Design, build, and maintain enterprise-scale production relational backends using Microsoft SQL Server, MySQL, or Oracle (both on-premises and in the cloud, with a particular emphasis on Relational Database Service in Amazon Web Services)
Be involved in designing, building, maintaining, and monitoring CI/CD pipelines and all deployments up to production.
Handle performance tuning, backup, and recovery tasks.
Create automated processes for recurring database tasks and deployments (such as migrations, replication, restoring backups, and spinning up new clusters).
Develop and automate best practices and repeatable procedures for deploying and

scaling databases.

Provide production and lower-environment support for assigned applications related to

their back-end databases

Build and maintain High Availability (HA) and Disaster Recovery (DR) design/implementation for complex mission-critical environments.
Assist with the design and implementation of infrastructure assets using cloud services.
Identify improvement opportunities on existing systems, build plans, and execute improvements.
Research of automation-related technologies.
Diagnose and troubleshoot database errors, including participating in an on-call rotation

and being available for on-call support as needed (even working over weekends when

required).

We are looking for people who:

Have 5+ years of experience either in PowerShell/ Windows command line scripting, or Linux scripting such as bash, especially with troubleshooting production systems.
Have 5+ years of experience in building, configuring, and managing database environments.
Experience with at least two relational and non-relational databases such as Microsoft

SQL Server, MySQL, Oracle, PostgreSQL, MongoDB and CouchDB is expected.

Experience in analyzing requirements and proposing database solutions.
Hands on experience in building, managing and troubleshooting high availability

features such as Clustering, Log-shipping and Mirroring.

Have 2-4 years of experience using cloud database services such as Amazon RDS.
Have experience in DEV-OPS configuration management system automation using tools

such as Terraform, Ansible, CloudFormation, Chef etc.

Have hands-on experience with Continuous Integration/Continuous Delivery &amp Deployment techniques and tools such as Jenkins and GitHub.
Have exposure to containerization (Docker) and a container orchestration system

(ECS/Kubernetes).

Have good understandings on disciplines related to database reliability engineering such

as systems management, security and release management.

Have experience in managing projects and initiatives, with minimum supervision.
Have effective communication skills - both verbally and in writing.
Can document the processes and procedures involved.