Business Function
Group Technology and Operations (T&O) enables and empowers the bank with an efficient, nimble and resilient infrastructure through a strategic focus on productivity, quality & control, technology, people capability and innovation. In Group T&O, we manage the majority of the Bank's operational processes and inspire to delight our business partners through our multiple banking delivery channels.
Group Infrastructure & Cloud (GIC) provides a platform/product for all applications to use public cloud. This product enables applications to use Public cloud services through predefined templates that are highly secure and architected against best practices and standards.
Responsibilities
• Partner with DBS development teams to help reproduce and resolve public cloud platform issues.
• Taking ownership of incidents reported and coordinating with L3 and engineering teams for resolution
• Constantly learn and use cutting edge cloud technologies
• Leverage your extensive customer support experience to provide feedback to the cloud team on how to improve the public cloud platform service.
• Drive customer communication during critical events.
• Write tutorials, and other technical articles for the developer community.
• Typically, the SRE will be primarily responsible for solving customer’s cases with advanced troubleshooting techniques to provide tailored solutions for our customers and thoughtfully work with customers to dive deep into the root cause of an issue.
• Apart from working on a broad spectrum of technical issues, the SRE may coach/mentor new hires, develop & present training, partner with development teams on complex issues or, participate in new hiring, write tools/script to help the team, or work with leadership on process improvement and strategic initiatives.
• Responding to and investigating system generated security incidents
• The role is an SRE role and thus includes rotation in a 24/7 on call roster that supports the cloud delivery platform pipeline
Requirement
• Production SRE experience in supporting enterprise public cloud environments (AWS and Google Cloud)
• Knowledge on Incident Management, Change management process
• Well organized, adaptable and makes clear and effective decisions
• Knowledge and experience with observability and SRE tooling and philosophy: ELK Stack / Graylog etc.
• General knowledge of infrastructure components Firewalls, TCP/IP, DNS, ICMP, Networking, Switching, PKI, TLS, Load Balancing
• General knowledge of web technology fundamentals HTTP, Websockets, Content Distribution, WAF, REST, JSON, YAML, HTTP, CORS
• Strong experience with any flavour of Linux
• General knowledge of Java, Python, JavaScript.
• General knowledge of enterprise security and public cloud security
• Awareness of continuous delivery and continuous integration development environment git,
• General experience with at least one of the following: K8S, Ansible, Terraform, CloudFormation
Apply now
We offer a competitive salary and benefits package and the professional advantages of a dynamic environment that supports your development and recognises your achievements.