Senior Site Reliability Engineer I

Booking.com
Booking.com
Bangalore, IndiaOn-siteCompetitiveAdded 1 month ago
Booking.com

Senior Site Reliability Engineer I

Original Advert

Booking Holdings (NASDAQ: BKNG) is the world leader in online travel and related services, provided to customers and partners in over 220 countries and territories through six primary consumer-facing brands - Booking.com, KAYAK, Priceline, Agoda.com, Rentalcars.com, and OpenTable. The mission of Booking Holdings is to make it easier for everyone to experience the world. During 2019, the Company had consolidated revenues and net income of $15.1 billion and $4.9 billion, respectively, and a current market value of approximately $90 billion.

Booking Holdings Bangalore is a Center of Excellence based in Bangalore, India and a legal entity of Booking Holdings Inc. The Center was created to support the increasing business demands of the Booking Holdings Brands. The Center of Excellence provides access to specialized and highly skilled talent, leading industry best practices, and collaboration opportunities across all of the Booking Holdings brands and business units.

At Booking.com, data drives our decisions. Technology is at our core. And innovation is everywhere. But our company is more than datasets, lines of code or A/B tests. We're the thrill of the first night in a new place. The excitement of the next morning. The friends you make. The journeys you take. The sights you see. And the food you sample. Through our products, partners and people, we can empower everyone to experience the world.

We're a truly global e-commerce company, with business operations in nearly every country and city on the planet. And we want to make it easy for everyone, anywhere in the world, to pay for their travel or do business with our platform - whenever and however it's convenient for them.

The Role

The Role

Senior Site Reliability Engineer I (aka Senior SRE I) are experts in treating operations as a software problem. They focus on reliability of systems and services - addressing availability, performance, scalability, latency, observability, efficiency. They work on maintaining key components and developing systems that will minimize human labor (through automation) and increase system reliability with the end goal of breaking the relationship between system size, operational toil and complexity.

A Senior SRE I is responsible for the design, prioritization and implementation of complex technical solutions. They can accurately estimate or forecast the effort and impact of the items they work on, and show a high quality of craft in what they deliver. They are expected to lead incident response for issues affecting their team. Senior SRE I is expected to coach and mentor less experienced engineers and be a thought leader in their team ensuring best practices are being implemented. The primary differences between a Senior SRE I and a Senior SRE II are the technical skills (enabling the Senior SRE II to only seek support in unique scenarios), the scope and the engagement in challenging best practices.Because the required technical skills and commercial knowledge can vary from one area to another, Senior SRE I can wear several hats; part of a business service owner team, owner of a piece of infrastructure, and/or consultant to product development teams regarding Site Reliability Engineering related scope.

Key Responsibilities

Building software applications

  • Is responsible to build software applications by using relevant development languages and applying knowledge of systems, services and tools appropriate for the business area and guide more junior members of the team in this topic.

  • Is responsible to refactor and simplify code by introducing design patterns when necessary and guide more junior members of the team in this topic.

  • Is responsible to ensure the quality of the application by following standard testing techniques and methods that adhere to the test strategy

  • Is responsible to write readable and reusable code by applying standard patterns and using standard libraries

  • Is responsible to maintain data security, integrity and quality by effectively following company standards and best practices

Software Systems Design

  • Is responsible to evaluate possible architecture solutions by taking into account cost, business requirements, technology requirements and emerging technologies

  • Is responsible to describe the implications of changing an existing system or adding a new system to a specific area, by having a broad, high-level understanding of the infrastructure and architecture of our systems

  • Is responsible to help grow the business and/or accelerate software development by applying engineering techniques (e.g. prototyping, spiking and vendor evaluation) and standards

  • Is responsible to meet business needs by designing solutions that meet current requirements and are adaptable for future enhancements

End to End System Ownership

  • Is responsible to reduce business continuity risks and bus factor by applying state-of-the-art practices and tools, and writing the appropriate documentation such as runbooks and OpDocs

  • Is responsible to reduce risk and obtain customer feedback by using continuous delivery and experimentation frameworks

  • Is responsible to independently manage an application or service by working through deployment and operations in production and guide more junior members of the team in this topic.

  • Is responsible to maintain data security, integrity and quality by effectively following company standards and best practises

Technical Incident Management

  • Is responsible to address and resolve live production issues by mitigating the customer impact within SLA

  • Is responsible to improve the overall reliability of systems by producing long term solutions through root cause analysis

  • Is responsible to keep track of incidents by contributing to postmortem processes and logging live issues

Automation and toil reduction

  • Is responsible to ensure that infrastructure stays current by reducing technical debt, searching for bottlenecks and preparing for scaling

  • Is responsible to reduce cost of operations and maintenance by leveraging new technologies, automation, and partner with vendors to ensure we stay current

  • Is responsible to reduce human labour by writing small software features that address availability, scalability, latency and efficiency

Monitoring and Alerting improvements

  • Is responsible to review and verify performance of production systems and network infrastructure by continuously monitoring appropriate observability metrics, business KPIs and capacity planning

  • Is responsible to improve application reliability by partnering with development teams to advise on setting appropriate observability metrics

Critical Thinking

  • Is responsible to systematically identify patterns and underlying issues in complex situations, and to find solutions by applying logical and analytical thinking.

  • Is responsible to constructively evaluate and develop ideas, plans and solutions by reviewing them, objectively taking into account external knowledge, initiating 'SMART' improvements and articulating their rationale.

Continuous Quality and Process Improvement

  • Is responsible to identify opportunities for process, system and structural improvements (i.e performance gains) by examining and evaluating current process flows, methods and standards.

  • Is responsible to design and implement relevant improvements by defining adapted/new process flows, standards, and practices that enable business performance.

Effective Communication

  • Has sufficient knowledge to deliver clear, well-structured, and meaningful information to a target audience by using suitable communication mediums and language tailored to the audience

  • Has sufficient knowledge to achieve mutually agreeable solutions by staying adaptable, communicating ideas in clear coherent language and practising active listening

  • Has sufficient knowledge to ask relevant (follow-up) questions to properly engage with the speaker and really understand what they are saying, by applying listening and reflection techniques

Architectural Guidance

  • Is responsible to advise product teams towards a technical solution that meets the functional, nonfunctional & architectural requirements by challenging the rationale for an application design and providing context in the wider architectural landscape

  • Has sufficient knowledge to set a clear direction for a technical capability by evaluating and aligning the target architecture improvements, reframing architectural designs and decisions for varied stakeholder

Coaching/Mentoring

  • Has basic knowledge to coach, guide and improve the overall performance of stakeholders and colleagues at all levels, when appropriate, by sharing experience, knowledge and approaches to work


Pre-Employment Screening

If your application is successful, your personal data may be used for a pre-employment screening check by a third party as permitted by applicable law. Depending on the vacancy and applicable law, a pre-employment screening may include employment history, education and other information (such as media information) that may be necessary for determining your qualifications and suitability for the position.

Application managed by Booking.com