ENGINEERINGUK

Senior Site Reliability Engineer II

About the Employer

Job Description

Senior Site Reliability Engineer Do you have programming, cloud infrastructure, and containerization expertise? Would you like to join our great reliability engineering team? About the Business LexisNexis Risk Solutions is the essential partner in the assessment of risk. Within our insurance vertical, we provide customers with solutions and decision tools that combine public and industry specific content with advanced technology and analytics to assist them in evaluating and predicting risk and enhancing operational efficiency. About the Role The Sr. SRE is a professional level role responsible for challenging reliability and toil reduction projects. This SRE should have a good understanding of how to observe distributed systems and their dependencies, and how to automate recovery to protect service levels. SREs are on-call and assist others during incidents, contributing to process improvements through experience and knowledge. Responsibilities Delivery of resilient application stacks via "Infrastructure as Code" and other DevOps practices. Automating any recurring toil tasks to improve efficiency. Monitoring and on-going support of critical, high revenue business applications. Diagnosis and resolution of complex system and application issues. Working with diverse technical and non-technical teams, including Development, QA, IT Operations, Product SRE and Project Management teams. Write and maintain systems/application documentation for technical and non-technical audiences. Improve, maintain and support in-house automation tools like GitHub IAC CI/CD Pipelines, AWS&Azure automation, front end UI change management tools. Requirements Hands-on with Configuration Management tools - e.g. Ansible, Puppet, Chef or equivalents. Professional experience of working at least 5 years within the public cloud - Azure&AWS or GCP. Hands-on experience of Linux and Windows server including support and troubleshooting. System and application monitoring - e.g. Prometheus, Grafana, etc. Professional Experience of working at least 3 years on Infrastructure as code tools i.e. terraform. Experience working with containerized/serverless workloads such as Docker, Kubernetes, AWS Cloud formations, AWS Lambdas. Professional experience of working at least 5 years with common source control tools - e.g Git, SVN, GitHub. Cloud Architecture and system design to solve key business problems and facilitate team goals. Strong and enthusiastic technologist, able to demonstrate a broad technical knowledge. Excellent oral and written communication skills. Ability to act as a point of expertise, advise others in the team on best practice and impart knowledge. Experience with any high-level programming language, preferably Python. BSc Engineering/Computer Science or relevant experience. Desirable Skills Experience migrating application from on-premises to public cloud. Experience with Blue-Green deployment methodologies. Experience with Log Management tools e.g - ELK Stack, Graylog Azure or Datadog. Experience working with an enterprise RDBMS such as MySQL and/or Microsoft SQL Server. Use of Secret Management services e.g - Hashicorp Vault, Azure Key vault or AWS KMS. AWS / Azure Certifications a plus. Culture and Benefits: Learn more about the LexisNexis Risk team and how we work here. J-18808-Ljbffr