لم يعد يتم قبول المزيد من الطلبات لهذه الوظيفة

Senior Site Reliability Engineer - Amman, الأردن - Quadcode

Quadcode Amman, الأردن

منذ 3 أسابيع

Examples of first tasks in the role:

Responsibilities in the role:

Identification of bottlenecks and preparation of recommendations to improve the reliability of services;
Responding to platform emergencies, localizing and resolving the causes of failures, compiling postmortem reports;
Development of monitoring and alerting tools ensuring high availability and quick detection of potential issues: (Grafana, Grafana OnCall, Prometheus Alert manager, etc.);
Active participation in change management processes, including assessment and coordination of changes to the infrastructure within Change Advisory Board (CAB) sessions;
Implementation and support of ITSM processes to optimize team workflow and enhance service quality.
Development and maintenance of documentation in an up-to-date state.

Requirements:

3+ years of experience in SRE/DevOps;
Understanding of SRE principles, practical experience in implementing SRE practices;
Understanding of principles and practical experience in building resilient systems;
Experience with monitoring and logging systems (Prometheus, Graylog, Grafana).
Experience with automation tools for software build and deployment (CI/CD): GitLab, Jenkins;
Understanding of virtualization and containerization principles;
Understanding of Infrastructure as Code (IaC) approaches and experience;
Proficiency in a programming language for automation script development (Python, Nodejs, Golang, etc.), ability to understand service code;
Understanding of network protocols, topologies, and network models;
Experience with configuration management tools: Ansible, Chef;
Basic experience with relational databases, such as PostgreSQL;
Experience in administering Linux operating systems;
Fluency in English and Russian (B2 minimum).

As an advantage:

Senior Site Reliability EngineerTech stack