Site Reliability Engineer (Core backend team)

Technology Stack:

AWS, Kubernetes (EKS), Aurora RDS (PostgreSQL/MySQL), Kafka, Argo CD, Prometheus, Jenkins, GitLab CI, Terraform, Ansible, Python, Java, Ruby.

Site Reliability Engineer (Core backend team)

As a Site Reliability Engineer, your prime responsibilities are related to a tight work with engineering's teams on product improvement in observability, reliability, and scalability. 

Key Responsibilities:

  • Tight work with the engineering and architecture teams on identifying resilience gaps, and building & executing a roadmap for their resolution

  • Onboarding of K8S microservices to the GitOPS-based CI/CD process and push microservices migration from semi-manual GitOPS releases to the fully-automated zero-downtime CI/CD process

  • Implement SLI/SLO for K8S microservices and build the process to follow them. Identify observability gaps, and execute a roadmap for their mitigation

  • React to production issues as an on-call engineer, participate in the RCA process, and write runbooks & automation to mitigate possible issues in the future

  • Develop, test, execute & support disaster recovery plans for mission-critical services and sub-systems

  • Capacity planning & cloud infra cost optimization

  • Implement security & compliance requirements

Requirements:

  •  3+ years of technical experience in the same or similar role supporting large-scale and high-load production systems

  • Experience in the development and support of public cloud infrastructure

  • Hands-on experience in running HA applications and development of the CI/CD process in Kubernetes

  • Proven programming skills in Python, Go or similar

  • Good knowledge of Linux environment, TCP/IP, network routing, DNS

  • Familiar with SRE principles, DevOps practices, and modern cloud-native landscape

  • Accuracy, attention to details, ability to follow processes

  • Good communication skills with English level intermediate or above

Pluses:

  • Experience working with contact centers, VoIP solutions;

  • Ability to read and troubleshoot Java code if needed;

  • Experience in SQL/NoSQL DB's or attitude to develop skills in this field. 

We offer:

  • Well-coordinated professional team

  • Cutting edge technologies, interesting and challenging tasks, dynamic project, great opportunities for self-realization, professional and career growth

  • Additional Health and Life Insurance Package

  • Employee Assistance Program

  • 25 vacation days

  • ReBenefit Platform Account.

Apply for job

apply

Contact us

Write to us at jobs@jettycloud.com or send a message to our recruiters