Share this Job
Apply now

Apply for Job

Site Reliability Team Leader



Job Category:  R&D

About CyberArk:

CyberArk is the global leader in privileged access security, a critical layer of IT security to protect data, infrastructure and assets across the enterprise, in the cloud and throughout the DevOps pipeline. CyberArk delivers the industry’s most complete solution to reduce risk created by privileged credentials and secrets. The company is trusted by the world’s leading organizations, including more than 50 percent of the Fortune 100, to protect against external attackers and malicious insiders.

Job Description:

CyberArk SRE are coders who enjoy a challenge and own the availability of CyberArk SaaS (infrastructure - application), by measuring failures and availability of SLIs and SLOs, using a proactive approach of prevention over mitigation and mitigation over fixing. The SRE collaborates with Dev and work with PM in order to continuously improve the services availability and quality. They will share ownership with the Dev team to create shared responsibility where the SRE owns the availability of the service, proactive prevention of issues, performing deliberate and structured troubleshooting to mitigate issues.

CyberArk Cloud Engineering is looking for a team leader with "automation first" mindset to lead the Site Reliability Engineering team.  The team leader shall demonstrate high professional skills, coach her team members and strive for excellence, using comprehensive business and technical knowledge. 


  • Lead a team of Site Reliability Engineers who is responsible for CyberArk SaaS uptime and share responsibility over the reliability of the services with the dev-teams
  • Own the overall health, availability, performance and capacity of the SaaS, improve the services reliability by writing code and build automation to prevent problem recurrence.
  • Lead projects and prioritization in order to meet the Site Reliability Engineering goals
  • Lead by example, mentor the team, define improvement goals for the team and lead the team to achieve the goals
  • Work hand-in-hand with the Dev teams, product management and Security Services to improve the reliability of the service
  • Working with multiple stakeholders, manage different agendas and resolve issues with the rule of what's best for CyberArk
  • Proactively test the flexibility and resiliency of the services
  • Recruit and train new team members
  • Design, write and deliver pipelines and code to improve the availability, scalability, latency and efficiency of CyberArk Services
  • Enhance the incident response process
  • Conduct performance reviews with team members
  • Maintaining our culture


  • 3-5 years of experience focused on Site Reliability, DevOps Engineering, system administration or application development for production services
  • 4 years experience in a production services operations role
  • 2 years in a leadership position with at least a year in management of engineering or operations
  • Strong hands-on experience in:
    • Linux and/or Windows OS
    • Network architecture and security configurations
  • Hands-on experience with the following scripting technologies:
    • Automation/Configuration management using either Ansible, Puppet, Chef or an equivalent
    • Python, Ruby, Bash, PowerShell
  • Think like an attacker
  • Experience with running production services on AWS – an advantage
  • Excellent communication skills
  • Strong attention to detail
  • Strong hands-on technical abilities
  • Strong problem solving skills
  • Strong understanding of Information Security in various environments
  • Ability to keep track of numerous detail-intensive, interdependent tasks and ensure their accurate completion
  • Proactive, highly motivated individual with a high work ethic and goal-oriented approach
  • Excellent communication and presentation skills

Apply now

Apply for Job