Site Reliability Engineer (Mid) – CPT

Engineering/Technical
Cape Town – Western Cape – South Africa

ENVIRONMENT:
A globally recognized brand with a strong strategic vision, dedicated to enhancing lives through cutting-edge technology, is looking for a Mid-Level Site Reliability Engineer. They seek a highly skilled and adaptable professional with a solid background in system automation and configuration management tools, including but not limited to Ansible, Puppet, and Terraform. This role offers an opportunity to contribute to a dynamic environment, ensuring reliability, scalability, and efficiency through automation and infrastructure management. You will need Matric/Grade 12, suitable Certifications such as Oracle, Cloud & DevOps and 5-10 years Software Development, of which preferably 3-5 years must be experience in SRE, DevOps, or System Engineering.
 
DUTIES:
  • Experience in monitoring and logging tools to enhance system observability and optimize troubleshooting processes.
  • Develop and maintain tools to automate operational workflows.
  • Actively participate in on-call rotations, promptly respond to incidents, and drive thorough root cause analysis to ensure effective resolution.
  • Work closely with Development teams to enhance system reliability through in-depth code reviews, performance analysis, and infrastructure improvements.
  • Drive the adoption of reliability best practices by contributing to the development, implementation, and continuous improvement of standards that enhance system stability and performance.
  • Promote a culture of knowledge-sharing within the team, encouraging collaboration and enabling continuous learning through open discussions, documentation, and technical insights.
 
REQUIREMENTS:
Minimum Requirements:
  • Matric/Grade 12.
  • 5-10 Years in Software Development, of which  preferably 3-5 years must be experience in SRE, DevOps, or System Engineering.
  • Proficiency in Scripting languages
  • Relevant Certification such as Oracle, Cloud, DevOps.
 
Technical Skills:
  • Continuous delivery
  • Cloud skills & best practices
  • Observability (System and Application Performance Monitoring)
  • Infrastructure as code
  • Configuration Management (Infrastructure as a Service)
  • Containers
  • Automation
  • Collaboration and Communication
  • Coding and Scripting
  • Azure DevOps
  • General systems uptimes
  • SLO (Service-Level Objectives)
  • Latency
  • Incident and Outage Management
  • Change Management
  • Capacity Planning
 
ATTRIBUTES:
  • A proactive approach to spotting problems, areas for improvement, and performance bottlenecks.
  • Strong troubleshooting.
  • Self-disciplined and self-motivated.
  • Ability to learn quickly and share knowledge with others.
  • Work well in a team and independently.
  • Accountable and responsible.
  • Attention to detail, accurate and analytical.
  • Good reporting and documentation. 
  • Excellent communication. 

+ 27 (0) 21 741 0400