Back

[Remote] Site Reliability Engineer

Worldwide Salaried Open

Note: The job is a remote job and is open to candidates in USA. Qlik is a Gartner Magic Quadrant Leader in data integration and analytics, serving over 40,000 global customers. They are seeking a Site Reliability Engineer to ensure the security, stability, and scalability of their Qlik and Talend Cloud services, while tackling complex challenges and driving improvements in performance and reliability.

Responsibilities

  • Take on the responsibility of maintaining the reliability and availability of our cloud platforms, tackling complex problems and driving improvements to enhance performance and scalability
  • Work closely with our Engineering organization, collaborating with Architecture, Platforms, and Domains teams to design and develop new infrastructure features and optimize cloud-related practices
  • Design and develop effective tooling, alerts, and responses to identify and address reliability risks, utilizing your expertise in cloud technology and backend systems
  • Act as a resource for fellow engineers, sharing your knowledge and expertise in cloud engineering, production service operations, incident management, and troubleshooting
  • Stay updated on the latest industry trends and technologies, contributing to the adoption of best practices and driving continuous improvement within our cloud environment
  • Ensure high reliability and availability of our cloud platforms, collaborating with cross-functional teams to implement new infrastructure features and optimize performance
  • Define and evangelize cloud-related optimizations and best practices, driving improvements in reliability, scalability, and performance
  • Analyze complex issues at the infrastructure, systems, network, and application levels, making recommendations and decisions to resolve them effectively
  • Share your expertise with fellow engineers, providing guidance on cloud technologies, automation, security, and best practices
  • Participate in on-call duties to maintain the availability and performance of our cloud infrastructure, providing regular updates on project status and activities

Skills

  • Bachelor's or Master's degree in Computer Science or a relevant field
  • Self-motivated with the ability to work autonomously and multitask effectively
  • Strong analytical skills for solving complex problems and driving innovative solutions
  • 10+ years of experience in software engineering and Site Reliability Engineering, focused on large-scale distributed systems, cloud infrastructure, and production operations
  • 5+ years' experience with Infrastructure as Code (IaC) tools such as Terraform, Crossplane, Ansible, or similar
  • 5+ years' experience working alongside a production system running on Kubernetes
  • 5+ years of professional experience in cloud engineering, preferably with AWS and Azure
  • 5+ years of Professional experience with operating and/or building microservices
  • Proficiency in scripting and automation (e.g., Bash, Python, Go, C#) and software engineering concepts
  • Proficiency with CI/CD, cloud and microservice autoscaling
  • Proficiency with observability stack tooling such as Prometheus, Open Telemetry, distributed tracing, and SIEM such as Splunk
  • Proficiency with Helm including but not limited to managing helm charts as well as creating custom charts from existing ones or building new
  • Provide technical leadership during troubleshooting efforts and effectively communicate issues, impact, and resolution plans to senior leadership
  • Proficiency with cloud security best practices across infrastructure and platform services, including identity and access management, encryption, network segmentation, secrets management, and least-privilege access controls
  • Proficiency with incident management best practices and confidently drive an incident in a critical production environment
  • Knowledge of infrastructure security review and compliance frameworks
  • Experience working with database concepts and tooling such as MongoDB, Redis, OpenSearch and RDS
  • Demonstrated ability to collaborate with development teams and provide expert guidance on implementing reliability best practices, ensuring systems are robust, scalable, and highly available
  • Knowledge of event-driven architecture (Ex. Pub Sub)
  • Where applicable, experience with or interest in learning other tools such as Clik House, Fire Hydrant, Solace, Gloo, Istio, and other cloud native related tools
  • Ability to obtain sufficient clearance status to work on IL5 systems with Qlik support
  • Due to this requirement: Must be a USA Citizen or be in process to become one by January 2027
  • Excellent English communication skills, both oral and written
  • Curiosity and desire to learn
  • Ability to take a rotating on-call shift (24/7)
  • Certifications such as CKD, CKS, AWS Certified Solutions Architect Associate/Professional, AWS Certified Advanced Networking Specialty, AWS Certified Security Specialty
  • Experience supporting FedRAMP or DoD IL4 certification initiatives by implementing security controls, driving audit readiness, and operationalizing compliant cloud infrastructure
  • Experience with self-hosted Temporal workflow infrastructure, including deployment, upgrades, scaling, monitoring, troubleshooting, and performance optimization across Kubernetes environments

Benefits

  • Medical, dental, and vision coverage
  • Life and AD&D
  • Short and long-term disability coverage
  • Paid time off
  • Paid parental / maternity leave
  • Participation in a 401(k) program that includes company match
  • Many other additional voluntary benefits
  • Genuine career progression pathways and mentoring programs
  • Culture of innovation, technology, collaboration, and openness
  • Flexible, diverse, and international work environment
  • Extra “change the world” day plus another for personal development
  • Participation in our Corporate Responsibility Employee Programs

Company Overview

  • Qlik is a data analytics platform that helps businesses make better decisions through advanced data integration, quality, and analytics. It was founded in 1993, and is headquartered in King Of Prussia, Pennsylvania, USA, with a workforce of 1001-5000 employees. Its website is https://www.qlik.com/.
  • Apply To This Job

    More jobs