Why are we recruiting?
As Site Reliability Engineers, our mission is to empower our team to architect, deploy, and operate infrastructure that is truly cloud agnostic. We aim to enable seamless deployment of our services across any cloud provider or region, ensuring we can always be as close as possible to our customers’ data storage locations. Through robust automation, resilient system design, and a deep commitment to reliability and scalability, we strive to provide our customers with consistent, low-latency experiences—no matter where their data resides.
The candidate will be closed with our international development teams in the USA, Italy, Hungary, and France.
As an SRE, you will play a pivotal role in ensuring our services meet the highest standards for mission-critical operations, with a strong emphasis on security, resiliency, scalability, and performance. The following objectives will serve as the foundation for achieving these goals:
Collaborate with teams to identify tooling needs and design effective solutions.
Coach engineers on best practices for reliable, secure, and performant production environments.
Architect, implement, and improve mission-critical services for operational excellence, security, resiliency, scalability, and performance.
Document and communicate service characteristics (scale, security, performance) to stakeholders.
Establish instrumentation and metrics for service health; manage and test disaster recovery processes.
Engineer automation and orchestration for SaaS platforms to boost reliability and efficiency.
Lead incident response, root-cause analysis, and remediation for continuous improvement.
Define, track, and enforce SLOs for production services, aligning with business expectations.
Monitor and optimize cloud resource utilization and spending for efficiency and cost savings.
Partner with DevOps to optimize pipelines, ensuring reliability and observability.