Senior Software Engineer, Site Reliability

Senior Software Engineer, Site Reliability

Toronto, Ontario  - Permanent

Job Description

Founded in 2008 and headquartered in San Francisco, The client is backed by over $400 million in investment from Sequoia Capital, CapitalG, Tiger Global Management, Javelin Investment Partners and Baillie Gifford.

Our Site Reliability Engineers are a hybrid of software and systems engineers. Our current mission is to design the next version of the core infrastructure. We code our way out of operational problems. We are responsible for reliability, scalability, and automation while keeping an eye on latency, performance, and capacity.

Must Have Skills:

Design, write and maintain software to improve the availability, scalability, and efficiency of services, incorporating third-­party open-source tools when available
Set the architectural direction of infrastructure and platform teams and supporting the entire organization
Design and implement the tools and processes used for deployment and Change Management and Incident Response
Own, maintain, and continuously improve all systems provided as a service, such as monitoring and datastores
Engage in service capacity planning and demand forecasting, anticipating performance bottlenecks

Minimum of 9 years of industry experience in engineering
Fluent in one or more of: C, Python, Go, Scala
Familiarity with algorithms, data structures, and complexity analysis
In-depth knowledge of operating systems (processes, threads, IPC, concurrency, locks, mutexes, semaphores, etc.)
Experience working with Unix/Linux systems from kernel to shell and beyond, with experience working with system libraries, file systems, and client-server protocols
Experience with one or more of: Puppet, Chef, Ansible, et al.
Strong sense of ownership, drive, and leadership

Nice to Have Skills:

Expertise in designing, analyzing, and troubleshooting large-scale distributed systems
Experience architecting cloud-native applications with Amazon Web Services
Experience with PostgreSQL tuning and performance
Experience managing a Container Orchestration System, such as ECS, Kubernetes
Experience with one or more of: Kafka, ElasticSearch, Envoy Proxy


Starting: ASAP

Similar jobs in Toronto:

Similar jobs in other locations: