Site Reliability Engineering Manager

Site Reliability Engineering Manager

Remote/Telecommute JobREMOTE / Toronto, Ontario, Canada  - Permanent
This job allows you to work remotely 

Job Description

Our client's machine learning platforms leverage automotive assembly and vehicle data to detect the earliest indicators of future product failures. They help automakers optimize quality, safety, and reliability throughout the entire product life cycle, from the assembly line to the finish line.

As an integral part of their engineering, site reliability and operations team, you will lead a growing team of SRE's and Cloud Admin's, to deploy, maintain, and monitor all of the cloud based infrastructure that the systems run on. This includes designing and scaling out the future for growth in an exciting and fast paced environment.

The two core products enable automakers to identify anomalies in production data for enhanced testing, accelerated root cause analysis, and improved manufacturing output. Their other product is a SaaS platform which enables predictive maintenance of connected and autonomous vehicles based on production, maintenance, and on-road data.

Must Have Skills:

● 5 years of DevOps/SRE experience, with ~2 years in a 'lead' or 'manager' capacity - bridging Development, Operations and everything in between
● expert level knowledge in Cloud, specifically Microsoft Azure, ideally with exposure to AWS and/or GCP
● 5+ years of software development experience
● strong experience with Terraform + Containers (Kubernetes/Docker)
● experience using Jenkins, Jira, Ansible, and exposure to scripting: Python, Ruby, etc.,
● ability to work in a fast-paced agile environment
● flexibility to adjust to changing priorities, requirements, and schedules
● familiarity with working on remote linux instances


Starting: ASAP
Travel: 0%
Dress Code: Casual

Similar jobs in Toronto:

Similar jobs in other locations: