Site Reliability Engineer
Who We Are:
KeepTruckin is on a mission to modernize the trucking industry. With the leading fleet management platform, we are bringing trucks online and fundamentally changing the way freight is moved on our roads.
We see our hard work rewarded in tangible ways every day and we believe that intelligence is most powerful when paired with humility. We’re motivated by the opportunity to impact and improve every facet of a trillion-dollar industry that touches everyone’s lives. KeepTruckin is proud to be a Forbes Cloud 100 company and a 2020 Career-Launching Company by Wealthfront.
About the Role:
As a Site Reliability Engineer on the Platform team, your role will be crucial in helping us design, scale, and manage our growing AWS-backed infrastructure. Your expertise will contribute to scaling our architecture and building a highly available system with an enthusiastic team. We are looking for candidates who have production experience with AWS-based platforms, expertise in automating distributed systems, scaling a fast-growing platform, maintaining high availability, and a forward-thinking mindset ready to take on tomorrow's challenges.
What You’ll Do:
- Automate the provisioning, scaling, and management of our infrastructure using Configuration As Code and Configuration Management
- Work with other engineering and product teams to design and build the infrastructure required to deliver new features to customers
- Identify and remove bottlenecks from systems in production
- Ensure 99.9% customer-facing uptime
- Continuously improve the monitoring and alerting capabilities of our platform, enabling us to be proactive instead of reactive
- Create deployment pipelines; take code from git to production
What We’re Looking For:
- 4+ years of professional SRE/DevOps experience, and a demonstrated ability working on high volume production systems
- Working knowledge of AWS services and technologies (Redshift, DynamoDB, Kinesis, RDS, ELB, AutoScaling, Lambda, etc…)
- Experience with infrastructure as code and configuration management (Terraform, Nix, Ansible, CloudFormation, Chef, etc...), and with build managers such as Bazel, Pants, Buck
- Knowledge of Python, Ruby, or Go, and an understanding of relational and NoSQL databases (PostgreSQL a plus)
- Experience with container orchestration framework such as Kubernetes, Docker Swarm
Creating a diverse and inclusive workplace is one of KeepTruckin's core values. We are an equal opportunity employer and welcome people of different backgrounds, experiences, abilities and perspectives.
Please review our Candidate Privacy Notice here.
Back to top