Senior Site Reliability Engineer
- Boston, MA
How You'll Make a Difference
- Ship foundational services to enable Klaviyo engineering to move faster with confidence
- Design and develop systems and processes that enable highly available & scalable systems
- Design, build and deliver software to dramatically improve the availability, scalability, latency, and efficiency of Klaviyo’s services
- Achieve break-throughs in systems throughput by identifying and eliminating bottlenecks
- Leverage technology such as Python, AWS, Django, Kubernetes, Bash, Terraform, MySQL, RabbitMQ, Redis, Cassandra, Postgresql to advance Klaviyo’s platform
- Champion best practices by actively collaborating with other teams in a culture that values whiteboarding and technical design review
- Contribute to the company as a subject matter expert in multiple areas, constantly pushing yourself to be a better engineer and to level up all of your peers within your team and within Klaviyo.
- Mentor and pair with other Klaviyo engineers to build better software by focusing on performance, self-healing system, configuration as code; defensive programming, application security, etc.
- Participate in periodic on call duties with a focus on solving issues when they are discovered, preventing recurrences and minimizing alert fatigue
- Prototype and advocate for architectural improvements to achieve breakthrough results in Klaviyo systems’ operational scalability and reliability
- Work hand-in-hand with product-facing engineers to ship impactful code
- Perform quantitative analysis to understand and scale Klaviyo systems and manage the cross-functional effort to resolve scalability issues
- Produce and advocate for preventative, upstream solutions with internal stakeholders and external vendors and dependencies
- Confidently make informed, data-driven decisions in a fast paced environment with competing priorities
- Evangelize Site Reliability best practices across the engineering organization and community
Who You Are
- BA or BS Degree in Computer Science, related field, or equivalent experience
- 5+ years of responsibility operating & scaling complex distributed systems
- Ability to handle yourself and complex systems in outage situations and to drive failures to root cause analysis and prevention of future issues
- Fundamental understanding of Linux (we run Ubuntu) and all layers of the networking stack. You should be confident administering and debugging production Linux systems
- Experience working on an engineering team building software
- Experience developing applications in Python, Ruby, Go, etc.
Get to know Klaviyo
Klaviyo is a world-leading marketing automation platform dedicated to accelerating revenue and customer connection for online businesses. Klaviyo makes it easy to store, access, analyze and use transactional and behavioral data to power highly-targeted customer and prospect communications. The company's hybrid customer-data and marketing-platform model allows companies to grow by fostering direct relationships with customers, without giving up their valuable data to popular big-tech ad platforms. Over 265,000 innovative companies like Unilever, Custom Ink, Living Proof and Huckberry sell more with Klaviyo. Learn more at www.klaviyo.com.
Klaviyo does not tolerate and prohibits discrimination, harassment or retaliation of or against job applicants, contractors, interns, volunteers or employees by another employee, supervisor, vendor, customer or any third party.
Back to top