Senior Site Reliability Engineer
Leanplum is the most complete mobile marketing platform, designed for intelligent action. Our integrated solution delivers meaningful engagement across messaging and the in-app experience. Leanplum offers Messaging, Automation, App Editing, Personalization, A/B Testing, and Analytics.
Top brands such as Expedia, Tesco, and Lyft trust us to create impactful relationships with their users. We were founded in 2012 by former Google engineers with years of experience in optimization and have received over $17MM in funding from top-tier VCs like Kleiner Perkins and Shasta Ventures.
Inside Leanplum, you’ll meet employees from 16 countries and counting. We house a world champion air guitarist, three medalists from programming competitions, and six loyal office dogs who greet you at the door with tails wagging. Past perks have included company vacations to Mexico and Tahoe, Alfred Hitchcock movie nights, and costume parties. But most of all, we believe in gratitude, collaboration, and karma.
We're looking for a Senior Site Reliability Engineer to join our growing team in San Francisco. You’ll have the opportunity to solve creative challenges. With the resources of experienced leadership and a world-class Engineering team, you can bring these ideas to our enterprise customers.
About This Role
Our Site Reliability Engineers are a hybrid of software and systems engineers. We code our way out of operational problems and into chocolate chip cookies.
Our current mission is to design Leanplum’s next version of the core infrastructure. We are responsible for reliability, scalability, and automation, while keeping an eye on latency, performance, and capacity.
We are seeking extraordinary talent to help fuel our distributed applications capable of serving over 1 billion mobile devices tracking over 6 billion analytical events/day equating to over 17,000 requests/second and in the end generating over 1.5TB/day of data.
What We Need Your Help With
- Monitoring and alerting for various components across our infrastructure
- Automate the server provisioning process across API, Cassandra and Spark with over 400 nodes
- Influence and create new designs and architectures for a growing number of distributed systems (multi regions cloud environment)
- Plan and execute configuration management and monitoring of our platform as it grows.
- Design the system and processes that engineers use to deploy their software into production.
- Design, write, and maintain software to improve the availability, scalability, latency, and efficiency of Leanplum’s services, incorporating cloud and open source tools when available and writing software of your own when nothing else fits the bill.
- Engage in service capacity planning and demand forecasting, anticipating performance bottlenecks and provisioning new hardware as necessary.
- Run software performance analysis and system tuning.
- Participate in rotating on-call duties.
You’re Good At
- Fluent in one or more of: Java, Python, or Scala
- Familiarity with algorithms, data structures, and complexity analysis
- Experience working with Unix/Linux systems from kernel to shell and beyond, with experience working with system libraries, file systems, and client-server protocols
- Nice to have experience with network protocols and theory (TCP/IP, UDP, ICMP, MAC addresses, IP packets, DNS, OSI layers, and load balancing, etc.)
- Systematic problem solving approach
You Might Be Also Good At
- Expertise in designing, analyzing, and troubleshooting large-scale distributed systems
- In-depth knowledge of operating systems (processes, threads, IPC, concurrency, locks, mutexes, semaphores, etc.)
- Strong sense of ownership and drive
- Experience with AWS, GCP, or Microsoft Azure
- Experience with tuning and performance (Spark, Cassandra, Google App Engine apps)
- Competitive Salaries
- Health, vision, and dental insurance
- Unlimited vacation
- Peer bonuses
- Delicious lunch catered daily
- Themed happy hours every Friday!
- Ping pong, darts, and foosball
- Puppies galore
Build more than a Career. Create Meaning.
Back to top