Quantcast operates some of the largest custom-developed data processing infrastructure in the world, storing and processing tens of petabytes of data daily. This includes a fault tolerant space efficient distributed file system (QFS
), along with a custom map/reduce implementation that is four times faster than open source alternatives.
Low-probability failures are common occurrences in any system that contains thousands of nodes. The Cluster Services team owns the development and operation of software systems that not only identify problems across the stack (network, hardware, OS, services), but also auto-correct and compensate for them. The team also owns a custom resource allocation system that has been used to achieve massive service co-tenancy across the compute infrastructure: scaling the distributed processing and storage platforms to become a highly available service that supports analytics and modeling across the company.
The ideal candidate has hands-on experience with large scale distributed systems (HDFS, Hadoop, Cassandra, etc), configuration management (Puppet, Chef), databases (mySQL, PostgreSQL), and Amazon Web Services. They should be comfortable working in an event-driven environment while also developing code that scales the management of our distributed storage and compute platform.
- Mentor and grow the more junior engineers on the team
- Drive operational excellence through automation, monitoring, and incident analysis
- Provide technical input into product roadmaps for the team
- Develop tools that scale the management of the distributed storage and compute platform
- Maintain and enhance the services that support the distributed storage and compute platform
- Work to make our platform more elastic and fault tolerant
- Guide the development of systems that integrate our data centers with Amazon Web Services
- BS in computer science or equivalent experience
- Experience with large scale distributed systems
- Proficiency in one or more programming languages
- Linux system administration/automation experience
- Track record of driving operational excellence
- Excellent communication and interpersonal skills
- Strong written communication and documentation skills
- Organized, detail-oriented personality
Quantcast is a fast-growing, late-stage, pre-IPO startup headquartered in San Francisco with offices around the world. We are committed to creating an inclusive and diverse environment where everyone can confidently be their authentic self.