Staff Software Engineer, Distributed Storage

3+ months agoSan Francisco, CA

Airbnb is a mission-driven company dedicated to helping create a world where anyone can belong anywhere. It takes a unified team committed to our core values to achieve this goal. Airbnb's various functions embody the company's innovative spirit and our fast-moving team is committed to leading as a 21st century company.

 The Storage Foundation team in Airbnb’s Infrastructure organization owns the following core services under Data Platform:

  1. A highly available, low-latency, distributed, multi-tenant K/V storage layer for Airbnb user profiling, Search, ML, pricing teams, supporting million-level read QPS, 99.9+% availability across thousands of customer tables.
  2. A distributed coordination service across global regions that is scalable, reliable, performant to support many mission-critical systems (MySQL, Redis, Kafka, Flink, Druid), built on top of Zookeeper and etcd.
  3. Set of horizontal management services that manage OpenSearch and ElasticCache clusters for hundreds of Airbnb production clusters, at million+ IOPS and million+ indexing QPS.

We’re looking to add Senior Engineers who can grow to be a Technical Leader to solve broad technical challenges the team has to tackle. As a Senior technical contributor, you will bring a unique skill set, experience, thought leadership and technical expertise to our organization, and start by (1) being part of the engineering team to build V2 of our K/V store on top of open-source TiDB and TiKV, and adding features like  multi-tenant cost attribution, large-scale bulk loading etc (2) owning and managing OpenSearch/Elasticache clusters for Airbnb in order to generalize Airbnb’s search/caching scenarios, and participating in driving the roadmap to design a horizontal service layer that effectively manages hundreds of OpenSearch/Elasticache customers.

What we are looking for:

  • 5+ years of relevant industry experience in a fast paced, high growth tech environment.
  • Knack for writing clean, readable, testable, maintainable code.
  • Hands-on experience and expertise in building and operating distributed storage and services.
  • Ability to understand and decompose large-scale distributed systems and figure out monitoring metrics, failure scenarios and debug them in an efficient manner.
  • Strong collaboration and communication skills with a customer-first mindset, ability to abstract various customer requests into a platform vision and solution.
  • Experience in designing architecture that is long-term and evolvable.
  • Knowledge of public cloud platforms (AWS, ElasticSearch, ElasticCache, Google Cloud Platform, etc) and open source preferred but not strictly required.

Some examples of our current work:

  • Built a fault-tolerant ingestion layer on top of TiDB to ingest TBs of data with minimal impact on online traffic
  • Management layer for Zookeeper to manage/upgrade multiple Zookeeper clusters within Airbnb.
  • Work with AWS to devise Redis cache upgrade and multi-tenancy roadmap and manage hundreds of ElasticCache customers inside Airbnb.

The starting base pay for this role is between $190,000 and $245,000. The actual base pay is dependent upon many factors, such as: education, experience, and skills. The base pay range is subject to change and may be modified in the future. This role may also be eligible for bonus, equity, benefits, and Employee Travel Credits.

Job ID: 3029584