Data Infrastructure Engineer

The Analytics Engineering Services team operates Box's Data Infrastructure across public and private clouds. The team is responsible for architecting and scaling the Analytics Data Warehouse, Hadoop Clusters that power backend products, Kafka pipelines for pub-sub subscribers, Storm Infra for monitoring, Redshift for DWH, Splunk and ELK for logging infrastructure.
 
We are in the process of rearchitecting our Data Stack, to continue to scale with growing business demands.  The data architecture will help surface faster data insights from product and GTM data via self-serve ETL capabilities to enable business 
 
We are looking for big thinkers and innovators to take on this problem space and deliver world class solutions. We are a small, passionate team that thinks big and is not afraid of huge, gnarly problems. If these challenges excite you, come join us.
 
Why the team needs you 
Moving to a cloud based Data Stack brings new set of challenges for us. We need someone passionate and having domain expertise to architect a Data Platform for such an environment that can scale to millions of events per day, with quality, consistency and latency tradeoffs optimized. We hope to learn from you the best practices for building large scale logging infrastructure across multi-clouds.
 
Why Box needs you 
Box is growing fast. Real fast. Every business in the world is looking to modernize the way that they work. As the leader in cloud content management, Box is the only company that can help enterprises transform how people work together. We are undergoing a massive change in how we run our services and need to build an analytics platform to help guide our way. That's where you come in.
 
Why you need Box 
You're going to have the unique opportunity to architect, build and design our private and multi-public cloud provider solution with the highest level of security that any company has to typically deal with. As you drive and scale our infrastructure migration to the cloud, you will get to work with cutting-edge technologies including Hadoop, Kafka, ElasticSearch, Spark, Storm and AWS Big Data Stack. You will have visibility across all of Engineering and have impact directly on the entire business. 
 
Who you are 
  • You have DevOps experience implementing large scale data stacks using AWS managed and server-less architectures.
  • You love building distributed client/server systems in Python and/or Scala.
  • You have experience writing ETL on Hive and Spark.
  • You are proficient with Infrastructure as Code, Ansible and/or Puppet.
  • You act like an owner and strive to do work you're proud of, both technically and in your team interactions
  • You are able to inspire other people to work with you, and you enjoy mentoring and coaching more junior engineers
  • You have 5+ years of relevant experience
  • AWS Certifications nice to have.

Back to top