Sr. Site Reliability Engineer
Box’s infrastructure is going through a massive transformation to enable our rapidly scaling business. At the heart of this transformation is our hybrid, multi-vendor cloud strategy. The SRE Application Services Team is responsible for driving the on-boarding of Box's application services onto platforms like OpenStack, Kubernetes, AWS, GCP and Azure.
In this role, you will be developing the automation, systems and processes that enable rapid prototyping by development teams, while ensuring tight alignment with compliance and availability goals. As we evolve the infrastructure and move towards a service oriented architecture, it is imperative that we redefine how we think about operationalizing services in the future. You’ll be collaborating across Engineering and Product to ensure this is a core part of our SDLC.
Why the team needs you
The SRE Application Services Team is packed with motivated engineers who understand the impact of our work to Box’s success. That breeds an amazing sense of ownership and pride within the team - values we believe will be key to adapt how we work in a world of micro-services and hybrid clouds. We would love to add someone who has seen this movie before (or parts thereof!) and can help shape the team’s vision with their domain expertise in one or more of public clouds, containers and cloud-native apps.
Why Box needs you
Box is growing fast. Real fast. Every business in the world is looking to modernize the way that they work. As the leader in cloud content management, Box is the only company that can help enterprises transform how people work together. Come help us define a robust way to build, operate and scale the web-services that power this industry-leading mission!
Why you need Box
You're going to have the unique opportunity to solve interesting technical challenges by defining, designing and deploying Box services on our private and multi-public cloud provider, platforms, and infrastructure, always thinking about reliability, scalability, resilience, security, and performance. As you drive and scale our infrastructure and service migration to the cloud, you will get to work with modern technologies including Kubernetes, SmartStack, and all major cloud providers. You will have visibility across all of Engineering and have impact directly on the entire business.
Who you are
- You have 3+ years of large-scale production operations experience and enjoy talking about stability, scalability and performance limits of web-services
- You have used tools like Puppet, Ansible, Chef, Spinnaker, SmartStack, et al to manage and scale multi-tier web-services infrastructure
- You act like an owner and strive to do work you're proud of. You believe in spreading (and acquiring) knowledge through mentorship
- You have experience managing an AWS / Azure / GCP cloud infrastructure and its foundational services, including EC2, S3, GCS and other storage options, VPCs, IAM, etc.
- You like to build robots that automate your job, using one or more of Python, Perl, Ruby, Java/Scala, C or the likes
- You enjoy troubleshooting in a distributed Linux systems environment and are comfortable in tracing problems through applications, systems and networks
- You have excellent oral and written communication skills and the ability to effectively communicate complex subjects to both technical and non-technical audiences at all levels.
Back to top