Sr. Site Reliability Engineer
If you are passionate about working on technologies and design approaches that are disruptive to incumbent Enterprise players, like to leverage Open Source, tooling and automation and agile concepts, and are interested in opportunities to work at massive scale, we may have just the job for you.
Site Reliability Engineering (SRE) is a discipline that is in and of itself not new, but at Box we like to think that we are considering it in a way that makes it one of the most important roles within our Technology organization. As an SRE at Box you will be responsible for running our production services and will be working very closely with developers to ensure reliability, scalability and performance of the next-generation of systems. As such you are embedded within one of the engineering teams and will be a core driver of operational excellence throughout the development cycle and go-live.
We're looking for highly motivated and talented engineers for the Application Operations Engineering team. Come be part of Box, a proven leader in the enterprise collaboration space, and be part of a winning team that is responsible for running our production services.
- Develop software to help drive opportunities to improve automation for deployment, management, tooling and visibility within the engineering team
- Work cross-functionally amongst a variety of teams and be a core contributor in every significant engineering solution that we deliver to our stakeholders
- Develop a deep understanding of the various services and applications that come together to deliver Box's services
- Augment existing instrumentation to build a cohesive picture of the characteristics of our systems with special attention to points of failure
- Design new tools and smart alerts that help discover failures/issues in a timely fashion and work with engineers to identify root cause and fix issues
- Perform code reviews, evaluate implementations, and provide feedback about potential tool improvements
- Define and evangelize cloud-related optimizations and best practices to improve reliability and performance
- Demonstrable knowledge of TCP/IP, HTTP, distributed systems, and experience supporting multi-tier web application architectures in a web scale environment
- Solid understanding of application design, including the operational trade-offs of various designs
- Solid understanding of Unix/Linux internals
- Demonstrated coding skills, preferably in Python
- Minimum 5 years experience in production service trouble-shooting that spans applications, systems and network
- BS/MS in CS or Engineering
About Box: Founded in 2005, Box (NYSE:BOX) is transforming the way people and organizations work so they can achieve their greatest ambitions. As the world's leading enterprise software platform for secure content collaboration, Box helps business of all sizes in every industry securely access and manage their critical information in the cloud. Box is headquartered in Redwood City, with offices across the United States, Europe and Asia. To learn more about Box, visit www.box.com.
Meet Some of Box's Employees
Field Customer Success Manager
Christian works with Box customers post-implementation to ensure they’re successful with their new software—and help them best use the services they subscribe to.
Back to top