Systems Engineer

3+ months agoHerndon, VA


Amazon Simple Storage Service (S3) is storage for the Internet. Through the use of pioneering techniques in distributed computing, developers are able to durably store their data on Amazon's proven computing infrastructure to achieve virtually limitless storage capacity at minimal cost. Amazon S3 provides a simple web services interface that can be used to store and retrieve any amount of data, at any time, from anywhere on the web. It powers businesses across the globe that make the lives of consumers better daily. Whether its electronic content delivered to your home, technology that betters your remote working experience, allows you to plan travel to exotic places, or simply get stuff delivered to your home.

To meet and exceed our customers' expectations, we are constantly working on S3 fleet management software innovation, process automation, and improving cloud storage infrastructure maintenance. As a system engineer, you will be joining our S3 Fleet Development team to operate S3 largest fleets maintenance activities such as patching, firmware updates, and software deployments, and work on full automation of these activities. We are growing and you will contribute towards our rapidly increasing scale in the face of explosive customer growth.

Some of the key job functions:

• Investigating new ways to improve operations and defining new solutions to maintain the world's largest public-cloud-based object storage service fleets
• Diving deep into service interdependencies and region specific configurations to find the root of a problem
• Work on operations and maintenance driven coding projects, primarily in JAVA and Python
• Provide support of incoming tickets, including extensive troubleshooting tasks, with responsibilities covering multiple products, features, and services
• Develop tools to aid patching and firmware update operations and maintenance
• Work with the development team to automate features and enhance fleet maintenance activities

Work-life Balance Our team works together to provide work/life balance for all team members. We recognize that the circumstances of our team members vary, and we balance work across the team so we're all able to maintain standards on behalf of our customers, while at the same time allowing for rich and happy personal lives.

On-Call Responsibility S3 services are highly available, but there are times when we occasionally stray away from our normal operations. To minimize the impact of such excursions, we have on-call rotations. However, we set these up so there are focused time periods when you are on-call and when you are not, so you can focus on your day job when not on-call.

Mentorship & Career Growth Our team is dedicated to supporting new members. We're building an environment that celebrates knowledge sharing and mentorship. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded engineer and enable them to take on more complex tasks in the future.

Inclusive Team Culture We have a diverse team and drive towards an inclusive culture and work environment. Our team is intentional about attracting, developing, and retaining amazing talent from diverse backgrounds. Our team members are active in Amazon's 10+ affinity groups, sometimes known as employee resource groups, which bring employees together across businesses and locations around the world. These range from groups such as the Black Employee Network, Amazon Women and Engineering, and LGBTQ+.


• Bachelor's Degree in Computer Science or a related field, or relevant work experience
• 5+ years of experience as a Systems Engineer, Site Reliability Engineer, Dev Ops Engineer or equivalent
• 5+ years of Linux experience
• 5+ years of systems management/administration automation experience
• 3+ years of Agile methods and processes experience.
• The ability to deal with ambiguity and drive, design, implement, and maintain large scale systems


• Advanced degree in Computer Science or an Engineering discipline
• 3 + years of development of systems management and administration automation in Python or Java
• Experience debugging and systems analysis skills to be able to identify and quickly resolve/mitigate issues
• Exposure to massive-scaled distributed systems
• Ability to track the health of our services, identify and fix problems on complex systems with massive scale

Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, visit US Disability Accommodations.

Job ID: Amazon-1471119