Network Device Telemetry, Monitoring, and Analytics
What will you achieve as part of our team?
You will influence, design, and implement distributed and scalable solutions for collecting, publishing, and analyzing telemetry data from one of the largest network in the world. You will help improve the availability of the network by measuring and analyzing its performance, health, and quality metrics. You will contribute in making the telemetry service efficient by developing an adaptive and dynamic service that responds to anomalies in real-time while conserving resources during normal operating conditions. You will grow your career and technical knowledge by working on design and implementation of multiple complex projects using latest tools and technologies while being mentored by senior engineers in our team. You will interact directly with our customers and partners and learn how our customers use the telemetry service to operate the network. We take pride in our deliverables, we measure and monitor what we deliver and, we highly value sound operational practices.
What our team does?
Amazon's network is a key differentiator for Amazon Cloud Computing and Web Services (AWS), enabling global operation of thousands of applications and services running across hundreds of thousands of servers worldwide. The AWS Networking team develops and operates the network platform for all of Amazon including e-commerce products and cloud computing solutions. This platform is industry-leading for its efficiency, performance, reliability and scale, and it is critical to the success of all AWS customers. Our team is part of AWS Networking organization. We are responsible for telemetry and monitoring infrastructure of hundreds of thousands of networking devices across tens of data centers across the world. Reliable, scalable, and timely telemetry is critical to the operation of the AWS Network, one of the largest in the world. We constantly innovate on behalf of our customers towards reducing and eliminating customer impacts due to faulty or degraded network devices. We employ massively distributed systems to analyze hundreds of billions of datapoints every minute to detect anomalies so that we can proactively isolate or fix unhealthy devices. We use highly efficient mechanism, using open source as well as custom software, to emit, collect, and analyze such a large volume of streaming data for monitoring of the health, performance, and quality of each networking device.
As part of the AWS Networking organization we participate in designing all of Amazon's massive scale data center network platform hardware and software. Our data center networks provide high bandwidth and low latency solutions for aggregating server racks, inter-connecting data centers in our regional backbone, inter-connecting our data centers to our Border network, and for management and monitoring of all network devices and critical data center infrastructure. Engineers in this organization define the hardware platforms, physical topologies, routing architectures and software control systems for these networks.
What is in it for you?
We are seeking Software Development Engineer (SDE L5) to work on telemetry, monitoring, and analytics for massively scaled and rapidly growing inter and intra data center network infrastructure. These are exciting times in our space we are growing fast, but still at an early stage and working on ambitious new initiatives where an engineer at any level makes day-to-day and strategic decisions that carry a huge amount of responsibility and significant technical and business impact. Our networks are designed to be fully managed and operated via software systems and automation. As a member of our team you will learn how one of the largest network in the world is monitored and operated, challenges that needed to be solved to maintain a highly reliable, and real-time streaming telemetry service across hundreds of thousands of entities involving trillions of data points published and consumed every minute. You will learn about tools and AWS services we use to process, analyze, and visualize the massive data sets, how we are innovating to reduce the time to detection and mitigation so that the duration of customer impact is minimized. We have senior engineers in our team with deep industry knowledge for mentoring new members of our team. We also work with senior engineers from other AWS teams who are our customers and partners. No network protocol knowledge is required.
Stack: Embedded Linux/C++, AWS Services / Java / Python
• 2+ years of non-internship professional software development experience
• Programming experience with at least one modern language such as Java, C++, or C# including object-oriented design
• 1+ years of experience contributing to the architecture and design (architecture, design patterns, reliability and scaling) of new and current systems.
• A Bachelor's Degree in Computer Science or Engineering, or equivalent experience is mandatory.
• Two or more years of experience in large scale data collection, processing, and visualization.
• Strong Unix/Linux skills and the ability to code in a modern object oriented language such as Python, Ruby, C++, or Java.
• Highly autonomous, very detail oriented, strong written and verbal communication skills.
• Background with large-scale data pipeline design and architecture is highly preferred
• Good understanding of Computer Science fundamentals, data structures, algorithms and distributed system
• Excellent written and verbal communication skills and an ability to interact efficiently with peers and customers is required, as well as experience initiating, driving and managing in-event conference calls.
• Ability to take a project from scoping requirements through actual launch of the project.
• Meets/exceeds Amazon's leadership principles requirements for this role. Meets/exceeds Amazon's functional/technical depth and complexity for this role.
Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, please visit https://www.amazon.jobs/en/disability/us.