Senior Software Engineer (Senior Site Reliability Engineer)
- Sunnyvale, CA
What you'll do...
Senior Site Reliability Engineer
Site Reliability Engineers (SREs) are responsible for keeping all production services/systems running smoothly. SREs are a blend of pragmatic operators and software craftspeople that apply sound engineering principles, operational discipline, and mature automation to environments and the codebase. Specialize in systems, be it Linux/Unix, Networking, storage systems, and more specific interest in improving reliability, performance, scalability, monitoring and measuring the distributed systems/services. Senior SREs meet the following criteria:
- Deep knowledge and expertise in 2-3 of the following areas and general awareness of the larger area,
- CI/CD automation - Building automated pipelines
- Automation - Use Ansible to efficiently manage infrastructure
- Networking - DNS, Load balancing(L4/7), TCP/IP, HTTP, SSL
- Optimize Performance and Cost across the tech stack (Network, Compute and Storage)
- Build/Use Kubernetes and containerizing systems
- Administering/Manage MySQL, MongoDB, Cassandra, Elasticsearch, Apache Storm and Apache Kafka clusters.
- Build Monitoring and Metrics platforms using Sensu/Prometheus, Grafana and integrating with Slack, X-Matters and Custom tools.
- Administering Logging platform (Scribe+ELK)
- Capacity Management
- Disaster Recovery and High Availability strategy
- Able to execute on the technical roadmap
- Identifies significant areas that result in cost savings
- Identifies architectural opportunities to improve reliability, availability, and performance
- Proactively work on capacity and productivity planning to improve overall efficiency
- Identify, define and build SLIs that will help meet the defined SLAs
Collaboration and Communication:
- Drives Incident management and reviews
- Collaborate on RCAs and executes on the gaps identified to prevent future occurrences
- Contributes to runbooks and project documentation
Influence and Maturity:
- Set an example for the junior members of the team with inclusive leadership and mentoring
- Able to de-escalate conflicts inside the team
About Global Tech
Imagine working in an environment where one line of code can make life easier for hundreds of millions of people and put a smile on their face. That's what we do at Walmart Global Tech. We're a team of 15,000+ software engineers, data scientists and service professionals within Walmart, the world's largest retailer, delivering innovations that improve how our customers shop and empower our 2.2 million associates. To others, innovation looks like an app, service or some code, but Walmart has always been about people. People are why we innovate, and people power our innovations. Being human-led is our true disruption.
Working virtually this year has helped us make quicker decisions, remove location barriers across our global team, be more flexible in our personal lives and spend less time commuting. Today, we are reimagining the tech workplace of the future by making a permanent transition to virtual work for most of our team. Of course, being together in person is an important part of our culture and shared success. We'll collaborate in person at a regular cadence and with purpose.
Outlined below are the required minimum qualifications for this position. If none are listed, there are no minimum qualifications.
Minimum Qualifications: Bachelor of Science and 5 years' experience in software engineering OR Master of Science 2 years' experience in software engineering
Outlined below are the optional preferred qualifications for this position. If none are listed, there are no preferred qualifications.
Master's degree in Computer Science or related field and 2 years' experience in software engineering or related field
Information Technology - CISCO Certification - Certification Primary Location...840 W CALIFORNIA AVE, SUNNYVALE, CA 94086-4828, United States of America
Back to top