Sr. Site Reliability Engineer

Company Description

Oath, a subsidiary of Verizon, is a values-led company committed to building brands people love. We are a leader in digital and mobile media with a global house of 50+ brands. Oath is shaping the digital future.

Job Description

Verizon Digital Media Services (VDMS) is looking for highly talented and technical engineer to help us build the next generation of software powering the internet. This is an excellent opportunity to join a world class technical team, working with some of the best and brightest engineers while also developing your skills and furthering your career within one of the most innovative and progressive technology companies.


Sr. Site Reliability Engineering (Sr. SRE) is what you get when you treat operations as if it’s a software problem. The purpose of this role is to watch over the availability, latency, performance, and reliability of the software and systems behind EdgeCast Networks.


This is an unusual job, unlike others in the industry. Like traditional operations groups, we keep important, revenue-critical systems up and running despite hurricanes, bandwidth outages, and configuration problems. Unlike traditional operations groups, we also have full access to and authority to fix, extend, and scale the code to keep it working and harden it against all the vagaries of the Internet. We hire people from both systems and software backgrounds. Strong candidates will have experience with both.


In SRE, we flip between the fine-grained detail of disk driver I/O scheduling to the big picture of continental-level service capacity, across a range of systems and a user population measured in billions. We own those products in production. We drive reliability and performance across massive scale by mastering the full depth of the stack. We literally do learn something new every day - usually surprising things - that have the potential to transform the lives of billions of our users around the world.


As a Site Reliability Engineer, you will work on large-scale system design and troubleshooting, and be fluent in systems programming and/or automation. You will have a desire to tackle the complex problems of scale. Familiarity with running production environments at scale is crucial in this job along with an in-depth understanding of Unix systems internals, and networking.

The SRE is always on call to keep our networks up and running, ensuring our users have the best and fastest experience possible.


Responsibilities

  • Design, write and deliver software to improve the availability, scalability, latency, and reliability of Edge services.

  • Solve problems relating to mission critical services and build automation to prevent problem recurrence; with the goal of automating response to all non-exceptional service conditions.

  • Influence and create new designs, architectures, standards and methods for large-scale distributed systems.

  • Engage in service capacity planning and demand forecasting, software performance analysis and system tuning.

  • Conduct periodic on call duties.

Qualifications

  • Strong knowledge of Linux systems
  • Solid understanding of networking concepts
  • Experience with C/C++
  • Solid scripting and automation skills (Python, Shell, Go, Perl, etc)
  • Configuration Management experience (SaltStack, Ansible, Chef, etc)
  • Experience with Continuous Integration tools (Jenkins, Buildbot, Travis, etc)
  • Experience troubleshooting large scale/high performance systems
  • Strong communication skills and enjoy working in a highly collaborative environment

Additional Information

EEO/AA Women, Minorities, Veterans, Individuals with Disabilities Employer: Oath offers a competitive salary and benefits package, including 401(k) match and performance bonus. All qualified applicants will receive consideration for employment without regard to, and will not be discriminated against based on, race, gender, color, religion, national origin, sexual orientation, gender identity, veteran status, disability or other protected category.


Meet Some of AOL's Employees

Christina J.

Strategic Account Manager, Attribution Client Services At Convertro

As a Strategic Account Manager, Christina works with brands’ key stakeholders, business decision makers, and executives to drive adoption of Convertro’s multi-touch attribution technology.

Jeff S.

Senior Social Media Manager - HuffPost Live

Jeff manages all the social media for HuffPost Live and markets content from The Huffington Post website—ensuring that the Huffington Post brand continues to grow.


Back to top