Site Reliability Engineer

Company Description

Illumio, the leader in micro-segmentation, prevents the spread of cyber threats inside data centers and cloud environments. Enterprises such as Morgan Stanley, BNP Paribas, Salesforce, Workday, and Oracle NetSuite use Illumio to reduce cyber risk and achieve regulatory compliance. Illumio’s Adaptive Security Platform™ uniquely protects critical information with real-time application dependency mapping and micro-segmentation that works in any data center, public cloud, or across hybrid deployments on bare-metal, virtualization, and containers. For more information about Illumio, visit and follow us @Illumio.

Job Description

The SRE team is a growing team that partners closely with Engineering, Support, and IT. We are responsible for the design, deployment, and continuous operation of the Illumio ASP cloud ecosystem. We need your help to take our existing platform to the next level with CI/CD, automated diagnostics/scaling/healing, and more.

What you'll do:

  • Work on a team responsible for a blend of architecture, automation, development, and application administration
  • Develop and deploy solutions from the infrastructure, to the network, and application layers, on public cloud platforms
  • Exercise new product features before they're delivered to our customers (we dogfood heavily)
  • Ensure our SaaS platform is available and performing, and that we can notice problems before our customers
  • Build the tools to improve speed, confidence and visibility of our SaaS deployments
  • Help build security into every step of the software & infrastructure life cycle
  • Collaborate with Support and Engineering on customer issues, as needed
  • Participate in a periodic on-call rotation (Workload sustainability is important - we don't want anyone burning out.)
  • If you're Seattle-based:
    • Travel to our headquarters in Sunnyvale, CA periodically, and as necessary to work together face to face
    • Help build our presence in Seattle

You're a good fit if you:

  • Strives to solve traditional operations problems through automation
  • Enjoy learning new tools, and languages
  • Enjoy a collaborative environment
  • Have a high attention to detail
  • Have a strong customer focus
  • Are willing to dig deep into infrastructure and code to solve problems


What we're looking for:

  • A Bachelors in Computer Science, Engineering, MIS, or experience in software engineering or a related field
  • A DevOps mentality
  • An enthusiastic self-starter with a commitment to learning, customer empathy, and team communication
  • Experience deploying, tuning, and maintaining Linux-based, highly available, fault-tolerant web platforms in public cloud providers such as AWS, Azure, and GCP
  •  Familiarity with a modern programming language. Experience with or a willingness to learn Ruby, Python and Linux shell scripting
  • Familiarity with common monitoring, log aggregation and metrics gathering platforms (Icinga, Sensu, Splunk, Telegraf/InfluxDB, et al.)
  • Familiarity with common database systems such as MySQL, PostgreSQL, Redis, or similar
  • Familiarity with common configuration management & orchestration tools. Experience with or willingness to learn Chef, Ansible, and AWS services & APIs

Encouraged but not required: 

  • Code samples
  • Experience speaking at industry conventions or meetups (Monitorama, SREcon, VelocityConf, DevOpsDays, etc.)

Additional Information

All your information will be kept confidential according to EEO guidelines.

Back to top