Senior DevOps/Site Reliability Engineer - iovation

    • Portland, OR

What We'll Bring:
At TransUnion, we have a welcoming and energetic environment that encourages collaboration and innovation - we're consistently exploring new technologies and tools to be agile. This environment gives our people the opportunity to hone current skills and build new capabilities, while discovering their genius.

Come be a part of our team - you'll work with great people, pioneering products and cutting-edge technology.

What You'll Bring:

  • At least 3 years of development operations and/or application development experience, including demonstrated experience in a SaaS environment.
  • Proven experience performing root cause analysis and advanced troubleshooting
  • Experience with systems management tools such as Puppet, Chef, or Ansible and concepts of configuration management in a large-scale environment
  • Strong experience supporting customer-facing applications on a Linux platform
  • Practical knowledge of at least one scripting language (Bash, Ruby, Python, Perl)


We'd Love to See
  • Experience operating SQL and NoSQL data stores in a high throughput, high availability, low latency environment
  • Experience with capacity planning practices or methodologies
  • Familiarity with Java debugging tools
  • Familiarity with operating software based on Spring Cloud and/or associated components
  • Experience operating Apache Cassandra, Elastic Search, Lucene, Solr or Katta in a production environment
  • Experience operating containers (Docker, rkt) in a production setting
  • Experience using Kubernetes, Marathon, Docker Swarm or another container orchestration platform
  • Experience troubleshooting and/or building systems in the JVM


Impact You'll Make:
  • Participate in cross functional feature teams to bring operational perspective during the development cycle in the form of non-functional requirements; develop and evangelize best practices for commonly needed design patterns
  • Driving deployment automation towards repeatable, consistent, and safe outcomes
  • Driving down systemic risk through failure mode analysis, instrumentation, and mitigation
  • Driving testing of reliability and scalability
  • Provide operational support for highly critical real time APIs to meet SLO and SLA objectives
  • Troubleshoot issues with our systems at all levels of the stack by performing deep problem analysis to identify root cause and appropriate resolution
  • Work to reduce manual toil through automation and development of standardized practices for managing systems and services
  • Take part in a 24x7 on-call rotation (current on-call rotation is once every 9 weeks)
  • Perform routine care and feeding tasks of the production environment including deployments, data corrections, and upgrades
  • Tools we use: Java, Groovy, Ruby, Python, Perl, Git, Go Cassandra, Elasticsearch, Postgres, Redis, ActiveMQ, MySQL Puppet, Rundeck, Docker, Kubernetes, CentOS, Sensu, Collectd, Graphite, JMX JIRA, Confluence, BitBucket


We are an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, age, disability status, veteran status, marital status, citizenship status, sexual orientation, gender identity or any other characteristic protected by law.

TransUnion's Internal Job Title:
Lead Engineer, Production Engineering


Back to top