Service Reliabilty Engineer - Real time API
Service Reliability Engineer (Realtime API) - iovation
"We are a very client-centric organization. The client comes first in everything we do. That's why we hire smart, talented people with a passion for developing the best tools to stop fraudsters, while saving our clients time and money." - Greg Pierson, Sr. VP iovation
What we'll bring:
- A welcoming and energetic environment that encourages collaboration and innovation. We consistently explore new technologies and tools to be agile.
- Flexible time off, workplace flexibility, an environment that welcomes continued professional growth through support of tuition reimbursement, conferences and seminars.
- Our culture encourages our people to hone current skills and build new capabilities, while discovering their genius.
- A collaborative environment that values the pursuit of excellence, acting with integrity, being innovative, active participation, building partnerships and taking pride in the work being done.
What you'll bring:
- At least 3 years of development and/or operations experience, including demonstrated experience in a SaaS environment.
- Demonstrated practical problem solving, communication and documentation skills
- Proven experience performing root cause analysis and engaging other stakeholders when appropriate
- Experience operating SQL and NoSQL data stores in a high throughput, high availability, low latency environment.
- Experience with systems management tools such as Puppet or Chef and concepts of configuration management in a large-scale environment
- Practical knowledge of at least one scripting language (Bash, Ruby, Python, Perl)
- Ability to assess tradeoffs and make decisions collaboratively in a cross-functional team
- Comfortable with the event driven nature of our work
We'd love to see:
- Experience with capacity planning practices or methodologies
- Familiarity with Java debugging tools
- Familiarity with operating software based on Spring Cloud and/or associated components
- Experience operating Apache Cassandra in a production environment
- Experience operating Elastic Search, Lucene, Solr or Katta in a production environment
- Experience operating containers (Docker, rkt) in a production setting
- Experience using Kubernetes, Marathon, Docker Swarm or another container orchestration platform
- Experience troubleshooting and/or building systems in the JVM
Impact you'll make:
- Participate in cross functional feature teams to bring operational perspective during the development cycle in the form of
- Non-functional requirements; develop and evangelize best practices for commonly needed design patterns
- Driving deployment automation towards repeatable, consistent, and safe outcomes
- Driving down systemic risk through failure mode analysis, instrumentation, and mitigation
- Driving testing of reliability and scalability
- Provide operational support for highly critical real time APIs to meet SLO and SLA objectives
- Troubleshoot issues with our systems at all levels of the stack by performing deep problem analysis to identify root cause and appropriate resolution
- Work to reduce manual toil through automation and development of standardized practices for managing systems and services
- Take part in a 24x7 on-call rotation (current on-call rotation is once every 9 weeks)
- Perform routine care and feeding tasks of the production environment including deployments, data corrections, and upgrades
Tools we use:
- Java, Groovy, Ruby, Python, Perl, Git, Go
- Cassandra, Elasticsearch, Postgres, Redis, ActiveMQ, MySQL
- Puppet, Rundeck, Docker, Kubernetes, CentOS,
- Sensu, Collectd, Graphite, JMX
- JIRA, Confluence, BitBucket
We are an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, age, disability status, veteran status, marital status, citizenship status, sexual orientation, gender identity or any other characteristic protected by law.
Back to top