Site Reliability Engineer
Zoosk's Technical Operations team is looking to add a seasoned Site Reliability Engineer to join our growing team. In this role you'll be a key member of the team that keeps us up and running, serving over 38 million active members in 80 countries. You should have a focus on automation, scaling, and monitoring.
We maintain a balance of self-hosted and cloud-based solutions. As a member of the team, you have a direct impact on design and feature enhancements to keep our platform running smoothly.
At Zoosk we practice DevOps. We push code multiple times per day. We make changes during the day, not in the middle of the night. We measure everything. We celebrate our wins. We partner with the leading vendors for support. We promote a culture of growth. We are proud of our collaborative environment, and knowledge sharing is both encouraged and appreciated.
We want to hear your ideas!
- Solid understanding of Linux system administration, including configuration, troubleshooting, automation, and security.
- Experience working in a high capacity, fault tolerant and horizontally scalable environment.
- Experience with load balancer technologies and best practices, F5, NetScaler, SteelApp.
- Experience with at least one scripting language (Python, Perl, Ruby), as well as shell scripting.
- Experience with common monitoring tools such as Nagios, Ganglia, Cacti, Splunk, New Relic.
- Familiarity with building hosts (kickstart, PXE boot) and configuration management systems (CFEngine, Puppet, Chef.)
- Experience in a 24x7 on call rotation. (Our pagers are not blowing up!)
- Work closely with platform and other engineering teams to set level expectations for projects.
- Establish best practices for OS tuning and hardware management, which will allow us to scale efficiently and securely with optimal hardware requirements.
- Manage backup and restore, maintaining run books and planning for disaster recovery.
- Performance analysis and tuning for services.
- Write tools to help automate and conduct functional tests of infrastructure.
- Familiarity with MySQL replication, configuration, and backup strategies.
- Familiarity with TCP/IP networking and routing for small and large networks.
- Experience with handling critical production under structured change management guidelines (publish tickets, design and implementation documents followed by Run Books which can be used by rest of Ops team to triage production issues).
- Experience with MySQL High Availability solutions.
- Experience managing instances in an AWS environment.
- Experience with implementing active-active topologies across data centers.
- Strong understanding of backup solutions.
Meet Some of Zoosk's Employees
Lead Engineer, UI Development
Sue Anna serves as a go-between between the Design and Engineering Teams. She takes designs from Zoosk’s creatives and translates them into streamlined presentation markups and elegant code.
Back to top