TIP: Site Reliability Engineer
- Flexible / Remote
Upwork ($UPWK) is the leading tech solution for companies looking to hire the best talent, maintain flexibility, and get more done. We’re passionate about our mission to create economic opportunities so people have better lives. Every year, more than $2 billion of work is done through Upwork by skilled professionals who want the freedom of working anytime, anywhere. Top companies connecting with extraordinary talent around the globe? Upwork is how.
This position is through Upwork’s Talent Innovation Program (TIP). Our TIP team is a global group of professionals that augment Upwork’s business. Our TIP team members are located all over the world.
Upwork is the largest freelance site in the world, with access to the most qualified freelancers, and our enterprise customers leverage this ability to rapidly and effortlessly source high quality talent from all over the world. We also provide as part of our enterprise offering compliance and onboarding tools and advanced reporting capabilities. We are seeking an experienced Site Reliability (DevOps) Engineer to support its main website. This position will focus on two areas:
1) Incident Response. Candidates will help us improve our monitoring tools and automation to improve our site reliability by identifying weakness and working with our development team to address those gaps. You will also help us to manage the process of handling any type of incident impacting upwork.com, including coordination, communication, and debugging, and remediations. This role will participate in our on-call rotation in your day-time and on some weekends (about once every 3 weeks).
2) Project-oriented work and general support ticket with a particular focus on assisting our developers. This includes more general Sysadmin work (automation scripting, writing Chef code, using AWS services and tools, managing nginx load balancers, managing DNS, configuring our CDN, etc) and assisting in debugging code in collaboration with developers.
This is an opportunity to work with a major revenue-producing website with millions of users. In addition to making sure everything works you are also expected to contribute to the continuous improvement of our environment.
This is a full time position (~40 hours per week, Monday-Friday). Work hours are mainly in AMER timezones (PST)
Must have requirements:
- Proficient in the following:
- Linux System Administration
- Expert in Python or Ruby
- Knowledgeable in AWS (EC2, S3, ECS, VPC, ElasticSearch, Lambda)
- Experience in Chef
- IP networking
- Excellent verbal and written communication skills (English)
- Ability to size-up a situation, assess the effectiveness of various tactics/strategies, and make rapid decisions on appropriate courses of action
- Able to work full time , ~40 hours per week, Monday-Friday
- Flexibility for on-call shifts, once or twice a month (Weekday and on some weekend)
Nice to have items:
- Grafana/Prometheus/Atlas/Icinga monitoring tools
- Development experience
- Docker and Kubernetes
- CI/CD pipeline experience.
Upwork is proudly committed to fostering a diverse and inclusive workforce. We never discriminate based on race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical condition), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics.
Back to top