Lead Systems Reliability Engineer

Responsible for creating desired functionality to assigned content, products or services in customer facing systems. Develops, tests and implements software that provides robust technical infrastructure and/or software applications used by business units.

ACCOUNTABILITIES

CREATES DESIRED FUNCTIONALITY TO ASSIGNED DOMAIN, PRODUCTS OR SERVICES IN APPLICATION DEVELOPMENT

  • Supports multiple features and applications
  • Subject matter expert within assigned domain, product or service
  • Develops applications and the underlying framework
  • Evaluates and reviews application designs
  • Implements, tests and delivers new features for multiple platforms
  • Coordinates with QA team to initiate testing and ensure testing is completed
  • Designs framework and software standards and recommend  systems/software improvements


ADDENDUM

LEAD SOFTWARE ENGINEER - SYSTEMS RELIABILITY/MONITORING & OPERATIONS
  • Develop robust logging, monitoring, alerting and SOPs using tools like Splunk,  Dynatrace, Sumologic, PagerDuty
  • Develop dashboard/tools to provide visualization, detect anomalies and predictive monitoring using open source stack like ELK, Python, Node.js
  • Identify repetitive, manual tasks/issues and build automation to reduce toil and improve productivity and reliability 
  • Proficient in Java, Python, UI/Javascript/Node.js
  • Proficient in Multi-cloud management and building and deploying systems in a cloud infrastructure (Google preferred)
  • Proficiency with Windows/Linux/UNIX operating systems and container technologies such as Docker, Kubernetes
  • Lead production support, Root Cause Analysis and blameless postmortems. 
  • Develop and implement SRE principles like setting up SLI/SLOs, Error Budgets, partner with Engineering on improving reliability/scalability of the application 
  • Ability to design, analyze, and troubleshoot large-scale distributed systems and optimize code


QUALIFICATIONS

REQUIRED
  • Bachelor's Degree or equivalent in MIS, Computer Science or a related field
  • 5+ years of experience in software coding and development
  • Proficient  in problem solving, database, application servers, application design and testing
  • Experience designing patterns and performance tuning of large scale web/enterprise  applications
  • Experience working on Google Cloud Platform or equivalent cloud platform.
  • Strong Communication Skills


PREFERRED
  • Retail experience


Back to top