At Fannie Mae, the inspiring work we do helps make a home a possibility for millions of homeowners and renters. Every day offers compelling opportunities to impact the future of the housing industry while being part of a collaborative team thriving in an energizing environment. Here, you will grow your career and help create access to affordable housing finance.
Job Description
THE IMPACT YOU WILL MAKE
As a Lead Cloud Operations Engineer, you will play a pivotal role in enhancing the resilience, efficiency, and performance of our AWS-hosted applications. With our cloud adoption complete, the focus now shifts to optimizing our systems for scalability, observability, and cost-effectiveness. You will lead key initiatives, mentor engineers, and collaborate across teams to ensure our cloud infrastructure is robust and future ready.
Want more jobs like this?
Get jobs in Reston, VA delivered to your inbox every week.
Key Responsibilities
- Cloud Maturity & Optimization
Partner across teams to elevate cloud operations maturity, focusing on improving resiliency, observability, performance, and cost-effectiveness of AWS-based systems. - Monitoring, Incident Response & Operational Excellence
Utilize observability platforms (e.g., CloudWatch, Splunk, Dynatrace, OpenTelemetry) to lead incident triage, root cause analysis, and continuous improvement efforts. - Infrastructure & Network Insight
Maintain comprehensive knowledge of application architecture, including firewalls, load balancers, DNS (Route53), WAF, and Layer 3/4 network components to ensure secure and efficient system operations. - Collaboration & Stakeholder Engagement
Partner with engineering, architecture, and product teams to influence infrastructure roadmaps, support enterprise-wide changes, and ensure alignment with business goals. - Resilience & Disaster Recovery
Collaborate with cross-functional teams to lead disaster recovery planning and execution, ensuring critical systems remain highly available and resilient.
Lead Expectations
- Operational Leadership: Serve as the escalation point for critical incidents, leading resolution efforts and post-incident reviews. Ensure clear communication with stakeholders and vendors during high-impact events.
- Strategic Influence: Collaborate with engineering and application teams to shape infrastructure roadmaps and align operational goals with organizational priorities.
- Mentorship & Knowledge Sharing: Coach and support engineers through training and hands-on guidance. Foster a culture of continuous learning and shared ownership.
- Change Management: Evaluate and provide guidance on risks associated with enterprise-wide cloud changes, ensuring implementations are resilient, minimally disruptive, and aligned with compliance and governance frameworks.
- Process Improvement & Best Practices: Drive the development and adoption of operational best practices, including automation, monitoring, and incident response frameworks. Lead initiatives to improve efficiency, reduce risk, and enhance system resilience.
Required Experience
- 4 years of experience in hands-on incident management in 24x7 production environments, including on-call responsibilities.
- 4 years of experience in cloud operations with a focus on maturing cloud environments and driving operational excellence.
- Strong experience in 24x7 operational environments, including on-call rotations.
- Proven ability to lead cross-functional initiatives and influence technical roadmaps.
Desired Experience
- Bachelor's degree or equivalent
- AWS certification (Solutions Architect, SysOps Administrator, or DevOps Engineer); Azure certification is a plus.
Technical Skills
- Advanced knowledge of AWS services (EC2, ECS, Lambda, EB, EMR, Glue, RedShift, IAM, CloudTrail, CloudFormation, CloudWatch, VPC, CloudFront, ELB, RDS, SNS/SQS, S3, EFS); working knowledge of Azure services and multi-cloud environments.
- Advanced scripting skills in Python and Bash; experience with AWS SDKs (e.g., boto3) for automation and custom tooling.
- Hands-on experience with CI/CD pipelines using tools like Jenkins, GitLab CI, AWS CodePipeline, and GitHub Actions; strong understanding of release automation and deployment strategies
- Proficient in Terraform, AWS CloudFormation, and CDK for infrastructure provisioning and management across multiple environments.
- Familiarity with IAM policies, KMS, Secrets Manager, and AWS Config.
- In-depth knowledge of enterprise networking concepts including VPC design, subnets, NAT gateways, VPNs, Direct Connect, firewalls, WAF, Route53, DNS, and Layer 3/4 appliances; familiarity with Zero Trust and network segmentation principles.
- Experience with Docker, Amazon ECS, and EKS (Kubernetes); understanding of container lifecycle management, service discovery, and scaling strategies.
- Experience with ITSM tools like ServiceNow and Jira; understanding of ITIL practices including incident, change, and problem management.
- Experience with observability platforms such as Splunk/SignalFX, Dynatrace, OpenTelemetry, and Grafana.
Qualifications
Education:
Bachelor's Level Degree (Required)
The future is what you make it to be. Discover compelling opportunities at Fanniemae.com/careers.
For most roles, employees are encouraged to work onsite on a regular basis at their designated office location. In-office work cadence is determined by your manager. Proximity within a reasonable commute to your designated office location is preferred unless the job is noted as open to remote.
Fannie Mae is an equal opportunity employer and considers qualified applicants for employment without regard to race, color, religion, sex, national origin, disability, age, sexual orientation, gender identity/gender expression, marital or parental status, or any other protected factor. Fannie Mae is committed to providing reasonable accommodations to qualified individuals with disabilities who are employees or applicants for employment, unless to do so would cause undue hardship to the company. If you need assistance using our online system and/or you need a reasonable accommodation related to the hiring/application process, please complete this form.
The hiring range for this role is set forth below. Final salaries will generally vary within that range based on factors that include but are not limited to, skill set, depth of experience, certifications, and other relevant qualifications. This position is eligible to participate in a Fannie Mae incentive program (subject to the terms of the program). As part of our comprehensive benefits package, Fannie Mae offers a broad range of Health, Life, Voluntary Lifestyle, and other benefits and perks that enhance an employee's physical, mental, emotional, and financial well-being. See more here.
Requisition compensation:
121000
to
158000