Network.AI Engineer

    • Menlo Park, CA

Facebook's mission is to give people the power to build community and bring the world closer together. Through our family of apps and services, we're building a different kind of company that connects billions of people around the world, gives them ways to share what matters most to them, and helps bring people closer together. Whether we're creating new products or helping a small business expand its reach, people at Facebook are builders at heart. Our global teams are constantly iterating, solving problems, and working together to empower people around the world to build community and connect in meaningful ways. Together, we can help people build stronger communities - we're just getting started.

The Network.AI group is a new team within Facebook Infrastructure. The charter of the new group spans the design and operations of the AI networking Infra including the network switches and the host side systems, as well as forward-looking projects such as transport evolution. Network Engineers at Facebook are a hybrid software/network engineers who design, build and operate our worldwide data center network. This team owns the complete lifecycle of the AI network in the data center from planning, design, product definition, QA, deployment and monitoring. Simple and scalable network design, automation and data analytics are the keys to meeting our demands. In this role, you will be responsible for conceiving, developing and deploying network software, systems and tools that keep the AI data center network operating at maximum reliability, scalability and efficiency.Do you like developing innovative solutions to some of the most complex scaling and reliability challenges out there? Do you want to build and operate the hyper-scale data center network that powers the world's largest social network? Do you want to ship code in production that positively impacts the experience of billions of users worldwide? Then, this is the role for you.

  • (Re)Design, deploy, manage and maintain the Facebook datacenter networks for AI infrastructure worldwide
  • Develop software that improves the reliability, efficiency and velocity of building and operating the AI datacenter network
  • Participate in the network on-call rotation and be an escalation contact for site events. Analyze data and identify root cause to network issues. Build monitoring systems and software robots that can debug and remediate network issues at scale
  • Test new network platforms before they are deployed in production
  • Build automation that improves the safety and reliability of our network software CI/CD pipeline
  • Partner alongside the best engineers in the industry on the coolest stuff around - the code and systems you work on, will be in production and used by billions of users all around the world
  • 2+ years of experience in one or more of higher level programming languages (Python, C, C++, Go, etc.)
  • Understanding of TCP/IP
  • 7+ years of experience with RoCE, Infiniband, RDMA - understanding of typical configurations and performance
  • 7+ years of experience in configuration and maintenance of network devices and NMS systems, or applications such as web servers, load balancers, relational databases, storage systems and messaging systems
  • Experience in developing and understanding network device configuration for at least one vendor (Arista, Juniper, Cisco, Brocade, Ciena, Infinera, Linux, etc.)
  • Experience in understanding and mitigating network hardware and topology failures
  • BS or MS in Computer Science or Computer Engineering or Electrical Engineering
  • Experience in a service provider or hyper-scale network in engineering or design roles
  • Knowledge in TCP/IP Congestion Control Algorithms (DCTCP/Cubic)
  • Knowledge of Network QoS and Scheduling algorithms (WRR/SP)
  • Understanding of the internals of a Router/Switch hardware, NPU/data planes and Optics
  • Understanding of the design principles and troubleshooting of distributed systems
Facebook is proud to be an Equal Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, transgender status, sexual stereotypes, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. We also consider qualified applicants with criminal histories, consistent with applicable federal, state and local law.Facebook is committed to providing reasonable accommodations for candidates with disabilities in our recruiting process. If you need any assistance or accommodations due to a disability, please let us know at accommodations-ext@fb.com.


Back to top