We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.
#alert
Back to search results
New

Principal Networking Engineer - QoS / Networking

Advanced Micro Devices, Inc.
$226,400.00/Yr.-$339,600.00/Yr.
United States, California, Santa Clara
2485 Augustine Drive (Show on map)
Jan 31, 2026


WHAT YOU DO AT AMD CHANGES EVERYTHING

At AMD, our mission is to build great products that accelerate next-generation computing experiences-from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you'll discover the real differentiator is our culture. We push the limits of innovation to solve the world's most important challenges-striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.

THE ROLE:

We are seeking a handson Principal Networking Engineer to own endtoend QoS strategy and implementation across data center SmartNICs/DPUs. You will define traffic classification, shaping, scheduling, and congestion control policies spanning TopofRack (ToR)/leaf/spine switches and host offload (SmartNIC/DPU), ensuring predictable performance for AI/ML, storage, and latencysensitive services. The ideal candidate combines deep knowledge of L2/L3/L4 QoS, RDMA/RoCE, PFC/ETS/ECN, and switch silicon schedulers/queues, with practical experience deploying policies at fleet scale.

THE PERSON:

We are seeking an experienced Principal Networking Engineer to drive the continuation of existing and future software systems and products. The successful candidate will be responsible for ensuring the functionality, reliability, and performance of our software products while keeping an outlook for future enabling and related technology. The ideal candidate will have a strong background in software engineering, excellent technical skills, and communication skill.

KEY RESPONSBILITIES:

  • Own QoS architecture across network tiers (host NIC/DPU including classification, policing, shaping, queue mapping, and scheduling strategies for mixed workloads (AI collectives, storage, RPC, control plane).
  • Design and implement SmartNIC QoS: map DSCP/PCP to NIC traffic classes, configure hardware TX/RX queues, rate limiters, WFQ/DRR schedulers, and offload paths for RDMA/TCP/UDP.
  • Switch QoS policy design: configure PFC, ETS, ECN/RED/WRED, buffer pools, queue thresholds, shared vs. dedicated buffers, and congestion control across multiple ASICs (e.g., Broadcom, NVIDIA/Mellanox, Marvell).
  • RDMA/RoCE tuning endtoend: lossless/losstolerant modes, CNP/ECN parameters, RNR/retry behavior, MTU/Jumbo frames, and scalable multitenant profiles.
  • Performance engineering: build test plans and run micro/macro benchmarks (e.g., ib_send_lat/ib_write_bw, RCCL/NCCL, iperf, switch counters/telemetry) to validate latency, throughput, tail performance, and fairness.
  • Instrumentation & observability: define SLI/SLOs for QoS (tail latency, drops, PFC events, ECN marks, queue depth, buffer occupancy); integrate with streaming telemetry (gNMI/INT/SFlow) and develop dashboards and alerts.
  • Troubleshoot complex incidents: incast, PFC deadlocks, microbursts, headofline blocking, unfair scheduling, and noisy neighbors; lead rootcause analysis and corrective actions.
  • Scale & automation: deliver declarative QoS via intentbased configs and CI/CD (e.g., Ansible/Salt, NAPALM, gNMI/gNOI, Netconf/YANG), including predeployment simulation and automated canary/rollback.
  • Documentation & standards: author design docs, runbooks, and guidance for tenant teams; contribute to internal standards and vendor requirements.

MINIMUM QUALIFICATIONS:

  • Strong experience datacenter networking or systems engineering, with direct ownership of QoS on switches and/or SmartNICs/DPUs.
  • Deep knowledge of QoS mechanisms: classification/marking (DSCP/PCP), policing, shaping, queueing (PRIO, WRR/WFQ/DRR), scheduling hierarchies, and buffer management.
  • Handson with PFC, ETS, ECN/WRED, explicit buffer tuning, and RDMA/RoCE performance/correctness in production.
  • Experience configuring merchant switch silicon (e.g., Broadcom Trident/Tomahawk, NVIDIA Spectrum, Marvell Teralynx) via NOS CLIs/SDKs (e.g., SONiC, Cumulus, NXOS, EOS, Onyx).
  • SmartNIC/DPU experience (e.g., NVIDIA BlueField, Intel IPU, AMD Pensando, Netronome/Agilio): queue configuration, rate limiting, hardware offloads, and hostNIC QoS mapping.
  • Proficiency with Linux networking (TC, qdisc, mqprio, XDP/eBPF), ethtool, RDMA tools (perftest, rdma-core utilities), and packet/flow analysis (tcpdump, Wireshark, INT/sFlow).
  • Strong automation skills: Python and/or Go for network automation, telemetry pipelines, and CI/CD integration; Gitbased workflows.
  • Demonstrated ability to debug lowlevel performance issues (NIC queues, IRQ affinity, NUMA, PCIe/xGMI topology, driver/firmware interactions).
  • Excellent written/verbal communication; strong design documentation and crossteam leadership.

PREFERRED QUALIFICATIONS:

  • Largescale operations experience (10K+ servers or multiregion fabrics) with QoS at fleet scale and multitenant isolation.
  • Practical experience with AI/ML workloads (RCCL/NCCL AllReduce, parameter servers, distributed training) and storage (NVMeoF, NFS, SMB, object) QoS tradeoffs.
  • Experience with traffic engineering and congestion control in Clos fabrics; familiarity with INT, gNMI, Inband telemetry, and P4 concepts.
  • Contributions to SONiC, DPDK, eBPF/XDP, or OpenConfig; experience with YANG/Netconf, gNOI.
  • Vendor engagement/bringup: working with ASIC/NIC vendors on buffer models, scheduling algorithms, and firmware roadmaps.
  • Security awareness for multitenant environments (DSCP abuse, QoS starvation, controlplane protection, CoPP/ACL integration).

ACADEMIC CREDENTIALS:

Bachelor's degree in Computer Science, Computer Engineering, or related field, Master's preferred

#LI-BW1

This role is not eligible for visa sponsorship.

Benefits offered are described: AMD benefits at a glance.

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process.

AMD may use Artificial Intelligence to help screen, assess or select applicants for this position. AMD's "Responsible AI Policy" is available here.

This posting is for an existing vacancy.

Applied = 0

(web-54bd5f4dd9-cz9jf)