Company Logo

Software Engineer

Netflix - 1d ago

Company Logo

Senior Software Engineer

Reddit - 4d ago

Network Site Reliability Engineer

AI Summary ✨

Requirements:

  • BS degree in Computer Science, Electrical Engineering, or a related technical field, or equivalent experience.
  • Minimum of 8 years of industry experience in network site reliability engineering, network automation, network operations, or related areas. Experience on both campus and data center networks.
  • Familiarity with network management tools such as Prometheus, Grafana, Alert Manager, Nautobot/Netbox, BigPanda
  • Expertise in automating networks using frameworks such as Salt, Ansible, or similar.
  • In-depth experience in one or more of the following: Python, Go.
  • Knowledge in network technologies such as TCP/UDP, IPv4/IPv6, Wireless, BGP, VPN, L2 switching, Firewalls, Load Balancers, EVPN, VxLAN, Segment Routing. Proven track record in network operations.
  • Skills with ServiceNow and Jira
  • Knowledge of Linux system fundamentals is a plus.
  • Systematic problem-solving approach, coupled with excellent communication skills and a sense of ownership and drive.

What you'll be doing:

  • Owning the operational aspect of the network infrastructure, ensuring its high availability and reliability.
  • Partnering with architecture and deployment teams to guarantee that new implementations are supportable and align with production standards.
  • Advocating for and implementing automation to reduce toil and enhance operational efficiency.
  • Monitoring network performance, identifying areas for improvement, and coordinating with relevant teams to execute enhancements.
  • Collaborating with SMEs to resolve production issues swiftly and effectively, maintaining customer satisfaction.
  • Identifying opportunities for operational improvements and partnering with teams to develop solutions that drive excellence and sustainability in network operations.

Nice to haves:

  • Track record of taking operational signals through means such as SNMP, Syslog, Streaming Telemetry to solve operational challenges
  • History of debugging and optimizing code; automating routine tasks.
  • Experience with Mellanox/Cumulus Linux, Palo Alto firewalls, Netscalers and F5 load balancers
  • Previous SRE experience

Perks and benefits:

  • NVIDIA is widely considered to be one of the technological world’s most desirable employers.
  • We have some of the most forward-thinking and hardworking people in the world working for us.
  • If you're creative, and enjoy having fun, then what are you waiting for? Apply today!
Apply here
Experience: Senior
Posted: December 12, 2025
Golang
Python
sitereliability

Similar jobs

  • 3 days ago
  • 4 days ago
  • 4 days ago
  • See all jobs in UK