Requirements:
- 8+ years of experience working with Linux (RHEL/CentOS/Rocky preferred) in a large complex or niche environment
- Deep knowledge of server Architecture: HP, SuperMicro, Dell, various overclock servers
- Low latency network interfaces and kernel bypass (configuration and optimization): Solarflare with onload, Mellanox with VMA
- System hardware/OS tuning and performance troubleshooting, understanding of CPU architectures
- Experience with build and configuration management tools, specifically Chef or Ansible
- Experience with observability tools, specifically Grafana and Prometheus
- Highly motivated and a keen eye for scripting and automation in Python, Ruby, and Bash
- Deep knowledge and experience of server network stack configuration, tuning and troubleshooting
- Strong communication skills, both verbal and written
- Critical thinking and problem-solving skills
- Well-organized, proactive, resourceful, accountable, and possesses an ownership mindset
- Good understanding of trading venues such as Nasdaq, LSE, Euronext etc.
- Degree in Engineering, Computer Science or related experience
- Well-rounded understanding of network architectures
Nice to Have:
- Kernel development with ability to modify stock kernels adding custom features
- Trade flow analysis: Understanding protocols used by various exchanges
- Experience with configuration management tools i.e. Ansible, Chef, and Terraform
- Familiarity with different network switch vendors and different switch architectures
- Experience with working with FPGA based applications and L1 network design
What You'll Be Doing:
- Manage systems efficiently at scale through standardization, automation, testing, and in-depth monitoring
- Enforce development standards for source control, testing, and continuous integration for infrastructure, OS, patches, and configuration management
- Manage a distributed compute environment and multiple petabyte-scale storage systems
- Install, manage, and monitor the Linux operating system (RHEL based)
- Troubleshoot complex hardware and software issues throughout the Squarepoint technology stack
- Create self-healing systems and automated recovery processes
- Respond to system incidents and participate in on-call rotations
- Conduct root cause analysis of incidents and outages
- Reduce operational toil through the development of user-driven automated workflows
- Work with business owners to regularly re-prioritize the book of work while delivering both tactical and long-term objectives
Perks and Benefits:
The minimum base salary for this role is $120,000 if located in New York. This expectation is based on available information at the time of posting. This role may be eligible for discretionary bonuses, which could constitute a significant portion of total compensation. This role may also be eligible for benefits, such as health, dental, and other wellness plans, as well as 401(k) contributions. Successful candidates’ compensation and benefits will be determined in consideration of various factors.