Develop incident alerts & observability automation, conduct analysis, create health metrics, lead investigations, and provide advisory support. Automate processes such as system & network log analysis to re-assemble and replay incident event history for root cause analysis & impact costs
Design and conduct tabletop exercises to assure organizational readiness in disaster recovery and business continuity program
Establish processes and build play-book document catalog and implement strategy around operational responses to incidents, and to protect our customers and Squarespace
Manage and contribute efforts to build the next generation Metrics Platform in the Cloud
Build / refine our Observability tools that support hundreds of engineers every day
Refine the Incident Commander processes and Incident Management training
Who We're Looking For
BS in Computer Science or Engineering, or equivalent professional experience
Have 8+ years of demonstrated experience as an engineer
Proficiency in at least 1 general purpose programming or scripting language (i.e. Golang)
In-depth technical understanding to assess incident risks & significance across broader tech ecosystem
Regular on-call rotation expectations
Benefits & Perks
Health insurance with 100% covered premiums for you and your dependent children
Fertility and adoption benefits
Headspace mindfulness app subscription
Global Employee Assistance Program
Pension benefits with employer match
Flexible paid time off
Up to 26 weeks of full pay for birth parent leave and up to 20 weeks of full pay for non-birth parent leave
12 weeks of pay to care for an ill family member
Education reimbursement
Employee donation match to community organizations
6 Global Employee Resource Groups (ERGs)
Free lunch and snacks
Close proximity to cultural landmarks such as Dublin Castle and St. Patrick's Cathedral