Team Lead Site Reliability Engineer (SRE)

Salary: $250k - $500k

Locations: Chicago
Job Type: Full Time
Job Category: Infrastructure

Description

We are working with a leading technology-driven trading firm to hire a Team Lead SRE to drive reliability, scalability, and performance across mission-critical systems. This role is ideal for engineers coming from Big Tech environments who are passionate about building resilient infrastructure and leading high-performing teams in a low-latency, high-availability setting.

Key Responsibilities

Lead and mentor a team of Site Reliability Engineers responsible for the uptime, performance, and scalability of production systems.
Define and implement SRE best practices, including SLIs, SLOs, error budgets, and incident management frameworks.
Own production reliability across trading and research platforms, ensuring systems operate with minimal latency and maximum availability.
Partner with software engineering, infrastructure, and trading teams to improve system design, observability, and operational excellence.
Drive automation initiatives to reduce toil, improve deployment pipelines, and enhance system self-healing capabilities.
Lead incident response, postmortems, and continuous improvement efforts across the platform.

Required Skills

Proven experience in a Site Reliability Engineering or Production Engineering role, ideally within a large-scale or Big Tech environment.
Strong programming skills in languages such as Python, Go, or Java.
Deep understanding of distributed systems, networking, and systems architecture.
Experience with observability tooling (e.g., Prometheus, Grafana, OpenTelemetry) and incident management practices.
Hands-on experience with cloud platforms (AWS, GCP, or Azure) and container orchestration (Kubernetes).
Track record of leading teams or mentoring engineers in high-performance environments.
Strong troubleshooting skills and the ability to operate effectively under pressure.

Preferred Qualifications

Background in low-latency systems and high-performance environments.
Experience with CI/CD systems, infrastructure as code, and large-scale production environments.
Familiarity with Linux internals, networking protocols, and performance tuning.
Prior experience managing on-call rotations and improving operational maturity.

Apply Today

Thank you for your interest in this opportunity. Please complete the form below and upload any relevant documents. A member of our team will review your application and be in touch soon.

Blockchain

AI/Machine Learning

Software

Network & Infrastructure

Quant Trading

Team Lead Site Reliability Engineer (SRE)

Description

Key Responsibilities

Required Skills

Preferred Qualifications

Apply Today

Follow Us

Jobs

Key Markets

Blockchain

AI/Machine Learning

Software

Network & Infrastructure

Quant Trading

Our Markets

Blockchain

AI/Machine Learning

Software

Network & Infrastructure

Quant Trading

Team Lead Site Reliability Engineer (SRE)

Description

Key Responsibilities

Required Skills

Preferred Qualifications

Apply Today

Our Markets

Blockchain

AI/Machine Learning

Software

Network & Infrastructure

Quant Trading