Post a job

Principal Site Reliability Engineer


United Kingdom
Base Salary
98k-115k GBP

Job Description

At a Glance

Us: Fast-growing startup of 100+ people. Remote team, mainly based in the UK. YC alumni (Summer 2019) and Series A funding of 32m$ in 2023. Our mission is to revolutionise how the world learns about people, so people can revolutionise the world. 🚀

You: An enabling leader passionate about driving platform-first initiatives to ensure the scalability, reliability, and performance of our platform.

Salary: £98,000 - £115,00 + bonus + share options

The Role

We are looking for a Principal Site Reliability Engineer to lead Site Reliability at Prolific, focusing on advancing the resilience and scalability of our GCP and AWS environments. You will play a pivotal role in overseeing and enhancing our Kubernetes clusters in GCP, which support our Django application, and in driving the SRE strategic transition to AWS, particularly towards serverless and event-driven architecture.

What you'll be doing

  • Strategic oversight of continuous monitoring, maintenance, and optimisation processes for our Django application, ensuring highest levels of performance and reliability.
  • Lead the evolution of our cloud Kubernetes estate, focusing on advanced security, reliability, and observability strategies.
  • Spearhead infrastructure optimisations and architectural improvements in collaboration with cross-functional teams, addressing complex challenges and ensuring scalability.
  • Promote knowledge sharing and reduce silos across teams to strengthen resilience and reduce dependency on key individuals, increasing our Bus Factor.
  • Drive hands-on coding and system design improvements, with a focus on Python/Django, to optimise system performance and efficiency.
  • Develop comprehensive documentation and training programs to elevate the operability skills of the engineering team and foster a culture of continuous learning.
  • Support our Service Delivery response strategies, by being part of an out-of-hours support rota, and collaborating with our Service Delivery Lead to enhance overall service quality.
  • Lead security initiatives, addressing emerging threats, ensuring robust compliance, and setting best practices for the organisation.

What you’ll bring

  • Extensive experience as a Site Reliability Engineer / Platform Engineer, with proven staff-plus leadership in managing a large-scale enterprise Kubernetes platform in GCP.
  • Deep expertise in security, compliance, and cloud architecture best practices.
  • A track record of implementing observability-first approaches and familiarity with tools like Datadog.
  • Experience in leading out-of-hours incident management and on-call rotations.
  • Demonstrated ability to mentor teams, lead strategic initiatives, and drive significant technology transformations.
  • Certification in any of the below would be an advantage, but not required
    • GCP Professional Cloud Architect
    • GCP Professional Security Engineer
    • GCP Professional Networking Engineer
    • GCP DevOps Engineer
    • CKA (Certified Kubernetes Administrator)

What you’ll get

⚖️ Work Life Balance: We’re all looking to strike the right work life balance, and as a remote first company you’re able to work flexibly from home or our dog-friendly co-working space in Shoreditch and Manchester. We also offer 25 days of holiday, plus bank holidays of course, which you can switch with any day of your choosing.

🏡 Family Life: We offer generous maternity, paternity and shared parental leave. Need to pick your child up from school? No Problem. Our flexible working gives you the childcare flexibility you need.

💰 Pension: We offer a salary sacrifice pension with a 3% starting employer contribution.

📈 Share Options: An exciting option to buy shares of Prolific in the future.

🧘 Wellbeing: We care deeply about our employees well-being, that’s why we offer comprehensive Bupa Medicash private health insurance, that disregards medical history. Taxable monthly stipend of £150 in order for you to improve your wellness and remote experience. We want you to have a happy and healthy environment so we offer a £1000 home office budget, along with a Apple MacBook when you start, plus a £200 yearly top-up.

⚰️ Death-in-service: We offer a death-in-service benefit that would pay out 4x of your annual salary.

📚 Learn Grow: Development is important to us, and we want to give all our employees the opportunity to learn. There are many personal growth and career progression opportunities available, as well as mentoring. We also offer a £1000 yearly budget for education, growth and training for you to use at your discretion.

💙 Culture: We’re a friendly bunch here at Prolific; open, transparent and inclusive. Although we’re a remote first company we still love to hang out with each other! We run collaborative quarterly company-wide meets up and team socials (both virtually and in-person), all paid for. Alongside this we offer a £1000 yearly budget for discretionary meet-ups so you can cover travel, food and accommodation. As a business we’re also committed to carbon offsetting; each month we donate money in your name to plant trees 🌳 and being remote we’re doing our bit to offset travel too

Our Interview Process

Talent Call: You'll meet with one of our Talent team and have an exploratory call about the role requirements, life at Prolific, as well as your background and aspirations.

Hiring Manager Interview: You'll interview with two members of the team, one of which will be the hiring manager. You'll have the opportunity to ask about the company and the role, and we'll ask you questions about your experiences and goals.
Panel Interview: We'll hold a panel interview that evaluates skills required for the role. You'll meet with more of our team and may be asked to complete a presentation or task. You'll be compensated with a £50 voucher 💰 for completing the task because we know your time is valuable!
Final Interview: We will deep dive into your past experiences, goals, motivations, and skills all aligned to our Prolific Principles. You'll speak with two to three members of the team and - as always - have an opportunity to ask questions about the role and company.

Diversity, Equity and Inclusion Monitoring

Prolific is an equal opportunity employer. We celebrate diversity and are committed to fostering diversity, equity and inclusion in the workplace. We welcome all applications, and consider them without regard to race, religion, belief, age, gender, gender expression, gender identity, gender reassignment, disability, marriage or civil partnership status, pregnancy or maternity, sex or sexual orientation.

We are committed to ensuring a fair recruitment process, it's essential to our success. Under the Equality Act (2010) we collect information from individuals at the point of application. We anonymously monitor the profiles of individuals that apply to each vacancy to ensure that no individual is unfairly discriminated against or disadvantaged.

Privacy Statement

By submitting your application, you agree that Prolific may collect your personal data for recruiting and global organisation planning. Prolific's Candidate Privacy Notice explains what personal information Prolific may process, where Prolific may process your personal information, its purposes for processing your personal information, and the rights you can exercise over Prolific’s use of your personal information.

Advice from our career coach

As a Principal Site Reliability Engineer at this fast-growing startup, you'll play a key role in advancing the resilience and scalability of the platform. To stand out as an applicant, showcase your experience in managing large-scale enterprise Kubernetes platforms and implementing observability-first approaches. Here are some tips to help you shine:

  • Highlight your experience in managing Kubernetes clusters in GCP and driving AWS transitions.
  • Showcase your expertise in security, compliance, and cloud architecture best practices.
  • Demonstrate your ability to lead strategic initiatives and mentor teams.
  • If applicable, mention any relevant certifications like GCP Professional Cloud Architect or CKA.
  • Emphasize your experience in incident management and on-call rotations.

Apply for this job


Please let Prolific know you found this job with This helps us grow! mascot