Job Description

Introducing Masabi

// At Masabi, we’re driving the fare payment revolution, powering the journeys of millions all over the world. We build fare collection platforms that allow riders to seamlessly buy and present tickets for public transport either on their mobile phones, from a ticket machine, or even by tapping their bank card to travel.

Our Justride platform is used in over 250 locations globally, including some of the largest cities in the world. With our industry-first mobile ticketing SDK, we’ve partnered with large players in the transport space, including Uber, Moovit and Transit.

Your own journey is important to us too. Choosing a role here means joining a network of innovators from all walks of life; a group of passionate individuals who consistently deliver. Here, you’ll find the tools you need to build the career you want. Whether you’re taking the direct route or trying a new path, we’ll support you no matter what.

Role Description

// Join our Site Reliability Engineering team at Masabi as we embark on a transformative journey in fare collection technology. As a SRE, you will be at the forefront of ensuring our platform's reliability, performance, and security.

This role is pivotal in shaping the future of our infrastructure, offering an exciting opportunity to collaborate with a dynamic team that bridges the gap between development and operations, delivering resilient service in a rapidly evolving industry.

Responsibilities

Automation and Scalability: Drive automation to reduce operational overhead and human error. Build CI/CD pipelines, develop Infrastructure as Code (IaC) using tools like Terraform and CloudFormation, and design scalable systems to handle high traffic while optimizing resource utilization.
Continuous Improvement: Refine processes, tools, and workflows to enhance system reliability, scalability, and efficiency. Plan capacity to anticipate future needs and support high-performance systems.
Security and Compliance: Ensure infrastructure meets organizational security standards and supports compliance frameworks like SOC 2 and PCI.
Monitoring and Reliability: Maintain real-time monitoring systems aligned with SLIs and SLOs, ensuring uptime and performance meet or exceed SLAs. Set up proactive alerting mechanisms to address issues before they escalate.
Cost Optimization: Monitor and optimize cloud infrastructure costs through autoscaling, rightsizing, and architectural reviews to balance cost-effectiveness with reliability.
Disaster Recovery and Redundancy: Implement failover strategies, disaster recovery plans, and redundancy to ensure system resilience under all conditions.
Incident Management: Respond to production incidents, minimize downtime, and restore availability. Perform root cause analysis, implement preventive measures, and contribute to post-incident reviews to share lessons learned.
Collaboration and Mentorship: Partner with developers to design reliable, maintainable systems. Coach teams on best practices for reliability, scalability, and observability, fostering a culture of ownership.
Documentation and Knowledge Sharing: Maintain detailed documentation for infrastructure, incident response, and workflows. Develop playbooks and runbooks to ensure seamless knowledge transfer.

// Our platform is JVM-based and cloud-native, hosted on AWS. We utilize standard tooling, including Gitlab, Terraform, CloudFormation, Puppet, Kibana, Grafana and Confluent Cloud.

Key Tools and Technologies SREs Work With

Monitoring: Grafana, Prometheus, CloudWatch, Pingdom
CI/CD: GitLab CI, Rundeck
IaC: Terraform, CloudFormation.
Cloud Platforms: AWS
Logging: Kibana, CloudWatch.

About You

Significant experience in SRE or related roles, with a proven track record in building and maintaining reliable systems.
Expertise in AWS Cloud technologies
Hands-on experience with Terraform and Grafana, along with strong knowledge of security principles and networking components.
Hands-on experience with EKS and ECS is essential.
Experience in building pipelines and robust CI/CD infrastructure.
A collaborative team player who approaches projects with an open mind and prioritizes security.
Passionate about leveraging technology to drive advancements while ensuring reliability and security.
Excellent communication skills, a collaborative mindset, and a willingness to learn and contribute to team success.
Self-sufficient and capable of working independently, while also knowing when to seek support or input.

Desirable

Familiarity with PCI DSS v4 Compliance requirements is a plus.
AWS Cloud certification

// Careers at Masabi are for people who are going places - people who are moved by our mission to improve accessibility and make fares fair for everyone. We are grateful to be a network of innovators from all walks of life; a group of passionate individuals who consistently deliver. We operate with openness — we celebrate multiple approaches and points of view and strive to create an environment where everyone feels empowered to bring their whole, authentic selves to work.

Whoever you are, just be yourself.

We encourage people from underrepresented backgrounds to apply; we don’t discriminate. Also, please notify our team of your pronouns at any point in your application. We believe in journeys made simple. Excursions made effortless. So, we cancel out confusion and leverage our collective expertise to support transit agencies and make life better for millions of riders — together, we are creating a future.

We’re already powering journeys - are you ready to join us?

Jobs

Site Reliability Engineer

Location

India

Masabi Jobs

Job Description

Introducing Masabi

Role Description

Responsibilities

Key Tools and Technologies SREs Work With

CI/CD: GitLab CI, Rundeck

IaC: Terraform, CloudFormation.

Cloud Platforms: AWS

Logging: Kibana, CloudWatch.

About You

Expertise in AWS Cloud technologies

Desirable

AWS Cloud certification

Whoever you are, just be yourself.

Apply for this job

About the job

More remote jobs at Masabi Jobs

Related remote jobs

Site Reliability Engineer-II

Site Reliability Engineer (5539)

Site Reliability Engineer (SRE) / DevOps Engineer

Site Reliability Engineer (Azure)

Site Reliability Engineering Engineer (Big Data)