Post a job

Site Reliability Engineer

K

Location
Remote
Kbit

Job Description

Site Reliability Engineer

Department: Development Team

Employment Type: Full Time

Location: Remote


Description

We're seeking a Site Reliability Engineer (SRE) to ensure the reliability, performance, and scalability of our high-frequency cryptocurrency trading systems. You'll focus on system health, performance monitoring, issue resolution, and process automation. We offer a competitive base salary with a quarterly cash profit share.

About You

  • Passionate about maintaining high-performance, mission-critical systems
  • Excited by the challenges of a 24/7 trading environment
  • Detail-oriented with a proactive approach to problem-solving
  • Comfortable with a flexible schedule, including evenings or weekends when necessary
  • Performance Optimization: Implement advanced monitoring and profiling tools for Python and C++ codebases to identify and eliminate bottlenecks
  • Reliability Engineering:
    • Develop and maintain service level objectives (SLOs) for cryptocurrency trading operations
    • Implement circuit breakers and automated failover mechanisms
    • Design robust error budgets accounting for 24/7 cryptocurrency markets
  • Incident Management: Conduct thorough postmortems after trading anomalies or system issues, focusing on technical and financial impacts
  • Data Integrity: Ensure reliability and consistency of data flows between trading systems, databases, and analytics platforms

Key Responsibilities

  • Monitor and maintain trading system health for optimal performance
  • Identify, triage, and resolve issues in real-time
  • Develop automation scripts and tools to streamline operations
  • Participate in follow-the-sun on-call rotations
  • Implement and maintain monitoring and alerting systems
  • Conduct root cause analysis and implement preventive measures
  • Optimize trading system performance through analysis and tuning
  • Maintain documentation for operational procedures and system architecture
  • Assist in planning and scaling trading infrastructure
  • Bachelor's degree in Computer Science, Software Engineering, or related field
  • Experience in financial trading systems or high-frequency environments
  • Strong proficiency in Python for scripting and automation
  • Linux systems administration in production environments
  • Network protocols and troubleshooting techniques
  • Familiarity with SQL/NoSQL databases and real-time data processing
  • Experience with monitoring tools (e.g., Prometheus, Grafana)
  • Analyzing and troubleshooting trading logs
  • Strong problem-solving skills and attention to detail
  • Excellent communication and collaboration abilities
  • Self-motivated and intellectually curious
  • Understanding of cryptocurrency markets and trading dynamics
  • Experience with centralized cryptocurrency exchange APIs
  • Experience with AWS
  • Containerization technologies (e.g., Docker) in trading environments
  • Database performance tuning
  • Log analysis and automated alert response systems
  • Financial risk management principles applied to trading systems

Benefits

  • We offer a competitive base salary along with a quarterly cash profit share to reward performance.
  • Innovation-Driven Culture: Be part of a team that embraces cutting-edge technology and continuous improvement.
  • Career Growth: Opportunities for professional development and career advancement.
  • Collaborative Environment: Work with talented professionals in a supportive and inclusive setting.
  • Competitive Compensation: Attractive salary and benefits package.
Our Commitment:
  • Work with cutting-edge technologies in cryptocurrency trading
  • Develop expertise in maintaining high-frequency trading systems
  • Collaborate with talented engineers and traders
  • Continuous learning through challenging projects and professional development

Advice from our career coach

A successful applicant for the Site Reliability Engineer position should be passionate about maintaining high-performance, mission-critical systems and excited by the challenges of a 24/7 trading environment. They should be detail-oriented, proactive in problem-solving, and comfortable with a flexible schedule, including evenings or weekends when necessary.

  • Performance Optimization: Implement advanced monitoring and profiling tools for Python and C++ codebases to identify and eliminate bottlenecks.
  • Reliability Engineering:
    • Develop and maintain service level objectives (SLOs) for cryptocurrency trading operations.
    • Implement circuit breakers and automated failover mechanisms.
    • Design robust error budgets accounting for 24/7 cryptocurrency markets.
  • Incident Management: Conduct thorough postmortems after trading anomalies or system issues, focusing on technical and financial impacts.
  • Data Integrity: Ensure reliability and consistency of data flows between trading systems, databases, and analytics platforms.

To stand out as an applicant, showcase your experience in financial trading systems or high-frequency environments, strong proficiency in Python for scripting and automation, expertise in Linux systems administration in production environments, and familiarity with SQL/NoSQL databases and real-time data processing. Highlight your ability to analyze and troubleshoot trading logs, along with your experience with monitoring tools like Prometheus and Grafana. Additionally, demonstrate your problem-solving skills, attention to detail, excellent communication abilities, and self-motivation. A good understanding of cryptocurrency markets, centralized exchange APIs, AWS, Docker, database performance tuning, log analysis, automated alert response systems, and financial risk management principles in trading systems will also make you a strong candidate.

Apply for this job

Expired?

Please let Kbit know you found this job with RemoteJobs.org. This helps us grow!

About the job

Oct 11, 2024

Full-time

Remote
RemoteJobs.org mascot