Job Description

The Machine Learning Operations Engineer supports our machine learning infrastructure by ensuring seamless model training, optimization, and deployment. This role is perfect for a tech-savvy individual who enjoys managing machine learning systems and hardware configurations rather than focusing solely on programming, although coding experience would be a strong plus. The ideal candidate is a computer enthusiast with a knack for machine learning infrastructure and model optimization with a passion for working in a collaborative, fast-paced environment.

Responsibilities

Maintain and manage the software configuration of on-premises machine learning hardware to support optimal performance for training neural networks.
Set up and maintain cloud-based training environments, primarily on Google Cloud Platform, to facilitate model experimentation and scalability.
Automate training workflows to drive continuous improvement of vision models, reducing manual overhead and enhancing efficiency.
Develop automated accuracy assessments and generate reports to evaluate and compare the performance of newly trained neural networks against existing models.
Ensure predictable and efficient turnaround times for training models with updated datasets to meet project timelines.
Organize and manage model weights and associated documentation in various formats for deployment across on-premises, cloud, and edge environments.
Apply quantization and pruning techniques to models to enhance computational efficiency without sacrificing accuracy.
Design and deploy infrastructure for low-latency inference to enable real-time performance for large-scale models (e.g., vLLMs).

Requirements

Proven experience with Linux server maintenance, including both on-premises and cloud environments.
Proficient in scripting with Bash and Python to streamline system and model management.
Hands-on experience with neural network training, data loaders, and data pre-processing pipelines.
Familiar with data and model parallelism strategies for improving training speed and efficiency.
Knowledgeable in neural network model conversion and optimization for deployment on diverse hardware.

Preferred Qualifications

Familiarity with Google Cloud Platform for machine learning operations.
Experience with specialized hardware platforms such as Nvidia Jetson, Triton Inference Server, and NIM.
Skilled in OpenVINO and ONNX for model conversion and optimization.
Experience training or fine-tuning large language models (LLMs) would be a significant advantage.
Programming experience in Python and C++ is beneficial but not mandatory.
Strong written and verbal communication skills for documentation and collaboration.
Passion for machine learning technology and an aptitude for problem-solving in fast-paced environments.

At Simbe, you will be at the forefront of retail innovation, working with cutting-edge AI and robotics technologies to transform retail operations. Our culture is dynamic, inclusive, and driven by a passion for improving the way retailers operate and serve their customers. Join us to be a part of a team that is not only reshaping the future of retail but also offering immense value to our clients worldwide.

Simbe Values: R. E. T. A. I. L.

Result Driven - We are customer-centric and results-driven. We strive to create immense value for our team, partners, customers, and investors.

Empathetic - We are sensitive and mindful. We support each other in challenging times, both professionally and personally.

Transparent - We highly value open communication internally, and with our partners and customers. We are receptive to feedback.

Agile - We are agile and always eager to learn. We quickly adapt to changes and customer needs.

Innovative - We are bold and innovative, with an intense focus on product design and user experience.

Leaders - We strive for excellence. We are accountable, the best at what we do, and leaders in our field.

Jobs

Machine Learning Operations Engineer

Location

United States

Base Salary

90k-150k USD

Simbe Robotics

Job Description

Responsibilities

Requirements

Preferred Qualifications

Apply for this job

About the job

Related remote jobs

Senior Manager Software Engineering - Remote

Software Engineering Manager, OpenShift AI Feature Store

Sr. Staff Security Engineer (ELK Stack)

Senior Staff Security Engineer

ML Platform Engineer

Job Description

Responsibilities

Requirements

Preferred Qualifications

Apply for this job

About the job

Related remote jobs

Senior Manager Software Engineering - *Remote*

Software Engineering Manager, OpenShift AI Feature Store

Sr. Staff Security Engineer (ELK Stack)

Senior Staff Security Engineer

ML Platform Engineer

Senior Manager Software Engineering - Remote