Job Description

You are intelligent, skilled, and approachable. A genuine tech professional with a passion for business; an enthusiast who seamlessly tackles tough challenges using the most powerful tools. You're the one who can listen to a client’s technical concerns, and then relay back solutions directly, concisely, and precisely; steering discussions directly to efficient results. You make the complex simple.

As part of the Expedite Commerce team, every day is an opportunity to evolve, advance your career, and unlock your potential while working as part of a close-knit global team of technologists. If you thrive on creating high-performance solutions and excel at solving intricate problems with AWS technologies, we invite you to join us on this dynamic journey.

The Role

As an AI NLP Ops Engineer, you will be responsible for managing, deploying, and fine-tuning NLP models and large language model (LLM) agents to solve business challenges, primarily using AWS Sagemaker and Bedrock technologies and other related infrastructure in AWS. You will play a critical role in ensuring the smooth operation, scalability, and reliability of our AI products, focusing on automation, performance monitoring, and agent lifecycle management.

This role requires hands-on experience with LLM agent-based frameworks and implementing LLM-based agents in production environments. The position demands strong development experience skills, advanced proficiency in Python development, specialized knowledge in LLM agent design and development, and exceptional debugging capabilities.

What you will do

You will be involved in all aspects of model AI NLP Ops engineering requirement implementation, data pipeline integration, and metrics integration. This includes development of scalable and optimized solutions for training, retraining, deploying, scheduling, monitoring, and improving NLP models and LLM agents. Key responsibilities include:

Model & Agent Development: Conceptualize and design robust NLP solutions and LLM agents tailored to specific business needs, with a focus on user experience, interactivity, latency, failover and functionality.
Hands-on Coding: Write, test, and maintain clean, efficient, and scalable code for NLP models and LLM agents, with a strong emphasis on Python programming.
Performance Monitoring: Monitor, optimize, and maintain NLP solutions and LLM agents, implementing model explainability, handling model drift, and ensuring robustness.
LLM Agent Ops Monitoring and Logging: Develop and implement comprehensive monitoring and logging solutions for LLM agents to track performance, errors, and usage patterns. Set up alerting mechanisms to promptly address anomalies or issues in LLM operations, ensuring high availability and reliability.
Debugging & Issue Resolution: Proactively identify, diagnose, and resolve issues related to LLM models, including model inaccuracies, performance bottlenecks, and system integration problems. Utilize debugging tools and techniques to troubleshoot complex problems in model behavior, data inconsistencies, and deployment errors.
Innovation and Research: Stay updated with the latest advancements in NLP and LLM technologies, experimenting with new techniques and tools to enhance agent capabilities and performance.
Continuous Learning: Adaptability to unlearn outdated practices, patterns, technologies and quickly learn and implement new technologies & papers as the ML world evolves. Maintain a proactive approach to staying current with emerging trends and technologies in NLP.

Requirements

LLM Agent Development & Deployment: 1-2 years of experience in fine-tuning LLMs and deploying LLM agents, including practical experience with AWS Bedrock, OpenAI Function Calling, Anthropic Function Calling, CrewiAI, Meta GPT framework, and other relevant platforms.
Strong Python Skillset: Proven track record of developing high-quality, efficient Python code, including experience with advanced Python features and best practices.
Integration Skills: Experience with integrating open-source and commercial NLP models and LLM agents, including developing and evaluating prompt engineering techniques.
Technical Proficiency: Strong skills in developing models and agents on cloud platforms, particularly AWS, and implementing serverless AI NLP Ops Engineerures (Utilizing AWS Lambda, Lambda as containerKinesis, SQS, DDB, Bedrock, OpenAI API, S3, Step Function).
Debugging & Troubleshooting: Expertise in debugging and fixing issues related to LLMs, including identifying root causes of errors, resolving discrepancies in model outputs, and optimizing system performance.
LLM Agent Ops Monitoring and Logging: Strong development experience of production implementation of LLM based agent monitoring.
Communication Skills:Excellent written and verbal communication skills in English, with the ability to present technical concepts clearly with team and clients. Should be excelling in designing utilizing mi
Development Experience: using CI/CD pipeline using AWS CodePipeline, CodeBuild, and CodeDeploy for automated testing and deployment of NLP Solution.
Working experience: in CI/CD solution utilizing AWS services (Code Commit, Code Build & Code Pipeline)

The Tech Stack

LLMs and NLP Models: Development experience in working with large language models (LLMs) like GPT, Claude, Gemini, LLAMA3, Anthropic, and others, including experience in fine-tuning and deploying these models.
AWS Platform Services: Development proficiency in AWS services (Lambda, Step Functions, S3, DynamoDB, SQS, SNS, CloudWatch Logs, Lamba as Container).
Integration Skills: Development experience integrating NLP solutions and LLM agents with platforms like Salesforce and using Atlassian agile tools (Jira & Confluence).
Communication Tools: Proficiency in using Zoom and Gong.io for communication and AI-based analysis.

Benefits

Health Insurance, PTO, and Leave time
Ongoing paid professional training and certifications
Fully Remote work Opportunity
Strong Onboarding & Training program

Work Timings - 1 pm -10 pm IST

About Expedite Commerce

At Expedite Commerce, we believe that people achieve their best when technology enables them to build relationships and explore new ideas. So we build systems that free you up to focus on your customers and drive innovations. We have a great commerce platform that changes the way you do business!

See more about us at expeditecommerce.com. You can also read about us on https://www.g2.com/products/expedite-commerce/reviews, and on Salesforce Appexchange/ExpediteCommerce.

EEO Statement

All qualified applicants to Expedite Commerce are considered for employment without regard to race, color, religion, age, sex, sexual orientation, gender identity, national origin, disability, veteran's status or any other protected characteristic.

Jobs

AI NLP Ops Engineer

Location

India

Expedite Commerce

Job Description

The Role

What you will do

Requirements

The Tech Stack

Benefits

Work Timings - 1 pm -10 pm IST

About Expedite Commerce

EEO Statement

Advice from our career coach

Apply for this job

About the job

More remote jobs at Expedite Commerce

Related remote jobs

Machine Learning Engineer in the Optimization team - US Remote

Senior Post-Sales Machine Learning Engineer - EMEA Remote

Launch Project Manager (EH Backfill)

Fullstack Engineer (Toronto)

AI Solution Architect