RemoteJobs.org mascotRemoteJobs.org
Remote JobsCompaniesAPIPost a Job
RemoteJobs.org mascotRemoteJobs.org

Find your dream remote job. Browse thousands of remote positions from top companies worldwide.

Job Categories

  • General
  • Programming
  • Design
  • Marketing
  • Sales
  • Customer Support

Resources

  • Browse Jobs
  • Companies
  • Post a Job
  • For Developers

Company

  • About Us
  • Contact
  • Privacy Policy
  • Terms of Service
© 2026 RemoteJobs.org. All rights reserved.
    ← Back to all jobs
    FA

    Machine Learning Engineer — Multilingual Data

    Featherless AI
    Full-time
    RemoteGeneralToday

    About this role

    We’re looking for a Machine Learning Engineer to own and scale our multilingual data pipeline—from sourcing and curation to evaluation and continuous improvement. You’ll work closely with researchers and infra engineers to ensure our models perform robustly across languages, scripts, and cultural contexts.

    This role sits at the intersection of data, research, and production ML and is ideal for someone who cares deeply about data quality, linguistic diversity, and model generalization beyond English.

    What You’ll Do

    • Design, build, and maintain large-scale multilingual datasets across high- and low-resource languages

    • Develop data pipelines for collection, cleaning, normalization, deduplication, and labeling

    • Implement quality filters using statistical, heuristic, and model-based methods

    • Work with researchers to define language coverage, benchmarks, and evaluation metrics

    • Analyze dataset bias, coverage gaps, and failure modes across regions and scripts

    • Support training, fine-tuning, and distillation workflows with high-quality multilingual data

    • Continuously iterate on datasets based on model performance and real-world usage

    What We’re Looking For

    • 3+ years of experience as an ML Engineer, Applied Scientist, or similar role

    • Strong experience working with multilingual or non-English datasets

    • Solid understanding of NLP fundamentals (tokenization, embeddings, language modeling)

    • Experience building scalable data pipelines (Python, Spark, Ray, or similar)

    • Familiarity with Unicode, scripts, tokenization challenges, and language-specific quirks

    • Comfort collaborating with researchers and translating research needs into production systems

    Nice to Have

    • Experience with low-resource languages or multilingual benchmarks (e.g. FLORES, XTREME)

    • Exposure to LLM training, fine-tuning, or distillation

    • Linguistics background or experience working with native language experts

    • Contributions to open-source datasets or ML tooling

    • Experience with data quality evaluation at scale

    Why Join

    • Real ownership over a core differentiator of the product

    • Work on models used globally, not just in English-speaking markets

    • Small, high-caliber team with deep ML and systems experience

    • Competitive compensation + meaningful equity at Series A stage

    About Featherless AI

    FA
    Featherless AI

    Related Jobs

    Staff Android Systems Engineer

    Greenlight · USD 165,000 - 240,000

    Sylheti Interpreter

    LanguageLine Solutions

    Psychiatric Nurse Practitioner - CA License

    Clarity Pediatrics · USD 175,000 - 205,000