RemoteJobs.org mascotRemoteJobs.org
Remote JobsCompaniesAPIPost a Job
RemoteJobs.org mascotRemoteJobs.org

Find your dream remote job. Browse thousands of remote positions from top companies worldwide.

Job Categories

  • General
  • Programming
  • Design
  • Marketing
  • Sales
  • Customer Support

Resources

  • Browse Jobs
  • Companies
  • Post a Job
  • For Developers

Company

  • About Us
  • Contact
  • Privacy Policy
  • Terms of Service
© 2026 RemoteJobs.org. All rights reserved.
    ← Back to all jobs
    Valerie Group

    Founding Cloud Infrastructure Engineer (AI Platform)

    Valerie Group
    Full-time
    Verified Remote
    RemoteDevOpsToday

    About this role

    We’re looking for a founding AWS Cloud Engineer to architect, build, and scale the infrastructure foundation for our AI-native technology platform.

    This is a true 0→1 Greenfield opportunity — no legacy systems, no existing infrastructure, and complete ownership over how the platform is designed and scaled.

    You will work at the intersection of cloud infrastructure, DevSecOps, distributed systems, and AI infrastructure, helping build highly scalable and resilient systems powering next-generation AI and MarTech products.

    This role is ideal for someone who thrives in fast-moving startup environments, enjoys solving complex infrastructure challenges, and can independently make strong architecture decisions with high ownership.

    What You’ll Own

    • Architect, build, and manage secure, scalable AWS cloud infrastructure from scratch

    • Design and operationalize cloud-native infrastructure for AI-native and SaaS products

    • Build and maintain production-grade CI/CD pipelines and deployment workflows

    • Develop Infrastructure as Code using Terraform (preferred) or AWS CloudFormation

    • Manage containerized environments using Docker and modern deployment practices

    • Build scalable distributed systems, async processing pipelines, and event-driven architectures

    • Support AI/LLM production workloads, inference systems, embeddings pipelines, semantic retrieval, and RAG-based architectures

    • Work closely with product, engineering, AI, and data teams to translate product requirements into scalable infrastructure systems

    • Design observability, monitoring, logging, alerting, and incident response systems

    • Continuously optimize infrastructure reliability, scalability, security, and cloud costs

    • Take complete ownership of infrastructure decisions, deployment reliability, and platform scalability

    Security & DevSecOps Responsibilities

    • Design and maintain secure AWS environments across development and production systems

    • Implement least-privilege IAM architecture, secrets management, and secure networking practices

    • Integrate security into CI/CD pipelines through IaC scanning, container security, dependency management, and automated checks

    • Build resilient infrastructure with strong monitoring, auditing, and operational visibility

    • Support incident management, root cause analysis, and production debugging

    • Ensure infrastructure follows modern cloud security and operational best practices

    Must-Have Skills & Experience

    • Strong hands-on experience with AWS cloud infrastructure and core AWS services including EC2, VPC, IAM, S3, RDS, Route53, Load Balancers, CloudFront, and CloudWatch

    • Strong experience building and managing scalable cloud-native infrastructure for SaaS or AI-native technology products

    • Expertise in Infrastructure as Code using Terraform (preferred) or AWS CloudFormation

    • Strong experience with Docker, containerized environments, and CI/CD deployment pipelines

    • Strong Linux, networking, debugging, and production incident management skills

    • Strong understanding of cloud security, IAM architecture, secrets management, and infrastructure best practices

    • Experience operationalizing AI/LLM workloads and supporting AI-native production environments

    • Strong understanding of modern AI infrastructure patterns including embeddings, semantic retrieval systems, vector databases, and RAG-based architectures

    • Experience supporting scalable distributed systems, async processing pipelines, and event-driven architectures

    • Strong ownership mentality with the ability to work independently in fast-moving startup environments

    • Strong communication and architecture-thinking skills with the ability to explain scalability tradeoffs, infrastructure risks, and operational decisions clearly

    • Experience building highly observable, resilient, and fault-tolerant production systems

    Good to Have

    • Experience with Kubernetes, Amazon EKS, or large-scale container orchestration systems

    • Exposure to serverless AWS services such as Lambda, EventBridge, and Step Functions

    • Experience with Redis, Kafka, SQS, or high-throughput queue and caching systems

    • Familiarity with AI infrastructure tooling such as LangChain, LangGraph, or LlamaIndex

    • Experience with vector databases such as Pinecone, Weaviate, Qdrant, or pgvector

    • Exposure to GPU workloads, AI inference infrastructure, or MLOps environments

    • Experience with analytics and data warehouse platforms such as Snowflake, Redshift, BigQuery, or dbt

    • Strong cloud cost optimization and infrastructure scaling experience

    • Founder-minded or builder mentality with strong execution capability in zero-to-one environments

    • Prior experience working in early-stage startups or high-growth product companies

    What Success Looks Like

    • Built and scaled production-grade AWS infrastructure from scratch

    • High system reliability, uptime, and operational resilience

    • Fast, stable, and secure deployment pipelines

    • AI infrastructure successfully running and scaling in production

    • Strong observability, monitoring, and incident management practices

    • Infrastructure that scales efficiently with product and usage growth

    • Optimized cloud spend while maintaining performance and reliability

    • Strong security posture across infrastructure, deployments, and access management

    • High ownership and ability to independently drive infrastructure evolution

    About Valerie Group

    Valerie Group
    Valerie Group

    Related Jobs

    Senior DevOps/Cloud Engineer (m/w/d)

    Stackmeister Jobs

    Full Stack AI Engineer (Remote in Portugal)

    Otonomee

    Senior Site Reliability Engineer (Remote Build)

    Remote · USD 54,000 - 150,000