We’re looking for a founding AWS Cloud Engineer to architect, build, and scale the infrastructure foundation for our AI-native technology platform.

This is a true 0→1 Greenfield opportunity — no legacy systems, no existing infrastructure, and complete ownership over how the platform is designed and scaled.

You will work at the intersection of cloud infrastructure, DevSecOps, distributed systems, and AI infrastructure, helping build highly scalable and resilient systems powering next-generation AI and MarTech products.

This role is ideal for someone who thrives in fast-moving startup environments, enjoys solving complex infrastructure challenges, and can independently make strong architecture decisions with high ownership.

What You’ll Own

Architect, build, and manage secure, scalable AWS cloud infrastructure from scratch
Design and operationalize cloud-native infrastructure for AI-native and SaaS products
Build and maintain production-grade CI/CD pipelines and deployment workflows
Develop Infrastructure as Code using Terraform (preferred) or AWS CloudFormation
Manage containerized environments using Docker and modern deployment practices
Build scalable distributed systems, async processing pipelines, and event-driven architectures
Support AI/LLM production workloads, inference systems, embeddings pipelines, semantic retrieval, and RAG-based architectures
Work closely with product, engineering, AI, and data teams to translate product requirements into scalable infrastructure systems
Design observability, monitoring, logging, alerting, and incident response systems
Continuously optimize infrastructure reliability, scalability, security, and cloud costs
Take complete ownership of infrastructure decisions, deployment reliability, and platform scalability

Security & DevSecOps Responsibilities

Design and maintain secure AWS environments across development and production systems
Implement least-privilege IAM architecture, secrets management, and secure networking practices
Integrate security into CI/CD pipelines through IaC scanning, container security, dependency management, and automated checks
Build resilient infrastructure with strong monitoring, auditing, and operational visibility
Support incident management, root cause analysis, and production debugging
Ensure infrastructure follows modern cloud security and operational best practices

Must-Have Skills & Experience

Strong hands-on experience with AWS cloud infrastructure and core AWS services including EC2, VPC, IAM, S3, RDS, Route53, Load Balancers, CloudFront, and CloudWatch
Strong experience building and managing scalable cloud-native infrastructure for SaaS or AI-native technology products
Expertise in Infrastructure as Code using Terraform (preferred) or AWS CloudFormation
Strong experience with Docker, containerized environments, and CI/CD deployment pipelines
Strong Linux, networking, debugging, and production incident management skills
Strong understanding of cloud security, IAM architecture, secrets management, and infrastructure best practices
Experience operationalizing AI/LLM workloads and supporting AI-native production environments
Strong understanding of modern AI infrastructure patterns including embeddings, semantic retrieval systems, vector databases, and RAG-based architectures
Experience supporting scalable distributed systems, async processing pipelines, and event-driven architectures
Strong ownership mentality with the ability to work independently in fast-moving startup environments
Strong communication and architecture-thinking skills with the ability to explain scalability tradeoffs, infrastructure risks, and operational decisions clearly
Experience building highly observable, resilient, and fault-tolerant production systems

Good to Have

Experience with Kubernetes, Amazon EKS, or large-scale container orchestration systems
Exposure to serverless AWS services such as Lambda, EventBridge, and Step Functions
Experience with Redis, Kafka, SQS, or high-throughput queue and caching systems
Familiarity with AI infrastructure tooling such as LangChain, LangGraph, or LlamaIndex
Experience with vector databases such as Pinecone, Weaviate, Qdrant, or pgvector
Exposure to GPU workloads, AI inference infrastructure, or MLOps environments
Experience with analytics and data warehouse platforms such as Snowflake, Redshift, BigQuery, or dbt
Strong cloud cost optimization and infrastructure scaling experience
Founder-minded or builder mentality with strong execution capability in zero-to-one environments
Prior experience working in early-stage startups or high-growth product companies

What Success Looks Like

Built and scaled production-grade AWS infrastructure from scratch
High system reliability, uptime, and operational resilience
Fast, stable, and secure deployment pipelines
AI infrastructure successfully running and scaling in production
Strong observability, monitoring, and incident management practices
Infrastructure that scales efficiently with product and usage growth
Optimized cloud spend while maintaining performance and reliability
Strong security posture across infrastructure, deployments, and access management
High ownership and ability to independently drive infrastructure evolution

Founding Cloud Infrastructure Engineer (AI Platform)

About this role

About Valerie Group

Related Jobs