Job Description
We are seeking a highly capable AI/ML Engineer to join as one of the first hires in a new startup. This role involves building the technical foundation of an AI-powered product from the ground up. You will design, implement, and optimize the core AI systems — including retrieval-augmented generation (RAG) pipelines, vector search, and backend integrations — while working closely with a small founding team.
This is a hands-on engineering role where you’ll directly shape the first working demo, then have the opportunity to transition into a full-time leadership position with equity. As an early-stage startup, there is also significant long-term potential to grow into a CTO role.
Responsibilities
- Ingest and preprocess text data into structured knowledge bases.
- Design and implement RAG pipelines with large language models.
- Integrate vector databases (Pinecone, Weaviate, FAISS, Milvus).
- Build and maintain backend services/APIs for embeddings and retrieval.
- Engineer and refine prompt templates for tone and style consistency.
- Implement safety and compliance guardrails in generation workflows.
- Deploy to the cloud, monitor latency and costs, and optimize performance.
- Secure sensitive assets such as prompts, datasets, and API keys.
- Collaborate with frontend developers and designers for demo integration.
Required Technical Skills
- Advanced Python programming.
- Strong experience with OpenAI API (GPT-4/5) or comparable LLMs.
- Proficiency with LangChain, LlamaIndex, or related frameworks.
- Vector search expertise (Pinecone, Weaviate, FAISS, Milvus).
- Backend/API development (FastAPI, Flask, or equivalent).
- Databases: Postgres, MongoDB, or equivalent.
- Cloud deployment: AWS, GCP, or Azure; Docker and Kubernetes.
- CI/CD pipelines and Git-based workflows.
- Security practices for API, IAM, and data handling.
Preferred Skills
- Fine-tuning or customizing LLMs (LoRA, adapters).
- Experience deploying RAG pipelines at production scale.
- Speech-to-text and text-to-speech integration.
- Familiarity with ElevenLabs, HeyGen and other voice/video AI tools.
- Real-time chat or streaming API systems.
- Previous exposure to fintech, edtech, or regulated domains.
Compensation & Growth
- Initial Stage (Demo Build): $135K–$165K/year.
- Future: Opportunity to convert into a full-time role with equity participation.
- As an early hire, potential to advance into CTO-level leadership as the company grows.