SalesJobsBoard - Curated B2B Sales Jobs

Overview

You're joining as one of the first engineers at a 6-person startup building AI data governance infrastructure. The product scans enterprise knowledge bases (documentation, wikis, support content) to find contradictions, outdated info, and data quality issues that break AI systems. You'll be writing the code that enterprises like Disney depend on to safely deploy AI. Small team, big customers, lots of greenfield work.

Role Snapshot

Aspect	Details
Role Type	Founding/early engineer (full-stack or backend-focused)
Product Stage	Early production - have paying customers but still building core features
Tech Complexity	High - NLP, data parsing, enterprise integrations, AI/ML infrastructure
Team Size	You + founder(s) + 1-2 other engineers
Customer Impact	Direct - you'll talk to Disney's team, debug their issues, ship features they requested
Scope	Broad - database design, API development, data pipelines, customer integrations

Company Context

Stage: Seed (just raised from Susa Ventures and Wischoff Ventures)

Size: 6 employees - mostly engineering

Growth: Landed Disney + other enterprises, now scaling to handle demand

Market Position: First mover in AI-specific data governance - defining the category

What You'll Actually Build

Core Product Areas

Data Ingestion: Connect to customer knowledge bases (Confluence, Notion, SharePoint, custom wikis). Parse different formats, handle auth, incremental updates.
Quality Detection: NLP/ML to identify contradictions, outdated content, missing information, ambiguous language that confuses AI systems.
Remediation Workflows: Tools for data teams to review issues, approve fixes, track changes over time.
Governance Layer: Audit trails, version control, approval chains for regulated industries.
AI Integration Layer: APIs that AI tools query to verify data quality before using information.

Day-to-Day Reality

Feature Development (50%): Ship new capabilities customers are asking for. Disney needs X integration, another customer needs Y report.
Customer Support (20%): Debug why ingestion failed for a customer's data source. Their Confluence has weird formatting. Figure it out.
Infrastructure (15%): Scale pipelines to handle larger datasets. Optimize queries that are timing out. Set up monitoring.
Customer Calls (10%): Join technical discussions with enterprise data teams. Explain how the system works, what's possible, what limitations exist.
Architecture Decisions (5%): Weekly syncs with founders on product direction, tech choices, what to build next.

The Honest Reality

What's Hard

Enterprise data is messy: Every customer's knowledge base is organized differently, uses different tools, has different edge cases. You'll spend a lot of time handling one-off scenarios.
Quality is subjective: "Contradictory information" is easy to say, hard to define algorithmically. You'll iterate on detection logic constantly based on false positives.
Customer timelines: Disney wants a feature by their deadline. You're balancing shipping fast for customers vs building the right foundation.
Small team reality: When something breaks in production at 8pm, you're probably the one fixing it. No one else knows that part of the codebase.
Ambiguous requirements: Customers say "we need better governance" but can't articulate exactly what that means. You're translating vague asks into concrete features.

What Success Looks Like

Ship features that directly enable customer renewals and expansions
Build integrations that unlock new enterprise logos ("we'll buy if you connect to our X system")
Reduce production incidents and customer support load through better reliability
Make architectural decisions that scale as dataset size and customer count grows

Tech You'll Work With

Likely Stack (infer from problem space):

Backend: Python (ML/NLP libraries), possibly Go for performance-critical parts
Data: PostgreSQL or similar, vector databases for semantic search, data pipeline tools
ML/AI: NLP models for text analysis, LLM integration for quality detection
Integrations: REST APIs for enterprise tools (Confluence, SharePoint, etc.)
Infrastructure: Cloud (AWS/GCP), Docker/Kubernetes, CI/CD

What You'll Learn:

Enterprise data governance patterns
NLP/ML in production at scale
Building reliable data pipelines
Selling/explaining technical products to non-technical buyers (you'll be on sales calls)

Who You'll Work With

Internally:

Rebecca Wang (Founder, Stanford CS dropout) - product direction, customer relationships
Akylai Kasymkulova (Co-founder) - likely technical co-founder, architecture decisions
1-2 other engineers - split the codebase with you
Eventually GTM hires - you'll explain product capabilities, what's feasible to promise

Customers:

Data engineers at enterprises debugging ingestion issues
Compliance/legal teams asking how audit trails work
AI/ML teams integrating your APIs into their systems

Requirements

2-5 years software engineering experience, preferably backend or data engineering
Strong Python skills (likely primary language for ML/NLP work)
Experience with data pipelines, APIs, or enterprise integrations
Comfortable working directly with customers (explaining technical concepts, debugging their issues)
Okay with ambiguity - you're building something new, not following established patterns
Interested in AI/ML applications (don't need to be an ML expert, but should want to learn)
SF-based or willing to relocate (6-person team needs in-person collaboration)

Early Engineering Hire