Sarah Carthy

Senior Technical Customer Success Manager

Weights & Biases

Customer SuccessBalancedEnterprise📍 Germany
Deal Size: $50K-500K+ expansions
Sales Cycle: 2-4 months for expansion deals
Posted by Sarah Carthy•

Overview

You own the technical relationship with enterprise AI/ML teams after they've signed. Your job is to drive adoption (getting them from pilot to production), expand usage across teams, and prevent churn. You're working with sophisticated customers building GenAI applications, LLMs, and production ML systems—they know what they're doing technically, so you need to be a peer, not just a support resource.


Role Snapshot

AspectDetails
Role TypeTechnical Customer Success (post-sales expansion focus)
Sales MotionLand-and-expand, usage-based growth
Deal ComplexityEnterprise/Strategic
Sales CycleN/A for renewals, 2-4 months for expansion deals
Deal Size$50K-500K+ expansions depending on seat count and usage
Quota (est.)$500K-1M net retention/expansion annually

Company Context

Stage: Public (via CoreWeave acquisition, NYSE-listed parent company)

Size: 306 employees

Growth: Expanding EMEA presence, hiring senior technical roles, backed by public company resources

Market Position: Leading ML operations platform competing against MLflow (open source), Neptune.ai, and internal tools teams build themselves. Trusted by top-tier AI companies.


GTM Reality

Pipeline Sources:

  • 70% Existing customer expansion - usage growth, new teams, additional products
  • 20% Renewals with expansion conversations
  • 10% Cross-sell opportunities from CoreWeave relationships

SDR/AE Structure: You inherit accounts from the sales team post-signature. You own the technical relationship but partner with Account Executives on commercial expansion deals.

SE Support: You ARE the technical expert post-sales. You don't get SE support—you're expected to handle technical architecture, integrations, and troubleshooting yourself.


Competitive Landscape

Main Competitors:

  • MLflow (open source—free but requires heavy ops investment)
  • Neptune.ai (similar feature set, smaller market presence)
  • Comet.ml (experiment tracking competitor)
  • Internal/homegrown tools (many teams build their own tracking)

How They Differentiate: Enterprise-grade reliability, comprehensive ML lifecycle coverage (not just experiment tracking), and proven at scale with customers like OpenAI. Better integration ecosystem than open-source alternatives.

Common Objections:

  • "We already built something internally" (technical debt argument)
  • "MLflow is free" (TCO and opportunity cost of maintaining it)
  • "We're only using 10% of the platform" (adoption/training gap)

Win Themes: Production-ready reliability, works at scale, saves ML engineers time vs maintaining internal tools, better collaboration across teams.


What You'll Actually Do

Time Breakdown

Customer QBRs & Strategy (25%) | Technical Support & Troubleshooting (30%) | Expansion Projects (25%) | Internal Coordination (20%)

Key Activities

  • Weekly check-ins with ML platform leads: You review usage metrics, discuss blockers (integration issues, feature gaps, workflow problems), and plan adoption roadmaps. These customers are sophisticated—you're talking about distributed training workflows, experiment versioning at scale, and production monitoring architecture.

  • Technical troubleshooting and escalation: When integrations break, data isn't syncing correctly, or performance issues crop up, you're the first line of defense. You dig into logs, reproduce issues, work with engineering on bugs, and sometimes write code snippets to unblock customers.

  • Expansion discovery and scoping: You track usage patterns to identify teams not yet on the platform or products they're not using (e.g., they use experiment tracking but not model registry). You scope out what it would take to expand, build the technical case, and loop in the AE for commercial discussions.

  • Quarterly Business Reviews: You prepare data-driven presentations showing adoption metrics, ROI (time saved, models deployed faster), and strategic recommendations. You're making the case for renewal and expansion based on actual usage data.


The Honest Reality

What's Hard

  • Customers are extremely technical: You're working with ML engineers and AI researchers who built models at FAANG companies. They'll spot bullshit immediately and expect you to understand their workflows at a deep level. If you can't talk fluently about distributed training, hyperparameter optimization, or model versioning strategies, you'll lose credibility fast.

  • Usage doesn't always translate to value perception: A team might use W&B daily but still question the cost at renewal time because their finance team doesn't understand ML tooling ROI. You spend time building business cases and quantifying "time saved" in ways non-technical stakeholders understand.

  • You're always in reactive mode: Between Slack messages about integration issues, urgent questions before demos, and ad-hoc technical consultations, your calendar gets blown up regularly. Proactive expansion work gets squeezed by firefighting.

What Success Looks Like

  • 120%+ net revenue retention on your book: Your customers renew and expand. Teams that started with 10 seats are now at 50. Products they weren't using (like model registry) are now core to their workflow.

  • High product adoption scores: Your accounts aren't just paying—they're actively using the platform. DAU/MAU ratios are healthy, multiple teams are onboarded, and they're integrated into production workflows.

  • Customer references and case studies: Your customers agree to speak at conferences, participate in case studies, and refer other companies because the platform genuinely made their ML workflows better.


Who You're Selling To

Primary Buyers:

  • Head of ML Platform / ML Infrastructure Lead (technical decision-maker)
  • VP Engineering or CTO (budget owner for expansions)
  • Individual ML Engineers and Data Scientists (end users who influence renewals)

What They Care About:

  • Reliability at scale: Does it work when we're tracking thousands of experiments and terabytes of data?
  • Integration friction: How easily does this fit into our existing MLOps stack (Kubernetes, AWS/GCP, CI/CD pipelines)?
  • Team collaboration: Can our distributed team (researchers, engineers, product) actually use this together effectively?
  • ROI vs internal tools: Is this cheaper and better than maintaining our homegrown tracking system?

Requirements

  • Hands-on ML/AI experience—you've trained models, run experiments, dealt with ML infrastructure challenges. You're not learning this on the job.
  • Fluent German and English (you're covering Germany, some customers will prefer German for QBRs)
  • 5+ years in technical CSM, solutions engineering, or ML engineering roles with customer-facing responsibilities
  • Experience with enterprise ML stacks: PyTorch/TensorFlow, cloud platforms (AWS/GCP/Azure), containerization, distributed training
  • Track record of driving net revenue retention and customer expansion in technical products
  • Comfortable with ambiguity—W&B is moving fast, EMEA structure is still scaling, you'll need to figure things out independently