By Robert Koehler, Scale GTM Advisor
Working in the trenches with sales teams, I’ve found that most reps still struggle to conduct meaningful discovery and uncover underlying business context, personal stakes, and internal friction that drive real decisions. Discovery is where good sellers build the most leverage and value. We win or lose deals here. Yet despite the mountain of call recordings we now store in systems such as Gong, Chorus, and Attention, sales managers often spend little time reviewing them with their sellers. The result is that most sellers rarely receive coaching, and when they do, it’s delivered via one-off, unreinforced training sessions.
This state of affairs calls for a new, AI-powered approach.
Building evaluation criteria based on best practices
AI (in this case, ChatGPT) can be an ideal assistant to increase the speed, quantity, quality, and consistency of discovery analysis, scoring, and coaching. When I first set out to create an AI sales coaching tool, my near-term goal was to use it to increase awareness of strengths/gaps and close specific discovery skill gaps for sellers, managers, and leaders. Longer-term, I hoped to improve stage-to-stage conversion, boost win rates, and shorten post-discovery time-to-close for our portfolio companies.
Based on several years of experience reviewing and scoring discovery calls, I partnered with ChatGPT to critique and refine a call evaluation scorecard using documented best practices (e.g., Challenger, SPIN, MEDDPICC; experts like Dave Brock, Keenan, Kyle Norton).
The core prompt included (access the full sales prompt here)
-
Context: “You are an expert sales coach…
-
Evaluation model: discovery broken into clear, observable steps aligned to best practices.
-
Scorecard: 1 point per step; total possible equals 100%. (attached to the prompt)
-
Output: a concise two-page report with seller/account/buyer roles/dates, a headline summary, Top 3 strengths, Top 3 areas to improve, and coaching recommendations.
And here is the 20-point scorecard I put together:
| Category | Sub-category | Description | Points |
| Opening & Framing (2 pts) |
Rapport & Icebreaker. | Brief, natural open | 0.5 |
| Meeting Intent & Agenda | Clear purpose, agenda, outcome | 1.0 | |
| Prospect’s Reason for Interest | Trigger for the meeting | 0.5 | |
| Problem Discovery & Context (7 pts) |
Business Goals & Outcomes | Target outcomes/metrics | 1.5 |
| Current Approach | Process/tools/workflow today | 1.0 | |
| Pain Points & Challenges | Friction, inefficiencies, risks | 1.5 | |
| Business Impact / Quantification | Time, cost, errors, risk | 1.5 | |
| Alternatives / Competition | Who/what else they’re considering | 0.5 | |
| Evaluation Criteria | What matters, how they’ll decide | 0.5 | |
| Priority & Timeline | Urgency, sequencing, timing | 0.5 | |
| Stakeholder & Decision Process (3 pts) |
Prospect’s Role | Role in the decision | 0.5 |
| Decision Process & Stakeholders | Who/how/governance | 1.5 | |
| Multithreading Opportunity | Pathways to execs/business buyers | 1.0 | |
| Positioning & Value Alignment (4 pts) |
Tailored Positioning | After discovery, aligned to outcomes | 1.5 |
| Value vs. Features | Business impact over feature dump | 1.0 | |
| Relevant Insights | Patterns from similar companies/personas | 1.0 | |
| Active Listening & Mid-Call Recap | 0.5 | ||
| Closing & Next Steps (4 pts) |
Debrief Question | How does this compare to today | 1.0 |
| Recap & Agreement | Confirm takeaways | 1.0 | |
| Clear Next Step | Aligned to their buying motion | 1.0 | |
| Schedule Next Step Live | 1.0 |
Of course, you can adjust the scorecard weighting to your organizational preferences. My model gives ‘Problem Discovery and Context’ the heaviest weight because missing the pain, impact, and decision dynamics kills conversion. The “Stakeholders & Decision Process” elements might be more important for enterprise selling where multiple stakeholders are key.
The prompt generates three report types (sample output report here):
-
Individual call discovery analysis
-
Individual seller analysis across multiple calls
-
Team-level discovery report for the CRO
Truly scalable coaching has never been easier
I implemented this approach at one Scale portfolio company and immediately saw massive changes:
-
Quantitative changes (before → after):
-
Calls evaluated (scored + documented): 12 → 50
-
Cycle time (eval + report): 2–3 weeks → 2 days
-
Follow-on coaching sessions: 1 → 8
-
Time to enablement recommendations: 3.5 weeks → 2 days
-
-
Qualitative changes:
-
-
Speed, structure, objectivity: A clear rubric and consistent scoring raised trust in the coaching.
-
Seller experience: Live, face-to-face coaching focused on 3 priorities for improvement max. Sellers felt a real investment and asked for more.
-
Behavioral shifts: Several sellers cut talk time 10–15%, asked more/better questions, and listened more.
-
Gamification: Sellers checked scores after calls and self-assessed.
-
We also walked away with a couple of learnings. First, humans in the loop remain essential. Managers should review summaries and recommendations to match company language and culture. Second, managers should focus on trends and scoring across multiple calls. In general, using relative movement and comparative performance to guide enablement is much more effective than focusing on the absolute figures.
Next steps to get started
-
Automate the workflow. Integrate the master prompt and LLM with your recording platform (Gong/Chorus/Attention) and your CRM so scoring/reporting is daily and automatic—shrinking the gap from call to coaching.
-
Deeper questioning analysis. Use the follow-up prompt that we developed to break down question types and quality (single open-ended, expansion prompts, reframes) and coach to ask better questions.
-
Evidence-based enablement. Build training from real call examples—what good looks like and what to avoid—aligned to the top three team gaps.
For help standing up AI sales coaching for your team, or access to the deeper questioning analysis, contact Robert Koehler (Scale GTM Advisor).