As AI becomes standard in the workplace, talent leaders have stopped asking “should we use AI?” and have started asking “how do we actually make this useful?”
As general-purpose tools like ChatGPT, Claude, and Copilot become more powerful, it’s tempting to believe we can simply upskill teams to be AI experts alongside their core competencies and prompt our way to solutions. But for talent and culture functions—which sit at the high-stakes intersection of employee performance, organizational strategy, and messy internal systems—this “DIY” approach is hitting a wall.
Today, most organizations are evaluating three potential paths for leveraging AI in Talent and Culture work:
- Direct Use: Using general-purpose AI and providing as much manual context as possible.
- Internal Builds: Asking engineering or talent teams to build custom tools on top of general models.
- Purpose-Built Systems: Buying solutions designed specifically for talent decisions and workflows.
Whether you’re prompting ChatGPT directly or building something custom on top of it, the same limitations consistently emerge.
It Doesn’t Know What It Doesn’t Know
Talent leaders don’t operate in a vacuum. Every decision is shaped by shadow policies, unspoken norms, relationship dynamics, and historical context that isn’t documented anywhere. None of that context lives cleanly in a single data source. Much of it isn’t documented anywhere at all.
General-purpose AI can only work with the context you remember to give it. It has no framework for knowing what’s missing. It can’t prompt you to clarify whether your compensation philosophy has shifted or if a recent restructuring created hidden tensions. It relies entirely on your ability to do the diagnostic work yourself. And without very specific prompting, a general-purpose tool has no way to know which context matters most or how to resolve conflicting inputs.
The result? Talent leaders end up in a cycle of “prompt fatigue”: re-explaining context and translating generic outputs into something usable. Partial context always produces partial insight.
Purpose-built systems, by contrast, are designed around the specific context talent decisions require. They know which questions unlock insight and can surface the gaps you didn’t realize mattered.
Trained On Everything, Calibrated To Nothing
Effective talent decisions require a reference class. Imagine you have five minutes to find the answer to a question about a specific topic. Would you rather I opened the door to a huge library or handed you a book opened to the right page?
General-purpose AI tools are the library. They are trained on the “lowest common denominator” of the internet: Reddit, public blogs, and generic HR templates. But they fundamentally lack one critical element talent leaders need: visibility into what other companies are actually doing, especially when it comes to non-public talent practices and benefits.
Questions like “Is this pattern meaningful?” or “Are we overreacting?” can’t be answered in isolation. A tool that truly allows talent leaders to make better, more strategic decisions needs a reference class. One that’s seen thousands of similar questions, data, and outcomes. That’s what allows leaders to calibrate decisions, understand tradeoffs, and learn from patterns that don’t exist inside a single organization.
Without benchmarks, AI can tell leaders what’s possible. But it can’t tell them what’s normal, leading, or risky—or model outcomes based on real data.
When your engagement scores drop 8 points in Engineering, is that a crisis or a correction from an unsustainable high? General-purpose AI will tell you “8 points is significant.” A purpose-built tool—trained on thousands of real-world engagement cycles—can tell you, “This is a typical post-restructuring correction; watch for X and Y as leading indicators of a real crisis.” Without benchmarks, you aren’t gaining insight; you’re just getting a summary of the average.
Optimized For Average, Not Excellent
General-purpose AI is optimized to produce answers that are broadly acceptable, not excellent. Without expert judgment layered on top, AI doesn’t strengthen your practices, it regresses them toward a mean of mediocre HR policies. Even when the output is technically correct, the guidance is often overly cautious or slightly outdated. This isn’t a flaw in the technology; it’s a natural consequence of being trained on the entire internet at once.
Real talent work is about making deliberate choices about what “good” looks like now, for your specific culture. That wisdom isn’t found in a manual; it’s earned through pattern recognition that, as of today, can only be gained by doing the work. It’s the ability to know which training approaches actually shift behavior versus those that merely create “check-the-box” fatigue for employees. It’s understanding why a performance management process that works in a high-growth startup will almost certainly fail in a mature, global enterprise.
This is the kind of judgment general-purpose AI can’t deliver because it has never done the work. It can’t distinguish a true best practice from a common mistake. Someone still has to decide when the system should raise a flag, when it should stay quiet, and how guidance must evolve as laws and social norms shift. That level of nuance requires subject-matter expertise built through years of hands-on experience—something a prompt can’t replace.
Built For Clean Inputs, Not Messy Reality
One of the hardest parts of applying AI to talent work isn’t the technology itself. It’s the reality of the data. General-purpose AI assumes clean inputs. Talent teams live with messy reality.
Talent data is fragmented:
- Some of it is structured: HRIS records, compensation data, headcount metrics.
- Some of it is semi-structured: survey results, performance feedback, exit data.
- Much of it is unstructured: policies, handbooks, internal documents, emails, all hands meeting discussions, and cultural norms that live nowhere except in people’s heads.
Very little of this data is centralized. Almost none of it is clean. And much of the most important context—the why behind decisions, the exceptions, the history—isn’t captured in any system at all.
General-purpose AI can impose order on pieces of data in the moment. But when you close the chat window, that understanding might evaporate. It’s certainly not memorialized for others on the team who have a different chat instance. There’s no durable system of record, no shared context, no institutional memory.
So instead of getting leverage, talent leaders spend time stitching together inputs across systems, re-explaining the same information in different prompts, and sanity-checking outputs against what they already know to be true. Then, when they’re confident the output is accurate, they still need to socialize it with their teams and potentially back into other systems. The result isn’t strategic insight. It’s repeated overhead.
The Hidden Tax of DIY AI
We’re early in this shift, but I’m already seeing a consistent pattern: teams start with excitement, then slowly realize they’ve created a second job.
Even if you solve for context, benchmarks, and expertise, there’s something that gets consistently underestimated: what comes after the prototype. AI has lowered the cost of building. But it hasn’t eliminated the work that comes after. Building a prototype takes a weekend. Maintaining it in production when your HRIS changes, your policies evolve, and someone gets a biased output takes resources that most talent teams don’t have.
Someone has to decide what good looks like. Which outputs are acceptable. When the system should stay quiet and when it should raise a flag. That judgment work never ends, and it almost always lands on people who already have a full-time job leading talent strategy.
Every new team member has to learn the prompts. Every policy change requires updating workflows across multiple people. Every edge case surfaces the fact that no one owns the system’s judgment. Every audit trail is a patchwork of Slack threads, ChatGPT exports, and someone’s recollection of what they asked.
When organizations build these systems internally, they almost always under-invest in that long-term maintenance layer because it isn’t anyone’s core job. Models change. Data drifts. Users adapt. Regulations evolve. Trust can be lost much faster than it’s gained.
Recently, a Chief People Officer said something to me that stuck—”My team is tired of trying to cobble this all together with Claude and Zaps. Who wouldn’t just pay for something?” Because the “cheaper” DIY alternative is quietly expensive: time spent re-explaining context, validating outputs, building and maintaining tools that sit outside anyone’s core role, and managing risk when decisions affect real people. For talent leaders, the real question isn’t whether AI can help. It’s where their judgment is best spent.
Stop Building Infrastructure, Start Making Decisions
I expect we’ll see a shift away from general-purpose experimentation and toward purpose-built platforms that scale expert judgment. Not because of hype, but because of pragmatism.
The difference isn’t just more context. It’s carrying forward the judgment of experts who’ve seen thousands of similar situations and updating that judgment as practices, regulations, and expectations change. That’s not something you can prompt engineer. It’s something you build with purpose and maintain continuously, which is exactly what specialized providers can do at scale and internal teams cannot.
For talent leaders, the real strategic question isn’t “Can we build this?” It’s “What do we need to own?”
You need to own your strategy, your judgment about what’s right for your people, and your organizational context. You don’t need to own the infrastructure for normalizing data, maintaining benchmarks, encoding expert judgment, and managing AI risk. Trying to own all of that typically means you end up owning none of it well.
General-purpose AI can make work move faster. But talent leaders aren’t accountable for speed alone. They’re accountable for trust, quality, and outcomes that endure.
The most strategic talent leaders aren’t the ones building the most AI tools. They’re the ones who know exactly where their judgment matters most and who invest in specialized capabilities to handle everything else.
Because the question isn’t whether AI can help talent teams work faster. It’s whether you’re building the capability to make better decisions, or just moving faster toward the same limitations.