Why your AI resume builder is lying about you (and how to fix it)

AI resume tools hallucinate metrics, invent projects, and fabricate skills. Learn how hallucination happens, see real examples, and find out how to test any AI tool.

AI resume tools have a hallucination problem, and most candidates do not notice until they are sitting in an interview being asked about a project they have never worked on.

The problem is not that the tools are malicious. It is that large language models are trained to produce fluent, plausible text — and "plausible resume bullet" is a pattern they have seen millions of times in training data. When you ask a generic AI tool to tailor your resume to a job description, it often fills gaps in your experience with invented details that sound credible. The metric is slightly inflated. The project name is subtly wrong. The technology you "led" is one you used once for an afternoon.

This piece explains the three main ways hallucination manifests in AI-generated resumes, shows real examples (anonymised), explains the technical concept of grounding, and gives you five specific prompts you can use to test any AI resume tool you are currently using.

Three ways generic AI fabricates experience

Inflated metrics

This is the most common and most dangerous form of hallucination. You tell the AI that you "improved page load time" on a project, and it rewrites the bullet as "reduced page load time by 40%, improving user retention by 15%." The numbers sound specific and impressive. They are invented.

Why does this happen? The model has seen thousands of strong resume bullets in its training data. Strong bullets have specific numbers. So when the model rewrites your vague description, it inserts numbers that pattern-match to "strong resume bullet" — not numbers that correspond to what you actually measured or achieved.

The danger is not just ethical (though it is that). The danger is practical. In a technical interview, a hiring manager who reads "reduced page load time by 40%" will ask: "How did you measure that? What was the baseline? What was the P99 after the change?" If you cannot answer, you have created a significant problem for yourself from a bullet you did not write.

Phantom projects

Generic AI tools sometimes invent entire projects that do not exist. This happens most often when the job description references a technology or domain you have not worked in, and the model is trying to bridge the gap.

If you have a Python background and the JD mentions Rust, a hallucinating AI might add "Led migration of core service from Python to Rust, achieving a 3x improvement in throughput" to your experience section. The project never happened. The migration never happened. The metric is invented. But the bullet is syntactically perfect and passes a surface-level read.

This is a harder failure to catch because phantom projects often blend plausibly with your real experience. They use the same company names, the same rough time periods. They just describe work you did not do.

Wrong tech stack claims

A more subtle form of fabrication: the AI correctly identifies that a technology should appear in your resume (because the JD mentions it), but incorrectly attributes your experience with it.

For example: you used AWS S3 for file storage on one project, three years ago, briefly. The JD mentions extensive AWS experience including EC2, RDS, Lambda, and S3. The AI rewrites your experience to include "Designed and managed AWS infrastructure including EC2 auto-scaling groups, RDS clusters, and Lambda event pipelines" — none of which you have worked with beyond reading the documentation.

This is the hallucination that gets candidates caught most often. Technical interviewers probe AWS questions by asking about specific configuration decisions, failure modes, and operational details. A candidate who can only answer "I used S3 for file storage" after claiming broader AWS infrastructure experience will not proceed past a technical screen.

Real examples we've seen

We have anonymised these examples but they represent patterns we have seen in CVs run through generic AI tools before candidates switched to RecastCV.

Example 1: The inflated ARR claim. A growth marketer with three years of experience asked a generic AI tool to tailor their resume to a VP of Growth role at a Series B startup. The JD mentioned "driving revenue growth at scale." The AI added "Drove $4.2M in incremental ARR through performance marketing campaigns." The candidate's actual contribution was meaningful but indirect — they managed campaigns that supported a team's pipeline targets, with no visibility into the ARR figure. In a panel interview, the CFO asked how the $4.2M was calculated. The candidate could not answer.

Example 2: The open source contribution that wasn't. A software engineer used a generic AI tailoring tool for a role at a company known for open source contributions. The JD mentioned "active open source involvement." The AI added "Contributed to the Django REST framework core codebase, improving serializer performance by 18%." The engineer had starred the repo. They had read the source code. They had never submitted a PR. The interviewer, who was a Django contributor herself, asked which PRs they could look up.

Example 3: The management experience stretch. A senior individual contributor applied for an engineering manager role. The JD required "3+ years of people management experience." The candidate had mentored two junior engineers informally. The AI rewrote this as "Managed a team of 4 engineers, conducting performance reviews and driving career development planning." The candidate had done none of those things. The first interview question was "Tell me about your performance review process."

In all three cases, the candidates had genuine strengths that would have been compelling. The fabricated details did not help — they created interview traps.

The "grounding" concept explained

In the context of AI-generated text, "grounding" means anchoring the AI's output to a specific, verified source of truth — rather than letting it generate plausible content freely.

For a resume tool, grounding means the AI can only write about experience that exists in your actual profile. If you have not added a project to your project library, the AI cannot write about it. If you have not claimed a technology as a skill you have used, the AI cannot insert it into a bullet. The output of the AI is constrained to the input you have provided.

This is technically harder than unconstrained generation. A grounded AI tool needs to maintain a structured record of your experience — not just a text blob — and apply rewriting rules that explicitly prohibit inserting any entity, metric, or claim that cannot be traced back to the source record. It is architecturally more like a database query plus a natural language transformation than a free-form generation task.

The trade-off: a grounded AI tool will sometimes produce a less impressive-sounding output than an unconstrained one, because it cannot fill gaps with invented details. That is the correct trade-off. A resume that accurately represents your experience — stated in the clearest, most relevant language for the role — is more useful than an inflated one that will collapse under interview questioning.

How RecastCV enforces grounding

RecastCV is built around grounding as a first-order constraint. The system has three components that enforce it.

The master CV is the primary source of truth. When you first use RecastCV, you upload your full CV. The system parses and structures it — not as raw text, but as a structured record of roles, dates, responsibilities, and skills. Every tailored output is derived from this record.

The project library is where you add the specific projects and outcomes that you want the AI to draw on. Each project record includes the context, the tech stack, the scope, and the outcome as you have described it. The AI cannot invent a project that is not in the library. It can rephrase, reorder, and reframe — but it cannot fabricate.

The rewrite rules are a set of explicit constraints applied during generation. These include: no metrics that are not present in the source record, no technology claims not present in the skills or project records, no job titles other than those in the experience record. If the AI's draft output violates a constraint, it is flagged and regenerated rather than passed through.

The practical result: a RecastCV output will not include a metric you did not provide, a project you did not add, or a technology you did not claim. It will express your real experience in the most relevant language for the role. That is a smaller, but more reliable, tailoring guarantee than "here is a version of your resume that sounds impressive."

For a side-by-side comparison of how this approach differs from Teal's AI features, see RecastCV vs Teal. And if you want to try it directly, the homepage walks through the three-step process.

Test your current tool with these 5 prompts

If you are using a generic AI tool — ChatGPT, Claude, Gemini, or any AI resume builder — you can test its hallucination behaviour with the following prompts. Paste your current resume and a recent JD into the tool, run the tailoring, and then ask these follow-up questions:

Prompt 1: Metric verification. "For each metric in the tailored resume (percentages, dollar amounts, numbers), list where in my original resume you found the source data."

A grounded tool will map every number back to your original document. An ungrounded tool will either admit it invented them or produce a confused response.

Prompt 2: Technology audit. "List every technology or tool mentioned in the tailored resume that does not appear in my original resume. Explain why you added each one."

A hallucinating tool will have a list. A grounded tool should return an empty list or only surface technologies from the JD that were already present in your original.

Prompt 3: Project source check. "For each project or initiative named in the tailored resume, identify the section of my original resume that describes it."

If the tool cannot map every project to a source, it has invented at least one.

Prompt 4: Seniority check. "Does the tailored resume imply a level of seniority or responsibility that is not supported by my original resume? If so, what specifically was added?"

This tests for scope inflation — the kind of hallucination that turns "contributed to" into "led" or "managed a team of" without justification.

Prompt 5: The blank test. Provide only the JD, with no resume at all. Ask the tool to "tailor my resume to this job description."

A grounded tool will refuse or produce an error — it has no source material to draw on. A hallucinating tool will generate a complete, plausible resume from scratch, for a candidate that does not exist.

Run these five tests on any AI tool you rely on. The results will tell you quickly whether the tool is working from your experience or generating around it.

Frequently asked questions

Does AI always hallucinate on resumes, or is it avoidable?

It is avoidable with the right architecture. Generic large language models hallucinate because they are not constrained to a specific source of truth — they generate plausible output based on patterns in training data. Purpose-built resume tools can enforce grounding by maintaining a structured record of your experience and applying rewrite rules that prohibit inserting unverified claims. The difference is architectural, not just a matter of prompting carefully.

Can I just review the AI output and catch hallucinations manually?

Sometimes, but not reliably. The most dangerous hallucinations are the subtle ones — a metric that is slightly inflated from the number you vaguely remember, or a technology added to a bullet where you have adjacent but not identical experience. These are hard to spot if you are not carefully comparing the output against your original experience record. A systematic audit using the five prompts in this article is more reliable than a general read-through.

Is it ever acceptable to round up or estimate metrics on a resume?

Yes, with important constraints. If you drove a project that produced measurable results but you do not have access to the exact figures, using a reasonable estimate — clearly identified in your mind as an estimate — is acceptable. What is not acceptable is using a precise-sounding figure (like '37%') when you have no basis for that precision, or stating an outcome as fact when it was only partial or indirect. The test: if asked in an interview, could you explain the measurement methodology clearly?

Does ChatGPT hallucinate less than other AI tools on resume content?

ChatGPT and similar general-purpose large language models all face the same structural problem: they are designed to produce fluent, plausible text, not to stay strictly within the bounds of a provided source document. Some prompting techniques reduce hallucination — for example, explicitly telling the model to only use information from the provided resume and to flag any gaps. But this requires careful prompting each time, and the model can still fabricate. Purpose-built grounded tools are architecturally more reliable for this use case.