Book a Call
Book a Call
ARTICLE #32
Table of contents
Evaluating creative work with AI: What are the possible risks?


Written by:
7 min read
Updated on: May 27, 2024
Samson Mosilily
Senior Regional Manager

African Market, Regional Management, Growth
Samson Mosilily
Senior Regional Manager

African Market, Regional Management, Growth
Artificial intelligence, like ChatGPT or other large language models, can be downright fun to play with, and often it sounds convincingly human. But that slick veneer hides a thorny reality: AI isn’t exactly your best buddy when it comes to evaluating creative work. Sure, these models can help draft legal briefs, suggest marketing taglines, or even churn out books. Yet the moment you start leaning on them for deep insights or nuanced judgments in art, design, or conceptual thinking, you are basically asking your AI assistant to perform a job it wasn’t really built for.
Below are the reasons why relying on LLMs as your go-to “creative critic” might be a recipe for bland feedback and skewed results. From fundamental issues like data bias to the inherent lack of genuine creativity, we’ll explore the potential pitfalls you risk if you treat AI’s perspective as the ultimate expert opinion. After all, if real creativity were just about stringing words together, we’d all be Van Goghs and Shakespeares by now, but that’s not how the human spark works.
Artificial intelligence, like ChatGPT or other large language models, can be downright fun to play with, and often it sounds convincingly human. But that slick veneer hides a thorny reality: AI isn’t exactly your best buddy when it comes to evaluating creative work. Sure, these models can help draft legal briefs, suggest marketing taglines, or even churn out books. Yet the moment you start leaning on them for deep insights or nuanced judgments in art, design, or conceptual thinking, you are basically asking your AI assistant to perform a job it wasn’t really built for.
Below are the reasons why relying on LLMs as your go-to “creative critic” might be a recipe for bland feedback and skewed results. From fundamental issues like data bias to the inherent lack of genuine creativity, we’ll explore the potential pitfalls you risk if you treat AI’s perspective as the ultimate expert opinion. After all, if real creativity were just about stringing words together, we’d all be Van Goghs and Shakespeares by now, but that’s not how the human spark works.
1. Susceptibility to user bias
1. Susceptibility to user bias
AI chatbots, including popular Large Language Models (LLMs), respond heavily to prompts. If you instruct them to tear a piece of work apart, they’ll find flaws, whether genuine or merely invented. Prompt them to praise the same piece, and suddenly they’ll gush about its brilliance.
This swing is particularly problematic regarding creativity, where nuance and context matter. Researchers have shown how easily GPT-style models can be nudged into biased or inaccurate answers just by adjusting the prompt (see various GPT-3 studies from leading academic sources, for instance). It’s a bit like asking someone who has never visited an art gallery to be your chief curator, handy in a pinch, but hardly a reliable authority.
Adding to that, there’s also the risk of echo chambers. If an AI’s training data leans heavily positive or negative, the entire analysis can tilt in one direction. The best creative work thrives on balanced feedback, not a barrage of praise or faultfinding that overlooks the finer points.
AI chatbots, including popular Large Language Models (LLMs), respond heavily to prompts. If you instruct them to tear a piece of work apart, they’ll find flaws, whether genuine or merely invented. Prompt them to praise the same piece, and suddenly they’ll gush about its brilliance.
This swing is particularly problematic regarding creativity, where nuance and context matter. Researchers have shown how easily GPT-style models can be nudged into biased or inaccurate answers just by adjusting the prompt (see various GPT-3 studies from leading academic sources, for instance). It’s a bit like asking someone who has never visited an art gallery to be your chief curator, handy in a pinch, but hardly a reliable authority.
Adding to that, there’s also the risk of echo chambers. If an AI’s training data leans heavily positive or negative, the entire analysis can tilt in one direction. The best creative work thrives on balanced feedback, not a barrage of praise or faultfinding that overlooks the finer points.
2. Inaccurate and unreliable algorithms
AI Limitations often surface in the form of outdated data or oversimplified metrics. An AI that judges a text or design concept might look for patterns from its training data but overlook distinct cultural references or unusual style choices that don’t fit its set of rules.
In a practical sense, imagine a brand that deliberately breaks grammar norms for impact. An AI tool might flag those stylistic decisions as errors. That means innovative ideas risk being labelled “incorrect.” And if you’re dealing with AI for Design, you might see bold or unconventional visuals get marked down because the algorithm “thinks” they’re mistakes.
And let’s not ignore that many AI systems rely on training sets that are far from current. A trend that peaked last year could be treated as if it’s brand new, resulting in misguided critiques of what you’re trying to achieve in the present day.
AI Limitations often surface in the form of outdated data or oversimplified metrics. An AI that judges a text or design concept might look for patterns from its training data but overlook distinct cultural references or unusual style choices that don’t fit its set of rules.
In a practical sense, imagine a brand that deliberately breaks grammar norms for impact. An AI tool might flag those stylistic decisions as errors. That means innovative ideas risk being labelled “incorrect.” And if you’re dealing with AI for Design, you might see bold or unconventional visuals get marked down because the algorithm “thinks” they’re mistakes.
And let’s not ignore that many AI systems rely on training sets that are far from current. A trend that peaked last year could be treated as if it’s brand new, resulting in misguided critiques of what you’re trying to achieve in the present day.
3. Absence of real-world context and perspective
Though AI combs through mountains of data, it hasn’t lived a single day in the real world. It can’t attend an art exhibition, interpret local humour, or experience global events firsthand. This gap in genuine understanding becomes especially clear when evaluating witty references or subtle brand cues that rely on personal or cultural knowledge.
Take humour as an example. Sarcasm, irony, and wordplay all require more than just dictionary definitions to interpret. Without that deeper grasp, the AI’s verdict may be shallow or off the mark.
Consider the difference between passively reading about a festival and dancing in the streets. No matter how comprehensive an AI’s data sets are there’s no living spark to guide its judgment. This explains why certain jokes or cultural nods go unrecognised, it processes words but can’t absorb the human energy behind them.
Though AI combs through mountains of data, it hasn’t lived a single day in the real world. It can’t attend an art exhibition, interpret local humour, or experience global events firsthand. This gap in genuine understanding becomes especially clear when evaluating witty references or subtle brand cues that rely on personal or cultural knowledge.
Take humour as an example. Sarcasm, irony, and wordplay all require more than just dictionary definitions to interpret. Without that deeper grasp, the AI’s verdict may be shallow or off the mark.
Consider the difference between passively reading about a festival and dancing in the streets. No matter how comprehensive an AI’s data sets are there’s no living spark to guide its judgment. This explains why certain jokes or cultural nods go unrecognised, it processes words but can’t absorb the human energy behind them.
4. Deficiency in creativity and original thought
Studies on creativity often define it through two lenses: “novelty” (offering something new) and “usefulness” (making that new idea beneficial). AI can churn out text that seems fresh, but beneath the surface, it’s typically recycling existing patterns.
When faced with boundary-pushing concepts like a brand identity that defies category norms, an AI might view that approach as an error. Such “caution” from the AI could hamper bold moves that genuinely excite audiences.

What’s more, real innovation often involves a bit of chaos, the willingness to break patterns and attempt what no one else has tried. AI is a pattern-seeker at heart, built to learn from established norms rather than leap fearlessly into the unknown. If your brief demands a never-before-seen concept, the algorithm’s tendency to preserve the familiar might stifle the very creativity you need.
Studies on creativity often define it through two lenses: “novelty” (offering something new) and “usefulness” (making that new idea beneficial). AI can churn out text that seems fresh, but beneath the surface, it’s typically recycling existing patterns.
When faced with boundary-pushing concepts like a brand identity that defies category norms, an AI might view that approach as an error. Such “caution” from the AI could hamper bold moves that genuinely excite audiences.

What’s more, real innovation often involves a bit of chaos, the willingness to break patterns and attempt what no one else has tried. AI is a pattern-seeker at heart, built to learn from established norms rather than leap fearlessly into the unknown. If your brief demands a never-before-seen concept, the algorithm’s tendency to preserve the familiar might stifle the very creativity you need.
5. Lack of emotional insight and empathy
Yes, AI can scan text for sentiment, but it doesn’t truly feel anything. That means it can’t empathise with personal struggles or pick up on a brand’s emotional narrative. Where a creative strategist might sense the heartbreak or triumph behind a story and adjust accordingly, an AI simply scores or categorises it.
Crafting effective campaigns often hinges on intangible elements like gut feelings, cultural memory, and brand character. AI can’t replicate these human subtleties, making it ill-suited for any evaluation that hinges on emotional pull.
Think about the early stages of brand development, where a project team might share personal stories to fuel new ideas. An AI can scan the words, but it doesn’t sense that spark of excitement when everyone in the room suddenly realises something is clicking. Without that intangible human response, you end up with suggestions that might look tidy on paper yet lack the warmth that actually draws people in.
Yes, AI can scan text for sentiment, but it doesn’t truly feel anything. That means it can’t empathise with personal struggles or pick up on a brand’s emotional narrative. Where a creative strategist might sense the heartbreak or triumph behind a story and adjust accordingly, an AI simply scores or categorises it.
Crafting effective campaigns often hinges on intangible elements like gut feelings, cultural memory, and brand character. AI can’t replicate these human subtleties, making it ill-suited for any evaluation that hinges on emotional pull.
Think about the early stages of brand development, where a project team might share personal stories to fuel new ideas. An AI can scan the words, but it doesn’t sense that spark of excitement when everyone in the room suddenly realises something is clicking. Without that intangible human response, you end up with suggestions that might look tidy on paper yet lack the warmth that actually draws people in.
6. Undermining human expertise and intuition
Too much reliance on AI Chatbot opinions might sideline professionals who’ve spent years honing their craft. Creative directors, copywriters, and brand strategists rely on experience that’s shaped by real-world results—far more nuanced than a dataset.
Ditching that hard-earned human expertise in favour of an algorithm’s quick take could not only flatten variety but also lessen the chances of a one-of-a-kind final product. Standardised evaluations might feel “safe,” but safe doesn’t always spark audience interest.
And there’s more at stake than just style. Humans bring a unique ability to pivot on real-time feedback, tapping into intangible factors no dataset can match. Perhaps a veteran designer recalls how a similar approach resonated in a past campaign, or a strategist notices subtle changes in consumer attitudes. Those spur-of-the-moment breakthroughs can turn an average campaign into something memorable. If you rely solely on AI, that alchemy of lived experience falls by the wayside.
Too much reliance on AI Chatbot opinions might sideline professionals who’ve spent years honing their craft. Creative directors, copywriters, and brand strategists rely on experience that’s shaped by real-world results—far more nuanced than a dataset.
Ditching that hard-earned human expertise in favour of an algorithm’s quick take could not only flatten variety but also lessen the chances of a one-of-a-kind final product. Standardised evaluations might feel “safe,” but safe doesn’t always spark audience interest.
And there’s more at stake than just style. Humans bring a unique ability to pivot on real-time feedback, tapping into intangible factors no dataset can match. Perhaps a veteran designer recalls how a similar approach resonated in a past campaign, or a strategist notices subtle changes in consumer attitudes. Those spur-of-the-moment breakthroughs can turn an average campaign into something memorable. If you rely solely on AI, that alchemy of lived experience falls by the wayside.
7. Ethical bias and considerations
Large AI models train on massive datasets that can contain ingrained biases about gender, race, and culture, you name it. If an AI offers creative feedback based on skewed data, it might inadvertently reinforce stereotypes. That’s hardly the path to fresh, inclusive brand concepts.
Moreover, an incorrectly trained AI might push you away from a campaign that resonates with a particular group, simply because the data it learned from wasn’t diverse or broad enough. We all know brand work can shape public attitudes, so these blind spots can become real hazards.
Plus, there’s the question of who actually gathered the data in the first place. If the sources are narrow, the AI’s recommendations may steer you toward a single viewpoint. For brands aiming to reach wide-ranging audiences, that’s a surefire way to miss opportunities for growth or alienate key segments.
Large AI models train on massive datasets that can contain ingrained biases about gender, race, and culture, you name it. If an AI offers creative feedback based on skewed data, it might inadvertently reinforce stereotypes. That’s hardly the path to fresh, inclusive brand concepts.
Moreover, an incorrectly trained AI might push you away from a campaign that resonates with a particular group, simply because the data it learned from wasn’t diverse or broad enough. We all know brand work can shape public attitudes, so these blind spots can become real hazards.
Plus, there’s the question of who actually gathered the data in the first place. If the sources are narrow, the AI’s recommendations may steer you toward a single viewpoint. For brands aiming to reach wide-ranging audiences, that’s a surefire way to miss opportunities for growth or alienate key segments.
8. Failure to grasp strategic objectives
The brand expression usually follows a well-defined plan. Whether your project aims to attract a new audience or reposition your existing identity, there’s a strategic story behind each decision. AI often lacks an innate understanding of that high-level plan.
Without context for the brand’s overall aims, an AI’s remarks may feel disconnected, making solid creative choices look like missteps.
And let’s be honest: no matter how comprehensive your brief, there are always nuances that only surface through human dialogue and incremental feedback. Perhaps your main target is a niche market with specific cultural cues, or you want to shift public perception in a subtle way that’s hard to quantify. AI isn’t built to interpret those layers. As a result, it may fixate on surface-level details and overlook the bigger reasons behind your creative direction.
The brand expression usually follows a well-defined plan. Whether your project aims to attract a new audience or reposition your existing identity, there’s a strategic story behind each decision. AI often lacks an innate understanding of that high-level plan.
Without context for the brand’s overall aims, an AI’s remarks may feel disconnected, making solid creative choices look like missteps.
And let’s be honest: no matter how comprehensive your brief, there are always nuances that only surface through human dialogue and incremental feedback. Perhaps your main target is a niche market with specific cultural cues, or you want to shift public perception in a subtle way that’s hard to quantify. AI isn’t built to interpret those layers. As a result, it may fixate on surface-level details and overlook the bigger reasons behind your creative direction.
9. Challenges in capturing brand voice
Brands invest countless hours refining their unique brand voice like how they speak, the tone they adopt, and the personality that comes through in writing and visuals. An AI can mimic many styles, but that’s often where it ends. It may unknowingly revert to a default style that aligns with its training data, missing the distinctive pattern or word choice that sets your brand apart.
In practice, an AI might misunderstand the difference between playful irreverence and unprofessional slang, leading to a critique that suggests “fixes” to elements that are, in fact, part of the brand’s signature.
This becomes even trickier for smaller or emerging brands still shaping their personality. An AI might decide that a casual yet daring style is “unprofessional,” nudging you toward something bland. Over time, these automated recommendations can file away the quirks that help you stand out.
Brands invest countless hours refining their unique brand voice like how they speak, the tone they adopt, and the personality that comes through in writing and visuals. An AI can mimic many styles, but that’s often where it ends. It may unknowingly revert to a default style that aligns with its training data, missing the distinctive pattern or word choice that sets your brand apart.
In practice, an AI might misunderstand the difference between playful irreverence and unprofessional slang, leading to a critique that suggests “fixes” to elements that are, in fact, part of the brand’s signature.
This becomes even trickier for smaller or emerging brands still shaping their personality. An AI might decide that a casual yet daring style is “unprofessional,” nudging you toward something bland. Over time, these automated recommendations can file away the quirks that help you stand out.
Frequently Asked Questions
Is AI biased when evaluating creative content?
Yes, AI can show bias when judging creative content. If the information used to train the AI isn't balanced or diverse, this can influence how it rates things. For example, it might prefer designs or ideas that are already common, which could limit the variety of creative expressions. This could mean that existing trends just keep getting repeated instead of new and different ideas being encouraged.
How does AI affect the creative decision-making process?
AI can definitely speed things up when it comes to making decisions by giving data-backed information. However, this might also mean human judgment becomes less important. If reliance on AI becomes excessive, the creative work produced might end up being a bit too predictable or just not very inspiring. This is because algorithms tend to favor established patterns rather than taking risks. For a really good creative process, it seems like AI should assist, not replace, individual intuition and expertise.
What happens if companies rely too heavily on AI for creative evaluations?
Relying too much on AI can result in creative work that feels very similar, where designs and concepts become too predictable and follow a formula. AI is built to identify patterns, but often, true creativity comes from disrupting those patterns and thinking in new ways. If businesses depend too much on AI, they might miss out on innovation and not produce designs that really grab attention. It's essential to have human involvement to make sure creative work stays original, innovative, and connects with people on an emotional level.
Final Thoughts
Where creative judgment is involved, AI can certainly provide prompt ideas and interesting suggestions, but it cannot take the place of the final gatekeeper. ChatGPT and other such tools can churn out clever wordplay or flashy one-liners, but ethical considerations, brand tone, and cultural trends are not something that an AI can review with true understanding or empathy. Creativity needs a human hand, a person who can spot those nuanced emotional beats, maintain a brand narrative's consistency, and achieve the right balance of inspiration and strategy.
So, while AI is tempting (and undeniably useful in some stages of content creation), relying on it to green-light your boldest ideas or refine your brand’s style is a gamble. Ultimately, human expertise is what keeps creative work feeling fresh, relevant, and authentically connected to culture. Think of AI as a handy sidekick, not the creative director. By letting people handle the final call, agencies and brands can preserve the magic that makes their content truly resonate and stay ahead of the curve as AI technology continues to evolve.
Frequently Asked Questions
Is AI biased when evaluating creative content?
Yes, AI can show bias when judging creative content. If the information used to train the AI isn't balanced or diverse, this can influence how it rates things. For example, it might prefer designs or ideas that are already common, which could limit the variety of creative expressions. This could mean that existing trends just keep getting repeated instead of new and different ideas being encouraged.
How does AI affect the creative decision-making process?
AI can definitely speed things up when it comes to making decisions by giving data-backed information. However, this might also mean human judgment becomes less important. If reliance on AI becomes excessive, the creative work produced might end up being a bit too predictable or just not very inspiring. This is because algorithms tend to favor established patterns rather than taking risks. For a really good creative process, it seems like AI should assist, not replace, individual intuition and expertise.
What happens if companies rely too heavily on AI for creative evaluations?
Relying too much on AI can result in creative work that feels very similar, where designs and concepts become too predictable and follow a formula. AI is built to identify patterns, but often, true creativity comes from disrupting those patterns and thinking in new ways. If businesses depend too much on AI, they might miss out on innovation and not produce designs that really grab attention. It's essential to have human involvement to make sure creative work stays original, innovative, and connects with people on an emotional level.
Final Thoughts
Where creative judgment is involved, AI can certainly provide prompt ideas and interesting suggestions, but it cannot take the place of the final gatekeeper. ChatGPT and other such tools can churn out clever wordplay or flashy one-liners, but ethical considerations, brand tone, and cultural trends are not something that an AI can review with true understanding or empathy. Creativity needs a human hand, a person who can spot those nuanced emotional beats, maintain a brand narrative's consistency, and achieve the right balance of inspiration and strategy.
So, while AI is tempting (and undeniably useful in some stages of content creation), relying on it to green-light your boldest ideas or refine your brand’s style is a gamble. Ultimately, human expertise is what keeps creative work feeling fresh, relevant, and authentically connected to culture. Think of AI as a handy sidekick, not the creative director. By letting people handle the final call, agencies and brands can preserve the magic that makes their content truly resonate and stay ahead of the curve as AI technology continues to evolve.
Work with us
Click to copy
work@for.co
- FOR® Brand. FOR® Future.
We’re remote-first — with strategic global hubs
Click to copy
Helsinki, FIN
info@for.fi
Click to copy
New York, NY
ny@for.co
Click to copy
Miami, FL
mia@for.co
Click to copy
Dubai, UAE
uae@for.co
Click to copy
Kyiv, UA
kyiv@for.co
Click to copy
Lagos, NG
lagos@for.ng
Copyright © 2024 FOR®
Work with us
Click to copy
work@for.co
- FOR® Brand. FOR® Future.
We’re remote-first — with strategic global hubs
Click to copy
Helsinki, FIN
info@for.fi
Click to copy
New York, NY
ny@for.co
Click to copy
Miami, FL
mia@for.co
Click to copy
Dubai, UAE
uae@for.co
Click to copy
Kyiv, UA
kyiv@for.co
Click to copy
Lagos, NG
lagos@for.ng
Copyright © 2024 FOR®
Work with us
Click to copy
work@for.co
We’re remote-first — with strategic global hubs
Click to copy
Helsinki, FIN
hel@for.co
Click to copy
New York, NY
ny@for.co
Click to copy
Miami, FL
mia@for.co
Click to copy
Dubai, UAE
uae@for.co
Click to copy
Kyiv, UA
kyiv@for.co
Click to copy
Lagos, NG
lagos@for.ng
Copyright © 2024 FOR®