The volume is staggering.
Millions of pieces of content are posted every minute. Comments, videos, images, listings, reviews. Every platform is under pressure to review, assess, and act in real time. So it's no surprise that many are turning to AI, especially large language models, to help scale Trust & Safety.
And to be clear, AI is useful. It flags patterns. It filters noise. It catches repeat behavior faster than any human team ever could.
But it doesn't understand context.
And in Trust & Safety, context is everything.
What the model sees isn’t always what’s happening
A keyword detection system sees a flagged phrase.
A human moderator sees sarcasm.
A model identifies violence in a video frame.
A reviewer recognizes a self-defense tutorial.
An image classifier reports nudity.
A person sees a medical diagram in a health post.
Without context, AI can’t tell the difference between harm and satire. Between a threat and a quote. Between hate speech and a report calling it out.
AI can be trained on datasets. But it can't draw from shared cultural understanding, tone shifts, or evolving slang in the way humans can. And it doesn't carry the institutional memory of your platform … what you've allowed, what you’ve removed, and why.
The real risk isn’t false positives. It’s lost trust.
When platforms rely too heavily on automation for moderation, they introduce two forms of risk:
Over-removal
Safe content gets taken down. Users get frustrated. Creators feel censored. Communities become anxious about what they can say and share.
Under-response
Harmful content slips through because it wasn’t phrased in the exact way the model was trained to detect. Coordinated behavior evades detection. Contextual dog whistles go unnoticed.
In both cases, users lose faith in the system. They stop reporting violations. They stop engaging. They stop believing that safety is a real priority.
The solution isn’t more AI. It’s smarter human-AI teams.
At Nectar, we don’t position GenAI as a replacement. We position it as a teammate. AI does the heavy lifting: surfacing potential issues, prioritizing queues, recognizing repeat patterns. But final judgment belongs to people.
Our approach combines:
Machine speed with human sense-making
Flagged items move faster but only get enforced when reviewed by trained moderators.
Cultural fluency in moderation teams
We build geo-specific teams that understand the language, humor, and social dynamics of the communities they serve.
Feedback loops that train both ways
Human decisions help retrain models. Model patterns help retrain people. Neither is static.
Escalation paths that don’t collapse under pressure
Our moderators don’t work alone. They escalate edge cases to senior reviewers, legal, or brand teams with clear decision frameworks and documentation.
The AI arms race can’t replace human values
Every platform is trying to move faster. But speed without sense is dangerous.
When a moderation decision is made, whether it’s a takedown, suspension, or warning, it affects real people. Misjudging context doesn't just impact a post. It impacts trust, identity, expression, and reputation.
That’s why GenAI must remain part of the toolkit, not the decider.
Real safety comes from judgment. And judgment still requires a human brain.
And, often, a human heart.