As artificial intelligence tools flood global markets and everyday use, experts are raising urgent concerns about their potential for harm and the lack of rigors standards to catch issues before they reach users. From hate speech to copyright infringement, AI models have increasingly demonstrated behaviour that ranges from problematic to downright dangerous.
But according to researchers, the problem goes far deeper than individual bugs or oversights. It’s about a system that still lacks the basic guardrails necessary to ensure safety, compliance, and trust.
“After almost 15 years of research, we still don’t know how to get AI to behave the way it’s intended,” said Javier Rando, an expert in adversarial machine learning, in an interview with CNBC. “And it doesn’t look like we are getting better.”
Red Teaming Isn’t Enough Yet
A widely used method for stress-testing AI, known as red teaming, has gained traction across the industry. Borrowed from cybersecurity practices, red teaming involves probing systems for weaknesses before malicious actors can exploit them. But according to researchers like Shayne Longpre, lead of the Data Provenance Initiative, the current scale of red teaming is woefully inadequate.
“There are simply not enough people working in red teams,” said Longpre, who co-authored a paper advocating for broader, more inclusive evaluation processes. He argues that the task of identifying AI model flaws is too complex to be left solely to internal company teams or contractors.
“Some of the flaws in the systems that people were finding required lawyers, medical doctors, or actual scientists to figure out if this was a flaw or not,” Longpre explained. “The common person probably couldn’t or wouldn’t have sufficient expertise.”
His research recommends standardised “AI flaw reports,” public incentives for disclosing flaws, and a transparent system to disseminate such information steps modelled after practices already in place in the software security industry.
Project Moonshot: Singapore’s Proactive Push
One standout example of a policy-backed AI evaluation framework is Project Moonshot, launched by Singapore’s Infocomm Media Development Authority in collaboration with companies like IBM and DataRobot.
The initiative offers an open-source toolkit for large language model evaluation, integrating benchmarking, red teaming, and harm-testing baselines into a single platform.
Anup Kumar, head of client engineering for data and AI at IBM Asia Pacific, said the toolkit helps startups ensure their models can be trusted before and after they’re deployed.
“Evaluation is a continuous process,” Kumar told CNBC. While some startups have embraced the toolkit, he noted there’s room for much wider adoption. “A lot of startups took this as a platform… But I think we can do a lot more.”
Project Moonshot now plans to expand its features to include multilingual and multicultural red teaming and customised assessments for specific industries.
Time for an AI Approval Pipeline?
For some, the solution lies in treating AI development more like other high-stakes industries.
Pierre Alquier, a professor of statistics at ESSEC Business School, Asia-Pacific, drew comparisons to pharmaceuticals and aviation.
“When a pharmaceutical company designs a new drug, they need months of tests and very serious proof that it is useful and not harmful before they get approved by the government,” Alquier noted. “We need that in AI now.”
He argued that many AI models today are too general in scope, making them harder to regulate and predict. “LLMs can do too many things, but they are not targeted at tasks that are specific enough,” Alquier said. As a result, “the number of possible misuses is too big for the developers to anticipate all of them.”
Rando echoed that sentiment. Broad language models, he said, create a moving target for defining “safety,” and tech companies must be more realistic about their capabilities.
“They should avoid overclaiming that their defences are better than they are,” Rando warned.
Building Trust Through Transparency
As the global AI race accelerates, the call from researchers is clear: responsible innovation demands more than technical prowess. It requires transparency, cross-sector cooperation, and public accountability.
Without those pillars, experts warn, the promises of AI may continue to be undermined by risks no one is fully equipped to manage.