AI Has A Data Problem – Causal Data May Solve It
Artificial Intelligence (AI) may be powerful, but this article argues its biggest limitation isn’t the models — it’s the type of data we feed them. In a recent piece for Forbes, writer and AI expert Gary Drenik explores the idea that modern AI systems rely too heavily on observational, transaction-based data, and that a shift toward causal data could significantly improve results. The core problem is that most AI today learns from what has already happened — purchases, clicks, searches, and other behavioral signals. This makes models good at spotting patterns, but weaker when conditions change or when they need to explain why something is happening. The article contrasts this with “causal data,” which aims to capture the drivers of behavior before actions occur — things like intent, expectations, sentiment, and constraints. The argument is that this kind of data gives earlier and more meaningful signals about future outcomes than traditional datasets. A key point is timing. Transaction data reflects behavior after it happens, while causal signals can appear months earlier. By the time spending slows or revenue drops show up in the data, the underlying causes may have been developing for a long time.
Drenik also highlights a broader issue in enterprise AI: organizations are accumulating massive amounts of data, but much of it is noisy, delayed, or shaped by algorithms themselves. This creates scale without clarity — lots of information, but not always meaningful insight. Causal data, in contrast, aims to reduce guesswork by focusing on measurable drivers of decision-making. That can make models more robust, more interpretable, and more stable when conditions shift. The takeaway is that the next leap in AI performance may not come from bigger models or more data, but from better data — specifically data that explains why outcomes happen, not just what happened.



