Human Data versus AI or Synthetic Data

May 12, 2025
2 min read

By 2030 Gartner expects that synthetic data or data created by AI will be the majority of all data. This decrease in original human content will happen for many reasons:

✴️ Because we have stopped encouraging original content - this needs to be encouraged and nurtured from childhood to adulthood. Remember our human default setting is convenience.

✴️ The dependency on AI increases - this is a combination of personal, social and cultural reasons.

✴️ We use it for simulations assuming it models the real world (being used across industries like health etc.). Often this is because of data paucity or regulations (like privacy concerns).

✴️ AI is automated to flood the internet and social media feeds and our daily life with AI manufactured data (bots, influencers etc). Largely due to new tools like agentic AI and Gen AI.

Meanwhile there is a crazy scramble to get original data - via scholarly brokered deals, through freemiums, or worse through systems that have captive audiences. 90% of the world's data was generated in the last two years alone (much of it video). Keep in mind as more people query the LLMs for similar things, statistically the answers will probably turn out results that are mediocre (as that is the maths behind the algorithms). We are seeing the benefits today as original content is greater than synthetic content (in lab setting we saw models collapse when they were recursively fed AI data).

💡 We are at an inflection point. We need policies to safeguard the human skills for generating original content and protecting it from exploitation. By generating original content - human retain critical thinking skills, value them and more importantly will help feed the AI machines that operate on a garbage in garbage out model.

While no one knows the exact % of synthetic data - it is important to realize it is decreasing and the AI machine that feeds on human creativity is running on an alternate "fuel".

Human Data versus AI or Synthetic Data

Recent Posts

Comments

Post Archive

Tags