The Inbreeding of Nutrition Data
Garbage in, garbage out. Generative AI is ruining nutritional data.
As artificial intelligence continues to evolve, it becomes increasingly clear that retrieving enough data is the most challenging part of building powerful AI. Big tech giants like Google, Microsoft, and Meta are all diving headfirst into this revolution with tools like ChatGPT, Microsoft Copilot, and Google Gemini.
But speed creates problems and as the internet becomes flooded with AI content, the scarcity of new, human-generated input creates the issue of data becoming far less accurate and reliable.
Content Degeneration
“The earliest large language models (LLMs), were trained on massive quantities of text, visual and audio content, typically scraped from the internet. We’re talking about books, articles, artworks, and other online content created by humans,” says Forbes.com.
But since ChatGPT was released, generative AI is everywhere. Futurism.com says as these models scrape information from every corner of the internet, they inadvertently incorporate AI-generated content into their training sets—a type of inbreeding, where you end up with increasingly synthetic, less reliable, and all-around less creative outputs.
According to a Gartner study, 60% of all data used in the development of AI will be synthetic rather than real by 2024. And with enough AI-spun stuff, it seems, this self-consumption will eventually break the model's digital brain. Tests show that some models only made it through five rounds of training with synthetic content before cracks in the outputs began to show.
AI researcher Jathan Sadowski dubbed the phenomenon as "Habsburg AI," a reference to Europe's famously inbred royal family.
Nutrition Data
Imagine AI trying to generate a complete picture off of another generated pic. That's literal AI inbreeding, as you get copies of copies with continual mistakes.
But I’m particularly concerned about nutrition information. Without sufficient high-quality, human-generated data, it’s conceivable that we might soon hit a brick wall with “dumbed down” AI-generated content. There's a school of thought that AI models might have already peaked, and are destined to only get dumber.
The implication of this self-sabotage is the widespread stifling of human experience and creativity. For example, a nutritionist might write about having overcome a recent difficult client issue with updated methods, bringing real value to the article. AI however brings no personal inspiration, no creative thoughts, feelings, and insights of the human mind to anything it produces. It copies the same old, non-creative information, whether reliable or mistake-prone, according to a recent study.
Here’s how this could happen step by step:
Initial Training: AI models are trained on high-quality, human-curated datasets containing accurate nutritional information.
AI Generation: As these models generate nutrition-related content (e.g., meal plans, nutritional analyses), this AI-generated data becomes part of future training sets.
Error Propagation: Small inaccuracies or biases in the initial AI-generated data can be amplified in subsequent generations. Unlike traditional research, AI-generated data often lacks human validation, amplifying existing errors rather than correcting them. This creates a self-reinforcing loop of inaccuracies.
Model Collapse: Over time, the model's ability to provide accurate nutritional information deteriorates. This "model collapse" can lead to unreliable predictions and potentially harmful dietary advice.
Conclusion
Nutrition information is not only disappearing on the internet, but it’s also degrading in value. Generative AI giving birth to more AI, like repetitive inbreeding, does not get smarter. It degenerates. The inbred AI becomes more stupid in a self-consuming loop—sort of like a dog, not just chasing its tail—but eating it. It’s a sort of artificial idiocy that in time, could render these AI models pointless, says Discovery.com.
But you don’t have this problem. You’re reading a newsletter curated by me—a nutritionist and personal trainer with 20 years of extensive experience in field practice. Instead of sifting through corrupted data, what you read here you can use and rely on. I’m extremely grateful to all my readers, as you make this labor of love worthwhile.
Your presence here is greatly valued. If you've found the content interesting and useful, please consider supporting it through a very cost-effective paid subscription. While all our resources are freely available, your subscription plays a vital role. It helps cover some operational costs and supports the continuation of this independent, unbiased research and journalism work. Please make full use of our free library.
If shy about commitments, feel free to leave a one-time (coffee jar) tip below!