Predictive, Not Perceptive: Inside Generative AI

The Gist

AI limitations. Generative AI technologies like ChatGPT are essentially advanced phrase predictors, lacking true understanding or learning capabilities.
Legal quandaries. Generative AI faces lawsuits over copyright infringements and defamation, setting the stage for legal standards in AI use.
Bias concerns. The rapid development of AI raises questions about training data bias, affecting accuracy and fairness across various applications.

In this fourth and final part of our four-part generative AI history series, we'll look at how generative AI is still in very much in its infancy. In this final installation, we'll also explore some of the issues surrounding the use of generative AI technologies and how these can be addressed in the future.

Inside CMSWire's Generative AI History Series

Part 1 discussed a detailed history of artificial intelligence that spans decades.
Part 2 discussed the foundations of generative AI, a record-breaking technology that hit the mainstream in November 2022 with OpenAI's release of ChatGPT.
Part 3 discussed the global AI race that has been raging for years now with countries pouring huge investments into AI to grab hold of leadership.
Part 4 today discusses the infancy of this transformative technology.

How Intelligent Are Generative AI Technolgies Really?

It’s important to keep in mind that while chatbots have made tremendous progress and are quickly becoming utilitarian for certain language tasks like text summarization and research, etc., they are not inherently intelligent generative AI technologies.

Phrase Predictors

At their core, ChatGPT and other generative AI chatbots are basically phrase predictors. The large language models (LLMs) they are based on have ingested and memorized large bodies of text, books, etc. so they can reasonably predict what response should be provided to the questions it’s asked. They don't really understand what these sequences say and don't know what they are talking about. They have limited reasoning capabilities and can still fail to understand simple language nuances or context. In some ways these chatbots are very sophisticated auto-completion engines. They appear extremely convincing and present as intelligent because they respond in a human-like manner.

Chatbots: Not True Generative AI Technologies

So technically generative AI chatbots are not true generative AI technologies (or more appropriately they are weak AI) as they are unable to learn or adapt, and only have a predetermined list of responses they can use based on what appears in the query. They also don’t handle complexity or language nuances very well. The technology is still very far from human cognition.

A Long Way From Thinking Machines

Even as the LLMs evolve and achieve higher levels of grammatical fluency we will still be a long way from a machine that can think. So, it’s unlikely that the major advances in AI will come from language systems.

There are much more intelligent AI systems operating today in robots, autonomous vehicles and many others that will ultimately be where the big advances in AI occur.

You Must Be Hallucinating

While the generative AI chatbot interfaces have advanced tremendously, they are far from perfect.

Only as Powerful as the Source

A model is only as powerful as its source. The LLMs of today are all trained on a variety on input data but the open internet is a critical source for almost all of them. We all know the internet is filled with plenty of misinformation and biased data, so it’s critical that any of the answers you get form AI chat interfaces be validated and/or interrogated. Additionally, there are no references or citations from where the information was obtained making it difficult for research purposes.

Prone to Mistakes

Generative AI chatbot interfaces are prone to making mistakes which have been coined as “hallucinations” when they seem to go off in a random direction out of the blue. An AI hallucination are outputs that may sound plausible but are either factually incorrect or unrelated to the given context. These hallucinations are not just restricted to providing wrong answers but can perpetuate harmful stereotypes, impact critical decisions such as healthcare, or expose legal issues by misrepresenting personal facts.

The First Defamation Lawsuit

In June 2023, OpenAI received its first defamation lawsuit over a ChatGPT hallucination. A Georgia radio host claimed that ChatGPT generated a false legal complaint accusing him of embezzling money. The outcome of the case will have a significant impact in establishing a standard in the emerging field of generative AI.

Reducing Hallucinations

The solution for improving models to reduce these hallucinations starts with improving the training data to ensure accurate, diverse and unbiased datasets. Understanding a training data’s inherent biases is also important to address. AI developers can also test for vulnerability to hallucinations by simulating question-and-answer scenarios that are potentially confrontational. Having human reviews of certain outputs can also identify areas where proper context is not being provided. Techniques such as reinforced learning with human feedback (RLHF) are critical to leverage.

Stop Using My Data Without Permission

There is a big legal debate looming on the legality of whether the data being used to train these numerous generative AI models are violating copyright protection. One could argue “fair use” might apply for a lot of the internet content but as these models get more sophisticated, they are being used to generate code, text, music and art. The data being used was already created by humans but scraped from the internet or other means and used to train the AI model. Almost all of the data these models ingest were created by humans, which creates a high exposure to copyright infringement. For copyrighted works, providing in depth summaries or close approximations of the original work starts to dangerously flirt with copyright protection.

Training Datasets

The training datasets for both GPT-4 and PaLM 2 have been kept relatively quiet. Google’s focus for PaLM 2 was to gain a deeper understanding of mathematics, logic, reasoning and science so you can assume the training data set has been altered accordingly. OpenAI simply claims the GPT-4 has been trained using publicly available data or data that they have licensed. Both have become increasingly secretive about their training data.

Training Data Lawsuits

Recently, the first lawsuit was filed against ChatGPT by authors Mona Awad and Paul Tremblay, claiming it breached copyright law by “training” ChatGPT on novels without the permission of authors. This will be a landmark case in establishing new rules around generative AI and copyright. This case will “likely rest on whether courts view the use of copyright material in this way as 'fair use,'" according to The Guardian.

Even if a work is not copied verbatim, the derivative works can also be very invasive to creators. Consider Hollie Mengert, an animator based in Los Angeles, who discovered that her entire online art portfolio was used to train an AI text-to-art model. The AI art that is being generated looks a lot like Mengert's work.

Ethical AI

This brings up the ethics of fine-tuning AI on the work of artists and how that may impact creators if thousands of people can easily generate similar styled work. Whether is legal or not, we have to deal with the ethics and governing principles around such scenarios as AI matures further.

Biases Get Amplified as AI Accelerates

The AI race is moving so fast we may be losing our ability to ensure the data theses systems are trained on is accurate and unbiased. AI bias is when the AI platform makes decisions that are unfair or prejudice to particular groups of people. AI bias is not new and has already proven to exist in many AI applications like facial recognition, credit reporting, university admissions and several other applications.

AI Biases

AI bias is not just limited to the training data. AI bias can occur when either the data and/or algorithms being used are generated by teams that are not very diverse, have inherent biases, have poor cognitive distribution (i.e., they all think alike), improper governance or have not instituted the proper testing. Data scientists in particular should be educated on what responsible AI looks like so that it can be embedded in the models. Much of the bias stems from the algorithms that compute scoring around nebulous terms such as creditworthiness or candidate screening. Providing transparency of these models and algorithms to consumers will also help moderate fairness.

Detecting Biases

Detecting bias isn’t always obvious during the design and analysis phase. The algorithm and data may actually be prepared correctly but certain data sets may unknowingly have built in bias because of historical trends. One example was an early Amazon AI job application review tool that filtered out women because the data set had a larger number of historical male candidates which the model wrongly assumed was the preference. This is why it is critical to test and simulate the model against a reasonably sized data sample to detect these inherent biases.

AI Has a Powerful Appetite

The technology industry has always been pressured to reduce its power consumption and corresponding carbon emissions which currently accounts for 2% to 3% of global emissions.

Learning Opportunities

Webinar

Oct

Beyond Storage: Smarter Content, Bigger Impact with DAM + AI

Discover how the DAM + AI duo makes content smarter, stronger and more accessible.

Webinar

Nov

Fix the Content Bottleneck: Build a Better WebOps Strategy

Content stalled? Dev overloaded? You’re not the only one. Learn how streamlined WebOps bridges the publishing gap.

Webinar

On demand

Agentic AI Playbook: Real-World Customer Service Use Cases You Can Deploy Now

Boost self-service by 30% and slash call volume by 63% with agentic AI.

Watch Now

Webinar

On demand

CMS Briefing: A Live Look at What’s Next in AI-Driven Platforms

Learn how leading organizations are using AI‑driven tools to publish faster, personalize smarter and stay secure.

Watch Now

Webinar

On demand

Ready or Not: How Data-First Organizations Are Unlocking Agentforce Potential

Learn how to cut through the noise, activate Agentforce and build a Salesforce AI strategy that actually delivers.

Watch Now

Webinar

On demand

AI in Customer Service: Faster Resolutions, Happier Customers

Don’t let rising demand burn out your team. See how to build a smarter, more resilient support org.

Watch Now

Webinar

Oct

Beyond Storage: Smarter Content, Bigger Impact with DAM + AI

Discover how the DAM + AI duo makes content smarter, stronger and more accessible.

Webinar

Nov

Fix the Content Bottleneck: Build a Better WebOps Strategy

Content stalled? Dev overloaded? You’re not the only one. Learn how streamlined WebOps bridges the publishing gap.

Webinar

On demand

Agentic AI Playbook: Real-World Customer Service Use Cases You Can Deploy Now

Boost self-service by 30% and slash call volume by 63% with agentic AI.

Watch Now

AI & Energy Use

Unfortunately, the energy required to support generative AI is moving this in the wrong direction. AI consumes energy in two main ways: training and inference. Training is the process by which the AI models learn by identifying patterns and relationships within the large data sets.

To put this in better context, to train BERT’s AI model with 110 million parameters consumed the energy of a round-trip transcontinental flight for one person. More parameters typically means more energy consumption.

AI & the Environment

It's obvious that a single large AI model is not going to have a big impact on the environment, but if thousands of companies start developing custom variations of AI models and create purpose-built AI bots for different purposes and those are used by millions of users, the energy consumption could become an issue.

There are strategies to mitigate the carbon impact of AI such as using renewable energy sources to power AI neural networks. This can be accomplished by leveraging more efficient graphical processing units (GPU’s), optimizing data center layouts and sourcing energy from green energy grids.

Optimizing AI

Data scientists and engineers are also actively working on optimizing AI models to reduce energy consumption using techniques like model compression and quantization.

There is also a potential positive impact for AI as businesses use AI to streamline their operations and business process, which can have direct correlation for optimizing energy consumption, reducing waste and improve sustainability.

Related Article: Citizen Activism for More Sustainable Tech

The Next Wave of AI Platforms Will Change Your Business

The success and momentum created by the launch of these LLMs and chatbots over the past nine months has ignited a new global arms race in AI. Every service provider, company and educational institution are now scrambling to figure out what this means for their business.

Embedded Chatbot Assistants

There is great potential for the current wave of chatbots to quickly become embedded as assistive technology in businesses. It’s easy to see how these platforms can boost performance but there needs to be some governance to ensure the accuracy and ownership of the data being produced.

Chatbots & Customer Service

These chatbots have the potential to revolutionize customer service, education, health treatment, personal productivity, and content creation. The platforms will continue to evolve with more specialized language models that are tailored to specific industries or use cases.

AI & Creative Content

Generative AI will also impact how creative content is developed, code is development, user experiences are design and many other disciplines.

DALL-E 2 is the second-generation AI system that creates realistic images from a description natural language. It’s able to combine various concepts, attributes and styles into a single rendered image. Jasper.ai is another AI-based content generation platform that assists content creators, marketers and authors craft a wide variety of content types such as blogs, articles, product descriptions and even long form content.

Automating User Interface Designs

And automating user interface design may never be the same again. Gailieo AI is a platform that creates editable UI designs from a simple text description allowing you to design faster than ever.

AI & Writing Code

Similarly, writing code no longer needs to be a tedious task. Many tools are emerging such as OpenAI Codex, Tabnine, Salesforce CodeT5 and others to help programmers write higher quality code faster in languages such as Python, Java and HTML.

Other Evolving Tools

Automation powered tools such as AutoGPT (an AI background agent) will continue to evolve allowing AI applications to generate their own prompts to execute very complex tasks.

AR & VR

Generative AI may even speed up the development of virtual and augmented reality (AR/VR) environments. Generative AI can be leveraged to create more dynamic virtual environments and more lifelike avatars. It’s possible it can be a catalyst for realizing some components originally envisioned in the Metaverse.

Final Thoughts on Generative AI Technologies

So while Generative AI, LLMs and chatbots may not reach the level of Artificial General Intelligence (AGI) in the short term, they will clearly provide competitive advantages to businesses that can quickly adopt, integrate and harness their power.

The AI race is on, so don’t get left behind.

fa-solid fa-hand-paper Learn how you can join our contributor community.