'ChatGPT and the Future of AI' Tells the Story of GenAI

I was excited to get an early copy of “ChatGPT and the Future of AI: The Deep Language Revolution” by Terrence Sejnowski of the Salk Institute and the University of California at San Diego. To be honest, many sections of the book are an easy read and very comprehensible by a business audience. Other sections are a slower read and really require some domain knowledge, but they are worth it if you want to understand without heavy mathematics how deep learning and ChatGPT work and how we have gotten to today. Something, I think most CIOs and CDOs will want to do.

Some Foundations

Deep learning, a field that started in the 1980s when computers were a million times less powerful, has evolved dramatically, especially since 2010. This evolution, says Sejnowski, has enabled breakthroughs in AI. The synergy between deep learning and reinforcement learning started by powering modern technologies, like search engines.

However, OpenAI's ChatGPT introduction brought things to a whole new level. As noted in the press, it reached 100 million users within two months, far surpassing the early growth of tech giants, like Google and Facebook. With that said, one thing is clear: ChatGPT is not human. Sejnowski argues ChatGPT’s superhuman ability is extracting information from vast data sources. This capability comes from a deep learning architecture called a transformer. Generative pretrained transformer (GPT) models revolutionize language tasks by vastly improving the performance of simpler deep learning networks. This technology, says Sejnowski, underpins ChatGPT’s impressive capabilities.

Self-supervised large language models (LLMs), or foundation models, are surprisingly versatile and capable of performing a wide range of language tasks by predicting the next word in a massive corpus of texts. Trained on both online and offline repositories, they generate text word by word, a process that reflects their ability to generalize. You can see this happening as it responds to a prompt.

While debates rage about whether these models truly understand or possess intelligence and consciousness, the answers remain complex. A chief scientist at a former employer of mine once suggested that applying voltage to enough interconnected wires could create intelligence, but Sejnowski, who has studied biological brains, argues otherwise. Intelligence, it seems, is more than just the sum of connections.

Business Implications

The rise of LLMs has significant implications for multiple industries and professions. While concerns about job loss are valid, Sejnowski suggests that technological change often leads to the creation of new jobs, accompanied by the education and training to help the workforce adapt. He argues that predictions that automation will eliminate jobs are overstated, as digital literacy, problem-solving and adaptability become increasingly critical skills.

Routine tasks will likely be automated, but this shift will make jobs more engaging, with LLMs augmenting rather than replacing human roles. For instance, LLMs can generate text, translate languages, answer questions and summarize information, allowing workers to focus on the creative and strategic aspects of their jobs.

In health care, LLMs can assist doctors by summarizing patient information scattered across medical records, typing notes, guiding bedside manner and even aiding in disease prediction and diagnosis. LLMs also hold potential in drug discovery by predicting new drugs and revolutionizing pharmaceutical development.

Across industries, LLMs can enhance customer service, content creation, data analysis, business intelligence, internal communications, sales, lead generation, training and process automation. As these technologies evolve, they promise to reshape high-value sectors, making them more efficient and innovative.

Intelligence, Thinking and Consciousness

Sejnowski begins by questioning the essence of intelligence and what is required for general intelligence. While ChatGPT does understand in its own manner the components and structure of language, it's crucial to recognize that human intelligence encompasses more than just language — a debate that I remember from my cultural anthropology classes in university. LLMs, however, can push us beyond old thinking and concepts, but they only share certain aspects of intelligence with humans, leaving much of the cognitive landscape unique to us.

LLMs, unlike humans, are based on mathematical functions — highly complex ones trained by learning algorithms. These pretrained models can be fine-tuned for specific tasks, making them versatile tools. The concept of priming, where an LLM is given a prompt before interaction, significantly enhances the flexibility of their responses. As these models grow larger and more complex, they exhibit behaviors that intriguingly parallel certain aspects of brain function, suggesting that neural network models are a new class of functions existing in high-dimensional spaces, offering insights into both artificial and human cognition.

Digging Into How Transformers Work

Deep learning approaches language through probabilities and learning, rather than relying on symbols and logic, leading to significant advancements in the 2000s. Unlike older models, LLMs are not given explicit instructions on word meanings or sentence structure. Instead, they discover meaning and syntax through self-supervised learning. Sejnowski traces this evolution back to Frank Rosenblatt’s introduction of the perceptron, a simple model mimicking a single neuron, where the essential components were units and weights.

As deep learning progressed, the focus shifted to multilayer neural networks with hidden layers and eventually, a third wave of deep learning emerged with enough computing power to support breakthroughs in image recognition, speech recognition and language translation. These networks are considered deep because their units are organized in multiple layers, processing data through many stages before producing an output.

Interestingly, LLMs grasp the meaning of a word based on the context in which it appears — an association that comes from correlations within vast data sets. In LLMs, words are represented by pretrained embeddings in large vectors, rich with information. While neurons in biological brains are millions of times slower than today’s digital processors, the sheer number of neurons compensates for this speed difference.

Transformers, the foundation of models like ChatGPT, were introduced in 2018. These specialized feed-forward neural networks contain hidden layers trained on enormous amounts of text using back-propagation of errors. Sejnowski compares a transformer to a sophisticated machine in a factory, processing raw materials (words and sentences) into a finished product (meaningful output). During training, the model adjusts weights and biases to minimize the difference between its predictions and actual outcomes, striving to maximize model accuracy.

Deep learning networks are complex mathematical functions that, despite their complexity, remain transparent and open to mathematical analysis. The use of high-dimensional spaces and large data sets has transformed previously intractable problems into solvable ones. Interestingly, the discovery that large network models can reduce training errors when data sets are sufficiently large has upended many of our prior intuitions, revealing new insights into both artificial and human intelligence.

The Risks

The rapid advancement of AI technologies like OpenAI’s GPT brings with them significant risks that necessitate robust guardrails to prevent mistakes and misuse. One critical concern is the potential leakage of sensitive company information, making it essential to implement strong protections. The financial cost of powering AI during training is another consideration, with GPT-4 alone costing $100 million over several months.

The debate around superintelligence is heated, Sejnowski says, with varying opinions on its potential. Geoffrey Hinton, for instance, expresses concern that AI's ability to write computer programs could lead to self-enhancement, a step closer to superintelligence. While there have been major advancements in learning, the timeline for achieving superintelligence, general intelligence and even the singularity remains uncertain. Still, many questions about unintended consequences have brought the discussion of regulation to the forefront.

As AI continues to evolve, maintaining privacy and upholding copyrights become increasingly important. There is a clear need to ensure no harm is created, whether it concerns liability, employment or discrimination. The options over how to regulate AI include self-regulation, governmental oversight and stricter copyright enforcement. Whatever the approach, it's crucial the development and deployment of AI prioritize ethical considerations to safeguard society.

Next Steps

LLMs have showcased impressive language competency, but they still have flaws. Studying structures of the human brain may offer means to overcome these limitations. The evolution from AI programming to AI learning marks a fundamental shift. Yet, learning requires vast computational power now in abundance. Meanwhile, writing AI programs based on rules is labor-intensive and lacks generalizability, which is why AI learning has prevailed, particularly since 2012.

Looking forward, the long-term direction for AI is to integrate LLMs into larger systems, much like how language became embedded in brain systems over millions of years. Sejnowski emphasizes that LLMs need a longer "childhood" to mature, suggesting that incorporating reinforcement learning during initial training could facilitate this development. Currently, LLMs rely on a constant stream of curated data and programmers who work tirelessly to optimize performance and make improvements.

To truly grow in capability, LLMs will need a longer-term memory. Sejnowski proposes that the next generation of LLMs should include a mechanism akin to a brain’s hippocampus, which would enable continual learning and bring them closer to human-like behavior. The hippocampus aids the brain’s cortex in bridging time and enhancing flexibility, a quality that could be mirrored in future AI by learning from nature and reverse engineering the brain’s functions.

In Conclusion

In his book “ChatGPT and the Future of AI: The Deep Language Revolution,” Sejnowski explores the evolution of AI from early neural networks to modern LLMs, highlighting their unprecedented language capabilities and the shift from AI programming to AI learning. Sejnowski shares the potential opportunities and risks, noting that LLMs have made significant strides in language comprehension. To be fair, they still have limitations and pose challenges, such as data privacy and ethical concerns. Sejnowski argues the need to study the human brain to further refine AI and suggests that future advancements should include mechanisms to enable lifelong learning and more human-like behavior. As AI continues to integrate into various industries, Sejnowski emphasizes the importance of thoughtful development and self-regulation to harness its potential while mitigating risks.

fa-solid fa-hand-paper Learn how you can join our contributor community.