Generative AI - Introduction

When we look back, 2023 is going to be the year when generative artificial intelligence (GenAI) well and truly broke out of the lab into public consciousness, earmarked by OpenAI’s public release of chatGPT in Nov 2022. ChatGPT’s performance was a material improvement over past instantiations and the public response has taken the AI researchers, governments and businesses by surprise. chatGPT can not only write text in multiple languages, it can also write code and future versions promise to be multimodal, supporting text, audio, video, images into a single model.

Some people mistakenly dismiss GPT as a glorified auto-complete engine just arbitrarily regurgitating streams of text based on the statistical likelihood that this is valid texted based on the realms of text that it has ingested before during training. Yes, these models are statistical but that is no reason to dismiss the reality that GPT can effectively do many tasks better and faster than humans already.

The fact that machines can now speak our language is game changer. Historically, in order for humans to get machines to do something, we needed software interfaces and humans had to learn how to turn our requests into keywords, filters, numbers in a specific, structured format that the machine can understand. The cost, speed and availability of talent in software development and well as the subsequent training of staff to use the software is a constraining factor in greater use of tech for productivity. But now this interface is being disrupted, potentially even eliminated - that completely upends the economics of tech deployment in companies.

Businesses that have historically operated with a substantial force of white collar employees handling documents, writing texts, generating creative assets are in the crosshairs in this wave - including legal, finance and entertainment industries. Executives in these sectors often find themselves on the back foot and under pressure to react: On one hand, they have been thrust overnight into a race to embed the efficiencies promised by this new tool to increase growth and profitability. On the other hand, there is a need to proceed with some caution given privacy concerns of putting confidential data into a shared model controlled by a third party and also managing potentially damaging weaknesses of the tech as it stands - including hallucinations, lack of sources or made up sources.

This is the first of a series of articles and videos to help CEOs on to the front foot as quickly as possible. Many are keen to learn but it is hard to judge what you need to and what you don’t need to know. You should aim to pitch you learning at the right level of depth so you are not down in the weeds but still deep enough to develop confidence and intuition. Learning how to do matrix multiplication in 4 dimensions with tensorflow is a bad use of your time - and I know most of you cannot afford to waste time going down the wrong rabbit holes and there are many rabbit holes here.

In the first session, we will start with reviewing the trajectory of how we got here, and that will help you to interpret where we are arc and develop some of your own intuition about where things are headed next.

In the second session, we will cover some foundational principles of how modern day AI works - it will help you to understand terms such as neurons, neural networks, gradient descent, weights, parameters, training vs inference, supervised vs unsupervised learning, backpropagation, fine tuning and transfer learning.

In the third session, we will discuss domain specific AI v. foundational models such as GPT and how that is fundamentally changing the AI landscape.

In the fourth session, we will do a deep dive into GPT technology and cover off topics such as strengths and weaknesses, significance of tokens and the techniques for customizing foundation models for a specific company, including differences between fine-tuning, transfer learning and prompt-engineering.

Generative AI - Session 1 - How did we get here?