How ChatGPT and other LLMs work—and where to go next Wired


Such as AI powered chatbots ChatGPT and Google Bard are definitely having a moment: next-generation chat software tools promise to keep us from memorizing all the world’s knowledge, from controlling our web searches to producing an endless supply of creative literature.

ChatGPT, Google Bard, and bots like them are examples of large language models or LLMs, and it’s worth digging into how they work. This means you can use them better and have an appreciation for what they’re good at (and what they really can’t trust).

Like many artificial intelligence techniques—such as those designed to recognize your voice or generate pictures of cats—LMSs are trained on large amounts of data. The companies behind them were more tentative when it came to where this data came from, but there are some clues we can look at.

For example, the research paper introducing the LAMDA (Language Model for Dialog Applications) model on which Bard was built cites Wikipedia, “public forums” and “code documentation from programming sites such as Q&A sites, tutorials, etc.” Meanwhile, Reddit wants to start charging for access to 18 years of text conversations, and StackOverflow has announced that it will start charging as well. The implication here is that LLMs have been making extensive use of both websites up to this point as resources, completely free and on the backs of the people who built and used those resources. Obviously, a lot of publicly available material on the web has been scraped and analyzed by LLMs.

LLMs use a combination of machine learning and human input.

Open through David Neild

No matter where all this text data comes from, it is processed by a neural network, which is commonly used and is made up of multiple nodes and layers. These networks continuously adjust their interpretation path and make sense of data based on many factors, including past trial and error results. Most LLMAs use a special neural network architecture called a transformer, some techniques of which are particularly suited to language processing. (That GPT stands for Generative Pretrained Transformer.)

Specifically, Transformer can read vast amounts of text, identify patterns by identifying how words and phrases relate to each other, and predict what words should come next. You’ve probably heard LLMs compared to overpowered automatic engines, and that’s actually not too far from the truth: ChatGPT and Bard don’t “know” anything, but they’re pretty good at figuring out which word follows another, which starts it. At a high enough level, it looks like real thinking and creativity.

One of the key innovations of these transformers is the self-focusing mechanism. It is difficult to explain in a paragraph, but basically it means that the words in a sentence are not counted separately, but they are related to each other in various sophisticated ways. Otherwise, it allows for a higher level of understanding.

There is some randomness and variation built into the code, which is why you won’t always get the same response from the Transformers chatbot. This self-correcting idea also explains how errors can be introduced. Basically, chatgpty and google bard don’t know what is authentic and what isn’t. They’re looking for responses that seem plausible and natural, and that match the information they’ve been trained on.



Source link

Related posts

Leave a Comment

six − 5 =