About Generative AI Systems

I’ve worked in environments where artificial intelligence was part of the equation for eight years. It was everything from audience segmentation to field extraction, but it was a black box that required building precise models: I fed data in and out came an answer that required some level of human review or nuance.

But then on November 30, 2022, everything changed: ChatGPT was released.

I was blown away because it was clear that it would change how we work and how users interact. In fact, it will shake up the way users experience things so much that we’ll have to rethink many of our processes in the future. A complete transformation of the field, so to speak.

Michio Kaku, the well-known physicist and author, said ChatGPT is just a fancy tape recorder. I say it’s a wonderful starting point.

However you think of AI today, we need to understand and use generative AI tools like ChatGPT. If we don’t, someone else will.

Within this book, I’ll provide you what has served me best in utilizing ChatGPT so that you can make the most of this tool.

Definitions

The very first place we need to start is clearly defining important concepts in this new domain. Some of the words are a bit technical and aren’t relevant when creating a prompt (Retrieval-Augmented Generation comes to mind) but this is a good start to understand how Generative AI works.

Core AI Concepts

  • Artificial intelligence (AI) — The creation of systems that can think and learn like humans. This term is commonly used when discussing the broader field of AI technologies.
  • Hallucinations — Instances when an AI model generates information or responses that are incorrect, misleading, or entirely fabricated. These hallucinations occur because the model creates outputs based on patterns in its training data, which may not always align with factual accuracy. For example, an AI model might confidently state that “Paris is the capital of Italy,” despite this being incorrect.
  • Machine learning — A subset of AI focused on creating systems that can learn and improve from experience. An example is an email spam filter that improves its accuracy over time. Prompts for machine learning models often include examples of desired inputs and outputs.
  • Neural network — A computer system designed to mimic the human brain and nervous system, often used in machine learning. It forms the foundation of many AI image generation models. Though not directly involved in prompting, understanding neural networks can help structure complex requests.
  • Natural language processing (NLP) — The field of AI that focuses on the interaction between computers and human language. For example, Microsoft Copilot uses NLP to understand and generate code comments. NLP tasks often include summarization, translation, or sentiment analysis. While NLP has been around for a long time, generative AI technology makes it much more accurate.
  • Generative AI (GenAI) — AI systems that create new content, such as text, images, or other data types. For example, DALL-E generates images from text descriptions.
  • Transfer learning — A machine learning technique where a model trained on one task is repurposed for a related task. An example is using a model trained on English to jumpstart training for a French language model. When prompting models that use transfer learning, users can leverage knowledge from related domains.

AI Models and Architectures

  • Generative pre-trained transformer (GPT) — A type of language model architecture used in many modern AI systems. An example is GPT-3, which powers ChatGPT. GPT is a specific architecture for implementing large language models (LLMs). When prompting GPT models, users can leverage their broad knowledge base.
  • Large language model (LLM) — An AI model trained on vast amounts of text data to understand and generate human-like text. Examples include Anthropic’s Claude, Google’s Gemini, OpenAI’s ChatGPT, and Microsoft’s Copilot, all based on their own LLMs. GPTs are a type of LLM architecture. When prompting LLMs, users can tap into their broad knowledge and capabilities.
  • Parameter — A variable in an AI model that adjusts during training to optimize performance. For example, GPT-3 has 175 billion parameters. Users don’t directly interact with parameters when prompting, but with carefully crafted instructions, the user can influence model behavior.

AI Applications

  • Agent — An AI system designed to perform specific tasks or functions autonomously. For example, a customer service chatbot that can handle basic inquiries without human intervention. When creating prompts, users might specify agent behaviors, like “As a travel planning agent, suggest an itinerary for a week in Paris.” Most systems in the future will do this behind the scenes.
  • AI assistant — An artificial intelligence system created to interact with users, answer questions, and complete tasks. Examples include Anthropic’s Claude, Google’s Gemini, OpenAI’s ChatGPT, and Microsoft’s Copilot, which can assist with tasks like writing and analysis.
  • Chatbot — A computer program that mimics human conversation through text or voice. For example, a customer service chatbot on a company’s website. When creating prompts for chatbots, users often emphasize a conversational tone and specific dialogue flows. Chatbots have existed for a long time, but they have traditionally been very procedural.
  • Conversational AI — AI systems created to have human-like conversations. For instance, OpenAI’s ChatGPT is an AI agent that uses conversational AI to talk about many topics. Prompts for conversational AI often include instructions like, “Let’s talk about climate change.”
  • Semantic search — A search method that understands the intent and context of a query, rather than just matching keywords. For example, Google’s search engine recognizes synonyms and related concepts. Users might specify, “Use semantic search to find information about…” to encourage a broader interpretation. While most search engines struggle with this, semantic search improves relevance.
  • Sentiment analysis — The use of natural language processing (NLP) to determine the emotional tone of a piece of text. An example is analyzing customer reviews to gauge product satisfaction. Prompts for sentiment analysis often include instructions like “Determine the overall sentiment of the following text.”

Techniques and Processes

  • Fine-tuning — The process of adjusting a pre-trained model. For example, fine-tuning GPT-3 on legal documents creates a specialized legal assistant. When using a fine-tuned model, users might prompt it based on its specialized knowledge: “As a legal AI, explain the concept of habeas corpus.”
  • Human-in-the-loop — A method that involves human input or oversight in AI processes. For example, in a content moderation system, AI flags potential issues, and a human reviews them. Prompts may include instructions like, “If unsure, indicate that human review is needed.”
  • Inference — The process of using a trained AI model to generate outputs based on new inputs. An example is using Google Gemini to answer questions about a given text, like a supplied document. Prompts for inference often include specific instructions about the desired output format or reasoning process.
  • Prompt engineering — The practice of designing and refining input prompts to get desired outputs from AI models. An example is crafting specific instructions for ChatGPT to generate a marketing plan. This is a key skill for effective use of generative AI.
  • Retrieval-augmented generation (RAG) — A technique that blends information retrieval with text generation to create accurate, relevant outputs. For example, a RAG-enabled AI assistant can be used to pull up-to-date information from a company’s knowledge base. RAG systems often prompt users to reference specific sources or types of information.
  • Tokenization — The process of breaking down text into smaller units (tokens) for processing by an AI model. An example is breaking a sentence into words or subwords for analysis by GPT-3. Understanding tokenization can help users craft more efficient prompts, especially when dealing with length limitations.

Strengths of Generative AI

Discovery and Draft Specifications

Need to create user personas in a pinch? ChatGPT’s got you covered. Need multiple problem statements using just a single prompt? ChatGPT is there. Need user stories for a basic feature? ChatGPT can do that, too. Need to draft every written artifact you can think of for a feature? Yep, ChatGPT handles it all, giving you something that’s good enough to get you started and greatly accelerate the process.

I tried it out on an example feature, and because of how much it sped up how we worked, we now use it daily at work. Whether it’s creating fake data, writing user stories, or looking for analogous inspiration, it’s the baseline for any feature. ChatGPT speeds up the early stages of discovery by providing quick access to a lot of helpful information. This helps designers gather and understand data quickly, without having to sift through many web pages or consult with many people.

Some professionals I know also enrich user personas with data so that you can converse with an “artificial” user. It’s not a replacement for the real thing, but it’s a good way to ask better questions.

It also boosts brainstorming sessions with fresh perspectives and helps researchers explore new ideas efficiently. Plus, it makes the literature review process a breeze, saving you time and allowing you to focus on experimental design and analysis.

Whether you’re identifying relevant studies, breaking down complex ideas, or developing new hypotheses, ChatGPT is a wonderful tool.

Artifacts

  • User personas
  • Analogous inspiration
  • Competitive analysis
  • User research questions
  • Problem statements
  • Predicted outcomes
  • User stories
  • Usability testing questions

Example prompts

Problem statement: Create multiple problem statements in a "How might we?" format for searching through a repository of documents.
User stories: Write user stories for a document search table where you can search by keyword or filter by certain fields. The actions are searching by keyword (boolean or non-boolean), selecting a filter, adding a filter, clearing a filter, and sorting by field.
User research questions: Write user research questions for a document search table where you can search by keyword or filter by certain fields. The actions are searching by keyword (boolean or non-boolean), selecting a filter, adding a filter, clearing a filter, and sorting by field.

Realistic data

One of the most time-consuming activities I’ve had in user experience is creating fake data. I spent hours building this out in wireframes for usability testing prototypes—something that is both fun and time-consuming.

That process is gone. ChatGPT saves a ton of time doing that now.

Creating fake data with ChatGPT is a breeze because the model quickly generates contextually relevant and diverse information, especially if you know what fields you want to use. By leveraging its vast training data, ChatGPT can simulate realistic user inputs, behaviors, and scenarios that mimic real-world data patterns.

And it’s not just about some data—it can create thousands of lines of realistic data to use in development environments.

This capability allows UX designers to prototype and test interfaces, speeding up the design iteration process. It’s like having an instant, versatile sandbox to explore and refine user experiences before launching into the real world.

Artifacts:

  • Wireframe content
  • Example data

Example prompt

Data table: Create a table of realistic data for 25 users with the following fields: First Name, Last Name, Email Address, Role (Administrator, Edit User, or Read-Only User), Active Status (yes or No), Added Date, and Last Updated Date in YYYY-MM-DD format.

User assistance copy

Before even having the first wireframes, ChatGPT does a good job at writing draft user assistance copy for features. This is due to several key advantages.

Its adaptability enables ChatGPT to generate content tailored to meet users where they are. Whether explaining basic functionalities or troubleshooting intricate issues, ChatGPT can adjust its language and depth of detail accordingly.

Because it references well-known contexts, ChatGPT can generate a good first draft from which to start.

ChatGPT’s consistency ensures that user assistance copy maintains a uniform quality and tone across interactions. You can even specify the style and tone, such as business casual or formal, so it’s tailored to the audience. This builds trust and enhances the overall user experience as long as the content is edited and reviewed accordingly.

Artifacts

  • Wireframes
  • In-application content
  • Knowledge base articles

Example prompt

User Assistance: Write user assistance content for a document search table where you can search by keyword or filter by certain fields. The actions are searching by keyword (boolean or non-boolean), selecting a filter, adding a filter, clearing a filter, and sorting by field.

Weaknesses of Generative AI

Wireframes

Until someone can write perfect user stories or product requirements documents, I’m convinced that ChatGPT may assist designers but will never replace them.

Wireframing isn’t just about creating a visual blueprint; it’s about understanding user needs, iterating on ideas, and fostering stakeholder collaboration.

While AI can help generate wireframes faster, it lacks the human intuition, empathy, and domain-specific knowledge required to deeply understand and solve user problems. For example, you can design a search experience, but every use case differs slightly depending on the domain. Searching for merchandising and searching for documents, for example, is a very different experience.

AI-generated wireframes may overlook the nuanced considerations and contextual insights UX designers bring through research and experience.

Therefore, while AI can assist, it won’t replace the critical, human-centered wireframing process in UX design.

Artifacts

  • Wireframes

Respect to Regulatory Guidelines

When I had my first conversations with designers about Open AI, we discussed several situations where ChatGPT was a bad fit—music authoring, writing, and other copyrighted material were examples—but the field I work in, Legal Tech, is actually one of the best fits for what ChatGPT can provide.

ChatGPT as a first draft without nuance? Sure.

As the final product without a “human in the loop”? Not a chance. There should always be a “human in the loop” to double-check the work, specifically in nuanced situations.

Regulatory environments often require precise and unambiguous communication backed by legal and compliance standards (instructions around banking and finance come to mind). ChatGPT, proficient in generating human-like text, cannot accurately comprehend complex legal jargon or nuanced regulatory frameworks.

This can lead to inaccuracies, misunderstandings, or even hallucinations in critical communication, potentially resulting in legal liabilities or non-compliance issues for end users.

ChatGPT excels in many conversational and informational tasks. Still, it is limited: Fancy tape recorders don’t understand complex regulations and potential biases that make them unsuitable for regulatory situations where precision and compliance are paramount.

There will always be a human in the loop here.

Authenticity

When was the last time I saw a LLM hallucinate?

Today.

I was walking someone through one of the applications and entered their name. It returned the information of someone completely different at the company and returned other incorrect information. We laughed about it and moved on.

This doesn’t happen a lot, but it does happen more on content that the LLMs don’t have a lot of context for, or there’s a lot of matches that it can’t line up. For example, an acquaintance of mine has a rather common name and it mixes him up with an actor in England. I don’t have that problem so the information that is returned for me is most probably me.

When you consider that all information on the internet isn’t factual — yes, there isn’t an Easter Bunny or Santa Clause either except on some marketing site selling costumes for both — authenticity is closely related to bias. They’re trying to fix these problems, but it’s hard because the LLMs are designed to predict what words come next, not to know what’s true.

Even Google, and about every other search engine that’s been invented, won’t know what’s true without human intervention either.

We’ve been living with this for the last 30 years, and will continue to do so. It doesn’t seem to affect search engine engagement either.

It’s really up to us to decide what’s right and what’s wrong. Search engines return wrong results sometimes, and so do GPTs. Both are learning to get better, and it’ll take time.

I have told everyone that uses any technology that you should always double check what you’re seeing like a journalist: One source is an opinion, but a second reliable source is validation. The same confidence that LLMs return information with is akin to the Google search results, and no one seems to be affected by that.

So the resolution is this: Trust and verify.

Parting Thoughts

How we do early ideation and discovery is already changing, especially at the consumer end of user experience. This is reflected in how user experience teams transform as a forcing function to use these tools.

However, we are very far from completely replacing designers. There may be the need for even more designers going forward because many applications will be transformed for this new era — making knowledge of these tools more important.

We’ve adapted before, and we can do it again. Act accordingly.