The Form & Function of AI

May 14, 2024

AI is coming rapidly. Both new and existing companies are racing to decipher what AI means for their products and services.

We're witnessing numerous attempts to brand AI and integrate it into some of our most commonly used apps.

In this post, I aim to reflect on the initial visual language evolving around AI, explore how we arrived at this point, and outline some ideas about where this could lead.

Introduction

Last week, someone shared a video of Steve Jobs demonstrating the 'slide to unlock' interaction at the launch of the first iPhone.

You can hear the audience gasp and clap with excitement. What a moment. Apple had invented a truly new way of interacting with a computer, and everyone understood it immediately.

The hardware innovation of touch screens opened doors to a new world of interface and product design.

And with the new hardware came a new design language. Skeuomorphic design dominated the apps on our phones for years, and that made sense. The theory behind it felt solid, and I have no doubt that mimicking real-life objects made the overall adoption of smartphones easier for people.

Product design has continued to evolve, mostly incrementally. Drop shadows and real-life textures were replaced with 'flat' UI and thanks to Force Touch and cameras, and new interactions appeared. But for the most part, the form and function of the digital products we use remained stable for the last 10 years.

Experiments with voice (Siri, Alexa, etc.) and early VR (Quest, Meta x Ray Ban, Vision Pro) are underway, but that all feels like it is still in development . There hasn't been another 'touch screen moment'. No significant step change in how we interact with the software and hardware that are now extensions of almost everybody.

Today

Right now, loads of people are using ChatGPT to do... stuff. Meta has put an 'Ask Meta AI' feature in Messenger, WhatsApp, Instagram, and Facebook. Adobe is running ads during the NBA about their new 'AI prompt fill tool' in Photoshop. Google is running ads on YouTube for courses in 'understanding how AI will affect your job'. And half of the city is covered in billboards about new AI products.

In product design circles, the hot conversation around the water cooler is all about how AI could and will likely change how we design products.

The vibe right now is that while we are currently taking old design patterns and best practices and applying them to AI, once this develops a bit more, we will have an opportunity to do some radically new things.

It's not exactly a 'touch screen moment', but it's not far off it.

As this whole space continues to explode, I am fascinated by two parts of what is going on with AI products:

  1. The Form: How should AI be visually presented to us?
  2. The Function: How are we being asked to interact with?

The Form

Now, what if I ask you to visualize how your brain works when you search for a memory? Or have an idea? What do you see?

I see a web or interconnected wires shooting electricity around a giant sphere. It's an image that I was probably shown in year 10 science class and has since been repeated back to me in movies.

Hollywood has a strong history of presenting non-humanoid AI to us with this circle/sphere element that often pulses and moves when it's processing/'thinking' or communicating.

There is a great Wikipedia article that lists all the movies that include AI (and importantly if they present the AI as a computer program only or as a humanoid robot).

I looked up most of the computer-only ones and was amazed that almost all of them stick to this circle visual. From the 70s right up until the most recent Mission Impossible.

I did have a moment when I was like, "surely someone made AI look different?", but no. They are almost all some kind of 'mind-eye' or brain idea.

And I don't think that's some kind of thing where people copied each other. I actually think that this is what we feel that Intelligence looks like. It's a bit dark, with some moving lights and a bit mysterious.

Hollywood also has a habit of naming AI’s, often using traditionally feminine names. They want us to think about the AI as almost another person, but importantly, not exactly another person.

As we move from fiction to the real world, the anthropomorphising of AI continues.

The organic, circle element seems to be where the visual brand language for AI is converging. Each company is trying out their own version of their AI brands complete with cute non-threatening names.

That AI’s need to be branded at all is fascinating to me. And that they are all so similar at the moment is also interesting. It really speaks to how these companies expect us to relate to the technology.

If companies continue down this direction, I do wonder if there will be a moment when I can name and design the AI to present how I feel. Instead of a general 'Siri' product, will I be able to choose an avatar for it (like I can today with my Memoji). Kind of like how I can change Siri's accent today.

There really aren't any limitations to how we can present AI to users. From something totally abstract (like the UI in the movie HER) to avatars. I expect that we will try everything and eventually something will win out.

One other thought I just had is about brand alliance. I wonder if people will develop preferences for different models like we have today with Apple vs. Google maps or Spotify vs. Apple Music. Functionally these products are the same. But people are very tied to their app of choice, often for reasons that have more to do with how something feels over how it functions. Will that be true for AI too?

It doesn't feel like this will play out like search did with Google winning across all platforms and devices. OpenAI is the early winner, but even Altman has said recently that the answer isn't larger and larger models and he expects other companies to catch up. So are we entering a world where we will have multiple models/AIs all branded differently living in our phones?

There is a lot to come.

(An interesting read: this paper from the University of Copenhagen, investigates how presenting AI as being like human influences perceived trust levels.)

The Function

This is an enormous topic, but I guess my main thought right now is…this can't be all chat interfaces. But it mainly is for now.

A database has a finite number of known results. So I know how many words I need to fit on a page or on a button. With LLM’s, I don't know if I will return a short sentence or 1000 words. So how do I design for that?

For a lot of good reasons, retrofitting a chat interface to LLM’s does make sense at the moment. But it also doesn't feel like the best solution.

But it is a solution. The three magic stars are appearing everywhere and a visual indication that you can 'tap here for AI'. (LukeW has a great article on this).

An open chat window suffered from the same UX issues that Siri and Alexa do. I call that the 'menu problem'.

If you don't know what options you have then it's actually very hard to know what or how to ask a computer something.

With chat & AI, companies are leading with various prompts around the chat window in an attempt to educate users on how to interact with it. It's a good start.

But once that platforms (e.g. iOS) and apps catch up, we should start to see new interaction models that haven't been possible before.

Recently Apple released a paper showcasing something called Ferret-UI which points to a future where the gap between AI and product interfaces is closed and more things are possible.

The world where platforms and apps are able to talk to each other might finally make voice a big deal. If I ask Siri to do anything other than call, message or tell me the weather, it struggles. But if the rumors this week are true and OpenAI does partner with Apple/Siri, the menu-problem might vanish quickly.

A big part of the Function conversation is really about where the technology sits. Jon Lax has a good Thread about 'platform vs. app' here that is worth reading. And you can see how that conversation is going to be a big differentiator. It's also where the hardware part of the discussion comes in. Where is all the computing actually happening? And how does on-device vs. cloud change what we design?

So what?

This should be viewed as a time stamp. Something to come back and look at in a year to see how far things developed.

Some questions that I'm left with:

  1. To what degree does it matter that we are taking this anthropomorphic approach to AI's identify? What are the implications and what are the alternatives?
  2. If chat = MSDOS, then what will come next?
  3. What are the technical blockers stopping us from moving past chat?
  4. How does the ux change in domain specific applications vs. platform level?

As new products continue to ship, I'm going to have all these questions in mind. One thing that I think we all feel very strongly right now is that we are at the beginning of a whole new set of design challenges that are going to require a bunch of new thinking.

New products, services and new kinds of experiences are coming and so it's a super exciting time to be a product person.

See you out there :)

Nick

Update

10 mins after I published this, OpenAI announced ChatGPT 4o.

The live demos cast some light on both the 'how will AI look?' and 'how will we interact with it?' questions. I send out an email tonight about it all which I've copied below.

Plus here are some Twitter hot takes.

Greetings from NYC (more on that next time),
At 12:50 pm today I pressed publish on something that I’d been writing for a while. It’s a piece called ‘The Form and Function of AI’.
At 1:00 pm, OpenAI did a 30 min live stream announcing a new product called ChatGPT 4o. And as someone put it on Twitter shortly after, all human computer interaction has changed.
My article is a bit of a brain dump exploring two of the design challenged that come with AI:
  1. What kind of identity should AI have?
  2. How should we interact with it?
I don’t try and provide any answers, I mostly ask questions and share thoughts from others who are wondering the same kinds of things.
It did not occur to me that the questions I was asking would (at least partially) be answered so quickly.
You should watch the video in full or at least look at these short demo videos. The technology is shockingly good.
Have you seen the movie HER? We we are at that stage now.
What does this mean for education? Visually impaired people? Travelling to foreign countries? I have so many questions.
The techno optimists are having a field day. The AI doomers will be writing their panic pieces tonight. And I’m not sure what to think.
It is reported that OpenAI have signed a deal with Apple to bring this technology to Siri and that it will be announced next month.
We’ve all either said or heard people say that phones ‘haven’t changed’ in years. The cameras just get better. But we all still remember when each new iPhone came with some (what felt like) breakthrough technology it in. And then that all stopped.
Well, it appears that we could be back, baby!
Let’s wait and see.
I do feel excited though. It really does feel like something is shifting.
I’ve been writing about voice interfaces for years. In a lot of ways, they are they are the holy grail of product design (“the best interface is no interface”). But the technology simply hasn’t been ready.
What OpenAI showed us today was a view into the future that Hollywood has been writing about for 50+ years.
I am mostly excited for the kids though. I can see them adopting this stuff quickly and it’s education applications seem like a no brainer. Maybe I’m wrong though?
What do you think?
Nick


References

Scott Belskey on Luxury Software

Ben Evans on Looking for AI Use Cases

Scott Belskey on The Era of Abstraction

Steven Sinofsky on CES 2024

Luke W on AI Models in Software UI

History of the user interface

Podcast: Decoder interview with Adobe CEO

My Twitter thread on this topic

Apple’s Ferret-UI paper

History of User Interfaces