There are several different ways that interfaces can help people think more fluidly

by distributing their cognition into artifacts in the world.

When interfaces help people distribute a cognition, it can encourage experimentation;

it can scaffold learning and reduce errors through redundancy;

it can show <i>only</i> the differences that matter;

it can convert slow calculation into fast perception;

it can support chunking, specially by experts;

it can increase efficiency of interactions;

and it can facilitate collaboration.

Let’s go through these one at a time.

Here’s a video game that, I bet, many of you all know: This is the Tetris video game.

And there was a very clever study that David Kirsh and Paul Maglio at UC San Diego ran,

where they looked at Tetris players playing Tetris, and what they found was really interesting.

In Tetris, as you may know, you can use keys to move objects around on the screen.

And at a certain point you could hit the space bar to drop the object and try and get a row of a few blocks.

Moving and rotating pieces on the screen may seem like a waste of time

because you have a limited amount of time before the block hits the bottom.

So, how do you use that most efficiently?

It turns out that people move the block around the screen

more than they — in some purely theoretical sense — need to.

So, in essence, I’m trying out different places that the blocks could go.

<i>Maybe</i> this is just for novices that when you’re learning Tetris you need to feel things out

but as you become more of an expert that’s no longer the case?

Exactly the opposite!

Kirsh and Maglio found that experts actually relied more heavily on moving objects in the world.

And what they were doing is they were distributing the cognitive effort of “what if” scenarios —

What if I placed it over here? What if I placed it over there?

You absolutely can do that in your mind,

but it turns that in this case, for most people, it’s cheaper to do it out in the world

to be able to turn the cognitive task of reasoning about all of the what-if scenarios

into a perceptual task of “Oh yeah, that block would work perfectly right there.”

It saves you the effort of having to, in particular, mentally rotate the different pieces

to figure out how they would fit into the screen.

Here’s another example from the learning sciences: There are called Montesorri blocks.

And what we have here are beads that provide a physical representation for number[s].

Especially for young children, numbers are really abstract concepts, difficult to get your head around.

And these physical instantiations can help teach addition, multiplication and other simple arithmetic operations.

So, for example, if I’m going to take three and multiply it by three to get three squared,

well, I can see that I have three by three — I have a square and I can see that it’s composed of nine beads.

And there’re other addition and multiplication options that you could with these also.

And by having this redundant information —

by taking an abstract concept and “realfying” it and making it concrete —

it can help scaffold learning and reduce errors.

The Montessori blocks example and the Tetris example

show us the power of providing a visual or physical instantiation of abstract ideas.

So what makes a good representation of this sort?

Well, you should show the information that you need and nothing else.

And, what these representations should do is

it should enable to kinds of tasks that users want to do like comparison and exploration and problem solving.

And if that seems too abstract or maybe obvious,

let’s take a look at this example from the London Underground.

This subway map was introduced about a century ago

and it was one of the very first maps to introduce a brand-new idea in map design:

of abstracting the layout of the tracks from the underlying physical geography.

Prior to this London subway map, the maps would show what the geography was

and so long things were long and short things were short;

and if the tracks wandered because that’s the way that it works;

then the tracks on the map would wander because that’s the way things worked.

And with the Underground map designers realized was that

the most common task for subway riders is to figure out how to get from point A to point B,

and all of this additional detail of faithfulness to the underlying topology

was getting in the way of that A-to-B task more than it was helping it.

And so what they did is they stripped that a lot of that unnecessary detail,

turning this into vertical, diagonal, and horizontal lines.

So there are some representation between the layout on the map and the layout on the real world —

North is north, and south is south,

and things roughly head in the direction that they do in the real world,

but the detail is stripped out.

And this makes it much easier to be able to figure out how to get between connections.

Another thing that they did is they introduced

what a century later we would call a “focus plus context” representation for the map.

In the center of London, the subway stations are very densely packed.

So that area is expanded out: it consumes more of the map real estate.

As you get out toward the suburbs, the stations are fewer and further between.

As oppose to that taking up 90% of the map because it’s 90% of the space.

Those stations are actually scrunched,

because if you know you need to go northeast to a particular station,

then the exact distances involved are most of the time less relevant.

Now, of course with constitutes of good representation is of course task-specific.

And so what you can see is that by making some tasks easier —

like getting from A to B when you know A and B,

or being able to navigate the center of London more effectively —

you’ve compromised on other tasks.

And so, for somebody who may need to make decisions

about what stop to get off at based on some underlying topography,

or another task that’s compromised is by virtue of the distance between the stations on the map

not lining up between the distances between stations in the world,

you can make poor judgments about what’s close and what’s far:

In the center you may believe things are far apart when they are really close,

and out in the suburbs, you may believe from the map that things are closer than they actually are.

And so nearly all representation design is about fitness to task.

Here’s a temperature map from the Weather Underground.

It shows the temperature at each location in the Bay area,

geo-referenced so that the temperature number is placed right on top of that physical location.

What do you think are the benefits and drawbacks of this representation?

What’s it good for and what’s it a problem for?

If you know the physical coordinate that you’d like the temperature for,

say for example, “What temperature is it along the coast?”

and I don’t care or don’t know the exact name of the town, this works incredibly well.

It’s also a reasonable interface, in some sense,

for trying to get a good [inaudible] of what are the temperatures like in the area overall:

I can see, for example, as I head inland the temperature tends to get warmer.

There are a lot of ways in which we could make this better.

So, right now, every temperature is shown identically no matter what temperature it is

which means it’s hard to scan: I have to read every single temperature one by one.

I could make this better if instead I had the temperature number somehow —

the color or size of the temperature number — correspond to the weather,

But if we’re going to start mapping the variables of the map to something like color,

we need to be careful to get it right.

This is an example that comes from Edward Tufte.

His books on visual design and the graphical representations of data are fantastic

and I strongly encourage you to read them.

In this map, we see how a computer scientist might make a mapping for Japan.

This is showing the height above or below sea level as color,

and what you can see is the depth below sea level is represented by the color spectrum Roy G. Biv.

Now, one of the challenges of hue is that it’s not an additive representation.

So it’s not really a strong ordering that we give to the colors perceptually.

It’s a substitutive representation that red and yellow are qualitatively different,

but we don’t automatically have a more-than or less-than relationship between the two.

Another problem on this representation,

is it’s very difficult by glance to tell what’s higher and what’s lower in the sea.

Conversely, the individual isosurfaces — the individual chunks of a particular depth — really pop out.

Like, to me, for example, the yellow depth pops out very strongly

and that shape really comes to the attentional fore,

which, if you are making a rock-and-roll poster for the Fillmore, would be awesome,

but if you’re trying to get a sense of the contours of the sea,

what become salient to you, the outlines for what’s 400 meters below sea level, it’s probably just not that relevant.

So how could we continue our theme of using color as a representational cue

but have it be more meaningful than you might see in this case

where we’re mapping it to the color spectrum?

And here’s Edward Tufte’s redesign which I think is much better.

There’s a couple of things that I really like about the representation here.

The first one is that all of the things that are above sea level are brown — are kind of an earth tone.

So, we’re leveraging our intuitions about the physical world and using that metaphorically for the map.

So, the land-colored stuff is land.

Similarly, all of the water is blue — the water-colored stuff is water.

And furthermore, we can see how the intensity — or the luminance — of that color blue changes with depth.

And the deeper blues are darker blue, which corresponds to our physical intuitions.

And of course, water doesn’t really get that much darker at the kind of depth that we’re talking about here —

our intuition about darker colors being deeper comes from much shallower depths.

But the idea — you can leverage this thing that we all know that water right by the shore is a paler color

and as you get more of it gets darker.

So this is a really wonderful way to see that these darker areas here are deeper than these shallow areas here.

With the London Underground map, we saw how the representation of the map —

what makes a good representation — was intrinsically tied up with the task that the user is doing.

Similarly, what makes a good representation is also dependent on what a user’s expertise is.

And a wonderful example of this comes from Herb Simon and [Bill] Chase in the early 1970’s.

They were looking at chess as an exemplar domain for trying to understand expertise.

One of the things that they observed was that

expert chess players were much better at being able to remember the configuration of a chess board.

You can imagine a couple hypotheses for this.

So, one of them would be “Experts were born with a better memory for that kind of thing”;

Or, similarly, “Experts by virtue of their ten thousand hours of training

have trained themselves up to build up that muscle and be very good at remembering that kind of thing.”

Neither turns out to be the case.

Experts are much better at remembering the configuration of a board, but only if it’s an actual game!

So if the configuration of the chess board is the configuration of how a chess board could be,

experts do a fantastic job of remembering it.

But if you arranged the pieces on the board in a way that a chess play could not ever achieve,

the experts actually do no better than novices at all.

And so what we’re seeing is that the ability of experts to chunk things and have higher memory capacity

is because they are able to leverage their knowledge about the domain.

Game design and user interface design are both concerned

with how easy or hard it’s for a user to accomplish a particular task.

The difference is that often designers want to make it hard, the right hard;

and interface game designers want to make things easy.

And so we can learn from this chess example

and we can ask this question as interface designers:

“Can we make interfaces more chunkable?”

Can we make interactions that can be accomplished in one chunk

and therefore place a lower load on our memory and make it easier for users to work with?

A great example of this comes from Bill Buxton

who looked at being able to move text between locations on a document.

And in a common desktop user interface today,

one common operation would be either a keyboard shortcut or a menu command to cut some text,

and then you move the cursor to a new location, and you paste that text.

That’s three different operations and, in between, if you got interrupted,

you might forget what’s in the copy buffer — in fact, I’m sure that’s happened to all of us at some point.

What Buxton realized is what if you could turn all of this, through a gestural interface, into one command?

So, maybe I could grab a text that I want, draw a new location and drop it right there.

That’d be much better: There’s never a time where I could be interrupted and lose state

because all of the state is maintained in this continuous gesture.

We’ve all heard the saying that a picture is worth ten thousand words.

As interface designers, we’re tasked with the project of representing the information to the user

and one task that we commonly have to deal with is:

Should we represent information visually, or should we represent information textually?

The answer of course is it depends.

But one time when representing information visually can be much more effective

is when you can convert slow reasoning tasks into fast perception tasks

by virtue of making them visually salient.

We saw that with the map example: In that case, both the colorings of the map were visual,

but [in] one, it was much easier to just add a glance to understand what’s going on in good coloring.

And the poor coloring, you have to reason about it much more slowly and the wrong things kept popping out.

If you think about a table of numbers, it can often be difficult to see trends,

whereas if you represent that same information visually,

it can often pop out what the high points, the low points, trends, outliers —

all of that becomes salient and automatically visible to you.