1
00:00:03,375 --> 00:00:09,641
There are several different ways that interfaces can help people think more fluidly

2
00:00:09,641 --> 00:00:14,706
by distributing their cognition into artifacts in the world.


3
00:00:14,706 --> 00:00:20,403
When interfaces help people distribute a cognition, it can encourage experimentation;

4
00:00:20,403 --> 00:00:25,207
it can scaffold learning and reduce errors through redundancy;

5
00:00:25,207 --> 00:00:28,073
it can show <i>only</i> the differences that matter;

6
00:00:28,073 --> 00:00:32,409
it can convert slow calculation into fast perception;

7
00:00:32,409 --> 00:00:35,679
it can support chunking, specially by experts;

8
00:00:35,679 --> 00:00:37,822
it can increase efficiency of interactions;

9
00:00:37,822 --> 00:00:40,566
and it can facilitate collaboration.

10
00:00:40,566 --> 00:00:44,047
Let’s go through these one at a time.

11
00:00:46,031 --> 00:00:52,613
Here’s a video game that, I bet, many of you all know: This is the Tetris video game.


12
00:00:52,613 --> 00:00:59,861
And there was a very clever study that David Kirsh and Paul Maglio at UC San Diego ran,

13
00:00:59,861 --> 00:01:05,704
where they looked at Tetris players playing Tetris, and what they found was really interesting.


14
00:01:05,704 --> 00:01:10,829
In Tetris, as you may know, you can use keys to move objects around on the screen.

15
00:01:10,829 --> 00:01:18,456
And at a certain point you could hit the space bar to drop the object and try and get a row of a few blocks.


16
00:01:18,456 --> 00:01:21,594
Moving and rotating pieces on the screen may seem like a waste of time

17
00:01:21,594 --> 00:01:25,303
because you have a limited amount of time before the block hits the bottom.

18
00:01:25,303 --> 00:01:28,850
So, how do you use that most efficiently?

19
00:01:28,850 --> 00:01:34,004
It turns out that people move the block around the screen

20
00:01:34,004 --> 00:01:39,072
more than they — in some purely theoretical sense — need to.

21
00:01:39,072 --> 00:01:44,437
So, in essence, I’m trying out different places that the blocks could go.

22
00:01:44,437 --> 00:01:50,623
<i>Maybe</i> this is just for novices that when you’re learning Tetris you need to feel things out

23
00:01:50,623 --> 00:01:54,430
but as you become more of an expert that’s no longer the case?

24
00:01:54,430 --> 00:01:55,919
Exactly the opposite!

25
00:01:55,919 --> 00:02:02,900
Kirsh and Maglio found that experts actually relied more heavily on moving objects in the world.

26
00:02:02,900 --> 00:02:07,906
And what they were doing is they were distributing the cognitive effort of “what if” scenarios —

27
00:02:07,906 --> 00:02:10,666
What if I placed it over here? What if I placed it over there?

28
00:02:10,666 --> 00:02:13,175
You absolutely can do that in your mind,

29
00:02:13,175 --> 00:02:18,615
but it turns that in this case, for most people, it’s cheaper to do it out in the world


30
00:02:18,615 --> 00:02:24,418
to be able to turn the cognitive task of reasoning about all of the what-if scenarios

31
00:02:24,418 --> 00:02:29,606
into a perceptual task of “Oh yeah, that block would work perfectly right there.”

32
00:02:29,606 --> 00:02:34,628
It saves you the effort of having to, in particular, mentally rotate the different pieces

33
00:02:34,628 --> 00:02:38,318
to figure out how they would fit into the screen.

34
00:02:38,318 --> 00:02:43,955
Here’s another example from the learning sciences: There are called Montesorri blocks.

35
00:02:43,955 --> 00:02:51,349
And what we have here are beads that provide a physical representation for number[s].

36
00:02:51,349 --> 00:02:57,291
Especially for young children, numbers are really abstract concepts, difficult to get your head around.

37
00:02:57,291 --> 00:03:04,274
And these physical instantiations can help teach addition, multiplication and other simple arithmetic operations.

38
00:03:04,274 --> 00:03:11,317
So, for example, if I’m going to take three and multiply it by three to get three squared,

39
00:03:11,317 --> 00:03:18,328
well, I can see that I have three by three — I have a square and I can see that it’s composed of nine beads.

40
00:03:18,328 --> 00:03:21,776
And there’re other addition and multiplication options that you could with these also.


41
00:03:21,776 --> 00:03:24,317
And by having this redundant information —

42
00:03:24,317 --> 00:03:28,648
by taking an abstract concept and “realfying” it and making it concrete —

43
00:03:28,648 --> 00:03:32,344
it can help scaffold learning and reduce errors.

44
00:03:32,344 --> 00:03:36,023
The Montessori blocks example and the Tetris example

45
00:03:36,023 --> 00:03:42,138
show us the power of providing a visual or physical instantiation of abstract ideas.

46
00:03:42,138 --> 00:03:45,615
So what makes a good representation of this sort?

47
00:03:45,615 --> 00:03:50,960
Well, you should show the information that you need and nothing else.

48
00:03:50,960 --> 00:03:54,752
And, what these representations should do is

49
00:03:54,752 --> 00:04:01,056
it should enable to kinds of tasks that users want to do like comparison and exploration and problem solving.

50
00:04:01,056 --> 00:04:03,790
And if that seems too abstract or maybe obvious,

51
00:04:03,790 --> 00:04:07,122
let’s take a look at this example from the London Underground.

52
00:04:07,122 --> 00:04:10,489
This subway map was introduced about a century ago

53
00:04:10,489 --> 00:04:16,560
and it was one of the very first maps to introduce a brand-new idea in map design:

54
00:04:16,560 --> 00:04:24,548
of abstracting the layout of the tracks from the underlying physical geography.

55
00:04:24,548 --> 00:04:29,586
Prior to this London subway map, the maps would show what the geography was

56
00:04:29,586 --> 00:04:31,921
and so long things were long and short things were short;

57
00:04:31,921 --> 00:04:34,872
and if the tracks wandered because that’s the way that it works;

58
00:04:34,872 --> 00:04:38,934
then the tracks on the map would wander because that’s the way things worked.

59
00:04:38,934 --> 00:04:42,918
And with the Underground map designers realized was that

60
00:04:42,918 --> 00:04:50,018
the most common task for subway riders is to figure out how to get from point A to point B,

61
00:04:50,018 --> 00:04:54,881
and all of this additional detail of faithfulness to the underlying topology

62
00:04:54,881 --> 00:05:00,521
was getting in the way of that A-to-B task more than it was helping it.

63
00:05:00,521 --> 00:05:04,617
And so what they did is they stripped that a lot of that unnecessary detail,

64
00:05:04,617 --> 00:05:09,604
turning this into vertical, diagonal, and horizontal lines.

65
00:05:09,604 --> 00:05:15,846
So there are some representation between the layout on the map and the layout on the real world —

66
00:05:15,846 --> 00:05:18,184
North is north, and south is south,

67
00:05:18,184 --> 00:05:22,192
and things roughly head in the direction that they do in the real world,

68
00:05:22,192 --> 00:05:24,252
but the detail is stripped out.

69
00:05:24,252 --> 00:05:29,646
And this makes it much easier to be able to figure out how to get between connections.

70
00:05:29,646 --> 00:05:32,088
Another thing that they did is they introduced

71
00:05:32,088 --> 00:05:38,082
what a century later we would call a “focus plus context” representation for the map.

72
00:05:38,082 --> 00:05:42,445
In the center of London, the subway stations are very densely packed.

73
00:05:42,445 --> 00:05:46,355
So that area is expanded out: it consumes more of the map real estate.

74
00:05:46,355 --> 00:05:50,869
As you get out toward the suburbs, the stations are fewer and further between.

75
00:05:50,869 --> 00:05:55,355
As oppose to that taking up 90% of the map because it’s 90% of the space.

76
00:05:55,355 --> 00:05:57,991
Those stations are actually scrunched,

77
00:05:57,991 --> 00:06:00,829
because if you know you need to go northeast to a particular station,

78
00:06:00,829 --> 00:06:06,504
then the exact distances involved are most of the time less relevant.

79
00:06:06,504 --> 00:06:11,281
Now, of course with constitutes of good representation is of course task-specific.

80
00:06:11,281 --> 00:06:15,811
And so what you can see is that by making some tasks easier —

81
00:06:15,811 --> 00:06:19,433
like getting from A to B when you know A and B,

82
00:06:19,433 --> 00:06:23,650
or being able to navigate the center of London more effectively —

83
00:06:23,650 --> 00:06:26,285
you’ve compromised on other tasks.

84
00:06:26,285 --> 00:06:28,871
And so, for somebody who may need to make decisions

85
00:06:28,871 --> 00:06:32,748
about what stop to get off at based on some underlying topography,

86
00:06:32,748 --> 00:06:39,739
or another task that’s compromised is by virtue of the distance between the stations on the map

87
00:06:39,739 --> 00:06:43,998
not lining up between the distances between stations in the world,


88
00:06:43,998 --> 00:06:48,219
you can make poor judgments about what’s close and what’s far:

89
00:06:48,219 --> 00:06:53,177
In the center you may believe things are far apart when they are really close,

90
00:06:53,177 --> 00:06:58,197
and out in the suburbs, you may believe from the map that things are closer than they actually are.

91
00:06:58,197 --> 00:07:03,806
And so nearly all representation design is about fitness to task.

92
00:07:03,806 --> 00:07:06,878
Here’s a temperature map from the Weather Underground.

93
00:07:06,878 --> 00:07:10,177
It shows the temperature at each location in the Bay area,

94
00:07:10,177 --> 00:07:17,011
geo-referenced so that the temperature number is placed right on top of that physical location.

95
00:07:17,011 --> 00:07:21,302
What do you think are the benefits and drawbacks of this representation?

96
00:07:21,302 --> 00:07:26,039
What’s it good for and what’s it a problem for?

97
00:07:30,670 --> 00:07:35,438
If you know the physical coordinate that you’d like the temperature for,

98
00:07:35,438 --> 00:07:39,376
say for example, “What temperature is it along the coast?”

99
00:07:39,376 --> 00:07:45,707
and I don’t care or don’t know the exact name of the town, this works incredibly well.

100
00:07:45,707 --> 00:07:50,024
It’s also a reasonable interface, in some sense,

101
00:07:50,024 --> 00:07:54,481
for trying to get a good [inaudible] of what are the temperatures like in the area overall:

102
00:07:54,481 --> 00:08:00,425
I can see, for example, as I head inland the temperature tends to get warmer.

103
00:08:00,425 --> 00:08:03,958
There are a lot of ways in which we could make this better.

104
00:08:03,958 --> 00:08:10,147
So, right now, every temperature is shown identically no matter what temperature it is

105
00:08:10,147 --> 00:08:15,138
which means it’s hard to scan: I have to read every single temperature one by one.

106
00:08:15,138 --> 00:08:22,289
I could make this better if instead I had the temperature number somehow —

107
00:08:22,289 --> 00:08:29,088
the color or size of the temperature number — correspond to the weather,


108
00:08:29,088 --> 00:08:36,533
But if we’re going to start mapping the variables of the map to something like color,

109
00:08:36,533 --> 00:08:39,448
we need to be careful to get it right.

110
00:08:39,448 --> 00:08:43,353
This is an example that comes from Edward Tufte.

111
00:08:43,353 --> 00:08:49,315
His books on visual design and the graphical representations of data are fantastic

112
00:08:49,315 --> 00:08:50,856
and I strongly encourage you to read them.

113
00:08:50,856 --> 00:08:58,596
In this map, we see how a computer scientist might make a mapping for Japan.

114
00:08:58,596 --> 00:09:04,235
This is showing the height above or below sea level as color,

115
00:09:04,235 --> 00:09:11,963
and what you can see is the depth below sea level is represented by the color spectrum Roy G. Biv.

116
00:09:11,963 --> 00:09:21,007
Now, one of the challenges of hue is that it’s not an additive representation.

117
00:09:21,007 --> 00:09:26,901
So it’s not really a strong ordering that we give to the colors perceptually.

118
00:09:26,901 --> 00:09:32,961
It’s a substitutive representation that red and yellow are qualitatively different,

119
00:09:32,961 --> 00:09:37,113
but we don’t automatically have a more-than or less-than relationship between the two.

120
00:09:37,113 --> 00:09:39,095
Another problem on this representation,

121
00:09:39,095 --> 00:09:45,093
is it’s very difficult by glance to tell what’s higher and what’s lower in the sea.

122
00:09:45,093 --> 00:09:55,938
Conversely, the individual isosurfaces — the individual chunks of a particular depth — really pop out.

123
00:09:55,938 --> 00:10:01,948
Like, to me, for example, the yellow depth pops out very strongly

124
00:10:01,948 --> 00:10:05,846
and that shape really comes to the attentional fore,

125
00:10:05,846 --> 00:10:10,942
which, if you are making a rock-and-roll poster for the Fillmore, would be awesome,

126
00:10:10,942 --> 00:10:15,266
but if you’re trying to get a sense of the contours of the sea,

127
00:10:15,266 --> 00:10:23,435
what become salient to you, the outlines for what’s 400 meters below sea level, it’s probably just not that relevant.

128
00:10:23,435 --> 00:10:28,301
So how could we continue our theme of using color as a representational cue

129
00:10:28,301 --> 00:10:32,805
but have it be more meaningful than you might see in this case

130
00:10:32,805 --> 00:10:35,565
where we’re mapping it to the color spectrum?

131
00:10:35,565 --> 00:10:39,643
And here’s Edward Tufte’s redesign which I think is much better.

132
00:10:39,643 --> 00:10:43,701
There’s a couple of things that I really like about the representation here.

133
00:10:43,701 --> 00:10:51,250
The first one is that all of the things that are above sea level are brown — are kind of an earth tone.

134
00:10:51,250 --> 00:10:56,566
So, we’re leveraging our intuitions about the physical world and using that metaphorically for the map.

135
00:10:56,566 --> 00:11:00,874
So, the land-colored stuff is land.

136
00:11:00,874 --> 00:11:05,577
Similarly, all of the water is blue — the water-colored stuff is water.

137
00:11:05,577 --> 00:11:15,846
And furthermore, we can see how the intensity — or the luminance — of that color blue changes with depth.

138
00:11:15,846 --> 00:11:22,773
And the deeper blues are darker blue, which corresponds to our physical intuitions.

139
00:11:22,773 --> 00:11:29,651
And of course, water doesn’t really get that much darker at the kind of depth that we’re talking about here —

140
00:11:29,651 --> 00:11:34,762
our intuition about darker colors being deeper comes from much shallower depths.

141
00:11:34,762 --> 00:11:41,224
But the idea — you can leverage this thing that we all know that water right by the shore is a paler color

142
00:11:41,224 --> 00:11:44,359
and as you get more of it gets darker.

143
00:11:44,359 --> 00:11:51,860
So this is a really wonderful way to see that these darker areas here are deeper than these shallow areas here.

144
00:11:51,860 --> 00:11:58,196
With the London Underground map, we saw how the representation of the map —

145
00:11:58,196 --> 00:12:03,869
what makes a good representation — was intrinsically tied up with the task that the user is doing.

146
00:12:03,869 --> 00:12:11,972
Similarly, what makes a good representation is also dependent on what a user’s expertise is.

147
00:12:11,972 --> 00:12:18,696
And a wonderful example of this comes from Herb Simon and [Bill] Chase in the early 1970’s.

148
00:12:18,696 --> 00:12:24,965
They were looking at chess as an exemplar domain for trying to understand expertise.

149
00:12:24,965 --> 00:12:28,229
One of the things that they observed was that

150
00:12:28,229 --> 00:12:34,817
expert chess players were much better at being able to remember the configuration of a chess board.


151
00:12:34,817 --> 00:12:38,580
You can imagine a couple hypotheses for this.

152
00:12:38,580 --> 00:12:46,659
So, one of them would be “Experts were born with a better memory for that kind of thing”;

153
00:12:46,659 --> 00:12:51,643
Or, similarly, “Experts by virtue of their ten thousand hours of training

154
00:12:51,643 --> 00:12:56,656
have trained themselves up to build up that muscle and be very good at remembering that kind of thing.”

155
00:12:56,656 --> 00:13:00,265
Neither turns out to be the case.

156
00:13:00,265 --> 00:13:12,124
Experts are much better at remembering the configuration of a board, but only if it’s an actual game!


157
00:13:12,124 --> 00:13:18,978
So if the configuration of the chess board is the configuration of how a chess board could be,

158
00:13:18,978 --> 00:13:22,332
experts do a fantastic job of remembering it.

159
00:13:22,332 --> 00:13:28,536
But if you arranged the pieces on the board in a way that a chess play could not ever achieve,

160
00:13:28,536 --> 00:13:32,433
the experts actually do no better than novices at all.

161
00:13:32,433 --> 00:13:41,286
And so what we’re seeing is that the ability of experts to chunk things and have higher memory capacity

162
00:13:41,286 --> 00:13:46,597
is because they are able to leverage their knowledge about the domain.

163
00:13:46,597 --> 00:13:51,547
Game design and user interface design are both concerned

164
00:13:51,547 --> 00:13:56,035
with how easy or hard it’s for a user to accomplish a particular task.

165
00:13:56,035 --> 00:14:00,524
The difference is that often designers want to make it hard, the right hard;


166
00:14:00,524 --> 00:14:04,325
and interface game designers want to make things easy.


167
00:14:04,325 --> 00:14:07,024
And so we can learn from this chess example

168
00:14:07,024 --> 00:14:10,047
and we can ask this question as interface designers:

169
00:14:10,047 --> 00:14:13,748
“Can we make interfaces more chunkable?”

170
00:14:13,748 --> 00:14:17,807
Can we make interactions that can be accomplished in one chunk

171
00:14:17,807 --> 00:14:23,665
and therefore place a lower load on our memory and make it easier for users to work with?

172
00:14:23,665 --> 00:14:26,455
A great example of this comes from Bill Buxton

173
00:14:26,455 --> 00:14:32,650
who looked at being able to move text between locations on a document.

174
00:14:32,650 --> 00:14:37,055
And in a common desktop user interface today,

175
00:14:37,055 --> 00:14:43,543
one common operation would be either a keyboard shortcut or a menu command to cut some text,

176
00:14:43,543 --> 00:14:47,588
and then you move the cursor to a new location, and you paste that text.

177
00:14:47,588 --> 00:14:54,479
That’s three different operations and, in between, if you got interrupted,

178
00:14:54,479 --> 00:14:59,351
you might forget what’s in the copy buffer — in fact, I’m sure that’s happened to all of us at some point.

179
00:14:59,351 --> 00:15:08,553
What Buxton realized is what if you could turn all of this, through a gestural interface, into one command?

180
00:15:08,553 --> 00:15:16,572
So, maybe I could grab a text that I want, draw a new location and drop it right there.

181
00:15:16,572 --> 00:15:22,652
That’d be much better: There’s never a time where I could be interrupted and lose state

182
00:15:22,652 --> 00:15:28,092
because all of the state is maintained in this continuous gesture.

183
00:15:28,092 --> 00:15:31,709
We’ve all heard the saying that a picture is worth ten thousand words.

184
00:15:31,709 --> 00:15:39,801
As interface designers, we’re tasked with the project of representing the information to the user

185
00:15:39,801 --> 00:15:44,831
and one task that we commonly have to deal with is:

186
00:15:44,831 --> 00:15:50,518
Should we represent information visually, or should we represent information textually?

187
00:15:50,518 --> 00:15:54,025
The answer of course is it depends.

188
00:15:54,025 --> 00:15:59,909
But one time when representing information visually can be much more effective

189
00:15:59,909 --> 00:16:06,026
is when you can convert slow reasoning tasks into fast perception tasks

190
00:16:06,026 --> 00:16:09,950
by virtue of making them visually salient.

191
00:16:09,950 --> 00:16:15,396
We saw that with the map example: In that case, both the colorings of the map were visual,


192
00:16:15,396 --> 00:16:21,482
but [in] one, it was much easier to just add a glance to understand what’s going on in good coloring.

193
00:16:21,482 --> 00:16:27,095
And the poor coloring, you have to reason about it much more slowly and the wrong things kept popping out.

194
00:16:27,095 --> 00:16:31,365
If you think about a table of numbers, it can often be difficult to see trends,

195
00:16:31,365 --> 00:16:34,936
whereas if you represent that same information visually,

196
00:16:34,936 --> 00:16:40,482
it can often pop out what the high points, the low points, trends, outliers —

197
00:16:40,482 --> 99:59:59,999
all of that becomes salient and automatically visible to you.