1 00:00:03,375 --> 00:00:09,641 There are several different ways that interfaces can help people think more fluidly 2 00:00:09,641 --> 00:00:14,706 by distributing their cognition into artifacts in the world. 3 00:00:14,706 --> 00:00:20,403 When interfaces help people distribute a cognition, it can encourage experimentation; 4 00:00:20,403 --> 00:00:25,207 it can scaffold learning and reduce errors through redundancy; 5 00:00:25,207 --> 00:00:28,073 it can show only the differences that matter; 6 00:00:28,073 --> 00:00:32,409 it can convert slow calculation into fast perception; 7 00:00:32,409 --> 00:00:35,679 it can support chunking, specially by experts; 8 00:00:35,679 --> 00:00:37,822 it can increase efficiency of interactions; 9 00:00:37,822 --> 00:00:40,566 and it can facilitate collaboration. 10 00:00:40,566 --> 00:00:44,047 Let’s go through these one at a time. 11 00:00:46,031 --> 00:00:52,613 Here’s a video game that, I bet, many of you all know: This is the Tetris video game. 12 00:00:52,613 --> 00:00:59,861 And there was a very clever study that David Kirsh and Paul Maglio at UC San Diego ran, 13 00:00:59,861 --> 00:01:05,704 where they looked at Tetris players playing Tetris, and what they found was really interesting. 14 00:01:05,704 --> 00:01:10,829 In Tetris, as you may know, you can use keys to move objects around on the screen. 15 00:01:10,829 --> 00:01:18,456 And at a certain point you could hit the space bar to drop the object and try and get a row of a few blocks. 16 00:01:18,456 --> 00:01:21,594 Moving and rotating pieces on the screen may seem like a waste of time 17 00:01:21,594 --> 00:01:25,303 because you have a limited amount of time before the block hits the bottom. 18 00:01:25,303 --> 00:01:28,850 So, how do you use that most efficiently? 19 00:01:28,850 --> 00:01:34,004 It turns out that people move the block around the screen 20 00:01:34,004 --> 00:01:39,072 more than they — in some purely theoretical sense — need to. 21 00:01:39,072 --> 00:01:44,437 So, in essence, I’m trying out different places that the blocks could go. 22 00:01:44,437 --> 00:01:50,623 Maybe this is just for novices that when you’re learning Tetris you need to feel things out 23 00:01:50,623 --> 00:01:54,430 but as you become more of an expert that’s no longer the case? 24 00:01:54,430 --> 00:01:55,919 Exactly the opposite! 25 00:01:55,919 --> 00:02:02,900 Kirsh and Maglio found that experts actually relied more heavily on moving objects in the world. 26 00:02:02,900 --> 00:02:07,906 And what they were doing is they were distributing the cognitive effort of “what if” scenarios — 27 00:02:07,906 --> 00:02:10,666 What if I placed it over here? What if I placed it over there? 28 00:02:10,666 --> 00:02:13,175 You absolutely can do that in your mind, 29 00:02:13,175 --> 00:02:18,615 but it turns that in this case, for most people, it’s cheaper to do it out in the world 30 00:02:18,615 --> 00:02:24,418 to be able to turn the cognitive task of reasoning about all of the what-if scenarios 31 00:02:24,418 --> 00:02:29,606 into a perceptual task of “Oh yeah, that block would work perfectly right there.” 32 00:02:29,606 --> 00:02:34,628 It saves you the effort of having to, in particular, mentally rotate the different pieces 33 00:02:34,628 --> 00:02:38,318 to figure out how they would fit into the screen. 34 00:02:38,318 --> 00:02:43,955 Here’s another example from the learning sciences: There are called Montesorri blocks. 35 00:02:43,955 --> 00:02:51,349 And what we have here are beads that provide a physical representation for number[s]. 36 00:02:51,349 --> 00:02:57,291 Especially for young children, numbers are really abstract concepts, difficult to get your head around. 37 00:02:57,291 --> 00:03:04,274 And these physical instantiations can help teach addition, multiplication and other simple arithmetic operations. 38 00:03:04,274 --> 00:03:11,317 So, for example, if I’m going to take three and multiply it by three to get three squared, 39 00:03:11,317 --> 00:03:18,328 well, I can see that I have three by three — I have a square and I can see that it’s composed of nine beads. 40 00:03:18,328 --> 00:03:21,776 And there’re other addition and multiplication options that you could with these also. 41 00:03:21,776 --> 00:03:24,317 And by having this redundant information — 42 00:03:24,317 --> 00:03:28,648 by taking an abstract concept and “realfying” it and making it concrete — 43 00:03:28,648 --> 00:03:32,344 it can help scaffold learning and reduce errors. 44 00:03:32,344 --> 00:03:36,023 The Montessori blocks example and the Tetris example 45 00:03:36,023 --> 00:03:42,138 show us the power of providing a visual or physical instantiation of abstract ideas. 46 00:03:42,138 --> 00:03:45,615 So what makes a good representation of this sort? 47 00:03:45,615 --> 00:03:50,960 Well, you should show the information that you need and nothing else. 48 00:03:50,960 --> 00:03:54,752 And, what these representations should do is 49 00:03:54,752 --> 00:04:01,056 it should enable to kinds of tasks that users want to do like comparison and exploration and problem solving. 50 00:04:01,056 --> 00:04:03,790 And if that seems too abstract or maybe obvious, 51 00:04:03,790 --> 00:04:07,122 let’s take a look at this example from the London Underground. 52 00:04:07,122 --> 00:04:10,489 This subway map was introduced about a century ago 53 00:04:10,489 --> 00:04:16,560 and it was one of the very first maps to introduce a brand-new idea in map design: 54 00:04:16,560 --> 00:04:24,548 of abstracting the layout of the tracks from the underlying physical geography. 55 00:04:24,548 --> 00:04:29,586 Prior to this London subway map, the maps would show what the geography was 56 00:04:29,586 --> 00:04:31,921 and so long things were long and short things were short; 57 00:04:31,921 --> 00:04:34,872 and if the tracks wandered because that’s the way that it works; 58 00:04:34,872 --> 00:04:38,934 then the tracks on the map would wander because that’s the way things worked. 59 00:04:38,934 --> 00:04:42,918 And with the Underground map designers realized was that 60 00:04:42,918 --> 00:04:50,018 the most common task for subway riders is to figure out how to get from point A to point B, 61 00:04:50,018 --> 00:04:54,881 and all of this additional detail of faithfulness to the underlying topology 62 00:04:54,881 --> 00:05:00,521 was getting in the way of that A-to-B task more than it was helping it. 63 00:05:00,521 --> 00:05:04,617 And so what they did is they stripped that a lot of that unnecessary detail, 64 00:05:04,617 --> 00:05:09,604 turning this into vertical, diagonal, and horizontal lines. 65 00:05:09,604 --> 00:05:15,846 So there are some representation between the layout on the map and the layout on the real world — 66 00:05:15,846 --> 00:05:18,184 North is north, and south is south, 67 00:05:18,184 --> 00:05:22,192 and things roughly head in the direction that they do in the real world, 68 00:05:22,192 --> 00:05:24,252 but the detail is stripped out. 69 00:05:24,252 --> 00:05:29,646 And this makes it much easier to be able to figure out how to get between connections. 70 00:05:29,646 --> 00:05:32,088 Another thing that they did is they introduced 71 00:05:32,088 --> 00:05:38,082 what a century later we would call a “focus plus context” representation for the map. 72 00:05:38,082 --> 00:05:42,445 In the center of London, the subway stations are very densely packed. 73 00:05:42,445 --> 00:05:46,355 So that area is expanded out: it consumes more of the map real estate. 74 00:05:46,355 --> 00:05:50,869 As you get out toward the suburbs, the stations are fewer and further between. 75 00:05:50,869 --> 00:05:55,355 As oppose to that taking up 90% of the map because it’s 90% of the space. 76 00:05:55,355 --> 00:05:57,991 Those stations are actually scrunched, 77 00:05:57,991 --> 00:06:00,829 because if you know you need to go northeast to a particular station, 78 00:06:00,829 --> 00:06:06,504 then the exact distances involved are most of the time less relevant. 79 00:06:06,504 --> 00:06:11,281 Now, of course with constitutes of good representation is of course task-specific. 80 00:06:11,281 --> 00:06:15,811 And so what you can see is that by making some tasks easier — 81 00:06:15,811 --> 00:06:19,433 like getting from A to B when you know A and B, 82 00:06:19,433 --> 00:06:23,650 or being able to navigate the center of London more effectively — 83 00:06:23,650 --> 00:06:26,285 you’ve compromised on other tasks. 84 00:06:26,285 --> 00:06:28,871 And so, for somebody who may need to make decisions 85 00:06:28,871 --> 00:06:32,748 about what stop to get off at based on some underlying topography, 86 00:06:32,748 --> 00:06:39,739 or another task that’s compromised is by virtue of the distance between the stations on the map 87 00:06:39,739 --> 00:06:43,998 not lining up between the distances between stations in the world, 88 00:06:43,998 --> 00:06:48,219 you can make poor judgments about what’s close and what’s far: 89 00:06:48,219 --> 00:06:53,177 In the center you may believe things are far apart when they are really close, 90 00:06:53,177 --> 00:06:58,197 and out in the suburbs, you may believe from the map that things are closer than they actually are. 91 00:06:58,197 --> 00:07:03,806 And so nearly all representation design is about fitness to task. 92 00:07:03,806 --> 00:07:06,878 Here’s a temperature map from the Weather Underground. 93 00:07:06,878 --> 00:07:10,177 It shows the temperature at each location in the Bay area, 94 00:07:10,177 --> 00:07:17,011 geo-referenced so that the temperature number is placed right on top of that physical location. 95 00:07:17,011 --> 00:07:21,302 What do you think are the benefits and drawbacks of this representation? 96 00:07:21,302 --> 00:07:26,039 What’s it good for and what’s it a problem for? 97 00:07:30,670 --> 00:07:35,438 If you know the physical coordinate that you’d like the temperature for, 98 00:07:35,438 --> 00:07:39,376 say for example, “What temperature is it along the coast?” 99 00:07:39,376 --> 00:07:45,707 and I don’t care or don’t know the exact name of the town, this works incredibly well. 100 00:07:45,707 --> 00:07:50,024 It’s also a reasonable interface, in some sense, 101 00:07:50,024 --> 00:07:54,481 for trying to get a good [inaudible] of what are the temperatures like in the area overall: 102 00:07:54,481 --> 00:08:00,425 I can see, for example, as I head inland the temperature tends to get warmer. 103 00:08:00,425 --> 00:08:03,958 There are a lot of ways in which we could make this better. 104 00:08:03,958 --> 00:08:10,147 So, right now, every temperature is shown identically no matter what temperature it is 105 00:08:10,147 --> 00:08:15,138 which means it’s hard to scan: I have to read every single temperature one by one. 106 00:08:15,138 --> 00:08:22,289 I could make this better if instead I had the temperature number somehow — 107 00:08:22,289 --> 00:08:29,088 the color or size of the temperature number — correspond to the weather, 108 00:08:29,088 --> 00:08:36,533 But if we’re going to start mapping the variables of the map to something like color, 109 00:08:36,533 --> 00:08:39,448 we need to be careful to get it right. 110 00:08:39,448 --> 00:08:43,353 This is an example that comes from Edward Tufte. 111 00:08:43,353 --> 00:08:49,315 His books on visual design and the graphical representations of data are fantastic 112 00:08:49,315 --> 00:08:50,856 and I strongly encourage you to read them. 113 00:08:50,856 --> 00:08:58,596 In this map, we see how a computer scientist might make a mapping for Japan. 114 00:08:58,596 --> 00:09:04,235 This is showing the height above or below sea level as color, 115 00:09:04,235 --> 00:09:11,963 and what you can see is the depth below sea level is represented by the color spectrum Roy G. Biv. 116 00:09:11,963 --> 00:09:21,007 Now, one of the challenges of hue is that it’s not an additive representation. 117 00:09:21,007 --> 00:09:26,901 So it’s not really a strong ordering that we give to the colors perceptually. 118 00:09:26,901 --> 00:09:32,961 It’s a substitutive representation that red and yellow are qualitatively different, 119 00:09:32,961 --> 00:09:37,113 but we don’t automatically have a more-than or less-than relationship between the two. 120 00:09:37,113 --> 00:09:39,095 Another problem on this representation, 121 00:09:39,095 --> 00:09:45,093 is it’s very difficult by glance to tell what’s higher and what’s lower in the sea. 122 00:09:45,093 --> 00:09:55,938 Conversely, the individual isosurfaces — the individual chunks of a particular depth — really pop out. 123 00:09:55,938 --> 00:10:01,948 Like, to me, for example, the yellow depth pops out very strongly 124 00:10:01,948 --> 00:10:05,846 and that shape really comes to the attentional fore, 125 00:10:05,846 --> 00:10:10,942 which, if you are making a rock-and-roll poster for the Fillmore, would be awesome, 126 00:10:10,942 --> 00:10:15,266 but if you’re trying to get a sense of the contours of the sea, 127 00:10:15,266 --> 00:10:23,435 what become salient to you, the outlines for what’s 400 meters below sea level, it’s probably just not that relevant. 128 00:10:23,435 --> 00:10:28,301 So how could we continue our theme of using color as a representational cue 129 00:10:28,301 --> 00:10:32,805 but have it be more meaningful than you might see in this case 130 00:10:32,805 --> 00:10:35,565 where we’re mapping it to the color spectrum? 131 00:10:35,565 --> 00:10:39,643 And here’s Edward Tufte’s redesign which I think is much better. 132 00:10:39,643 --> 00:10:43,701 There’s a couple of things that I really like about the representation here. 133 00:10:43,701 --> 00:10:51,250 The first one is that all of the things that are above sea level are brown — are kind of an earth tone. 134 00:10:51,250 --> 00:10:56,566 So, we’re leveraging our intuitions about the physical world and using that metaphorically for the map. 135 00:10:56,566 --> 00:11:00,874 So, the land-colored stuff is land. 136 00:11:00,874 --> 00:11:05,577 Similarly, all of the water is blue — the water-colored stuff is water. 137 00:11:05,577 --> 00:11:15,846 And furthermore, we can see how the intensity — or the luminance — of that color blue changes with depth. 138 00:11:15,846 --> 00:11:22,773 And the deeper blues are darker blue, which corresponds to our physical intuitions. 139 00:11:22,773 --> 00:11:29,651 And of course, water doesn’t really get that much darker at the kind of depth that we’re talking about here — 140 00:11:29,651 --> 00:11:34,762 our intuition about darker colors being deeper comes from much shallower depths. 141 00:11:34,762 --> 00:11:41,224 But the idea — you can leverage this thing that we all know that water right by the shore is a paler color 142 00:11:41,224 --> 00:11:44,359 and as you get more of it gets darker. 143 00:11:44,359 --> 00:11:51,860 So this is a really wonderful way to see that these darker areas here are deeper than these shallow areas here. 144 00:11:51,860 --> 00:11:58,196 With the London Underground map, we saw how the representation of the map — 145 00:11:58,196 --> 00:12:03,869 what makes a good representation — was intrinsically tied up with the task that the user is doing. 146 00:12:03,869 --> 00:12:11,972 Similarly, what makes a good representation is also dependent on what a user’s expertise is. 147 00:12:11,972 --> 00:12:18,696 And a wonderful example of this comes from Herb Simon and [Bill] Chase in the early 1970’s. 148 00:12:18,696 --> 00:12:24,965 They were looking at chess as an exemplar domain for trying to understand expertise. 149 00:12:24,965 --> 00:12:28,229 One of the things that they observed was that 150 00:12:28,229 --> 00:12:34,817 expert chess players were much better at being able to remember the configuration of a chess board. 151 00:12:34,817 --> 00:12:38,580 You can imagine a couple hypotheses for this. 152 00:12:38,580 --> 00:12:46,659 So, one of them would be “Experts were born with a better memory for that kind of thing”; 153 00:12:46,659 --> 00:12:51,643 Or, similarly, “Experts by virtue of their ten thousand hours of training 154 00:12:51,643 --> 00:12:56,656 have trained themselves up to build up that muscle and be very good at remembering that kind of thing.” 155 00:12:56,656 --> 00:13:00,265 Neither turns out to be the case. 156 00:13:00,265 --> 00:13:12,124 Experts are much better at remembering the configuration of a board, but only if it’s an actual game! 157 00:13:12,124 --> 00:13:18,978 So if the configuration of the chess board is the configuration of how a chess board could be, 158 00:13:18,978 --> 00:13:22,332 experts do a fantastic job of remembering it. 159 00:13:22,332 --> 00:13:28,536 But if you arranged the pieces on the board in a way that a chess play could not ever achieve, 160 00:13:28,536 --> 00:13:32,433 the experts actually do no better than novices at all. 161 00:13:32,433 --> 00:13:41,286 And so what we’re seeing is that the ability of experts to chunk things and have higher memory capacity 162 00:13:41,286 --> 00:13:46,597 is because they are able to leverage their knowledge about the domain. 163 00:13:46,597 --> 00:13:51,547 Game design and user interface design are both concerned 164 00:13:51,547 --> 00:13:56,035 with how easy or hard it’s for a user to accomplish a particular task. 165 00:13:56,035 --> 00:14:00,524 The difference is that often designers want to make it hard, the right hard; 166 00:14:00,524 --> 00:14:04,325 and interface game designers want to make things easy. 167 00:14:04,325 --> 00:14:07,024 And so we can learn from this chess example 168 00:14:07,024 --> 00:14:10,047 and we can ask this question as interface designers: 169 00:14:10,047 --> 00:14:13,748 “Can we make interfaces more chunkable?” 170 00:14:13,748 --> 00:14:17,807 Can we make interactions that can be accomplished in one chunk 171 00:14:17,807 --> 00:14:23,665 and therefore place a lower load on our memory and make it easier for users to work with? 172 00:14:23,665 --> 00:14:26,455 A great example of this comes from Bill Buxton 173 00:14:26,455 --> 00:14:32,650 who looked at being able to move text between locations on a document. 174 00:14:32,650 --> 00:14:37,055 And in a common desktop user interface today, 175 00:14:37,055 --> 00:14:43,543 one common operation would be either a keyboard shortcut or a menu command to cut some text, 176 00:14:43,543 --> 00:14:47,588 and then you move the cursor to a new location, and you paste that text. 177 00:14:47,588 --> 00:14:54,479 That’s three different operations and, in between, if you got interrupted, 178 00:14:54,479 --> 00:14:59,351 you might forget what’s in the copy buffer — in fact, I’m sure that’s happened to all of us at some point. 179 00:14:59,351 --> 00:15:08,553 What Buxton realized is what if you could turn all of this, through a gestural interface, into one command? 180 00:15:08,553 --> 00:15:16,572 So, maybe I could grab a text that I want, draw a new location and drop it right there. 181 00:15:16,572 --> 00:15:22,652 That’d be much better: There’s never a time where I could be interrupted and lose state 182 00:15:22,652 --> 00:15:28,092 because all of the state is maintained in this continuous gesture. 183 00:15:28,092 --> 00:15:31,709 We’ve all heard the saying that a picture is worth ten thousand words. 184 00:15:31,709 --> 00:15:39,801 As interface designers, we’re tasked with the project of representing the information to the user 185 00:15:39,801 --> 00:15:44,831 and one task that we commonly have to deal with is: 186 00:15:44,831 --> 00:15:50,518 Should we represent information visually, or should we represent information textually? 187 00:15:50,518 --> 00:15:54,025 The answer of course is it depends. 188 00:15:54,025 --> 00:15:59,909 But one time when representing information visually can be much more effective 189 00:15:59,909 --> 00:16:06,026 is when you can convert slow reasoning tasks into fast perception tasks 190 00:16:06,026 --> 00:16:09,950 by virtue of making them visually salient. 191 00:16:09,950 --> 00:16:15,396 We saw that with the map example: In that case, both the colorings of the map were visual, 192 00:16:15,396 --> 00:16:21,482 but [in] one, it was much easier to just add a glance to understand what’s going on in good coloring. 193 00:16:21,482 --> 00:16:27,095 And the poor coloring, you have to reason about it much more slowly and the wrong things kept popping out. 194 00:16:27,095 --> 00:16:31,365 If you think about a table of numbers, it can often be difficult to see trends, 195 00:16:31,365 --> 00:16:34,936 whereas if you represent that same information visually, 196 00:16:34,936 --> 00:16:40,482 it can often pop out what the high points, the low points, trends, outliers — 197 00:16:40,482 --> 99:59:59,999 all of that becomes salient and automatically visible to you.