0:00:00.630,0:00:04.030 Welcome to CS 101. I'm Dave Evans. I'll be your guide on this journey. 0:00:04.030,0:00:07.047 This course will introduce you to the fundamental ideas in computing 0:00:07.047,0:00:09.563 and teach you to read and write your own computer programs. 0:00:09.563,0:00:13.063 We're going to do all that in the context of building a Web search engine. 0:00:13.063,0:00:16.363 I'm guessing everyone here has at least used a search engine before. 0:00:16.363,0:00:19.562 The goal of the first three units in this course is to build a Web crawler. 0:00:19.562,0:00:22.129 They will collect data from the Web for our search engine. 0:00:22.129,0:00:24.663 And to learn about big ideas in Computing by doing that. 0:00:24.663,0:00:29.680 In Unit 1, we'll get started by extracting the first link on a web page. 0:00:29.680,0:00:32.730 A Web crawler finds web pages for our search engine 0:00:32.730,0:00:37.797 by starting from a "seed" page and following links on that page to find other pages. 0:00:37.797,0:00:43.930 Each of those links lead to some new web page, which itself could have links that lead to other pages. 0:00:43.930,0:00:46.507 As we follow those links, we'll find more and more web pages 0:00:46.507,0:00:50.232 building a collection of data that we'll use for our search engine. 0:00:50.479,0:00:54.712 A web page is really just a chunk of text that comes from the Internet into your Web browser. 0:00:54.712,0:00:56.580 We'll talk more about how that works in Unit 4. 0:00:56.580,0:00:59.563 But for now, the important thing to understand is that 0:00:59.563,0:01:02.497 a link is really just a special kind of text in that web page. 0:01:02.497,0:01:07.347 When you clic on a link in your browser it will direct you to a new page. 0:01:07.347,0:01:09.496 And you can keep following those links (...) 0:01:09.496,0:01:14.213 What we'll do in this Unit is write a program to extract that first link from the web page. 0:01:14.213,0:01:18.213 In later units, we'll figure out how to extract all the links and build their collection for our search engine