Return to Video

Module 10 Part 3 - 3rd party modules

  • 0:02 - 0:06
    John DeCrey: Welcome to week 10.
  • 0:06 - 0:10
    We're going to introduce and
    talk a little bit more about
  • 0:10 - 0:13
    third party modules and
  • 0:13 - 0:16
    what that means and
    how to install it.
  • 0:16 - 0:19
    Then we've got
    several code longs
  • 0:19 - 0:22
    coming up that I
    think will be fun.
  • 0:22 - 0:27
    I'd like to introduce my guest.
  • 0:27 - 0:29
    This is Jerick.
  • 0:29 - 0:34
    We've worked together
    for several years now.
  • 0:34 - 0:38
    When we first meet, was it 2014?
  • 0:38 - 0:41
    Jerik: Yeah. That sounds right.
  • 0:42 - 0:46
    John DeCrey: Anyway Jerick lives
  • 0:46 - 0:50
    and breathes in Python
    and he's done a lot
  • 0:50 - 0:54
    of his own personal
    Python scripts
  • 0:54 - 0:58
    that I'll let him talk about
    and some of them are funny.
  • 0:58 - 1:04
    Then he's an avid motorcyclist,
  • 1:04 - 1:07
    and he'll talk a little
    bit about that too.
  • 1:07 - 1:11
    We'll say, how's the saying go,
  • 1:12 - 1:17
    was it bad decisions
    leads to good stories?
  • 1:17 - 1:18
    Jerik: Yes.
  • 1:18 - 1:22
    John DeCrey: Unfortunately
    I have a lot of stories.
  • 1:22 - 1:25
    Anyway go ahead, Jerick.
  • 1:26 - 1:29
    Jerik: Yes. Just to
    introduce myself,
  • 1:29 - 1:32
    I've been working in
    the software industry
  • 1:32 - 1:36
    for almost 12 years.
  • 1:36 - 1:40
    I've mostly been in the QA side
  • 1:40 - 1:44
    of things that's testing
    and finding bugs.
  • 1:44 - 1:48
    A lot of my job though has
    been creating automated tests.
  • 1:48 - 1:53
    A lot of that has been
    through Python 2.
  • 1:53 - 1:57
    Our coat along exercises
    is going to go
  • 1:57 - 2:02
    through some of the things
    I've learned doing that.
  • 2:02 - 2:06
    I have some slides
    to go through at
  • 2:06 - 2:10
    the beginning though,
    let me just start this.
  • 2:11 - 2:21
    This is third party Python
    modules. I'll just read this.
  • 2:21 - 2:24
    Modular programming refers
    to the process of breaking
  • 2:24 - 2:27
    a large unwieldy programming
    tasks into separate,
  • 2:27 - 2:29
    smaller, more manageable
    sub tasks or modules.
  • 2:29 - 2:31
    Individual modules can then be
  • 2:31 - 2:33
    coubbled together like
    building blocks to create
  • 2:33 - 2:38
    a larger application.
    Just as an example.
  • 2:38 - 2:42
    You guys are probably
    familiar with this when
  • 2:42 - 2:46
    you do these import
    statements like this time.
  • 2:46 - 2:50
    This is a module that's
    built into Python.
  • 2:50 - 2:54
    It's a way that you
    can bring this into
  • 2:54 - 2:56
    your program and utilize
  • 2:56 - 3:00
    everything that's been built
    out around this module.
  • 3:00 - 3:02
    It's just a way of not
  • 3:02 - 3:07
    having to redo work that
    someone else has already done.
  • 3:08 - 3:12
    John DeCrey: We have done
    imports throughout the course.
  • 3:12 - 3:16
    The from is probably
  • 3:16 - 3:19
    a little bit different
    than anyone has seen it.
  • 3:19 - 3:21
    Can you talk about
    that a little bit?
  • 3:21 - 3:24
    Jerik: Yeah. Let me
    go back to that.
  • 3:25 - 3:29
    These firm statements are
  • 3:29 - 3:32
    similar to the
    import statements,
  • 3:32 - 3:41
    but it's digging more
    into the module itself.
  • 3:42 - 3:46
    John DeCrey: Submitting a
    specific item from the import.
  • 3:46 - 3:55
    Jerik: Yeah. Exactly.
    The advantages
  • 3:55 - 3:58
    of modular programming are,
  • 3:58 - 4:01
    that it's more simple, it's
    maintainable and reusable.
  • 4:01 - 4:05
    When you import
    that time module,
  • 4:05 - 4:08
    you don't have to do any
    maintenance on that.
  • 4:09 - 4:13
    The people who run the
    Python project are going to
  • 4:13 - 4:16
    be adding methods to that
    or fixing something.
  • 4:16 - 4:19
    If we changed our time format,
  • 4:19 - 4:21
    they would be the ones going
  • 4:21 - 4:22
    and making sure
    that still worked
  • 4:22 - 4:28
    with all the existing
    Python programs out there.
  • 4:31 - 4:36
    Python, that time module
    is one that's built in.
  • 4:36 - 4:38
    There's a bunch that
    Python has built in.
  • 4:38 - 4:43
    Then there's also a bunch
    of third party modules.
  • 4:44 - 4:50
    These are examples of ones
    that are built into Python.
  • 4:50 - 4:54
    It's just a Python
    modules index.
  • 4:58 - 5:02
    Then these are some common
    third party modules.
  • 5:02 - 5:04
    They're guaranteed,
  • 5:04 - 5:06
    not every one of them
    isn't listed here
  • 5:06 - 5:12
    because I would guess there's
    hundreds of thousands.
  • 5:12 - 5:16
    The thing I love about Python
  • 5:16 - 5:19
    is any problem that
    you're looking to solve,
  • 5:19 - 5:22
    someone on the
    Internet has probably
  • 5:22 - 5:24
    solved it and created
    a module for it.
  • 5:24 - 5:27
    You can just go and
    import that module,
  • 5:27 - 5:30
    and get up and
  • 5:30 - 5:34
    running a lot faster than
    having to do it yourself.
  • 5:34 - 5:36
    John DeCrey: Chances are high.
  • 5:36 - 5:38
    There's a module
    out there for it.
  • 5:38 - 5:40
    The saying there's an app for
  • 5:40 - 5:43
    that in Python, there's
    a module for that.
  • 5:43 - 5:50
    Jerik: Exactly. You guys
  • 5:50 - 5:52
    have probably already
    done this too.
  • 5:52 - 5:54
    John DeCrey: We actually
    have not used pip yet.
  • 5:54 - 5:56
    I'm trying to think,
  • 5:56 - 5:59
    but I don't think that we have.
  • 5:59 - 6:02
    As far as exiting installing.
  • 6:02 - 6:06
    Jerik: To install a
    third party module.
  • 6:06 - 6:11
    The syntax is pip install
    and then the package name.
  • 6:11 - 6:14
    I'll show an example of that
  • 6:14 - 6:17
    later on when we do the
    code along exercises.
  • 6:17 - 6:27
    John DeCrey: We type that
    in at the Python console.
  • 6:32 - 6:37
    Jerik: Now I want to just
    show some examples of
  • 6:37 - 6:38
    different modules that I've used
  • 6:38 - 6:42
    like throughout work and
    personal projects and stuff.
  • 6:44 - 6:47
    This is only three
    lines of code,
  • 6:47 - 6:50
    but it's going to do
    something really cool.
  • 6:50 - 6:54
    This is a module
    called pytesseract.
  • 6:58 - 7:02
    What it is it's image,
  • 7:02 - 7:05
    it's a OCR engine.
  • 7:05 - 7:08
    You can get a picture.
  • 7:08 - 7:10
    This is just a
    picture of a receipt.
  • 7:10 - 7:15
    You can get all the
    text data from it,
  • 7:15 - 7:17
    which is crazy when
  • 7:17 - 7:20
    three lines of code,
    and it can do that.
  • 7:23 - 7:29
    The reason I needed this
    is I worked at a company
  • 7:29 - 7:34
    where we would scrape
    the web for obituaries.
  • 7:34 - 7:39
    Then we sold that
    the obituaries to
  • 7:39 - 7:47
    a well known Utah based
    ancestry website.
  • 7:49 - 7:51
    There's this website.
  • 7:51 - 7:52
    John DeCrey: What company
    you're talking about?
  • 7:52 - 7:57
    Jerik: Yeah. [LAUGHTER] There's
  • 7:57 - 7:59
    this obituary website in Germany
  • 7:59 - 8:02
    that for some reason
    on their web page
  • 8:02 - 8:07
    instead of pasting the
    obituary like as text they had
  • 8:07 - 8:11
    generated like a PNG
    or a picture file
  • 8:11 - 8:15
    of the obituary and posted
    that on the website.
  • 8:15 - 8:20
    We had our scraper go
    and get those images,
  • 8:20 - 8:24
    and it would use pytesseract
    to get the text out of
  • 8:24 - 8:28
    those images, to
    get the obituaries.
  • 8:28 - 8:31
    But I can show this running.
  • 8:31 - 8:34
    We have this picture.
  • 8:34 - 8:37
    John DeCrey: That's a PNG file.
  • 8:37 - 8:41
    Jerik: All this is going
    to do is open that
  • 8:41 - 8:44
    and get the text from
    it and printed out.
  • 8:44 - 8:48
    John DeCrey: We're invoking
    the image to string function,
  • 8:48 - 8:52
    and then passing in the
    image to that function.
  • 8:52 - 8:55
    Jerik: Yeah, so this
    image-to-string function is
  • 8:55 - 8:59
    something that's built into
    this pytesseract module.
  • 8:59 - 9:00
    John DeCrey: Perfect.
  • 9:01 - 9:03
    Jerik: So running.
  • 9:03 - 9:07
    John DeCrey: Well, look at
    that. Yeah, that's cool.
  • 9:07 - 9:09
    That was a fuzzy picture too.
  • 9:09 - 9:13
    It wasn't like, crisp, clean.
  • 9:13 - 9:15
    Jerik: Yeah, it actually
    does pretty good.
  • 9:15 - 9:19
    I was playing around
    with other images but.
  • 9:19 - 9:21
    John DeCrey: Yeah.
    It did a good job.
  • 9:21 - 9:23
    Jerik: To me, it was just crazy.
  • 9:23 - 9:27
    Like that huge task of
    getting obituary data
  • 9:27 - 9:30
    from picture files
    on the Internet
  • 9:30 - 9:33
    was reduced to basically
    three lines of code.
  • 9:33 - 9:35
    John DeCrey: Yeah, seriously.
  • 9:36 - 9:40
    Jerik: It did not take me
    hours upon hours to do that.
  • 9:40 - 9:43
    [LAUGHTER] [OVERLAPPING]
  • 9:43 - 9:45
    John DeCrey: There's
    already, stuff out there,
  • 9:45 - 9:48
    but if you wanted to design
    your own, like for example,
  • 9:48 - 9:52
    you take a picture of receipts
    and have it parse out,
  • 9:52 - 9:55
    saving it to your own data.
  • 9:55 - 9:56
    There's just lots of.
  • 9:56 - 9:59
    Jerik: Yeah, the sky's
    the limit. [LAUGHTER]
  • 9:59 - 10:01
    John DeCrey: Of course
    stuff you can do with that.
  • 10:01 - 10:03
    Actually, if I can
    talk a little bit
  • 10:03 - 10:05
    about that for a minute.
  • 10:05 - 10:10
    We did a project at
    the Uview Hospital.
  • 10:10 - 10:12
    The hospital uses Epic,
  • 10:12 - 10:18
    which is a huge EMR system,
    Electronic Medical Records.
  • 10:23 - 10:27
    Every now and then,
    like nurses and
  • 10:27 - 10:28
    the medical personnel that use
  • 10:28 - 10:30
    it will get an error message.
  • 10:30 - 10:33
    Sometimes the error, the screen,
  • 10:33 - 10:35
    it's like it's a message box.
  • 10:37 - 10:41
    They may not be sure what
    the recovery is or what
  • 10:41 - 10:43
    the procedure is or process
  • 10:43 - 10:46
    is when you encounter
    certain errors.
  • 10:46 - 10:49
    [NOISE] We developed a thing
  • 10:49 - 10:53
    where the medical personnel
    can take a picture from
  • 10:53 - 10:57
    the screen of that error and
  • 10:57 - 11:00
    it translates it
    to text and does
  • 11:00 - 11:02
    a look-up from a database,
  • 11:02 - 11:04
    finds the error code,
  • 11:04 - 11:06
    the error number, and stuff,
  • 11:06 - 11:08
    and then brings up the
    documentation right
  • 11:08 - 11:10
    there on the handheld.
  • 11:10 - 11:11
    Jerik: That's awesome.
  • 11:11 - 11:14
    John DeCrey: Just give more
    information on the error
  • 11:14 - 11:21
    and what best practices moving
    forward from that error.
  • 11:21 - 11:24
    Lots of really great
    opportunities.
  • 11:24 - 11:25
    Like you said,
  • 11:25 - 11:27
    the sky is the limit there.
  • 11:31 - 11:35
    Jerik: Here's another example.
  • 11:35 - 11:42
    Where is it? This is another
    thing from that same job.
  • 11:42 - 11:44
    John DeCrey: Other
    funeral homes?
  • 11:44 - 11:50
    Jerik: Yeah. [LAUGHTER] We
    wanted to make sure that we had
  • 11:50 - 11:53
    all the funeral
    home websites that
  • 11:53 - 11:57
    has the obituary data
    in the US and Canada.
  • 11:59 - 12:02
    This isn't the script itself,
  • 12:02 - 12:03
    but the same idea.
  • 12:03 - 12:08
    A script that would
    go on Google Maps,
  • 12:08 - 12:10
    search for a funeral home in
  • 12:10 - 12:13
    every zip code and
  • 12:13 - 12:15
    return a list of all
    the funeral homes.
  • 12:15 - 12:18
    And then I would
    go and add it to
  • 12:18 - 12:19
    a master list of
  • 12:19 - 12:22
    that would check for like
    duplicates and stuff.
  • 12:22 - 12:24
    John DeCrey: Let's
    talk about that.
  • 12:24 - 12:27
    Just those couple lie to
    code and what it's doing.
  • 12:27 - 12:30
    We got on line four,
  • 12:30 - 12:35
    so you got request and
    get and Google URL,
  • 12:35 - 12:39
    along with an appended
    search string,
  • 12:39 - 12:43
    which is how Google
    searches work.
  • 12:43 - 12:47
    Line five, however,
    looks very cryptic.
  • 12:47 - 12:52
    I know what that is. I
    do my best to avoid it.
  • 12:52 - 12:54
    [LAUGHTER]
  • 12:54 - 12:55
    Jerik: I probably knew
    what this meant at
  • 12:55 - 12:57
    one point too. But
    I don't anymore.
  • 12:57 - 13:01
    John DeCrey: This is called
    regular expressions.
  • 13:03 - 13:07
    Whoever invented the syntax
  • 13:07 - 13:08
    for regular [LAUGHTER]
    expressions,
  • 13:08 - 13:10
    I don't know, should maybe
  • 13:10 - 13:11
    go find something
    else to do instead.
  • 13:11 - 13:13
    [LAUGHTER]
  • 13:13 - 13:14
    Jerik: I agree.
  • 13:15 - 13:18
    John DeCrey: I
    suppose if you spend
  • 13:18 - 13:21
    enough time in it,
    it becomes memory.
  • 13:21 - 13:24
    But anytime you
    have something that
  • 13:24 - 13:28
    you have to have a guide and
    a cheat sheet and stuff,
  • 13:28 - 13:32
    and ledgers and
    whatever you look up,
  • 13:32 - 13:35
    I would say that it failed.
  • 13:36 - 13:42
    That's regular expressions on
  • 13:42 - 13:47
    Line 5 is what that
    stuff is in. Go ahead.
  • 13:48 - 13:51
    Jerik: This is the
    third-party module
  • 13:51 - 13:52
    that I'm using in this one.
  • 13:52 - 13:57
    This request, it
    basically just gets
  • 13:57 - 14:02
    the entire HTML
    from this request,
  • 14:02 - 14:12
    and then this rejects
    URLs out of the HTML.
  • 14:12 - 14:15
    John DeCrey: Just
    for clarification,
  • 14:15 - 14:17
    that's just shorthand for
  • 14:17 - 14:19
    regular expressions
    when you hear
  • 14:19 - 14:22
    the rejects, same thing.
  • 14:23 - 14:26
    Jerik: Then Google
    obviously has some of
  • 14:26 - 14:28
    their own URLs in there,
  • 14:28 - 14:30
    so I filter those out.
  • 14:30 - 14:32
    I'll run this, you can see
  • 14:32 - 14:34
    what [OVERLAPPING]
    it prints out.
  • 14:34 - 14:38
    These are all what
  • 14:38 - 14:42
    shows up when I search
    funeral homes in my zip code.
  • 14:42 - 14:45
    [LAUGHTER]
  • 14:45 - 14:47
    John DeCrey: That's right.
    Then you specified a zip code?
  • 14:47 - 14:50
    Jerik: Yeah. When
    I ran this script,
  • 14:50 - 14:51
    like for my job at that company,
  • 14:51 - 14:53
    I think I just passed it in
  • 14:53 - 14:56
    all the zip codes and just
    let it run overnight.
  • 14:56 - 15:01
    [LAUGHTER] Then in the morning
    had a list of like 20,000.
  • 15:01 - 15:04
    John DeCrey: In Line 4,
  • 15:04 - 15:05
    I just noticed that you're
  • 15:05 - 15:08
    converting your zip
    code to a string.
  • 15:08 - 15:12
    Why wouldn't you just wrap
    it inside of a string?
  • 15:12 - 15:15
    Is there a particular reason?
  • 15:21 - 15:25
    Jerik: I probably copied
    this from my real script and
  • 15:25 - 15:27
    I was probably passing in
  • 15:27 - 15:30
    a variable of [OVERLAPPING]
    the zip codes there.
  • 15:30 - 15:33
    I was just lazy
  • 15:33 - 15:36
    when I brought it over
    to this example one.
  • 15:36 - 15:40
    [LAUGHTER]
  • 15:40 - 15:41
    John DeCrey: It's pretty cool
    that you can do that too.
  • 15:41 - 15:44
    When that returns, it looks like
  • 15:44 - 15:48
    the URLs that on Line 5,
  • 15:48 - 15:52
    all that the results are
    coming into that URL's,
  • 15:52 - 15:54
    that looks like a
    list. Is that right?
  • 15:54 - 15:55
    Jerik: Yeah.
  • 15:55 - 15:57
    John DeCrey: Okay.
  • 15:57 - 16:00
    Jerik: I can even do this.
  • 16:00 - 16:02
    John DeCrey: Oh, I see.
    Then in your fore loop,
  • 16:02 - 16:05
    you're going through that list,
  • 16:05 - 16:06
    and then also making sure
  • 16:06 - 16:09
    that anything that's
    in the bad URLs is
  • 16:09 - 16:14
    not in the URL's
    list. That's cool.
  • 16:16 - 16:19
    Jerik: I have a
    breakpoint right here.
  • 16:19 - 16:22
    Now, this res
    variable will have,
  • 16:22 - 16:28
    this text is going to be
    the entire HTML from that,
  • 16:28 - 16:30
    I don't know, there's a way
    to copy the whole thing.
  • 16:30 - 16:32
    But anyway, it's like
  • 16:32 - 16:39
    a big jumbling of HTML
    that gets returned.
  • 16:40 - 16:42
    John DeCrey: Very cool.
  • 16:44 - 16:49
    Jerik: This next one is a
    personal project that is dumb,
  • 16:49 - 16:51
    but awesome too because
  • 16:51 - 16:54
    it's been [OVERLAPPING]
    paying out.
  • 16:54 - 16:58
    [LAUGHTER] I call this
    one the Being Rewarder.
  • 16:58 - 17:03
    Being the search engine
    has this dumb program
  • 17:03 - 17:07
    where it's called Being Rewards.
  • 17:07 - 17:09
    It gives you these points
  • 17:09 - 17:12
    every day for just
    searching and being.
  • 17:12 - 17:15
    I think you can do
    like 20 searches
  • 17:15 - 17:17
    on desktop and then like 15
  • 17:17 - 17:24
    on mobile being and
    you get points for it.
  • 17:24 - 17:28
    I've made this script that
    just searches being every day.
  • 17:28 - 17:32
    John DeCrey: Then you can redeem
    the points for goods, right?
  • 17:32 - 17:35
    Jerik: Yeah. You can get
    gift cards and stuff.
  • 17:35 - 17:41
    I use it for like
    Xbox game pass on PC,
  • 17:41 - 17:44
    which is like a game service.
  • 17:44 - 17:46
    John DeCrey: What is
    that typically files?
  • 17:46 - 17:52
    Jerik: I have no
    idea. [LAUGHTER] I
  • 17:52 - 17:55
    bet it's like 20 bucks a
    month or 15 bucks a month,
  • 17:55 - 17:57
    probably. Not too expensive.
  • 17:57 - 18:00
    John DeCrey: I have to say
    more like 50 bucks a year.
  • 18:00 - 18:02
    But if you're saying
    25 [LAUGHTER]
  • 18:02 - 18:04
    a month, that's
    pretty expensive.
  • 18:04 - 18:07
    Jerik: This script
    has been working
  • 18:07 - 18:09
    for like six years now.
  • 18:09 - 18:10
    [OVERLAPPING] I thought it would
  • 18:10 - 18:11
    have been shut down forever.
  • 18:11 - 18:13
    John DeCrey: You've
    never paid for it.
  • 18:13 - 18:14
    Because you get the
    being rewarded.
  • 18:14 - 18:15
    Jerik: Yeah, exactly.
  • 18:15 - 18:21
    [LAUGHTER] I'll talk through
    the code a little bit.
  • 18:21 - 18:25
    These functions, actually,
  • 18:25 - 18:28
    I'll run through it, and
    then I'll go through it.
  • 18:28 - 18:31
    What this is going to do
    is just going to open,
  • 18:31 - 18:33
    being in desktop mode,
  • 18:33 - 18:34
    search it three times,
  • 18:34 - 18:37
    and then open it in mobile mode,
  • 18:37 - 18:42
    and search it three
    times. We'll run that.
  • 18:52 - 18:55
    These are just random words.
  • 18:55 - 18:58
    John DeGrey: From
    your generator there.
  • 19:00 - 19:02
    Jerik: I haven't thought
    about this before.
  • 19:02 - 19:02
    This is actually
  • 19:02 - 19:06
    a good example to bring up.
  • 19:06 - 19:09
    I have these generated words,
  • 19:09 - 19:13
    but there was also another
    third party module
  • 19:13 - 19:17
    that was for just
    entering in real.
  • 19:17 - 19:21
    But from the dictionary words.
  • 19:21 - 19:23
    I was using that for a while,
  • 19:23 - 19:29
    but it had to get it from
  • 19:29 - 19:31
    their online database
    or whatever and
  • 19:31 - 19:35
    sometimes it didn't work
    and so I went back to this.
  • 19:35 - 19:36
    John DeGrey: I see.
  • 19:36 - 19:38
    Jerik: That's just
    something you have to
  • 19:38 - 19:41
    deal with with third
    party modules.
  • 19:41 - 19:45
    Sometimes they don't
    work the best, I guess.
  • 19:45 - 19:47
    John DeGrey: Yeah, we saw it run
  • 19:47 - 19:50
    three times for the
    desktop and three times on
  • 19:50 - 19:52
    the mobile and you just
  • 19:52 - 19:55
    run this automatically
    daily, right?
  • 19:55 - 19:56
    Jerik: Yeah.
  • 19:56 - 19:58
    John DeGrey: It's awesome.
  • 20:03 - 20:09
    Jerik: I call this the KSL
    deal finder. It's a script.
  • 20:09 - 20:12
    When you search for
    something you want,
  • 20:12 - 20:15
    like dirt bikes for example,
  • 20:15 - 20:18
    you have your list
    of search results.
  • 20:18 - 20:20
    The top ad is going
    to be the newest.
  • 20:20 - 20:24
    This script gets the top
    ad and then it keeps
  • 20:24 - 20:27
    checking for new ads to show up.
  • 20:27 - 20:32
    If a new ad is
    posted on my script,
  • 20:32 - 20:33
    at least I send an email
  • 20:33 - 20:36
    to myself that
    there was a new ad.
  • 20:36 - 20:42
    It's a way to be the
    first person to message,
  • 20:42 - 20:45
    the person who posts
    what they're selling.
  • 20:45 - 20:46
    John DeGrey: Yeah, especially
    if it's a great deal
  • 20:46 - 20:48
    and I love the story.
  • 20:48 - 20:50
    If you wouldn't mind telling us
  • 20:50 - 20:54
    the great deal you got on
    one of your motorcycles.
  • 20:55 - 20:59
    Jerik: I was in the market
    for a new dirt bike
  • 20:59 - 21:04
    and I had this script
    running and I was up in
  • 21:04 - 21:06
    Heber and was that like
  • 21:06 - 21:08
    eight o'clock at night
    and I was looking at
  • 21:08 - 21:10
    a bike and didn't want it and
  • 21:10 - 21:14
    then eight o'clock I get a
    email from the deal finders.
  • 21:14 - 21:15
    John DeGrey: Eight
    o'clock at night?
  • 21:15 - 21:18
    Jerik: Yes. Of this
    incredible deal.
  • 21:18 - 21:20
    Like crazy deal.
  • 21:20 - 21:25
    But it was down in Grand
    Junction, Colorado.
  • 21:26 - 21:29
    John DeGrey: You're out
    of where? You're at?
  • 21:29 - 21:31
    Jerik: Yeah, I'm in Orem.
  • 21:32 - 21:36
    I think it was like a
    three or four hour drive.
  • 21:36 - 21:38
    But anyways, I'm
    like, ding it now.
  • 21:38 - 21:42
    I have to drive to Grand
    Junction, Colorado tonight.
  • 21:42 - 21:44
    Yeah, I called the guy up.
  • 21:44 - 21:45
    I'm like, I can drive down
  • 21:45 - 21:48
    now but I won't be
    there till like
  • 21:48 - 21:50
    midnight or 01:00 AM
    and he said that was
  • 21:50 - 21:53
    fine and so I start driving down
  • 21:53 - 21:55
    there and it was a blizzard
  • 21:55 - 22:00
    in Spanish Fort Canyon
    of all places, too.
  • 22:00 - 22:04
    I hate that road, but
    I almost turned back.
  • 22:04 - 22:05
    John DeGrey: Oh, is that right.
  • 22:05 - 22:07
    Jerik: Probably like
    three or four times?
  • 22:07 - 22:09
    Yeah. I even pulled over once.
  • 22:09 - 22:11
    John DeGrey: Oh, really?
  • 22:11 - 22:15
    Jerik: Yeah, I almost just
    because the deal was so good,
  • 22:15 - 22:18
    I thought it had to
    have been a scam.
  • 22:18 - 22:20
    John DeGrey: Oh, yeah,
    that's through the whole way
  • 22:20 - 22:23
    down you're thinking,
    please don't be a scam.
  • 22:23 - 22:29
    Jerik: But yeah, I ended
    up picking up the bike.
  • 22:29 - 22:34
    Drove back. Yeah, I still
    have that dirt bike.
  • 22:34 - 22:36
    I could still sell it for
  • 22:36 - 22:38
    1,000 more than I bought it for.
  • 22:38 - 22:39
    John DeGrey: Wow.
  • 22:39 - 22:46
    Jerik: It was one was that
    access data so over six years.
  • 22:46 - 22:50
    John DeGrey: Wow, so
    because of your script,
  • 22:50 - 22:52
    you got an alert to
  • 22:52 - 22:58
    the ad immediately as
    soon as it became on it.
  • 22:58 - 23:01
    Jerik: Yeah, so I was
    definitely the first caller.
  • 23:01 - 23:04
    But I've had like
    buddies use this script
  • 23:04 - 23:08
    too and find deals on stuff too.
  • 23:11 - 23:15
    John DeGrey: Then you still
    went to work that same day?
  • 23:15 - 23:16
    Jerik: Oh, yeah I got back
  • 23:16 - 23:19
    at 04:00 AM and I still went to
  • 23:19 - 23:25
    work and there's absolutely
    no way I could do that now.
  • 23:25 - 23:28
    I think just getting over 30,
  • 23:28 - 23:32
    I'm retired from doing
    stuff like that.
  • 23:32 - 23:34
    John DeGrey: Until the next
    great deal comes along.
  • 23:34 - 23:37
    Jerik: Oh yeah, probably.
  • 23:37 - 23:42
    John DeGrey: By the way,
  • 23:42 - 23:44
    for the listeners here,
  • 23:44 - 23:47
    actually two things for anybody
  • 23:47 - 23:50
    that's out of state
    not familiar with KSL,
  • 23:50 - 23:55
    that KSL is a local
    station here in Utah
  • 23:55 - 24:01
    and they have online
    classified listing.
  • 24:01 - 24:03
    It is probably one
    of the most common
  • 24:03 - 24:05
    and popular in the state.
  • 24:05 - 24:12
    In fact, other people in other
    states use it frequently.
  • 24:13 - 24:18
    It's not as famous as eBay.
  • 24:18 - 24:20
    Well, it may be
    as an auction too
  • 24:20 - 24:24
    but coniine ads and stuff.
  • 24:25 - 24:28
    We're actually
    going to do one of
  • 24:28 - 24:32
    our labs using the
    KSL deal finder.
  • 24:32 - 24:35
    Then just want to do a caution
  • 24:35 - 24:39
    because any time that
  • 24:39 - 24:42
    this is also referred to
    as like screen scraping,
  • 24:42 - 24:45
    anytime that you're doing that,
  • 24:45 - 24:49
    you want to make sure that
    you don't flag yourself as
  • 24:49 - 24:50
    being a bot that could
  • 24:50 - 24:53
    potentially get banned
    from a service.
  • 24:53 - 24:55
    Jerik, I think
  • 24:55 - 24:58
    didn't you get banned
    at one time from KSL?
  • 24:58 - 25:00
    Jerik: Yeah, they banned my IP.
  • 25:00 - 25:03
    John DeGrey: Yeah, so they
    block your IP address.
  • 25:03 - 25:06
    You can't use it anymore?
  • 25:06 - 25:09
    I think we have the timer.
  • 25:09 - 25:13
    This goes into a timer setting,
  • 25:13 - 25:17
    loops and there's
    a sleep function
  • 25:17 - 25:19
    that we can use to sleep for
  • 25:19 - 25:21
    so many seconds and
    so he's got it set to
  • 25:21 - 25:24
    15 which is I'm assuming
    is that 15 seconds?
  • 25:24 - 25:25
    Jerik: Yeah.
  • 25:26 - 25:30
    John DeGrey: That's
    probably well adequate.
  • 25:30 - 25:32
    The thing is, you
    don't want to flag
  • 25:32 - 25:35
    yourself and lower that
    down to maybe even
  • 25:35 - 25:41
    like 10 or five seconds
    because chances are,
  • 25:41 - 25:43
    a lot of these systems
    including KSL,
  • 25:43 - 25:46
    they do have bot detectors and
  • 25:46 - 25:49
    you'll definitely get
    yourself put on the radar.
  • 25:49 - 25:54
    That's just a word of caution
    when using these scripts.
  • 25:56 - 25:59
    Take it away Jarik.
  • 25:59 - 26:04
    Jerik: I don't have any
    examples but there's
  • 26:04 - 26:06
    a couple of other projects that
  • 26:06 - 26:09
    I did that I want to talk about.
  • 26:09 - 26:11
    There's one I called
    the info wall.
  • 26:11 - 26:14
    It was me and my buddies.
  • 26:14 - 26:20
    We did like a Bluetooth
    speaker start up thing that
  • 26:20 - 26:23
    we sold it on Amazon
  • 26:23 - 26:26
    and we wanted a pretty display
  • 26:26 - 26:30
    that showed how many
    sales we got that day.
  • 26:30 - 26:33
    I made a little
    script that would go
  • 26:33 - 26:37
    and look at our Amazon portal
  • 26:37 - 26:39
    and just get the
    sales for that day.
  • 26:39 - 26:41
    Now that I think about
    it, there's probably
  • 26:41 - 26:44
    an API I could have looked
    into to get that data,
  • 26:44 - 26:47
    but this is what I'm
    used to so I did that.
  • 26:47 - 26:49
    John DeGrey: It's fun
    to do your own anyway
  • 26:49 - 26:51
    then you have more control of
  • 26:51 - 26:53
    getting exactly what you want.
  • 26:54 - 26:59
    Jerik: Unfortunately,
    the info wall
  • 26:59 - 27:01
    didn't have very good numbers.
  • 27:01 - 27:02
    We didn't sell very
    many speakers,
  • 27:02 - 27:05
    so that was retired.
  • 27:05 - 27:08
    Another project,
  • 27:08 - 27:10
    this is probably the
    dumbest project I've done.
  • 27:10 - 27:14
    I called it NBA bet buddy.
  • 27:14 - 27:18
    I used to bet on
    NBA games a lot.
  • 27:19 - 27:24
    I tried an experiment
    where I had made
  • 27:24 - 27:29
    a script that every morning
  • 27:29 - 27:33
    it would go and get the
    Vegas odds of games
  • 27:33 - 27:35
    and then compare it against
  • 27:35 - 27:38
    expert picks for
    games that night.
  • 27:38 - 27:41
    Then anytime there was
    a mismatch between
  • 27:41 - 27:43
    the odds and the expert picks,
  • 27:43 - 27:46
    it would send those to me
  • 27:46 - 27:47
    because those are
    supposedly supposed to
  • 27:47 - 27:49
    be like good ones to bet on.
  • 27:49 - 27:52
    End of the story, they were not
  • 27:52 - 27:56
    so that project is retired too.
  • 27:56 - 27:58
    John DeGrey: Still
    a lot of risk?
  • 27:58 - 28:04
    Jerik: Yes. Vegas knows how
    to make their money still.
  • 28:05 - 28:07
    John DeGrey: That's funny.
  • 28:07 - 28:11
    Jerik: All right. We're good
  • 28:11 - 28:14
    to start the code-along
    exercises now?
  • 28:14 - 28:17
    John DeGrey: Yeah,
    let's jump into it.
  • 28:17 - 28:18
    What's our first code-along.
  • 28:18 - 28:22
    Jerik: This one is going
    to be a simple one.
  • 28:23 - 28:26
    I'll show you what
    this is going to do.
  • 28:26 - 28:29
    John DeGrey: We're going
    to do this ski report.
  • 28:29 - 28:32
    Jerik: Yeah. What this
    script is going to do,
  • 28:32 - 28:35
    it's going to go
    to this web page,
  • 28:35 - 28:38
    SkiUtah.com snow report.
  • 28:38 - 28:41
    This is a good day to do it.
  • 28:41 - 28:44
    This is October 23.
  • 28:44 - 28:46
    We just had some snow.
  • 28:46 - 28:48
    John DeGrey: Actually got
    some snow last 24 hours.
  • 28:48 - 28:52
    Jerik: We got some
    data to extract.
  • 28:52 - 28:54
    This is what our
    script, it's just going
  • 28:54 - 28:56
    to get for snowbird,
  • 28:56 - 28:58
    the 24 hour snowfall,
  • 28:58 - 29:00
    48 and then the base
  • 29:00 - 29:02
    and since this is
    the first snowfall,
  • 29:02 - 29:04
    these are going to be the same.
  • 29:04 - 29:06
    But that is all right.
  • 29:12 - 29:16
    Let me make a new file.
  • 29:18 - 29:22
    John DeGrey: You're just
    creating a new game pip file?
  • 29:22 - 29:30
    Jerik: Yeah. Then let's see.
  • 29:30 - 29:33
    This is where we're going to use
  • 29:33 - 29:35
    the pip install syntax
  • 29:35 - 29:38
    to install the
    modules that we need.
  • 29:38 - 29:43
    The first one we need
    is called selenium,
  • 29:45 - 29:50
    so you just type pip install
    selenium into your terminal.
  • 29:50 - 29:53
    John DeGrey: Most
    people, I don't know
  • 29:53 - 29:59
    if anyone's using
    the IDE you're in,
  • 29:59 - 30:01
    most of us is in PyCharm.
  • 30:01 - 30:05
    If you open up the
    Python console,
  • 30:06 - 30:09
    which is in PyCharm,
  • 30:09 - 30:11
    down at the bottom you'll see
  • 30:11 - 30:15
    a terminal and there
    will be Python console.
  • 30:15 - 30:18
    You can also do it
    from the terminal,
  • 30:18 - 30:21
    but you still have
    to be in Python.
  • 30:21 - 30:26
    But you go and there's
    a PIP, install.
  • 30:26 - 30:30
    Jerik: Yeah, then this
  • 30:30 - 30:34
    will say that I already
    have it installed.
  • 30:34 - 30:37
    But if you don't,
    it will install it.
  • 30:37 - 30:39
    John DeGrey: See how
    do you spell that?
  • 30:41 - 30:45
    Jerik: S-E-L-E-N-I-U-M.
  • 30:50 - 30:52
    John DeGrey: Okay.
  • 30:52 - 30:54
    John DeCrey: Mine
    says the same thing.
  • 30:54 - 30:56
    Already set aside.
  • 30:57 - 31:01
    Jerik: So Selenium.
    That's the module that
  • 31:01 - 31:06
    you're using to hook
    to the web page,
  • 31:06 - 31:10
    can navigate web pages with
    it like click on buttons
  • 31:10 - 31:12
    or Enter and search
  • 31:12 - 31:15
    bars like it did for
    the being rewarder,
  • 31:15 - 31:18
    or you can also get
    text from it too,
  • 31:18 - 31:21
    so that's what we're going
    to use for this script.
  • 31:21 - 31:25
    We do need another module,
  • 31:25 - 31:27
    this one's called
    WebDriverManager,
  • 31:27 - 31:34
    so you do pip, install
    web driver-manager.
  • 31:36 - 31:40
    John DeCrey: Web driver
    and then hyphen,
  • 31:40 - 31:42
    is there a space?
  • 31:42 - 31:44
    Jerik: No space,
    just dash manager.
  • 31:44 - 31:46
    John DeCrey: Dash manager.
  • 31:51 - 31:55
    Jerik: So when I was
    getting this lesson
  • 31:55 - 32:01
    ready yesterday, I noticed that.
  • 32:01 - 32:04
    When I was running the scripts,
  • 32:04 - 32:06
    I had to add one more package,
  • 32:06 - 32:09
    I don't know why,
    but just in case,
  • 32:09 - 32:12
    I might as well install it,
  • 32:12 - 32:15
    it's called packaging,
  • 32:15 - 32:18
    so you do pip install packaging,
  • 32:27 - 32:31
    and now we should be good to go.
  • 32:32 - 32:37
    John DeCrey: Install packaging?
  • 32:37 - 32:41
    Jerik: Yeah.
  • 32:42 - 32:46
    John DeCrey: Requirement
    already satisfied. Cool.
  • 32:47 - 32:52
    Jerik: Then for each of
    these Selenium scripts,
  • 32:52 - 32:54
    I'm just going to copy
  • 32:54 - 32:57
    some stuff that gets
    everything set up.
  • 32:59 - 33:01
    I don't know if I
    should leave it on
  • 33:01 - 33:03
    the screen for a while, John.
  • 33:03 - 33:05
    I'm good to keep going.
  • 33:05 - 33:08
    John DeCrey: Actually,
    if you wouldn't mind
  • 33:08 - 33:11
    just bumping your
    font up a little bit,
  • 33:11 - 33:15
    I've been using 18.
  • 33:16 - 33:18
    Jerik: How is that?
  • 33:18 - 33:20
    John DeCrey: There you
    go. That looks great.
  • 33:22 - 33:24
    What are the instructions?
  • 33:24 - 33:25
    Instructions are to just
  • 33:25 - 33:28
    copy what you got
    there on the screen?
  • 33:28 - 33:31
    Jerik: You can
    pause now and copy.
  • 33:31 - 33:34
    John DeCrey: People
    watching this,
  • 33:34 - 33:36
    you can just pause the video,
  • 33:36 - 33:39
    go ahead and enter
    all that stuff
  • 33:39 - 33:44
    in your script and
    then go ahead, Jack.
  • 33:45 - 33:47
    Do you want to explain?
  • 33:47 - 33:52
    Jerik: Yes, so I'll just
    explain some of these.
  • 33:52 - 33:57
    This driver object it's
  • 33:57 - 34:00
    basically the browser that
  • 34:00 - 34:01
    we're going to be
    interacting with,
  • 34:01 - 34:05
    so you can pass it in
    a bunch of arguments,
  • 34:05 - 34:07
    one of them is headless,
  • 34:07 - 34:08
    what headless does
    is it makes it
  • 34:08 - 34:11
    so it doesn't bring
    up the browser.
  • 34:11 - 34:13
    Makes the script run faster,
  • 34:13 - 34:17
    because it times have to
    load images and stuff.
  • 34:17 - 34:20
    John DeCrey: You
    are still running,
  • 34:20 - 34:21
    but you don't see it.
  • 34:21 - 34:24
    It's running as a
    back end process,
  • 34:24 - 34:27
    so there's no user interfaces.
  • 34:28 - 34:30
    Jerik: When I'm
    building scripts,
  • 34:30 - 34:32
    I'll usually comment
    this out so I can see
  • 34:32 - 34:35
    what's going on as it runs.
  • 34:36 - 34:41
    Then this, I just had to add
  • 34:41 - 34:46
    because I don't know if
    it's chrome or Selenium,
  • 34:46 - 34:51
    just logs, or these two lines I
  • 34:51 - 34:53
    guess make it so it doesn't spit
  • 34:53 - 34:58
    out logs that look scary.
  • 35:03 - 35:08
    This is all setting
    up this driver,
  • 35:08 - 35:10
    which is the web driver
  • 35:10 - 35:13
    or the browser that we're
    going to interact with.
  • 35:13 - 35:15
    John DeCrey: The driver
    is what that's basically
  • 35:15 - 35:19
    what's driving the
    screen scraping.
  • 35:19 - 35:23
    That's what's loading
    the web browser thing,
  • 35:26 - 35:30
    Jerik: To navigate to a
    web page using Selenium,
  • 35:30 - 35:33
    you do driver.get,
  • 35:36 - 35:40
    and then in this case
  • 35:40 - 35:43
    we're going to go
    to this web page,
  • 35:50 - 35:56
    and then there's better
    ways to do this,
  • 35:56 - 35:59
    but this makes it a lot simpler.
  • 35:59 - 36:02
    I'm just going to
    put a sleep there,
  • 36:02 - 36:03
    so this is going to
    go to that web page
  • 36:03 - 36:05
    and sleep for five seconds,
  • 36:05 - 36:08
    this sleep is just
    going to make sure that
  • 36:08 - 36:12
    everything's loaded before
    we try and do stuff with it.
  • 36:14 - 36:16
    John DeCrey: Then
    five means we're
  • 36:16 - 36:18
    sleeping it for five seconds.
  • 36:22 - 36:24
    Jerik: The first thing
    we want to get is
  • 36:24 - 36:28
    this 24 hour snowfall.
  • 36:28 - 36:38
    To do that we can set
    it as a variable,
  • 36:38 - 36:42
    so I'll just do our 24 equals,
  • 36:42 - 36:46
    and then you do everything
    through the driver.
  • 36:46 - 36:49
    driver.find_element.
  • 36:54 - 36:56
    All these things on the page,
  • 36:56 - 36:58
    they're called elements,
  • 37:00 - 37:05
    and then with Selenium,
  • 37:05 - 37:07
    there's different
    ways you can find
  • 37:07 - 37:09
    or you can hook to the elements,
  • 37:09 - 37:12
    CSS selectors is a
    really popular one,
  • 37:12 - 37:16
    but I've used different
    ones like x path,
  • 37:16 - 37:18
    I've had to use a few times,
  • 37:18 - 37:23
    it just depends on
    what works are not,
  • 37:24 - 37:27
    so we'll do CSS sector.
  • 37:31 - 37:39
    Then just a quick overview
    on CSS selectors,
  • 37:39 - 37:42
    so this is the page source,
  • 37:42 - 37:45
    over on the right, you can
    click this little button,
  • 37:45 - 37:47
    it's called the element picker.
  • 37:47 - 37:50
    John DeCrey: How did
    you tell [OVERLAPPING]?
  • 37:50 - 37:51
    Jerik: Sorry.
  • 37:51 - 37:52
    John DeCrey: How
    you even got there?
  • 37:52 - 37:54
    Jerik: So when you
    have chrome open,
  • 37:54 - 37:57
    you can press F12 and
    we'll bring this up,
  • 37:57 - 37:58
    it's called the DevTools.
  • 37:58 - 38:00
    John DeCrey: And for mac users?
  • 38:00 - 38:04
    Jerik: I think it's F12
    on Mac 2, isn't it?
  • 38:04 - 38:06
    John DeCrey: There's
    a menu option too
  • 38:06 - 38:11
    from the Chrome menu.
  • 38:11 - 38:13
    Jerik: Inspect, I think.
  • 38:13 - 38:15
    John DeCrey: Inspect.
  • 38:19 - 38:24
    Jerik: Excuse me. So
    this element picker.
  • 38:24 - 38:26
    John DeCrey: So you
    click that little icon
  • 38:26 - 38:27
    in the far left?
  • 38:27 - 38:28
    Jerik: Yeah.
  • 38:28 - 38:30
    John DeCrey: That's cool.
    Then you can hover over
  • 38:30 - 38:34
    the web page and it goes
    right into the elements?
  • 38:34 - 38:38
    Jerik: Yeah. For this exercise,
  • 38:38 - 38:40
    I'm going to show you
    a really easy way to
  • 38:40 - 38:43
    get the selector for
    whatever element you want.
  • 38:43 - 38:45
    You click on what you
  • 38:45 - 38:49
    want and it highlights
    it here in the source,
  • 38:49 - 38:53
    and then you can right
    click on this and do copy,
  • 38:53 - 38:57
    select that and that
    copies the CSS selector.
  • 38:57 - 39:02
    You can just paste
    that in there.
  • 39:02 - 39:04
    John DeCrey: That's really cool.
  • 39:11 - 39:16
    Jerik: So if we left it
    like this and ran it,
  • 39:16 - 39:21
    this hour 24 variable would
    be the Selenium element,
  • 39:21 - 39:24
    but we need to get the
    text from that element,
  • 39:24 - 39:27
    so to do that,
  • 39:29 - 39:33
    we're going to use.getattribute,
  • 39:33 - 39:39
    and then here you do inner HTML.
  • 39:59 - 40:01
    John DeCrey: And by
    the way, for anybody
  • 40:01 - 40:03
    that might be having problems
  • 40:03 - 40:09
    getting that whole
    CSS selector stuff,
  • 40:09 - 40:11
    I do have it posted
    in the assignment
  • 40:11 - 40:14
    in Canvas that you can copy.
  • 40:14 - 40:17
    I encourage you to try to follow
  • 40:17 - 40:20
    Jerrek getting it live
    from the website,
  • 40:20 - 40:22
    but just as a backup plan,
  • 40:22 - 40:26
    you can get it from
    the Canvas page too.
  • 40:28 - 40:33
    Jerik: Next we want
    48 hour total.
  • 40:36 - 40:39
    This is going to
    be really similar
  • 40:39 - 40:41
    just a different selector,
  • 40:41 - 40:48
    so driver.find_element
    by CSS selector,
  • 40:48 - 40:53
    and I'm just going to
    add the get attribute,
  • 40:53 - 40:55
    inner HTML here,
  • 40:58 - 41:02
    then do the same thing to get
  • 41:02 - 41:06
    that selector, click on that,
  • 41:06 - 41:11
    right click, copy selector,
  • 41:15 - 41:18
    and put that in there.
  • 41:26 - 41:29
    John DeCrey: Do it
    once for the 24 hour
  • 41:29 - 41:31
    and then twice for the 48.
  • 41:31 - 41:35
    Jerik: Then, same
    thing for the base.
  • 42:08 - 42:13
    Now this should have those
    values there, the 19.
  • 42:13 - 42:17
    Now we just want to print
    it out so we can make
  • 42:17 - 42:22
    a report variable and
    build out a report.
  • 42:22 - 42:28
    Let's see, 24-hour snowfall.
  • 43:01 - 43:04
    Then the base,
  • 43:13 - 43:18
    and then we can print
    out the report.
  • 43:28 - 43:34
    John DeCrey: Just to
    explain like line 25
  • 43:34 - 43:40
    using catenation and like
    the slash being there.
  • 43:40 - 43:42
    We have talked about that.
  • 43:42 - 43:46
    Another kind of cool away we
    could do it would be using
  • 43:46 - 43:49
    the string tripolation and
  • 43:49 - 43:57
    then the triple quotes
    rather to get the literal.
  • 43:58 - 44:02
    Anyway, that's all that's doing.
  • 44:03 - 44:05
    Jerik: I'm going
    to try this now.
  • 44:05 - 44:10
    I am running it, this
  • 44:10 - 44:12
    is going to be a boring
    one because it's
  • 44:12 - 44:13
    just bringing up this web page.
  • 44:13 - 44:17
    I noticed even though
    that box is there,
  • 44:17 - 44:19
    the page in the
    background is still
  • 44:19 - 44:22
    loaded everything.
    It doesn't matter.
  • 44:22 - 44:24
    John DeCrey: That's cool.
  • 44:24 - 44:29
    Jerik: If we did need to click
    on things for our script,
  • 44:29 - 44:32
    I bet we would have had
    to get rid of that.
  • 44:32 - 44:36
    This anticipated
    opening dates thing.
  • 44:36 - 44:38
    That's just things
    that you have to deal
  • 44:38 - 44:42
    with writing scripts like this.
  • 44:42 - 44:45
    That is that script all done.
  • 44:45 - 44:49
    John DeCrey: I was just
  • 44:49 - 44:51
    going to say you
    can change it to
  • 44:51 - 44:54
    headless so we can see
    the difference there.
  • 44:54 - 44:56
    Jerik: Yeah, it's just
    not going to bring up
  • 44:56 - 44:58
    the browser window this time
  • 44:58 - 45:02
    and it should just
    post the data there.
  • 45:04 - 45:07
    John DeCrey: There we go.
  • 45:15 - 45:16
    This is
  • 45:16 - 45:19
    one of the card
    along assignments.
  • 45:19 - 45:21
    If you get it running
    and you're good with
  • 45:21 - 45:23
    this before we move on,
  • 45:23 - 45:27
    you can submit your
    script in the canvas,
  • 45:27 - 45:30
    and then, we can move
    on to the next one.
  • 45:33 - 45:41
    Jerik: The next one. Where is
  • 45:41 - 45:44
    the deals page? That was weird.
  • 45:46 - 45:49
    This one is going to go
    to the Amazon Deal of
  • 45:49 - 45:53
    the Day page and this
    script is going to
  • 45:53 - 46:02
    get these deal titles along
    with the link to the deal,
  • 46:02 - 46:05
    and it's going to
    just print them out,
  • 46:05 - 46:07
    or we're going to save them
  • 46:07 - 46:10
    as a dictionary and
    then print them out.
  • 46:23 - 46:27
    Same thing with this one.
  • 46:27 - 46:31
    You can copy it over
    from the last script.
  • 46:31 - 46:36
    This is just stuff to get
    everything set up and
  • 46:36 - 46:44
    then we're going to
  • 46:44 - 46:47
    do driver.get to
    go to that page.
  • 46:52 - 46:56
    Instead of using this URL,
  • 46:56 - 46:59
    I'm just going to use this.
  • 46:59 - 47:03
    This seems like it will
    work in the future better.
  • 47:06 - 47:10
    It's Amazon.com/deal.
  • 47:13 - 47:15
    Then same thing.
  • 47:15 - 47:17
    We're going to let it sleep for
  • 47:17 - 47:22
    five seconds to make sure
    everything is loaded.
  • 47:27 - 47:32
    This script is going
    to be a little more
  • 47:32 - 47:37
    complicated and here's
    the thought process
  • 47:37 - 47:40
    to why we need to
    do it this way.
  • 47:41 - 47:44
    For this one or for
    the last script,
  • 47:44 - 47:48
    we just got a
    single element from
  • 47:48 - 47:53
    Selenium using that
    fine-by CSS thing.
  • 47:53 - 47:57
    But this we're going to want
    to get every single one
  • 47:57 - 48:00
    of these titles and links.
  • 48:00 - 48:08
    I kind of see here in the
    source how these are set up.
  • 48:08 - 48:12
    They're each in their
    own little div.
  • 48:12 - 48:17
    We're going to find a selector
    that gets all the divs.
  • 48:17 - 48:24
    If I click here and then
    look in the source code,
  • 48:24 - 48:26
    you can hover over these,
  • 48:26 - 48:35
    and what I'm looking for now
    is something that is unique,
  • 48:35 - 48:39
    that I can hook to
    with the CSS selector.
  • 48:51 - 48:54
    Sorry, there's a lot here.
    I want to make sure I
  • 48:54 - 48:57
    find the right one.
  • 49:04 - 49:07
    I'm going to find a
    selector that finds
  • 49:07 - 49:09
    this div that's highlighted,
  • 49:09 - 49:13
    and then we can
  • 49:13 - 49:17
    drill down and get the
    title and links from that.
  • 49:19 - 49:26
    We'll make a variable
    called deal divs,
  • 49:28 - 49:31
    be similar to the last one.
  • 49:33 - 49:36
    This time though is really
  • 49:36 - 49:38
    important and in
    the last classes,
  • 49:38 - 49:40
    a few people always miss this.
  • 49:40 - 49:42
    You want to make sure
    this one is plural.
  • 49:42 - 49:44
    Let's find elements.
  • 49:44 - 49:51
    John DeCrey: Element with
    an S. Easy to get confused.
  • 49:51 - 49:54
    If at the end
  • 49:54 - 49:57
    have some issues it's
    not running as expected,
  • 49:57 - 49:58
    go back to this line,
  • 49:58 - 50:00
    to the line that you
    have, and make sure
  • 50:00 - 50:02
    that it's elements
    with S.element.
  • 50:10 - 50:14
    Jerik: Going back to this CSS,
  • 50:14 - 50:16
    I can search for this class.
  • 50:17 - 50:19
    I'll show you
    another cool thing.
  • 50:19 - 50:21
    When you have Chrome
    dev tools open,
  • 50:21 - 50:23
    you can press Control
    F that brings up
  • 50:23 - 50:28
    this search bar that you
    can search for selectors.
  • 50:28 - 50:30
    I use this a lot to
  • 50:30 - 50:35
    test selectors to make sure
    I'm getting the right thing.
  • 50:35 - 50:38
    I'm just going to type
    in the selector I
  • 50:38 - 50:41
    am thinking I want to use here.
  • 50:46 - 50:50
    These selectors will be
    posted to, right, John?
  • 50:50 - 50:53
    John DeCrey: We can post
    it. Your mouse cursor is
  • 50:53 - 50:57
    kind of covering what
    you're typing there.
  • 50:58 - 51:02
    Jerik: This isn't too
    important to the lessons.
  • 51:02 - 51:06
    Just in case you're
    interested on
  • 51:06 - 51:10
    like my thought process
    of finding selectors.
  • 51:12 - 51:17
    Then it's that deal
    grid item module.
  • 51:17 - 51:22
    This star equals is
    searching for a wild card,
  • 51:22 - 51:25
    so it's going to do any
    class containing this.
  • 51:32 - 51:35
    We can see that it's
  • 51:35 - 51:41
    pining these divs,
    which is good.
  • 51:42 - 51:49
    Then this found them all.
  • 51:49 - 51:53
    I'm going to go one
    more down though.
  • 51:53 - 51:57
    To do that, let's do read of n,
  • 51:57 - 52:03
    and then div then it's
    going down one more level.
  • 52:03 - 52:07
    The reason I like this
    search thing to test
  • 52:07 - 52:11
    selectors is now I
    can go and make sure.
  • 52:11 - 52:15
    John DeCrey: Narrow
    it down and confirm
  • 52:15 - 52:17
    that you got the right one.
  • 52:19 - 52:21
    Jerik: Now when this runs,
  • 52:21 - 52:28
    instead of finding one of these,
  • 52:28 - 52:33
    this is setting all of
    these to that variable.
  • 52:36 - 52:39
    So that is that.
  • 52:39 - 52:41
    Now we'll make an empty list for
  • 52:41 - 52:52
    the titles and an empty
    list for the links too.
  • 52:54 - 53:02
    Then we can iterate through
  • 53:02 - 53:06
    the list of those
    div elements and get
  • 53:06 - 53:11
    out what we need and
    append it to these lists.
  • 53:11 - 53:19
    So to do that, we'll just do
    for deal-div in deal_div.
  • 53:22 - 53:24
    John DeCrey: Just pay attention
  • 53:24 - 53:26
    on one that is without the S,
  • 53:26 - 53:28
    the one right after the for
  • 53:28 - 53:30
    loop is your variable
    in the fourth in
  • 53:30 - 53:34
    that loop just to not
    get confused there.
  • 53:39 - 53:47
    Jerik: We'll go deal titles.
  • 53:47 - 53:55
    I wanted to make
    that [inaudible].
  • 53:55 - 53:58
    Deal titles, append.
  • 54:04 - 54:09
    This deal_div now when it gets
    to this point is going to
  • 54:09 - 54:13
    be a single div.
  • 54:13 - 54:14
    So the first time it
    iterates through it,
  • 54:14 - 54:18
    it's going to be this first one.
  • 54:21 - 54:25
    I'm going to expand
    this because we want
  • 54:25 - 54:30
    to get the title.
  • 54:44 - 54:47
    Let me do this.
  • 54:49 - 54:52
    This text right here.
  • 54:52 - 54:54
    These smart DIY and home tools,
  • 54:54 - 54:56
    that's what we're trying to get.
  • 54:56 - 55:01
    This class looks like
    something we could hook to.
  • 55:01 - 55:04
    We could do another wild card
    to search so we don't have
  • 55:04 - 55:07
    to get this random.
  • 55:08 - 55:15
    We'll do deal_div.find element.
  • 55:15 - 55:17
    This is going to be single
  • 55:17 - 55:21
    now because we find in
    the single element now.
  • 55:21 - 55:23
    John DeCrey: Yes.
    Make sure it's find
  • 55:23 - 55:25
    element, not elements.
  • 55:25 - 55:26
    Jerik: Yes.
  • 55:26 - 55:48
    Then okay,
  • 55:48 - 55:56
    so from what was that first one?
  • 55:56 - 56:00
    We found this one and then
    we drill down to that.
  • 56:00 - 56:07
    So from there we need
    to go down and a tag,
  • 56:12 - 56:16
    so we need to go to this a tag.
  • 56:21 - 56:29
    So to do that, we'll
    do a.a-text-normal.
  • 56:31 - 56:39
    That will bring us
    down into this class.
  • 56:39 - 56:41
    Then we need to go down one
  • 56:41 - 56:45
    more into this deal
    content module thing.
  • 56:50 - 56:53
    We can do a wildcard
    search on that just
  • 56:53 - 56:58
    to make sure we're
    getting the right thing.
  • 57:19 - 57:26
    Then before that last
    closing parenthese do.text,
  • 57:26 - 57:29
    this will get the text from it.
  • 57:31 - 57:34
    This.text is actually
    doing the same thing
  • 57:34 - 57:39
    as the get.attribute interHTML.
  • 57:39 - 57:41
    For some reason, text would
  • 57:41 - 57:43
    not work on this website though.
  • 57:43 - 57:45
    I have no idea why.
  • 57:45 - 57:48
    I will do a breakpoint and
  • 57:48 - 57:51
    this hour 24 would have
    the text in there,
  • 57:51 - 57:53
    but for some reason when I would
  • 57:53 - 57:55
    do hour 24, that text
    it wouldn't work.
  • 57:55 - 57:57
    I have no idea why.
  • 58:04 - 58:06
    Then now
  • 58:06 - 58:10
    we just need to
    get the link deal,
  • 58:10 - 58:14
    links start, append,
  • 58:14 - 58:17
    and then using
  • 58:17 - 58:24
    that same deal_div
    find_element, singular.
  • 58:24 - 58:40
    Then this one is right here.
  • 58:40 - 58:42
    So we just need to
    drill down once
  • 58:42 - 58:51
    from that original div.
  • 58:53 - 59:03
    Then this one we're
  • 59:03 - 59:05
    going to do get attribute.
  • 59:05 - 59:15
    The attribute we want is this
    HF that has the link to it.
  • 59:22 - 59:31
    That's that. This is
  • 59:31 - 59:34
    going to be a
    dictionary in the end.
  • 59:34 - 59:41
    Title with links equals.
  • 59:41 - 59:49
    This dictionary zip thing is
    just something I found that,
  • 59:49 - 59:51
    where you can take two lists and
  • 59:51 - 59:55
    this merges them
    into a dictionary,
  • 59:56 - 60:06
    passing the deal_titles
    and deal_links,
  • 60:06 - 60:14
    and then we can print the
    title with the links.
  • 60:14 - 60:17
    John DeCrey: So ultimately that
  • 60:17 - 60:20
    becomes a dictionary title?
  • 60:20 - 60:22
    Jerik: Yep. This
    title with links is
  • 60:22 - 60:25
    a dictionary that
    has both of those.
  • 60:40 - 60:42
    There it is.
  • 60:42 - 60:43
    John DeCrey: Hey, look at that.
  • 60:43 - 60:45
    Jerik: Sweet.
  • 60:56 - 60:59
    John DeCrey: See, can
    I get the price too?
  • 60:59 - 61:03
    Was that in there?
  • 61:07 - 61:09
    Jerik: They used to have
    like one item per price,
  • 61:09 - 61:12
    but now they have these
  • 61:12 - 61:14
    are going to be a bunch
    of deets in there.
  • 61:14 - 61:16
    John DeCrey: Fifteen
    percent off.
  • 61:16 - 61:17
    Jerik: Yes.
  • 61:17 - 61:19
    John DeCrey: Interesting.
  • 61:20 - 61:24
    Jerik: Yes, there
    is that script.
  • 61:25 - 61:30
    Should we go ahead and
    move on to the KSL one?
  • 61:30 - 61:33
    John DeCrey: Let me talk
    about this for a minute.
  • 61:36 - 61:39
    There's actually
    business models that
  • 61:39 - 61:42
    are around this thing.
  • 61:42 - 61:45
    There's companies
    that have that screen
  • 61:45 - 61:49
    scrap Amazon's site and
    showing the trends of
  • 61:49 - 61:57
    pricing over time because
    sometimes you can look at
  • 61:57 - 62:01
    certain products that
    might be cheaper
  • 62:01 - 62:07
    in some time part of
    the year than others.
  • 62:07 - 62:11
    Where you can maybe make a wise,
  • 62:11 - 62:13
    maybe a smarter decision
    on when to buy something,
  • 62:13 - 62:15
    especially on the cost.
  • 62:15 - 62:21
    But even like business
    opportunities are there as well.
  • 62:21 - 62:26
    I do want to also caution
    a little bit with ethics
  • 62:26 - 62:31
    because with knowledge is
  • 62:31 - 62:34
    power and it comes
    responsibility.
  • 62:34 - 62:37
    There is the ethics of things,
  • 62:37 - 62:40
    especially screen scraping and
  • 62:40 - 62:43
    other people's content
    and stuff like that,
  • 62:43 - 62:45
    so just something
    else to be aware
  • 62:45 - 62:50
    and consider on
    certain projects.
  • 62:50 - 62:54
    But I think that's all I
    have to say about that.
  • 62:54 - 62:58
    Actually at one time I did have
  • 62:58 - 63:03
    some code fragment to
    a pen to this to take
  • 63:03 - 63:07
    all the findings and put it
    into a vocal database like
  • 63:07 - 63:12
    SQL-lite but I think
    we'll skip that for now.
  • 63:12 - 63:16
    But we could easily extend
    this script and save
  • 63:16 - 63:22
    it to a database or even to
    a CSV file or something.
  • 63:22 - 63:26
    That's pretty much for all the
    scripts that we're giving.
  • 63:26 - 63:30
    We could easily save off.
  • 63:30 - 63:34
    You want to move on
    to the next one?
  • 63:34 - 63:35
    Jerik: Yes.
  • 63:35 - 63:36
    John DeCrey: Go ahead and if you
  • 63:36 - 63:38
    want to submit this one into
  • 63:38 - 63:44
    canvas in the Amazon I
    believe it's just called,
  • 63:46 - 63:48
    Amazon Web Scrape Lab,
  • 63:48 - 63:51
    so you can submit that into
    that assignment in Canvas and
  • 63:51 - 63:55
    then we'll go to the next one.
  • 63:57 - 64:02
    Jerik: This one is
  • 64:09 - 64:13
    the one my testing I
  • 64:13 - 64:15
    found that furniture gets
    posted the most often.
  • 64:15 - 64:20
    [NOISE] What this is
  • 64:20 - 64:24
    going to do is it's going
    to go to this page,
  • 64:24 - 64:26
    like we're searching
    for furniture to buy,
  • 64:26 - 64:29
    it's going to get
    this first listing
  • 64:29 - 64:32
    then it's going to save that off
  • 64:32 - 64:35
    and check again
    every 15 seconds.
  • 64:35 - 64:39
    Once this changes, then
    we know a new ad has been
  • 64:39 - 64:41
    posted and it's going to tell us
  • 64:41 - 64:46
    the newest ad that was listed.
  • 64:46 - 64:48
    One thing to note on this is
  • 64:48 - 64:54
    these first three ads are
    like sponsored spots.
  • 64:54 - 64:56
    They were posted days ago,
  • 64:56 - 65:00
    so when we do our CSS selectors,
  • 65:00 - 65:02
    we're going to find
    this fourth one.
  • 65:02 - 65:07
    But anyway, so do
  • 65:07 - 65:16
    the new file and then same
    thing with the other scripts,
  • 65:17 - 65:20
    stuff is just a set up.
  • 65:24 - 65:27
    Then [NOISE] I'm going to
  • 65:27 - 65:30
    put this setting
    section at the top.
  • 65:30 - 65:33
    These are things that you
    might want to change.
  • 65:33 - 65:36
    If you want to actually
    use this as a deal finder,
  • 65:36 - 65:37
    you can put in
  • 65:37 - 65:41
    a different search criteria
  • 65:41 - 65:44
    or you can have it
    checked more often.
  • 65:46 - 65:52
    In my experience, 15 seconds
    between checking is a lot.
  • 65:52 - 65:55
    I think you'll get blacklisted
  • 65:55 - 65:58
    pretty quick if you did that.
  • 65:58 - 66:00
    When I was running
  • 66:00 - 66:03
    this deal finder for
  • 66:03 - 66:06
    its intended purpose to
    actually try and find deals,
  • 66:06 - 66:08
    I would run it through
  • 66:08 - 66:13
    a BPN2 just so I didn't
    get my IP banned again.
  • 66:13 - 66:15
    John DeCrey: What's
    the recommendation
  • 66:15 - 66:17
    , maybe every minute.
  • 66:17 - 66:20
    Jerik: I think I was
    doing every two minutes
  • 66:20 - 66:23
    and you could even go
  • 66:23 - 66:26
    bigger than that depending on
    what you're searching for.
  • 66:26 - 66:31
    If you're searching for
    something really niche,
  • 66:31 - 66:35
    you could even do like once
    every 30 minutes probably?
  • 66:35 - 66:35
    John DeCrey: Yes.
  • 66:35 - 66:37
    Jerik: It just
    depends on how much.
  • 66:37 - 66:39
    John DeCrey: I agree
    to, o every 15 seconds
  • 66:39 - 66:40
    is pretty excessive.
  • 66:40 - 66:47
    I probably avoid anything
    other than 30 seconds.
  • 66:48 - 66:50
    Jerik: Yes.
  • 66:51 - 66:55
    John DeCrey: Even every
    minute would be adequate.
  • 67:03 - 67:07
    I was just thinking you could
    use a random generator.
  • 67:07 - 67:08
    Random number generator.
  • 67:08 - 67:14
    Jerik: Yes. That's
    actually a thing.
  • 67:14 - 67:15
    When I was working at
    that job where we were
  • 67:15 - 67:18
    scraping the funeral
    home websites,
  • 67:18 - 67:20
    we did like a bunch of
    other things we were
  • 67:20 - 67:23
    scraping for too but it was
  • 67:23 - 67:25
    always like a cat and
    mouse game between us
  • 67:25 - 67:29
    and the people not wanting
    us to scrape their stuff.
  • 67:29 - 67:31
    But that was one of
    the things we did
  • 67:31 - 67:32
    was add random timeouts,
  • 67:32 - 67:38
    so it seemed more like a human
    was browsing their site.
  • 67:40 - 67:43
    John DeCrey: Jarik and
    I used to work together
  • 67:43 - 67:51
    in digital forensics and
  • 67:51 - 67:54
    specifically mobile but we were
  • 67:54 - 67:58
    always looking for and
    working with hackers that
  • 67:58 - 68:04
    have workarounds on
    mobile security to
  • 68:04 - 68:09
    get to the data for criminal
    investigations and whatnot.
  • 68:09 - 68:11
    But same thing,
  • 68:11 - 68:14
    it was always a cat and mouse.
  • 68:14 - 68:17
    The manufacturers
    would always close
  • 68:17 - 68:18
    the security holes and then
  • 68:18 - 68:20
    the hackers would find new ones,
  • 68:20 - 68:23
    and the forensic industry
  • 68:23 - 68:26
    would expose and put
    it in their product.
  • 68:26 - 68:31
    [LAUGHTER]
  • 68:31 - 68:32
    Jerik: Just for this exercise,
  • 68:32 - 68:33
    I'm going to put 15 seconds
  • 68:33 - 68:36
    here just so it
    doesn't take too long.
  • 68:36 - 68:39
    This furniture section gets
    posted too all the time.
  • 68:39 - 68:41
    I'll be surprised if it doesn't
  • 68:41 - 68:43
    find one in the
    first 15 seconds.
  • 68:43 - 68:48
    John DeCrey: Yes. [NOISE]
  • 68:48 - 68:50
    Jerik: Now I'm going to make
    a function that's going to
  • 68:50 - 68:54
    get the first listings info.
  • 68:54 - 69:02
    This listings info, call it,
  • 69:04 - 69:11
    get_first_listing_info
    and this is
  • 69:11 - 69:14
    where we'll have our
    first driver.get.
  • 69:15 - 69:19
    This is going to go to
    that classified link.
  • 69:20 - 69:22
    Then like the other scripts,
  • 69:22 - 69:25
    we'll let it sleep
    for five seconds.
  • 69:35 - 69:39
    Now we want to get the link
  • 69:39 - 69:48
    driver.find element by CSS.
  • 69:50 - 69:53
    I think on this one,
  • 70:02 - 70:06
    let's see if we want the link.
  • 70:08 - 70:11
    This is what we want.
  • 70:15 - 70:22
    Let me see if I can get it by
    this item info title link.
  • 70:23 - 70:27
    Except we are going
    to want it from
  • 70:27 - 70:31
    this fourth listing so
    we can't do it by that.
  • 70:31 - 70:42
    [NOISE]
  • 70:42 - 70:49
    Let's see, we got
    search results.
  • 70:49 - 70:54
    We'll want to go from
    there to this Div,
  • 70:54 - 71:01
    to that class.
  • 71:01 - 71:03
    I'm looking for when it goes to
  • 71:03 - 71:12
    this first ad and then we
    can drill down from there.
  • 71:14 - 71:18
    These will be posted
    somewhere too, right John?
  • 71:18 - 71:21
    This will be a long one.
  • 71:22 - 71:25
    John DeCrey: If you want
    to post in the chat
  • 71:25 - 71:28
    then I'll put it in the Canvas.
  • 71:28 - 71:30
    Jerik: Okay.
  • 71:43 - 71:45
    I'll send it to you
  • 71:45 - 71:48
    after the recording is done.
  • 71:51 - 72:01
    This can be a div to
    a section to a div,
  • 72:03 - 72:09
    to another div and by the way,
  • 72:09 - 72:12
    this is going from
  • 72:12 - 72:17
    the search result to
    the div section div.
  • 72:20 - 72:30
    Then here I had to do a div
  • 72:30 - 72:34
    and then this nth child,
  • 72:34 - 72:39
    this is saying,
    get the first one.
  • 72:47 - 72:50
    That's just saying, get
    this first div after
  • 72:50 - 72:52
    this div so this is getting
  • 72:52 - 72:55
    that and then we can do
  • 72:55 - 72:58
    another nth child here to
    get the fourth section,
  • 72:58 - 73:01
    which is the third that we want.
  • 73:03 - 73:10
    After that the sectioned
  • 73:10 - 73:15
    nth child the fourth one.
  • 73:16 - 73:22
    Then from there we go
  • 73:22 - 73:31
    to another div,
  • 73:31 - 73:34
    the 2dh div, and
  • 73:34 - 73:43
    then h_2 div and the
    link, it's a long one.
  • 74:05 - 74:10
    It looks like there's a tag.
  • 74:10 - 74:13
    Then we'll do the
    same thing we did on
  • 74:13 - 74:17
    the Amazon one we'll get
    the attribute intrif.
  • 74:17 - 74:19
    John DeCrey: Okay,.
  • 74:30 - 74:32
    Jerik: I hope I typed
    that all in right,
  • 74:32 - 74:37
    but we'll see, then the title.
  • 74:48 - 74:51
    Then this is going to be
  • 74:51 - 75:02
    similar from Oh,
  • 75:02 - 75:11
    wait, this is really similar.
    This is the same thing.
  • 75:21 - 75:23
    Because here's the
    title of this brand
  • 75:23 - 75:25
    new velvet ivory couch.
  • 75:25 - 75:26
    It's the same selector.
  • 75:26 - 75:33
    We're just getting the
    text from it this time.
  • 75:36 - 75:40
    John DeCrey: I don't know
    if I want a velvet couch.
  • 75:40 - 75:44
    Jerik: I know. It
    look pretty though.
  • 75:44 - 75:46
    That's one of those
    couches that it's
  • 75:46 - 75:48
    just a show couch. You
    don't actually use it.
  • 75:48 - 75:50
    John DeCrey: Yeah.
  • 75:51 - 75:53
    Jerik: This function will return
  • 75:53 - 75:55
    the link in the title.
  • 75:55 - 75:59
    John DeCrey: When you do
    two returns like that,
  • 75:59 - 76:03
    it's returning as a
    tapple, is that right?
  • 76:03 - 76:08
    Jerik: Yeah. Now,
  • 76:08 - 76:12
    we can call this function
    and save it to listing info.
  • 76:22 - 76:29
    Then we're going to
    save off the link,
  • 76:29 - 76:34
    so we can check it
    again in 15 seconds.
  • 76:34 - 76:37
    John DeCrey: That's our Delta.
  • 76:37 - 76:39
    Jerik: Yep.
  • 76:39 - 76:40
    John DeCrey: Sort of thing.
  • 76:40 - 76:43
    Jerik: The first listing link,
  • 76:43 - 76:50
    temp equals listening info
  • 76:50 - 76:55
    and it's the first thing
    in there, the link.
  • 76:58 - 77:09
    The title is going to
    be the second thing.
  • 77:13 - 77:16
    Scripts like this to
    you, I like to print
  • 77:16 - 77:17
    out like what's happening
  • 77:17 - 77:20
    if it's going to be
    running for a while.
  • 77:20 - 77:22
    John DeCrey: That is for sure.
  • 77:22 - 77:23
    Jerik: We'll print this out
  • 77:23 - 77:25
    just to log what's going on.
  • 77:25 - 77:28
    First, listing title.
  • 77:29 - 77:32
    It's going to be the title,
  • 77:33 - 77:42
    and then we'll print
    out the link to you.
  • 77:42 - 77:52
    I'm going to add
  • 77:52 - 77:56
    a counter so we can print out
    how many times this is run.
  • 78:01 - 78:07
    Then now, I'm going to
    make a while loop and this
  • 78:07 - 78:12
    is where it's just going to
    run until it finds a new ad,
  • 78:12 - 78:15
    so it'd be, while true
  • 78:15 - 78:17
    that means it's
    just going to run
  • 78:17 - 78:20
    forever until I
    tell it to break,
  • 78:22 - 78:30
    we'll increase the
    check count and
  • 78:30 - 78:37
    then we'll sleep for our
    time we specified up here.
  • 78:37 - 78:38
    John DeCrey: Okay.
  • 78:38 - 78:43
    Jerik: You print out
    another log message.
  • 78:58 - 79:03
    This will print out
    that it's checking
  • 79:03 - 79:08
    and then we'll get
    a new listing info.
  • 79:10 - 79:14
    That is, we'll get
    it from the get
  • 79:14 - 79:18
    first listing info
    function we made.
  • 79:19 - 79:26
    Then first list link
  • 79:26 - 79:35
    can be listing info[0].
  • 79:35 - 79:40
    John DeCrey: So info[0] was
    the title or the or no?
  • 79:40 - 79:44
    Jerik: Zero is the link
    and then 1 is the title.
  • 79:44 - 79:46
    John DeCrey: Okay,.
  • 79:47 - 79:55
    Jerik: One, and then
  • 79:55 - 79:57
    this is where we'll check to see
  • 79:57 - 79:58
    if it's different or not.
  • 79:58 - 80:02
    If it's different we
    know it's a new ad so if
  • 80:02 - 80:07
    first listing link temp
  • 80:07 - 80:12
    does not equal
    first listing link,
  • 80:12 - 80:15
    this is when we know
    it's different.
  • 80:15 - 80:17
    John DeCrey: Lets
    checked in our Delta?
  • 80:17 - 80:18
    Jerik: Yep.
  • 80:20 - 80:23
    John DeCrey: There's the new ad.
  • 80:38 - 80:43
    Jerik: Man, this break
    will stop the program.
  • 80:43 - 80:45
    John DeCrey: Okay.
  • 80:49 - 80:53
    Jerik: We'll see if this works.
  • 80:53 - 80:57
    John DeCrey: There's different
    options that you can put in here.
  • 80:57 - 80:58
    If you were to
    actually use this for
  • 80:58 - 81:06
    your own self and
    looking for something,
  • 81:06 - 81:10
    and once it found something
    that matches your criteria,
  • 81:10 - 81:12
    you can have it email,
  • 81:12 - 81:14
    send an email to you and
  • 81:14 - 81:17
    you probably even do a text
    message, couldn't you?
  • 81:17 - 81:19
    Jerik: Probably,
    there's probably
  • 81:19 - 81:21
    a module for that, to be honest.
  • 81:21 - 81:23
    I get email.
  • 81:23 - 81:25
    It was pretty easy
    to set up to email.
  • 81:25 - 81:30
    There's SMTP module that I used.
  • 81:30 - 81:36
    John DeCrey: What's the error
    that we got there on the bottom?
  • 81:36 - 81:39
    Jerik: That was that stuff.
  • 81:39 - 81:41
    I'm not sure what's
    the logging that
  • 81:41 - 81:43
    whether it's selenium or chrome.
  • 81:43 - 81:45
    John DeCrey: We can
    ignore that stuff?
  • 81:45 - 81:46
    Jerik: Yeah.
  • 81:46 - 81:47
    John DeCrey: Okay.
  • 81:47 - 81:49
    Jerik: I need to.
  • 81:49 - 81:50
    John DeCrey: Objection.
  • 81:50 - 81:51
    Jerik: A copy of that.
  • 81:51 - 81:53
    John DeCrey: This is
    attempt of number one.
  • 81:53 - 81:58
    In 15 seconds we'll check
    to see if there's a new ad.
  • 81:58 - 82:04
    Number two, has there been
    new ads being posted?
  • 82:04 - 82:06
    Jerik: One just got posted.
  • 82:06 - 82:07
    John DeCrey: Sweet.
  • 82:07 - 82:09
    Jerik: You can go
    cop yourself as
  • 82:09 - 82:12
    Freeman Style grandfather clock.
  • 82:12 - 82:14
    John DeCrey: Let's
    look at that URL,.
  • 82:14 - 82:18
    Jerik: Nice.
  • 82:23 - 82:28
    John DeCrey: Brand new ad that
    we just found within seconds.
  • 82:28 - 82:29
    Jerik: Yep.
  • 82:29 - 82:32
    John DeCrey: Very cool.
  • 82:34 - 82:36
    Jerik: That is it.
  • 82:36 - 82:39
    John DeCrey: Then we can
    change it to headless,
  • 82:39 - 82:45
    browsers are not showing
    while your script is running.
  • 82:48 - 82:51
    Jerik: That is all I have.
  • 82:52 - 82:55
    I'll turn it back to you.
    Usually, ask questions,
  • 82:55 - 82:58
    but back to you John.
  • 82:59 - 83:04
    John DeCrey: Many thanks.
    This concludes that lab.
  • 83:04 - 83:06
    If you would submit this into
  • 83:06 - 83:14
    the canvas that's called KSL,
  • 83:14 - 83:16
    it's just called KSL Lab.
  • 83:17 - 83:20
    Many thanks. Jerkally,
  • 83:20 - 83:22
    I appreciate all
    the time that you
  • 83:22 - 83:25
    have spend on this and always
  • 83:25 - 83:28
    enjoyable talking
    with you and hearing
  • 83:28 - 83:32
    your stories so I
    really appreciate it.
  • 83:32 - 83:35
    Jerik: No problem.
    Thanks for having me.
Title:
Module 10 Part 3 - 3rd party modules
Description:

more » « less
Video Language:
English
Duration:
01:23:36

English subtitles

Revisions