1
00:00:00,408 --> 00:00:03,991
In this segment I'm going to show you that dependency syntax

2
00:00:03,991 --> 00:00:09,040
is a very natural representation for relation extraction applications.

3
00:00:10,702 --> 00:00:16,504
One domain in which a lot of work has been done on relation extraction is in the biomedical text domain.

4
00:00:16,504 --> 00:00:19,410
So here for example, we have the sentence

5
00:00:19,410 --> 00:00:26,195
“The results demonstrated that KaiC interacts rhythmically with SasA, KaiA, and KaiB.”

6
00:00:26,195 --> 00:00:30,562
And what we’d like to get out of that is a protein interaction event.

7
00:00:30,562 --> 00:00:34,628
So here’s the “interacts” that indicates the relation,

8
00:00:34,628 --> 00:00:36,746
and these are the proteins involved.

9
00:00:36,746 --> 00:00:40,165
And there are a bunch of other proteins involved as well.

10
00:00:40,535 --> 00:00:48,219
Well, the point we get out of here is that if we can have this kind of dependency syntax,

11
00:00:48,219 --> 00:00:55,213
then it's very easy starting from here to follow along the arguments of the subject and the preposition “with”

12
00:00:55,213 --> 00:00:59,367
and to easily see the relation that we’d like to get out.

13
00:00:59,367 --> 00:01:01,714
And if we're just a little bit cleverer,

14
00:01:01,714 --> 00:01:05,811
we can then also follow along the conjunction relations

15
00:01:05,811 --> 00:01:12,967
and see that KaiC is also interacting with these other two proteins.

16
00:01:14,259 --> 00:01:17,362
And that's something that a lot of people have worked on.

17
00:01:17,362 --> 00:01:24,355
In particular, one representation that’s being widely used for relation extraction applications in biomedicine

18
00:01:24,355 --> 00:01:27,796
is the Stanford dependencies representation.

19
00:01:27,796 --> 00:01:33,639
So the basic form of this representation is as a projective dependency tree.

20
00:01:33,639 --> 00:01:40,699
And it was designed that way so it could be easily generated by postprocessing of phrase structure trees.

21
00:01:40,699 --> 00:01:44,077
So if you have a notion of headedness in the phrase structure tree,

22
00:01:44,077 --> 00:01:49,640
the Stanford dependency software provides a set of matching pattern rules

23
00:01:49,640 --> 00:01:55,291
that will then type the dependency relations and give you out a Stanford dependency tree.

24
00:01:55,291 --> 00:02:01,998
But Stanford dependencies can also be, and now increasingly are generated directly

25
00:02:01,998 --> 00:02:06,749
by dependency parsers such as the MaltParser that we looked at recently.

26
00:02:07,319 --> 00:02:11,470
Okay, so this is roughly what the representation looks like.

27
00:02:11,470 --> 00:02:13,299
So it's just as we saw before,

28
00:02:13,299 --> 00:02:17,855
with the words connected by type dependency arcs.

29
00:02:19,655 --> 00:02:24,240
But something that has been explored in the Stanford dependencies framework

30
00:02:24,240 --> 00:02:27,772
is, starting from that basic dependencies representation,

31
00:02:27,772 --> 00:02:34,053
let’s make some changes to it to facilitate relation extraction applications.

32
00:02:34,053 --> 00:02:38,482
And the idea here is to emphasize the relationships

33
00:02:38,482 --> 00:02:43,302
between content words that are useful for relation extraction applications.

34
00:02:43,302 --> 00:02:45,387
Let me give a couple of examples.

35
00:02:45,387 --> 00:02:51,553
So, one example is that commonly you’ll have a content word like “based”

36
00:02:51,553 --> 00:02:56,599
and where the company here is based—Los Angeles—

37
00:02:56,599 --> 00:03:01,029
and it’s separated by this preposition “in”, a function word.

38
00:03:01,029 --> 00:03:07,101
And you can think of these function words as really functioning like case markers in a lot of other languages.

39
00:03:07,101 --> 00:03:11,410
So it’d seem more useful if we directly connected “based” and “LA”,

40
00:03:11,410 --> 00:03:15,034
and we introduced the relationship of “prep_in”.

41
00:03:15,911 --> 00:03:20,734
And so that’s what we do, and we simplify the structure.

42
00:03:20,734 --> 00:03:22,982
But there are some other places, too,

43
00:03:22,982 --> 00:03:29,649
in which we can do a better job at representing the semantics with some modifications of the graph structure.

44
00:03:29,649 --> 00:03:34,868
And so a particular place of that is these coordination relationships.

45
00:03:34,868 --> 00:03:40,393
So we very directly got here that “Bell makes products”.

46
00:03:40,393 --> 00:03:44,158
But we’d also like to get out that Bell distributes products,

47
00:03:44,158 --> 00:03:51,819
and one way we could do that is by recognizing this “and” relationship

48
00:03:51,819 --> 00:04:01,820
and saying “Okay, well that means that ‘Bell’ should also be the subject of ‘distributing’

49
00:04:03,159 --> 00:04:07,493
and what they distribute is ‘products.’”

50
00:04:09,432 --> 00:04:11,315
And similarly down here,

51
00:04:11,315 --> 00:04:21,104
we can recognize that they’re computer products as well as electronic products.

52
00:04:21,781 --> 00:04:24,606
So we can make those changes to the graph,

53
00:04:24,606 --> 00:04:28,118
and get a reduced graph representation.

54
00:04:28,595 --> 00:04:33,489
Now, once you do this, there are some things that are not as simple.

55
00:04:33,489 --> 00:04:38,857
In particular, if you look at this structure, it’s no longer a dependency tree

56
00:04:38,857 --> 00:04:43,019
because we have multiple arcs pointing at this node,

57
00:04:43,019 --> 00:04:46,128
and multiple arcs pointing at this node.

58
00:04:47,251 --> 00:04:48,569
But on the other hand,

59
00:04:48,569 --> 00:04:54,588
the relations that we’d like to extract are represented much more directly.

60
00:04:54,588 --> 00:04:58,006
And let me just show you one graph that gives an indication of this.

61
00:04:58,652 --> 00:05:06,422
So, this was a graph that was originally put together by Jari Björne et al,

62
00:05:06,422 --> 00:05:12,465
who were the team that won the BioNLP 2009 shared tasks in relation extraction

63
00:05:12,465 --> 00:05:17,498
using, as the representational substrate, Stanford dependencies.

64
00:05:17,498 --> 00:05:20,677
And what they wanted to illustrate with this graph

65
00:05:20,677 --> 00:05:25,231
is how much more effective dependency structures were

66
00:05:25,231 --> 00:05:30,860
at linking up the words that you wanted to extract in a relation,

67
00:05:30,860 --> 00:05:34,757
than simply looking for words in the linear context.

68
00:05:35,434 --> 00:05:40,401
So, here what we have is that this is the distance

69
00:05:40,925 --> 00:05:45,891
which can be measured either by just counting words to the left or right,

70
00:05:45,891 --> 00:05:50,042
or by counting the number of dependency arcs that you have to follow.

71
00:05:50,042 --> 00:05:53,324
And this is the percent of time that it occurred.

72
00:05:53,324 --> 00:05:56,337
And so what you see is, if you just look at linear distance,

73
00:05:56,337 --> 00:06:02,892
there are lots of times that there are arguments and relations that you want to connect out

74
00:06:02,892 --> 00:06:06,223
that are four, five, six, seven, eight words away.

75
00:06:06,223 --> 00:06:11,726
In fact, there’s even a pretty large residue here of well over ten percent

76
00:06:11,726 --> 00:06:16,768
where the linear distance away in words is greater than ten words.

77
00:06:16,768 --> 00:06:21,176
If on the other hand though, you are trying to identify,

78
00:06:21,176 --> 00:06:25,636
relate the arguments of relations by looking at the dependency distance,

79
00:06:25,636 --> 00:06:30,460
then what you’d discover is that the vast majority of the arguments

80
00:06:30,460 --> 00:06:35,428
are very close-by neighbors in terms of dependency distance.

81
00:06:35,428 --> 00:06:42,068
So, about 47 percent of them are direct dependencies and another 30 percent of distance too.

82
00:06:42,068 --> 00:06:48,512
So take those together and that’s greater than three quarters of the dependencies that you want to find.

83
00:06:48,512 --> 00:06:51,537
And then this number trails away quickly.

84
00:06:51,537 --> 00:06:59,431
So there are virtually no arguments of relations that aren’t fairly close together in dependency distance

85
00:06:59,431 --> 00:07:02,621
and it’s precisely because of this reason that you can get

86
00:07:02,621 --> 00:07:09,617
a lot of mileage in doing relation extraction by having a representation-like dependency syntax.

87
00:07:11,447 --> 00:07:16,050
Okay, I hope that’s given you some idea of why knowing about syntax is useful,

88
00:07:16,050 --> 99:59:59,999
when you want to do various semantic tasks in natural language processing.