原始文档
the plane can fly . the typical plane can see the plane . a typical fly can see . who might see ? the large can might see a can . the can can destroy a large can . who might see ? who might fly ? who can fly ? the can might see . the plane can fly a typical fly . who can fly ? the
…
…
分句1
./../sentsplit.pl example0.train example0.sentences
the plane can fly .
the typical plane can see the plane .
a typical fly can see .
who might see ?
the large can might see a can .
the can can destroy a large can .
who might see ?
…
…
标注数据
the/at plane/nn can/md fly/vb ./.
the/at typical/jj plane/nn can/md see/vb the/at plane/nn ./.
a/at typical/jj fly/nn can/md see/vb ./.
who/wps might/md see/vb ?/.
the/at large/jj can/nn might/md see/vb a/at can/nn ./.
the/at can/nn can/md destroy/vb a/at large/jj can/nn ./.
who/wps might/md see/vb ?/.
…
…
数据编号1
./../create_key.pl words.key < example0.sentences > example0.seq
单词
1 the
6 typical
3 can
8 a
9 who
13 destroy
7 see
2 plane
11 ?
10 might
5 .
12 large
4 fly
标注
6 jj
7 wps
2 nn
3 md
1 at
4 vb
5 .
词频统计1
./../pretrain.pl example0.all lex ngram
词型及其词性标记的组合在训练集中出现的次数
plane nn 34
a at 58
see vb 45
? . 57
typical jj 25
large jj 22
destroy vb 9
can md 58
might md 42
can nn 39
fly nn 20
who wps 57
fly vb 46
. . 43
the at 35
一元词性及二元词性在训练集中的出现次数
md 100
wps 57
at 93
. 100
nn 93
vb 100
jj 47
vb . 50
wps md 57
at jj 47
nn . 50
nn md 43
vb at 50
at nn 46
md vb 100
jj nn 47
模型训练
1 | ./../hmmtrain.pl words.key pos.key ngram lex example.hmm |
1 | M= 13 |
标注
测试数据
the can can destroy the typical fly .
编号
T= 8
1 3 3 13 1 6 4 5
预测1
./../testvit example.hmm example0.test
1 | ------------------------------------ |
the/at can/wps can/vb destroy/md the/at typical/nn fly/wps ./jj
对于无标注:
数据编号1
./../create_key.pl words.key < example0.sentences > example0.seq
单词
1 the
6 typical
3 can
8 a
9 who
13 destroy
7 see
2 plane
11 ?
10 might
5 .
12 large
4 fly
编号后的文档1
2T= 590
1 2 3 4 5 1 6 2 3 7 1 2 5 8 6 4 3 7 5 9 10 7 11 1 12 3 10 7 8 3 5 1 3 3 13 8 12 3 5 9 10 7 11 9 10 4 11 9 3 4 11 1 3 10 7 5 1 2 3 4 8 6 4 5 9 3 4 11 1 12 4 3 4 5 9 3 7 11 9 3 7 8 3 11 1 2 3 7 1 6 3 5 9 3 7 11 8 2 3 7 5 9 3 7 8 12 2 11 9 10 13 8 6 3 11 9 3 7 11 9 10 7 11 9 10 4 11 9 10 7 8 4 11 1 2 3 4 1 2 5 1 6 2 10 4 8 2 5 9 10 4 8 12 2 11 9 3 4 8 12 4 11 9 10 4 11 9 3 7 8 4 11 9 3 4 11 1 2 10 4 8 2 5 9 10 7 1 3 11 8 12 3 10 4 5 1 2 3 7 8 12 4 5 9 3 13 8 12 4 11 9 3 7 8 2 11 8 12 2 3 4 5 9 10 7 11 9 3 4 8 3 11 8 12 3 10 7 5 8 6 3 3 7 1 3 5 9 3 13 8 6 3 11 9 3 4 11 8 6 3 3 4 1 12 2 5 8 4 3 4 8 2 5 9 3 4 8 2 11 9 10 13 1 3 11 1 6 2 3 4 8 12 2 5 1 6 4 3 7 1 12 3 5 9 10 4 8 2 11 9 10 4 11 9 3 7 8 12 4 11 1 6 4 3 13 8 12 2 5 9 3 4 8 3 11 8 6 3 3 7 5 8 6 4 10 4 5 9 3 7 1 2 11 9 3 4 1 12 2 11 1 4 10 4 8 6 3 5 9 3 7 8 2 11 9 10 7 8 4 11 8 3 10 4 5 9 3 7 11 9 10 7 11 9 10 7 11 8 2 3 4 5 9 10 4 8 3 11 8 12 3 10 7 5 9 10 7 11 8 12 3 3 13 8 3 5 8 6 3 10 7 1 3 5 9 10 7 11 1 6 3 3 4 5 9 10 7 8 6 4 11 1 6 4 3 4 5 9 3 4 11 8 4 3 7 8 6 3 5 8 2 10 4 5 9 10 7 11 8 6 3 10 4 8 2 5 9 3 7 1 3 11 8 12 3 3 7 5 9 3 4 8 6 3 11 9 10 4 11 9 10 4 11 9 3 4 8 6 3 11 9 10 4 8 12 2 11 9 3 4 11 9 10 4 1 6 3 11 1 3 3 13 1 6 4 5 9 3 7 11 8 2 3 13 1 3 5 9 10 4 11 9 3 7 11 1 12 2 3 7 5 8 4 3 7 5 8 2 10 4 5 8 3 10 7 5 9 3 4 11
训练1
./../esthmm -N 7 -M 13 example0.seq > example0.hmm
1 | M= 13 |
测试1
he can can destroy the typical fly .
1 | T= 8 |
1 | ./../testvit example0.hmm example0.test |
1 | Viterbi using direct probabilities |