patgen problems?
Erik Frambach
E.H.M.Frambach@eco.rug.nl
Wed, 9 Sep 1998 10:56:48 +0100
I think I may have found a bug in the Web2c-win32 version of Patgen.
Consider the following input files:
dic:
---cut---
woor-den-boek
rot-zooi
ezels-oren
---cut---
tran:
---cut---
1 1
a A
b B
c C
d D
e E
f F
g G
h H
i I
j J
k K
l L
m M
n N
o O
p P
q Q
r R
s S
t T
u U
v V
w W
x X
y Y
z Z
---cut---
patgen.in:
---cut---
1 5
1 4
1 2 10
1 4
2 1 4
1 5
1 1 1
1 6
3 2 1
1 8
1 1000 1
y
---cut---
Now if I enter enter `patgen dic nul output tran <patgen.in> log
then`output' will contain:
---cut---
5198
51100
11551
11651
---cut---
and a file `pattmp' is generated that contains:
---cut---
119111111114421001011104298111101107
11411111642122111111105
10112210110811542111114101110
---cut---
here's the tail of `log' :
---cut---
4 good, 0 bad, 0 missed
100.00 %, 0.00 %, 0.00 %
0 patterns, 256 nodes in count trie, triec_max = 256
0 good and 27 bad patterns added
finding 0 good and 0 bad hyphens
pattern trie has 256 nodes, trie_max = 272, 8 outputs
0 nodes and 6 outputs deleted
total of 0 patterns at hyph_level 5
hyphenate word list? writing pattmp.53
4 good, 0 bad, 0 missed
100.00 %, 0.00 %, 0.00 %
---cut---
But the file `pattmp.53' is *not* written.
PATGEN, Version 2.0 (C version 6.1) on Unix generates as output:
---cut---
3b
3d
s3
t3
---cut---
and a file called `pattmp.5' containing:
---cut---
woor*den*boek
rot*zooi
ezels*oren
---cut---
Here's a diff of the logfiles of both versions, it may indicate where
the trouble comes from:
---cut---
1c1
< This is PATGEN, Version 2.3 (Web2c 7.2)
---
> This is PATGEN, Version 2.0 (C version 6.1)
107c107
< hyphenate word list? writing pattmp.53
---
> hyphenate word list? writing pattmp.5
---cut---
Any ideas?
Greetings,
Erik Frambach