patgen problems?

Erik Frambach E.H.M.Frambach@eco.rug.nl
Wed, 9 Sep 1998 10:56:48 +0100


I think I may have found a bug in the Web2c-win32 version of Patgen.
Consider the following input files:

dic: 
---cut---
woor-den-boek
rot-zooi
ezels-oren
---cut---

tran:
---cut---
 1 1
 a A
 b B
 c C
 d D
 e E
 f F
 g G
 h H
 i I
 j J
 k K
 l L
 m M
 n N
 o O
 p P
 q Q
 r R
 s S
 t T
 u U
 v V
 w W
 x X
 y Y
 z Z
---cut---

patgen.in:
---cut---
1 5
1 4
1 2 10
1 4
2 1 4
1 5
1 1 1
1 6
3 2 1
1 8
1 1000 1
y

---cut---

Now if I enter enter `patgen dic nul output tran <patgen.in> log
then`output' will contain:
---cut---
5198
51100
11551
11651
---cut---

and a file `pattmp' is generated that contains:
---cut---
119111111114421001011104298111101107
11411111642122111111105
10112210110811542111114101110
---cut---

here's the tail of `log' :
---cut---
4 good, 0 bad, 0 missed
100.00 %, 0.00 %, 0.00 %
0 patterns, 256 nodes in count trie, triec_max = 256
0 good and 27 bad patterns added
finding 0 good and 0 bad hyphens
pattern trie has 256 nodes, trie_max = 272, 8 outputs
0 nodes and 6 outputs deleted
total of 0 patterns at hyph_level 5
hyphenate word list? writing pattmp.53

4 good, 0 bad, 0 missed
100.00 %, 0.00 %, 0.00 %
---cut---

But the file `pattmp.53' is *not* written.

PATGEN, Version 2.0 (C version 6.1) on Unix generates as output:
---cut---
3b
3d
s3
t3
---cut---

and a file called `pattmp.5' containing:
---cut---
woor*den*boek
rot*zooi
ezels*oren
---cut---

Here's a diff of the logfiles of both versions, it may indicate where
the trouble comes from:
---cut---
1c1
< This is PATGEN, Version 2.3 (Web2c 7.2)
---
> This is PATGEN, Version 2.0 (C version 6.1)
107c107
< hyphenate word list? writing pattmp.53
---
> hyphenate word list? writing pattmp.5
---cut---

Any ideas?

Greetings,
Erik Frambach