[tex-hyphen] Hyphenation algorithm in Lua

Stephan Hennig sh-list at posteo.net
Fri Aug 28 21:35:27 CEST 2020


Am 26.08.20 um 18:45 schrieb Keno Wehr:

> LuaTeX provides a “hyphenate” callback allowing to replace TeX's 
> hyphenation routine (LuaTeX manual, p. 175).
> This callback is used by the luatexko and luavlna packages, which 
> however do not use it for real hyphenation.
> I would like to get an idea how this callback can be used for hyphenation.
> Is there any known Lua implementation of TeX's or any other hyphenation 
> algorithm?
> What is the preferred way to access the hyphenation patterns within Lua?

Have a look at <URL:https://github.com/sh2d/padrinoma>.  The repository 
contains some Lua/texlua and LuaLaTeX examples in the `example/` 
directory.  The `patternize.lua` commmand-line tool might be something 
to look at first (run with argument `--help`).  See the several 
`MANIFEST` files for the other examples.  Installation instructions can 
be found in file `examples/README`.

Warning: For the sake of flexibility – I had other applications in mind 
than hyphenation –, an OOP approach has been taken to pattern handling. 
Therefore, the code might look more complex than needed.

Best regards,
Stephan Hennig


> $ texlua patternize.lua -p en-us -v <<< hyphenation
> pattern file: /usr/share/texlive/texmf-dist/tex/generic/hyph-utf8/patterns/txt/hyph-en-us.pat.txt (4938 patterns read)
> spot mins, special characters: 2 2 '-=.'
> 
>  . h y p h e n a t i o n .
>    h y3p h
>          h e2n
>             1n a
>          h e n a4
>              n2a t
>          h e n5a t
>                   2i o
>                 1t i o
>                      o2n
>  .0h0y3p0h0e2n5a4t2i0o0n0.
> hy-phen-ation
> $ texlua patternize.lua -p en-gb -v <<< hyphenation
> pattern file: /usr/share/texlive/texmf-dist/tex/generic/hyph-utf8/patterns/txt/hyph-en-gb.pat.txt (8527 patterns read)
> spot mins, special characters: 2 2 '-=.'
> 
>  . h y p h e n a t i o n .
>    h2y
>     2y p
>      y2p h
>    h y3p h
>          h e2
>        p h e4
>      y p h4e
>              n1a
>              n a t i4
>                   2i o2
>                 1t i o
>  .0h0y3p0h4e4n1a1t2i4o0n0.
> hy-phen-a-tion


More information about the tex-hyphen mailing list.