Dear Mojca,<br><br>I have looked at the material. Currently I am half way through Liang's thesis. My first impressions are that one needs to be able to compose a program in binary in order to solve this.<br><br>The three Turkish/Basque files give some hope. I noticed that they are all in Ruby. Is that necessary? Could they be in Perl? I have some minimal Perl exposure. Also, the files are dealing with hyphenation whereas Lao doesn't hyphenate. We need to identify where word breaks can and can't occur. I know the rules, but translating them into Ruby or any other computer language is another story.<br>
<br>I am not sure how to move forward. Also, some of the characters will not display properly on their own, is it better to write the unicode numbers? <br><br>Brian<br><br><div class="gmail_quote">On Mon, May 3, 2010 at 10:46 PM, Mojca Miklavec <span dir="ltr"><<a href="mailto:mojca.miklavec.lists@gmail.com">mojca.miklavec.lists@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div class="im">On Thu, Apr 29, 2010 at 06:53, Brian Wilson wrote:<br>
> It seems that I may have reinvented the wheel (and created an inferior<br>
> model.)<br>
><br>
> For a pdf explanation of Lao syllabification check this link<br>
> <a href="http://www.tcllab.org/events/uploads/valaxay-lao.pdf" target="_blank">http://www.tcllab.org/events/uploads/valaxay-lao.pdf</a><br>
> Thank you,<br>
> Brian Wilson<br>
<br>
</div>Thanks a lot for the link.<br>
<br>
I would not dare to create patterns myself (I would need to study the<br>
letters, their encoding and rules into deeper detail and install the<br>
appropriate fonts), but my suggestion would be ...<br>
<br>
1.) Do you know how the hyphenation algorithm works? If you want, I<br>
can send you some links and some material that I have on my computer.<br>
<br>
2.) Your example calls for rule-based patterns.<br>
<br>
Here are some examples of how such patterns are being generated (they<br>
are only of help once you understand what's under point "1"):<br>
<br>
<a href="http://tug.org/svn/texhyphen/trunk/hyph-utf8/source/generic/hyph-utf8/languages/tk/generate_patterns_tk.rb" target="_blank">http://tug.org/svn/texhyphen/trunk/hyph-utf8/source/generic/hyph-utf8/languages/tk/generate_patterns_tk.rb</a><br>
<br>
<a href="http://tug.org/svn/texhyphen/trunk/hyph-utf8/source/generic/hyph-utf8/languages/tr/generate_patterns_tr.rb" target="_blank">http://tug.org/svn/texhyphen/trunk/hyph-utf8/source/generic/hyph-utf8/languages/tr/generate_patterns_tr.rb</a><br>
<br>
<a href="http://tug.org/svn/texhyphen/trunk/hyph-utf8/source/generic/hyph-utf8/languages/eu/generate_patterns_eu.rb" target="_blank">http://tug.org/svn/texhyphen/trunk/hyph-utf8/source/generic/hyph-utf8/languages/eu/generate_patterns_eu.rb</a><br>
<br>
They all start wih something like ...<br>
# h is not here.<br>
consonants=%w{b c d f g j k l m n ñ p q r s t v w x y z}<br>
# Open vowels: a e o<br>
vop=%w{a e o}<br>
# Closed vowels: i u<br>
vcl=%w{i u}<br>
<br>
Maybe Arthur would be interested in exotic scripts, but it's best if<br>
you do a headstart and start with a few simple patterns and then we<br>
can help you when reach a step when you won't know how to proceed.<br>
<font color="#888888"><br>
Mojca<br>
</font></blockquote></div><br><br clear="all"><br>-- <br>Brian Wilson, Director<br>Asia-Pacific International University Translation Center<br>_____________<br><br>I have a new blog!! <a href="http://tc4asia.org/wpblog">http://tc4asia.org/wpblog</a><br>
<br>"He hath shewed thee, O man, what is good; and what doth the LORD require of thee , but to do justly, and to love mercy, and to walk humbly with thy God." Micah 6:8<br>