[luatex] own tokenizer

Wolfgang Jeltsch wolfgang at cs.ioc.ee
Thu Aug 1 15:09:37 CEST 2013


Hi,

according to my understanding of the LuaTeX reference, there are two
callbacks that you can use to influence the way, input is turned into
token lists:

  • The process_input_buffer callback turns plain input lines into plain
    input lines.

  • The token_filter callback essentially turns token lists into token
    lists.

However, there seems to be no direct way of providing your own
translator from input lines to token lists; the actual tokenization
seems to be always done by LuaTeX on its own.

I have thought about a solution to this problem: I would set the
catcodes of the characters \, ^^M, ^^@, %, and ^^? to “other”. This way,
LuaTeX’s tokenizer would generate a token from each single character.
Then I would implement a token_filter callback that fetches input
characters by fetching tokens.

Is this approach sensible? Are there better ways of achieving what I
want?

Best wishes,
Wolfgang



More information about the luatex mailing list