[texworks] Improving Syntax Highlighting

Chris Jefferson chris at bubblescope.net
Tue May 8 16:48:32 CEST 2012


On 08/05/12 11:42, Stefan Löffler wrote:
> Hi,
>
> On 2012-05-06 12:50, Chris Jefferson wrote:
>
>> This implies a number of limitations. The big one is no multi-line
>> user regular expressions, sorry. Specific multi-line things can be
>> custom written in C++ obviously.
>>
>> The things I would most like, in order of preference, are:
>>
>> 1) Matching of maths ( both $ $ and \( \) ).
>> 2) Ability to highlight specific \begin{x} ... \end{x} sections.
>> 3) Highlighting of parts of regular expressions (for example, in
>> \textbf{XYZ}, make the XYZ bold).
> What would be nice here would be some form of delimiter matching. E.g.,
> correctly match something like \section{A {B} C}. This doesn't work with
> reg-exps alone, but I recently found that Gtk-source-view
> (http://projects.gnome.org/gtksourceview/documentation.html) can do it.
> As I understand it, it includes the possibility to give two regular
> expressions: one for the beginning, and one for the end of the
> to-be-matched string. Since I guess something like that will be needed
> for \begin/\end section matching anyway, I thought I'd mention this.
> To that end, I guess we should think about supporting some more
> sophisticated configuration files in the long run (e.g., XML based).

I am thinking about your other comments, but one specific thought about 
this.

Perhaps rather than regular expressions, some kind of latex-aware 
tokeniser might be a better approach.

For example, given something like:

I like \textbf{Lots of $x$ and $y$ and \textit{z} }

This would be tokenised into (note: I would go and look what proper 
latex tokenisation looks like!)

'I' 'like' '\textbf' '{' 'Lots' 'of' '$' 'x' '$' 'and' '$' 'y' '$' 'and' 
'\textit' '{' 'z' '}' '}'

Then make a stack of the current state, and as we scan along we 'push' 
and 'pop' things on and off this stack. That would handle nested 
expressions nicely, and would (I believe) make things like not 
highlighting inside a verbatim easier.

In this mode, rather than giving a regular expression, you would state 
how you wanted (for example) inside a textbf, or inside math mode, or 
inside a tabular, to be formatted. You could also state how classes of 
tokens (numbers, {}, \commands) were coloured.

The biggest problem with this is that is would be totally different to 
what came before, and would be very latex-dependant. I (for example) 
don't know what is up in the world of luatex, and other tex variants.

I might have a play with this, and see what it looks like and how the 
code looks.


More information about the texworks mailing list