[texworks] ver 567 Scripts - Cariag/Line returns now coming back as ??

Paul A Norman paul.a.norman at gmail.com
Wed Mar 17 03:03:48 CET 2010

Dear Stefan,

Just had a repaat - so here is the information.

Its speech and quote marks, m-dashes and n-dashes when copy/pasting
from Open Office 3.1.1 ver 9420.

Worth mentioning as it is becoming more common as a draft editor.

Does this relate? When I reopen one of my master documents, it has a call for

\usepackage[utf8]{inputenc} % set input encoding (not needed with XeLaTeX)

As per the document template.

I get a message box when the document opens complaining that UTF--8 is
not supported it will be interpreted as  UTF-8

What is the difference between UTF (two dashes) 8 and UTF (one dash) 8?


On 17 March 2010 12:15, Paul A Norman <paul.a.norman at gmail.com> wrote:
> Thanks Stefan,
> Leaving other things asside knowing that they are in the list for fixing...
> The one marker that comes to mind straight away was the big P
> paragraph marker which in LaTeX would be \P. ¶ U+00B6 (182)
> But again I think that this was all from pasting from a draft
> word-processor, as you say different encoding.
> I'll post when I get time to try and reproduce these things -- but
> suspect now  all the other stuff  has to do with potential encoding
> difference between editing engines.
> Thanks,
> paul
> On 17 March 2010 00:04, Stefan Löffler <st.loeffler at gmail.com> wrote:
>> Hi Paul,
>> Am 2010-03-16 11:55, schrieb Paul A Norman:
>>> So I'm best to convert to QTScript, I don;t have enough experience in
>>> LUA for regular expressions but I can do it in JS or QTScript no
>>> problem.
>> This is entirely up to you, of course. Once this is fixed, it shouldn't
>> make any difference, though.
>>> So it looks from Jonathan's post there that I would need to so something like
>>> txt =  txt.replace(/\u2029/g, '\n');
>>> When this is "fixed" would I need to remove that line?
>> No, you can keep that line. It simply replaces the odd markers
>> (actually, they are not so odd; they are what unicode defines as
>> paragraph mark (as opposed to simply forcing a new line); they are just
>> not particularly wide spread). When the fix is released, there shouldn't
>> be any of those odd markers, so the line wouldn't do anything anymore.
>>> I have noticed that other characters are similarly treated, is there a
>>> table somewhere I could look at please to know how TeXWorks editor
>>> handles things pasted from other word processors that look ok in
>>> TeXWorks but  perhaps can come back from script as rectangles or
>>> ?marks
>> Really? I haven't come across any of these, yet. So there's no list that
>> I know of. But if you could give some examples, this could be helpful.
>> A general speculation would be that there is an encoding problem - if
>> you have special unicode characters (like an ffi-ligature) in your text,
>> they are displayed correctly by Tw, but when passed to scripts and back
>> to are not interpreted correctly. But again, without examples to test
>> this hypothesis it's just idle speculation.
>> HTH
>> Stefan

More information about the texworks mailing list