[texworks] New script features

Paul A Norman paul.a.norman at gmail.com
Wed Apr 28 11:29:45 CEST 2010


>We could still discuss if there isn't more benefit in providing readFile than there is harm - >maybe there is - but to me it looks like a rather twisted (though ingenious) use of the API to >do evaluate(readFile()). We should have other ways of doing that.

Don't hesitate to use eval() for such purposes, it is completely
within the scope of what ECMA envisages. Many implementations and
hosts from browsers to desktopp applications, to even the OS (WIndows
provides controls for the local filesystem for .Js scripts to use in
local scope) provide various means for explicitly retrieving strings,
directly and indirectly.

ECMA descripes the processing of such strings in these terms:

 " 10.1.2 ... if the parameter to the built-in eval function is a
string, it is treated as an ECMAScript Program"

and

"13.1.1 Equated Grammar Productions: ... This source text consists of
global code and any contained function codes according to the
definitions in 10.1.2."

So when it is useful load in variables and function codes using eval(), galore!

Paul

Some helpful notes:

http://www.ecma-international.org/publications/files/ECMA-ST-ARCH/ECMA-262,%203rd%20edition,%20December%201999.pdf

10.1.2

" Eval code is the source text supplied to the built-in eval function.
More precisely, if the parameter to the built-in eval function is a
string, it is treated as an ECMAScript Program. The eval code for a
particular invocation of eval is the global code portion of the string
parameter."

10.2.2 Eval Code handles the execution context for the eval code

• The scope chain is initialised to contain the same objects, in the
same order, as the calling context's scope chain. This includes
objects added to the calling context's scope chain by with statements
and catch clauses.
• Variable instantiation is performed using the calling context's
variable object and using empty
property attributes.
• The this value is the same as the this value of the calling context.

13.1.1 Equated Grammar Productions
Two uses of the FunctionBody grammar production are defined to be
equated when one of the following is true:
• Both uses obtained their FunctionBody from the same location in the
source text of the same ECMAScript program. This source text consists
of global code and any contained function codes according to the
definitions in 10.1.2.
• Both uses obtained their FunctionBody from the same location in the
source text of the same call to eval (15.1.2.1). This source text
consists of eval code and any contained function codes according to
the definitions in 10.1.2.

15.1.2.1 eval (x)
 2.  Parse x as a Program. If the parse fails, throw a SyntaxError
exception (but see also clause 16).

Page 161

"An implementation may extend program and regular expression syntax.
To permit this, all operations (such as calling eval, using a regular
expression literal, or using the Function or RegExp constructor) that
are allowed to throw SyntaxError are permitted to exhibit
implementation-defined behaviour instead of throwing SyntaxError when
they encounter an implementation-defined extension to the program or
regular expression syntax."

On 26 April 2010 22:38, Stefan Löffler <st_loeffler at hotmail.com> wrote:
> Hi,
>
> Am 2010-04-26 02:21, schrieb Jonathan Kew:
>
> On Sun, Apr 25, 2010 at 1:51 PM, Stefan Löffler <st.loeffler at gmail.com>
> wrote:
>>
>> Am 2010-04-25 11:24, schrieb Jonathan Kew:
>> > It's pretty neat that this works; however, I think we should expose a
>> > readFile() method to directly read the text of a file, rather than going
>> > through the "open document, get text, close document" sequence. That's
>> > pretty inefficient, and could become a significant overhead if you were
>> > loading a number of script libraries.
>> >
>>
>> I tend to disagree (partly). A general purpose readFile() is something
>> that may come in handy some times, but also can do a lot of harm.
>
> Could you explain a bit more about your concern here?
>
> I just think that there can be a lot of issues involved with coding a (safe)
> own version of readFile. Some scripting languages have native support for
> this - so be it. QtScript is based on the same language as JavaScript, and
> they are specifically designed to not access the file system. If we want to
> give scripts access to other scripts (e.g. for library use) I think we
> should provide a specific function for that, which could e.g. make sure that
> we stay in the Tw/scripts context and don't run arbitrary code on the hard
> disk. I realize that this is weak, because potential attackers could simply
> install their (malicious) scripts in the Tw/scripts folder, but we probably
> can't prevent that.
>
>
>>
>> In
>> particular, I'd prefer scripts to not use absolute paths as much as
>> possible.
>
> True, in general - although if the script is building the path by starting
> from things like its own path, or the path of the document it's working
> with, and modifying the file name/extension, etc, then the fact that it's an
> absolute path is not important.
>
> That's right. As stated above, I'm just a bit uneasy about scripts running
> arbitrary code on the hard disk. I know we can't avoid it (the
> open-copy-close mechanism works in any case - I even used it myself for the
> "inline bibliography" script), but I think we should generally think about
> providing other (possibly safer) mechanisms of loading libraries / executing
> functions from other scripts that should also work transparently across
> scripting languages.
> We could still discuss if there isn't more benefit in providing readFile
> than there is harm - maybe there is - but to me it looks like a rather
> twisted (though ingenious) use of the API to do evaluate(readFile()). We
> should have other ways of doing that.
>
>> In any case, it would probably be nice to have a runScript()
>> function that does what one would expect of it (i.e., as Paul suggested,
>> run a script identified by some name/label, rather than a TWScript
>> object we currently have no way of obtaining a handle to).
>
> Yes, that might be nice. Though it raises questions of context: does the new
> script execute in its own separate context, or can it be run within the
> current script's context (allowing sharing of globals, etc)? I suppose that
> could only work if the scripts are in the same language....
>
> This is indeed a complicated issue. I prefer this to work regardless of
> scripting language. That would mean separate contexts, though we should have
> some way to pass data between them. The problem is just that ATM, lua and
> python both use a global interpreter, which is designed to be created when
> the script starts and to be destroyed when the script ends, i.e. it's not
> designed for re-entry (in the threading sense).
>
>> To make a long story short, here's a list of things I'd like to see in
>> scripting (and for the most of them I even have an idea of how to code
>> them ;)):
>> 1) Give scripts a way to obtain a handle to their TWScript obj (the
>> pointer is in TWScriptAPI already, anyway)
>
> That's easy, I guess; what do you expect it to be used for?
>
> All by itself, not much. Except that it feels somewhat logical for a script
> to at least be able to query its own representation in Qt. In the long run,
> we could store data here (see below), use it to connect to signals, etc.
>
>> 2) Provide a per-script list/hash of "globals", i.e. of QVariant
>> objects; this would give us the possibility to pass data between
>> successive invocations of the same script (need to think about data
>> lifetime here; possibly need to find a way to clone objects)
>
> How about also having an application-wide collection of such globals, so
> that it's possible for a group of related scripts to share data?
>
> Sounds interesting. I've commented a bit further on this on GC. The main
> (implementation-specific) problems I see right now are where we store them
> exactly and how we name them.
>
>> 3) Add the possibility to call functions defined in a script from C++
>>
>> 4) Make it possible to connect C++ signals to script slots (it should
>> already be working for QtScript; for the others we need some tricks -
>> possibly create a dummy QtScript object to connect to which passes on
>> the signals to the real script)
>
> I've been wondering if we should in some way allow scripts to be "attached"
> to the windows they create in this situation. But perhaps that's not
> necessary. If the script object is deleted/replaced while the connected
> widgets are still around, the connections will break, but no real harm
> should follow - the user would just have to close and re-instantiate the
> window to get it working again.
>
> If we're able to connect script functions to signals, the script could
> terminate itself when the window is destroyed (and vice versa, if
> necessary). So I don't see the necessity here, provided we implement pts. 3
> & 4.
>
>>
>> 5) Introduce a new script type ("multi", "mixed", ... ideas welcome)
>> that defines only functions. There is one special function (e.g. init())
>> that gets called when the TWScript object is created. Its purpose is to
>> register menu items, toolbar items, hooks, ... (whatever comes to mind).
>> Each of them gets connected to a function provided by the script.
>
> ScriptType: library
> Defines functions for use as hooks, callback slots, whatever....
> No need for a magic init() function: the "main program" of the script is
> executed on load, and can do whatever setup is desired.
> Or maybe we should have both initialize() and terminate() functions; the
> latter provides an opportunity for the script library to do last-minute
> things before the app shuts down, such as saving its globals to a .ini file,
> to be read by initialize() next time it is loaded.
>
> I think we do need initialize() (and possibly, for symmetry, finalize()) if
> we intend to keep the system we have at the moment. In principle, I can
> think of 3 different approaches:
> 1) We have a persistent per-script interpreter
> 2) We have an on-demand per-script interpreter
> 3) We have a global interpreter
>
> Currently we're using nr. 2. An interpreter (e.g. a QScriptEngine object) is
> created when the script is executed and destroyed when it is finished. This
> could easily be generalized to multi/mixed/lib scripts. We'd need to load
> them anew each time a function defined in them is executed, however. Hence
> we can't put one-time initialization outside of a function, as otherwise the
> script would add a new menu item/toolbar item/whatever each time it is
> loaded.
> I still think that 2 is the safest and best approach, though. Nr. 1 could
> cause tremendous overhead (imagine a user who has 100s of scripts, each of
> which gets an own interpreter even though it may never be used throughout
> the session). Nr. 3 on the other hand would introduce the whole problem of
> naming conflicts. Besides, there could be some issues when one script calls
> another script.
>
> So, the control flow I imagine is as follows (pseudo-code)
> // in the constructor
> TWScript::TWScript() {
>     CREATE_NEW_INTERPRETER()
>
>     LOAD_SCRIPT()
>     CALL('initialize')
>
>     DESTROY_INTERPRETER()
> }
>
> TWScript::callScriptFunction(funcName, params) {
>     CREATE_NEW_INTERPRETER()
>
>     LOAD_SCRIPT()
>     CALL(funcName, params)
>
>     DESTROY_INTERPRETER()
> }
>
> BTW: "library" may be a bit misleading. Of course these scripts could be
> used as Paul suggested, i.e. as providing only functions for use from other
> scripts which are never invoked directly. They could also, however, provide
> a single function with appropriate UI items. Such scripts I wouldn't
> necessarily associate with the term "library".
>
> For example, there have been several requests in the past for toolbar icons
> to insert some code - say, Greek letters. This is trivial to code in
> scripts, but we'd need some way to add a toolbar icon (simple) and to
> connect to its triggered() signal (complex - see task 4 above). For this,
> the script would obviously need to add a toolbar icon (from the initialize()
> function), need to provide a function to actually insert the text, and need
> to connect to the signal (again from initialize()).
>
> -Stefan
>



More information about the texworks mailing list