[texworks] New script features

Mon Apr 26 12:38:07 CEST 2010

Hi,

Am 2010-04-26 02:21, schrieb Jonathan Kew:
> On Sun, Apr 25, 2010 at 1:51 PM, Stefan Löffler <st.loeffler at gmail.com
> <mailto:st.loeffler at gmail.com>> wrote:
>
>
>     Am 2010-04-25 11:24, schrieb Jonathan Kew:
>     > It's pretty neat that this works; however, I think we should
>     expose a readFile() method to directly read the text of a file,
>     rather than going through the "open document, get text, close
>     document" sequence. That's pretty inefficient, and could become a
>     significant overhead if you were loading a number of script libraries.
>     >
>
>     I tend to disagree (partly). A general purpose readFile() is something
>     that may come in handy some times, but also can do a lot of harm.
>
>
> Could you explain a bit more about your concern here?

I just think that there can be a lot of issues involved with coding a
(safe) own version of readFile. Some scripting languages have native
support for this - so be it. QtScript is based on the same language as
JavaScript, and they are specifically designed to not access the file
system. If we want to give scripts access to other scripts (e.g. for
library use) I think we should provide a specific function for that,
which could e.g. make sure that we stay in the Tw/scripts context and
don't run arbitrary code on the hard disk. I realize that this is weak,
because potential attackers could simply install their (malicious)
scripts in the Tw/scripts folder, but we probably can't prevent that.

>  
>
>     In
>     particular, I'd prefer scripts to not use absolute paths as much as
>     possible.
>
>
> True, in general - although if the script is building the path by
> starting from things like its own path, or the path of the document
> it's working with, and modifying the file name/extension, etc, then
> the fact that it's an absolute path is not important.

That's right. As stated above, I'm just a bit uneasy about scripts
running arbitrary code on the hard disk. I know we can't avoid it (the
open-copy-close mechanism works in any case - I even used it myself for
the "inline bibliography" script), but I think we should generally think
about providing other (possibly safer) mechanisms of loading libraries /
executing functions from other scripts that should also work
transparently across scripting languages.
We could still discuss if there isn't more benefit in providing readFile
than there is harm - maybe there is - but to me it looks like a rather
twisted (though ingenious) use of the API to do evaluate(readFile()). We
should have other ways of doing that.

>     In any case, it would probably be nice to have a runScript()
>     function that does what one would expect of it (i.e., as Paul
>     suggested,
>     run a script identified by some name/label, rather than a TWScript
>     object we currently have no way of obtaining a handle to).
>
>
> Yes, that might be nice. Though it raises questions of context: does
> the new script execute in its own separate context, or can it be run
> within the current script's context (allowing sharing of globals,
> etc)? I suppose that could only work if the scripts are in the same
> language....

This is indeed a complicated issue. I prefer this to work regardless of
scripting language. That would mean separate contexts, though we should
have some way to pass data between them. The problem is just that ATM,
lua and python both use a global interpreter, which is designed to be
created when the script starts and to be destroyed when the script ends,
i.e. it's not designed for re-entry (in the threading sense).

>     To make a long story short, here's a list of things I'd like to see in
>     scripting (and for the most of them I even have an idea of how to code
>     them ;)):
>     1) Give scripts a way to obtain a handle to their TWScript obj (the
>     pointer is in TWScriptAPI already, anyway)
>
>
> That's easy, I guess; what do you expect it to be used for?

All by itself, not much. Except that it feels somewhat logical for a
script to at least be able to query its own representation in Qt. In the
long run, we could store data here (see below), use it to connect to
signals, etc.

>     2) Provide a per-script list/hash of "globals", i.e. of QVariant
>     objects; this would give us the possibility to pass data between
>     successive invocations of the same script (need to think about data
>     lifetime here; possibly need to find a way to clone objects)
>
>
> How about also having an application-wide collection of such globals,
> so that it's possible for a group of related scripts to share data?

Sounds interesting. I've commented a bit further on this on GC. The main
(implementation-specific) problems I see right now are where we store
them exactly and how we name them.

>     3) Add the possibility to call functions defined in a script from C++
>
>     4) Make it possible to connect C++ signals to script slots (it should
>     already be working for QtScript; for the others we need some tricks -
>     possibly create a dummy QtScript object to connect to which passes on
>     the signals to the real script)
>
>
> I've been wondering if we should in some way allow scripts to be
> "attached" to the windows they create in this situation. But perhaps
> that's not necessary. If the script object is deleted/replaced while
> the connected widgets are still around, the connections will break,
> but no real harm should follow - the user would just have to close and
> re-instantiate the window to get it working again.

If we're able to connect script functions to signals, the script could
terminate itself when the window is destroyed (and vice versa, if
necessary). So I don't see the necessity here, provided we implement
pts. 3 & 4.

>
>     5) Introduce a new script type ("multi", "mixed", ... ideas welcome)
>     that defines only functions. There is one special function (e.g.
>     init())
>     that gets called when the TWScript object is created. Its purpose
>     is to
>     register menu items, toolbar items, hooks, ... (whatever comes to
>     mind).
>     Each of them gets connected to a function provided by the script.
>
>
> ScriptType: library
> Defines functions for use as hooks, callback slots, whatever....
> No need for a magic init() function: the "main program" of the script
> is executed on load, and can do whatever setup is desired.
> Or maybe we should have both initialize() and terminate() functions;
> the latter provides an opportunity for the script library to do
> last-minute things before the app shuts down, such as saving its
> globals to a .ini file, to be read by initialize() next time it is loaded.

I think we do need initialize() (and possibly, for symmetry, finalize())
if we intend to keep the system we have at the moment. In principle, I
can think of 3 different approaches:
1) We have a persistent per-script interpreter
2) We have an on-demand per-script interpreter
3) We have a global interpreter

Currently we're using nr. 2. An interpreter (e.g. a QScriptEngine
object) is created when the script is executed and destroyed when it is
finished. This could easily be generalized to multi/mixed/lib scripts.
We'd need to load them anew each time a function defined in them is
executed, however. Hence we can't put one-time initialization outside of
a function, as otherwise the script would add a new menu item/toolbar
item/whatever each time it is loaded.
I still think that 2 is the safest and best approach, though. Nr. 1
could cause tremendous overhead (imagine a user who has 100s of scripts,
each of which gets an own interpreter even though it may never be used
throughout the session). Nr. 3 on the other hand would introduce the
whole problem of naming conflicts. Besides, there could be some issues
when one script calls another script.

So, the control flow I imagine is as follows (pseudo-code)
// in the constructor
TWScript::TWScript() {
    CREATE_NEW_INTERPRETER()

    LOAD_SCRIPT()
    CALL('initialize')

    DESTROY_INTERPRETER()
}

TWScript::callScriptFunction(funcName, params) {
    CREATE_NEW_INTERPRETER()

    LOAD_SCRIPT()
    CALL(funcName, params)

    DESTROY_INTERPRETER()
}

BTW: "library" may be a bit misleading. Of course these scripts could be
used as Paul suggested, i.e. as providing only functions for use from
other scripts which are never invoked directly. They could also,
however, provide a single function with appropriate UI items. Such
scripts I wouldn't necessarily associate with the term "library".

For example, there have been several requests in the past for toolbar
icons to insert some code - say, Greek letters. This is trivial to code
in scripts, but we'd need some way to add a toolbar icon (simple) and to
connect to its triggered() signal (complex - see task 4 above). For
this, the script would obviously need to add a toolbar icon (from the
initialize() function), need to provide a function to actually insert
the text, and need to connect to the signal (again from initialize()).

-Stefan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/texworks/attachments/20100426/e2770217/attachment.html>