Recommendations for a scripting or plugin language for highly math-dependent user coding?
|June 1, 2011||Posted by forumadmin under TechQns||
I have started a bounty for this question
…because I really want the
community’s input. I can (and have)
looked at several languages /
frameworks and think ‘well, this will
probably work okay’ — but I would
really appreciate advice that’s based specifically on the
problem I face, and especially from
anyone with experience integrating /
using what you recommend.
I work on scientific analysis software. It provides a lot of tools for mathematical transformation of data. One tool allows the user to enter in their own equation, which is run over the data set (a large 2D or 3D matrix of values) and evaluated.
This tool has a graphical equation editor, which internally builds an object-oriented expression tree with a different object for each operation (there would be an instance of the Logarithm class, for example, which is the node in the tree for adding calculating a logarithm of a value to a base; it has two children which are its inputs.) A screenshot of part of it:
You can see the tree it’s building on the left, and a few of many (fifty?) potential operations in the menu on the right.
This has a few downsides:
- A graphical editor becomes clumsy for complex equations
- There are some operations that are difficult to represent graphically, such as creating large matrices (the kernel for a n x n convolution for example)
- It only allows equations: there is no branching or other logic
It was neat when it was much simpler, but not any more, for the kind of stuff our users want to be able to do with it. If I wrote it now I’d do it quite differently – and this is my chance
I would like to give user something more powerful, and let them write code – script or compiled – that can perform much more advanced operations. I am seeking SO’s advice for what technology this should use or the best approach to take towards it.
The rest of this question is quite long – I’m sorry. I’ve tried to describe the problem in detail. Thanks in advance for reading it
Our math operates on large matrices. In the above equation, V1 represents the input (one of potentially many) and is 2D or 3D, and each dimension can be large: on the order of thousands or hundreds of thousands. (We rarely calculate all of this at once, just slices / segments. But if the answer involves something which requires marshalling the data, be aware size and speed of this is a consideration.)
The operations we provide allow you to write, say,
2 x V, which multiplies every element in
Vby 2. The result is another matrix the same size. In other words, a scripting or programming language which includes standard math primitives isn’t enough: we need to be able to control what primitives are available, or how they are implemented.
These operations can be complex: the input can be as simple as a number (2, 5.3, pi) or as complex as a 1, 2 or 3-dimensional matrix, which contains numerical, boolean or complex (paired values) data. My current thinking is a language powerful enough to which we can expose our data types as classes and can implement standard operators. A simple evaluator won’t be enough.
- Rather than just writing operations that are evaluated iteratively on one or more inputs to provide an output, as currently (which is implementable easily through an expression evaluator), I’d like the user to be able to: provide outputs of different sizes to the inputs; to call other functions; etc. For the host program, it would be useful to be able to ask the user’s code what part or slice of the inputs will be required to evaluate a slice or part of the output. I think exposing some part of our classes and using an OO language is probably the best way to achieve these points.
Our audience is primarily research scientists who either are not used to coding, or are probably used to a language like Matlab or R.
We use Embarcadero C++ Builder 2010 for development, with small amounts of Delphi. This may restrict what we can make use of – just because something’s C++, say, doesn’t mean it will work if it’s only been coded against VC++ or GCC. It also has to be suitable for use with commercial software.
Our software currently has a COM interface, and part of the application can be automated with our app being the out-of-process COM server. We could add COM interfaces to some internal objects, or make a second COM framework specifically for this, if required.
The ‘tools’, including this one, are being migrated to a multithreaded framework. The end solution needs to be able to be executed in any thread, and multiple instances of it in many threads at once. This may affect a hosted language runtime – Python 2.x, for example, has a global lock.
It would be great to use a language that comes with libraries for math or scientific use.
Backwards compatibility with the old expression tool is not important. This is version 2: clean slate!
- RemObjects Pascal Script and DWScript are languages easily bindable to
TObject-derived classes. I don’t know if it is possible to provide operator overloading.
- Hosting the .Net runtime, and loading C# (say) based DLLs as plugins. I rather like this idea: I’ve seen this done where the host program provides a syntax highlighter, debugging, etc. I gather it was a huge amount of coding, though. It would enable the use of IronPython and F# too.
- RemObjects Hydra looks like an interesting way of achieving this. Unfotunately it advertises itself for Delphi, not C++ Builder; I’m investigating compatibility.
- Hosting something like Python, which is doable from RAD Studio
- Providing a BPL interface, and letting users code directly against our program if they buy a copy of RAD Studio (ie, provide a plugin interface, and expose classes through interfaces; maybe require plugins be compiled with a binary-compatible version of our IDE)
Thanks for your input! I appreciate all answers even if they aren’t quite perfect – I can research, I am just after pointers on where to go, and opinions (please, opinions with reasons included in the answer :p) on how to approach it or what might be suitable. Every answer, no matter how short, will be appreciated. But if you recommend something in detail rather than just “use language X” I’ll be very interested in reading it
The following have been recommended so far:
Python: 2.6 has a global lock, that sounds like a game-killer. 3 (apparently) doesn’t yet have wide support from useful libraries. It sounds to me (and I know I’m an outsider to the Python community) like it’s fragmenting a bit – is it really safe to use?
Lua: doesn’t seem to be directly OO, but provides “meta-mechanisms for implementing features, instead of providing a host of features directly in the language”. That sounds very cool from a programmer point of view, but this isn’t targeted at programmers wanting cool stuff to play around with. I’m not sure how well it would work given the target audience – I think a language which provides more basics built in would be better.
MS script / ActiveScript. We already provide an external COM interface which our users use to automate our software, usually in VBScript. However, I would like a more powerful (and, frankly, better designed) language than VBS, and I don’t think JScript is suited either. I am also uncertain of what issues there might be marshalling data over COM – we have a lot of data, often very specifically typed, so speed and keeping those types are important.
Lisp: I hadn’t even thought of that language, but I know it has lots of fans.
Hosting .Net plugins: not mentioned by anyone. Is this not a good idea? You get C#, F#, Python… Does it have the same marshalling issues COM might? (Does hosting the CLR work through COM?)
A couple of clarifications: by “matrix”, I mean matrix in the Matlab variable sense, ie a huge table of values – not, say, a 4×4 transformation matrix as you might use for 3D software. It’s data collected over time, thousands and thousands of values often many times a second. We’re also not after a computer algebra system, but something where users can write full plugins and write their own math – although the system having the ability to handle complex math, like a computer algebra system can, would be useful. I would take ‘full language’ over ‘algebra’ though if the two don’t mix, to allow complex branches / paths in user code, as well as an OO interface.
|Asked By – David M||Read Answers|