luajit
-
tor.helge.skei tor.helge.skei https://www.kvraudio.com/forum/memberlist.php?mode=viewprofile&u=152647
- KVRian
- Topic Starter
- 527 posts since 30 May, 2007
anybody here having any experience with luajit?
i'm playing around with lua (using luajit) in a vst plugins, and generally, things looks really promising.. i'm a complete noob when it comes to lua, but i'm learning.. my next step now, is looking into audio processing with lua..
in the various process() callbacks, we receive pointers to buffers with samples to fill with our own data, and/or read from.. so, how can i make these buffers available to the lua side, directly, using ffi, without copying (and converting) the buffers.. ??
i recently read this quote (don't remember where):
"You should definitely look at LuaJIT FFI, which allows you to cast a (light)userdata (a simple pointer opaque to Lua) to anything you defined using ffi.cdef. I use it to create a memory buffer (userdata) in C, call an external C++ module (some GPU processing using CUDA) to fill it with data, and return it back to Lua. Then in Lua I cast the userdata to a structure pointer and process it further in Lua. .. No copies are made using casting, and I would not have memory for such copies, as I am dealing with hundreds of megabytes of data."
a (confusing/meaningless) screenshot:
i'm playing around with lua (using luajit) in a vst plugins, and generally, things looks really promising.. i'm a complete noob when it comes to lua, but i'm learning.. my next step now, is looking into audio processing with lua..
in the various process() callbacks, we receive pointers to buffers with samples to fill with our own data, and/or read from.. so, how can i make these buffers available to the lua side, directly, using ffi, without copying (and converting) the buffers.. ??
i recently read this quote (don't remember where):
"You should definitely look at LuaJIT FFI, which allows you to cast a (light)userdata (a simple pointer opaque to Lua) to anything you defined using ffi.cdef. I use it to create a memory buffer (userdata) in C, call an external C++ module (some GPU processing using CUDA) to fill it with data, and return it back to Lua. Then in Lua I cast the userdata to a structure pointer and process it further in Lua. .. No copies are made using casting, and I would not have memory for such copies, as I am dealing with hundreds of megabytes of data."
a (confusing/meaningless) screenshot:
- KVRist
- 168 posts since 19 Apr, 2014 from London
That screenshot demonstrates not needing to copy buffers. Read through the code and output carefully to understand what is happening.
-
tor.helge.skei tor.helge.skei https://www.kvraudio.com/forum/memberlist.php?mode=viewprofile&u=152647
- KVRian
- Topic Starter
- 527 posts since 30 May, 2007
did a little more reading and experimentation..
this seems to work quite well:
c/c++:
lua:
so i can actually start making vst plugins in lua now
the (vst) plugin loads a lua script with the same filename (with the file extension replaced with .lua), and from the same directory as the plugin, .. it then compiles that, and the plugin starts calling lua functions when needed..
the plugin itself is made with my own library/framework, and should be fully portable.. so, soon i'll compile both 32 and 64 bit versions, for both windows and linux..
this seems to work quite well:
c/c++:
Code: Select all
virtual void on_processBlock(float** AInputs, float** AOutputs, uint32 ASize) {
lua_getglobal(MState,"on_processBlock");
lua_pushlightuserdata(MState,AInputs);
lua_pushlightuserdata(MState,AOutputs);
lua_pushinteger(MState,ASize);
lua_pcall(MState,3,0,0);
}
Code: Select all
function on_processBlock(inputs,outputs,size)
local in0 = ffi.cast("float**",inputs)[0]
local in1 = ffi.cast("float**",inputs)[1]
local out0 = ffi.cast("float**",outputs)[0]
local out1 = ffi.cast("float**",outputs)[1]
for i=0,size-1 do
out0[i] = in0[i] * gain_left
out1[i] = in1[i] * gain_right
end
end
the (vst) plugin loads a lua script with the same filename (with the file extension replaced with .lua), and from the same directory as the plugin, .. it then compiles that, and the plugin starts calling lua functions when needed..
the plugin itself is made with my own library/framework, and should be fully portable.. so, soon i'll compile both 32 and 64 bit versions, for both windows and linux..
-
- KVRian
- 573 posts since 1 Jan, 2013 from Denmark
Nice!
That seems a lot like something i worked on, and i definitely wanted to use LuaJit as well, however i never got around to it. How is the speed of Lua using a little bit more complicated rendering examples - usable for dsp?
That seems a lot like something i worked on, and i definitely wanted to use LuaJit as well, however i never got around to it. How is the speed of Lua using a little bit more complicated rendering examples - usable for dsp?
-
tor.helge.skei tor.helge.skei https://www.kvraudio.com/forum/memberlist.php?mode=viewprofile&u=152647
- KVRian
- Topic Starter
- 527 posts since 30 May, 2007
i'm not sure, since i actually don't know much at all about lua .. i'm learning, though.. but some performance charts and blog posts look really promising:Mayae wrote:How is the speed of Lua using a little bit more complicated rendering examples - usable for dsp?
a post from 2010, that shows results not far off from gcc.. it has probably become a bit better since then (the results are for an early beta).. link
"..On my laptop, the C implementation multiplies two 1000×1000 matrices in 2.0 seconds (BTW, 1.4 sec if I use float; 0.9 if SSE is used; 26.8 sec without matrix transpose), LuaJIT-jit in 2.3 seconds.." link
"..With no optimization or buffering, and loading every pair of samples individually into a table (which seems like it’d be inefficient and dumb), the LuaJIT version is faster than the C version!" link
but for me, the most interesting thing is the rapid-prototyping possibilities, and how easy you can throw together some specific utility plugin if you need it.. the same script would work everywhere, as long as the wrapper plugin is ported to that platform.. you can also call other shared libraries (dll/so) almost directly, and easily intergrate lua code with your own c/c++ code.. the possibilities are endless
-
tor.helge.skei tor.helge.skei https://www.kvraudio.com/forum/memberlist.php?mode=viewprofile&u=152647
- KVRian
- Topic Starter
- 527 posts since 30 May, 2007
[deleted.. double post]
-
- KVRian
- 573 posts since 1 Jan, 2013 from Denmark
Okay, nice.
Yes, i agree - we share the same vision check out the project in my sig (audio programming environment), it's virtually the same (only it's only frontend currently is in C, in can be extended to any language - like Lua)tor.helge.skei wrote:but for me, the most interesting thing is the rapid-prototyping possibilities, and how easy you can throw together some specific utility plugin if you need it.. the same script would work everywhere, as long as the wrapper plugin is ported to that platform.. you can also call other shared libraries (dll/so) almost directly, and easily intergrate lua code with your own c/c++ code.. the possibilities are endless
-
tor.helge.skei tor.helge.skei https://www.kvraudio.com/forum/memberlist.php?mode=viewprofile&u=152647
- KVRian
- Topic Starter
- 527 posts since 30 May, 2007
nice!!Mayae wrote:check out the project in my sig (audio programming environment), it's virtually the same (only it's only frontend currently is in C, in can be extended to any language - like Lua)
too bad i can't try it out (i'm on linux) :-/
you're using tcc? i tried that too, but it seems like tcc has some issues with 64bit linux shared libraries (the -fPIC part, i think).. a standalone binary (exe) worked great, though.. so i quickly changed focus, and looked at luajit instead.. i need to experiment a bit more with it, i think..
-
tor.helge.skei tor.helge.skei https://www.kvraudio.com/forum/memberlist.php?mode=viewprofile&u=152647
- KVRian
- Topic Starter
- 527 posts since 30 May, 2007
i did some performance testing with the lua wrapper, with quite encouraging results!
first i made a lo-fi, mono, simplistic pitch shifter, with a main-loop like this:
i compiled the plugin in debug mode.. with lots of debugging stuff, print statements, etc.. then i inserted this plugin on a track with a small audio loop playing, and duplicated the plugin 10 times.. the cpu meter barely moved, so i started duplicating this track.. 10 times.. so, now i had 100 plugins running.. it took a second or so for the jit-compilation to settle down, but after that, the cpu meter hovered around 25%.. cool! then i decided to stress the system a bit more.. selected all ten tracks, and duplicated that.. 200 plugins.. but that froze my desktop :-/ don't know if it was the jit-ing that froze it, or if it became too much for the cpu.. i rebooted almost immediately..
anyway, 100 lua scripts running at 25% is not too bad!!
soon i'll try to optimize the lua code, and compile a release-build of the vst plugin, and do some testing again..
first i made a lo-fi, mono, simplistic pitch shifter, with a main-loop like this:
Code: Select all
function on_processBlock(inputs,outputs,size)
local in0 = ffi.cast("float**",inputs)[0]
local in1 = ffi.cast("float**",inputs)[1]
local out0 = ffi.cast("float**",outputs)[0]
local out1 = ffi.cast("float**",outputs)[1]
for i=0,size-1 do
local in_ = (in0[i] + in1[i]) * 0.5
buffer[ math.floor(in_pos) ] = in_
in_pos = math.floor(in_pos+1) % math.floor(len_)
local gain = math.min(out_pos/fade_,1)
local out_ = buffer[ math.floor(base+out_pos) % math.floor(len_) ] * gain
+ buffer[ math.floor(fade_base+out_pos) % math.floor(len_) ] * (1-gain)
out_pos = out_pos + inc_
if out_pos >= (len_-1-fade_) then
fade_base = base + len_ - 1 - fade_
out_pos = 0
base = in_pos
end
out0[i] = out_
out1[i] = out_
end
end
anyway, 100 lua scripts running at 25% is not too bad!!
soon i'll try to optimize the lua code, and compile a release-build of the vst plugin, and do some testing again..
- KVRist
- 168 posts since 19 Apr, 2014 from London
Are they all running on the same core? as I'd imagine they'd all be sharing the same JIT runtime, right? and if so will not be able to fully utilise all cores.
-
tor.helge.skei tor.helge.skei https://www.kvraudio.com/forum/memberlist.php?mode=viewprofile&u=152647
- KVRian
- Topic Starter
- 527 posts since 30 May, 2007
not sure.. didn't go very deep with the testing.. i used bitwig studio in linux.. 10 tracks.. don't know how bitwig handles multi-core stuff, if it spreads the tracks out among cores, or something.. but i have a quad core, so perhaps that 25%, and crashing/freezing when i went over that, indicates that only one core is being used? hmm... the test plugin (64bit linux vst) is static linked with libluajit.a (v2.0.3, which i compiled myself), and is around 500k (debug build, not stripped).. i'll probably experiment with dynamic linking a little later.. but there's so much else interesting to try firstavasopht wrote:Are they all running on the same core? as I'd imagine they'd all be sharing the same JIT runtime, right? and if so will not be able to fully utilise all cores.
-
- KVRian
- 573 posts since 1 Jan, 2013 from Denmark
Oh, pity Technically, the source code should actually be (nearly) compatible with any unix kind.. Only i guarded a lot of the unix specific stuff in mac #defines - i will fix that next releasetor.helge.skei wrote:nice!!Mayae wrote:check out the project in my sig (audio programming environment), it's virtually the same (only it's only frontend currently is in C, in can be extended to any language - like Lua)
too bad i can't try it out (i'm on linux) :-/
you're using tcc? i tried that too, but it seems like tcc has some issues with 64bit linux shared libraries (the -fPIC part, i think).. a standalone binary (exe) worked great, though.. so i quickly changed focus, and looked at luajit instead.. i need to experiment a bit more with it, i think..
Yes, there's two compilers included, one of which is tcc (the other is configurable through scripts to invoke system compilers like llvm/gcc etc.). But yeah, x64 tcc is quite funky (and has some weird calling conventions that require dirty hacks) - given your results it wouldn't surprise me if luajit outperformed tcc. When i get time, ill look at luajit, got high expectations now
e: is it still true for lua that it doesn't have native integers?
- KVRAF
- 7890 posts since 12 Feb, 2006 from Helsinki, Finland
Sort of.. there is no separate integer type in the language, but the native arrays treat small integers (that is integer values of the numeric type) specially, so in some sense integers are meaningful, they just share the same type (in the language) with other numbers.Mayae wrote: e: is it still true for lua that it doesn't have native integers?
IIRC LuaJIT will generate integer code (optimization is known as type narrowing) in various situations when it decides it's the right thing to do (induction variables and such should at least get narrowed). Also if I'm not mistaken, you can use the bitops to do integer arithmetic explicitly.
- KVRAF
- 7890 posts since 12 Feb, 2006 from Helsinki, Finland
x64 calling conventions (different operating systems use slightly different variants) are different from the 32-bit cdecl convention, but yeah, you should normally get the same thing whether you specify cdecl or something else.camsr wrote:Isn't x64 calling convention limited to only cdecl?