🎉 Celebrating 25 Years of GameDev.net! 🎉

Not many can claim 25 years on the Internet! Join us in celebrating this milestone. Learn more about our history, and thank you for being a part of our community!

How to call native functions in C++-bytecode

Started by
31 comments, last by Juliean 3 years, 10 months ago

I'm not sure what sort of scripting architecture, scripting features and uses of scripting you are aiming for.

  • Open-ended, high level scripts (like writing a game in Python, using your game engine and arbitrary other libraries as extension modules) or specific purpose “small” scripts (like scripting particle systems by describing the behaviour of every particle and particle source type with something like BulletML)?
    Both cases seem to imply exposing to scripts your game engine with a carefully designed API (respectively a neat and comprehensive interface for everything or small “windows” of restricted primitives), with no need to add arbitrary functions particularly often.
  • To what extent your scripts can be processed and compiled ahead of time? Generating C or C++ code that can be compiled and amalgamated into your game (possibly as a DLL to allow loading it more dynamically) would be both simple and very high performance: there need to be specific reasons (for example, loading scripts from mods and levels) to justify slower and more complex compromises.
  • As compromises go, turning a visual scripting graph into bytecode and bytecode into machine code is potentially high performance (not too much if you cut corners) but at the masochistic end of the complexity and difficulty spectrum.
    What benefits of decoupling the front end and back end of your scripting system with a bytecode representation apply to your needs? Is there a cheaper and simpler way to obtain those benefits?
    For example, a large and sophisticated node graph seems a good fit for generating code in an established high level language, and leveraging infrastructure for that language: C or C++ (or D, Rust, Go…) for direct embedding as suggested above, Lua or Javascript or other languages with ready made high performance embedded interpreters (probably with efficient JIT compilation), GPU shaders (maybe in part).

Omae Wa Mou Shindeiru

Advertisement

LorenzoGatti said:
Open-ended, high level scripts (like writing a game in Python, using your game engine and arbitrary other libraries as extension modules) or specific purpose “small” scripts (like scripting particle systems by describing the behaviour of every particle and particle source type with something like BulletML)? Both cases seem to imply exposing to scripts your game engine with a carefully designed API (respectively a neat and comprehensive interface for everything or small “windows” of restricted primitives), with no need to add arbitrary functions particularly often.

In short, I'm pretty much going to blueprint-route, even with a bit more focus on using the visual scripts for gameplay. The c++-side of the engine/plugin-architectur is mainly used to represent everything “generic" in a way. I'm lacking the proper word for it, but think like this: Everything thats related to implemented say the rendering, collision-detection, entity/game-object respresentation is done in C++, while everything related to gameplay (Player/Enemy-behaviour, cutscenes, spells, …) is done in my visual scripting. So I do need a good bit of the engine exposed, and as I said I've already been through having to manually write wrappers and really felt how it impaired productivity (even with using reflection ie. to expose all component-properties w/o having to define them myself).

LorenzoGatti said:
To what extent your scripts can be processed and compiled ahead of time? Generating C or C++ code that can be compiled and amalgamated into your game (possibly as a DLL to allow loading it more dynamically) would be both simple and very high performance: there need to be specific reasons (for example, loading scripts from mods and levels) to justify slower and more complex compromises.

While it would be totally ok to precompile all scripts for builds; I want to have quick iteration times while working in the editor. Lets just say, that my own “compiler” so far took like 0.1s, and I want to keep it that way. So if I can't use a C++ backend for the editor, I'll have to have another backend and I don't want to deal with having multiple of those. Thus, I'm searching for a middle-ground, which I belive can be achieved with a bytecode-compile language (I already made some great progress since yesterday and got basic stuff running, so I think that'll work). There's a few other reasons why I disregarded direct low-level code gen; mainly that my language is designed to allow support coroutine-like stopping/resuming with concurrently running code naturally, so having to express this in C++ would make the code-generation itself rather complicated I belive.

LorenzoGatti said:
As compromises go, turning a visual scripting graph into bytecode and bytecode into machine code is potentially high performance (not too much if you cut corners) but at the masochistic end of the complexity and difficulty spectrum.

Uh, I don't mind that. The whole scope of my engine is on the larger end of the spectrum anyways, so it doesn't bother me having to do something thats also difficult ?
And you know, with what I got since yesterday I'm even more confident that its going to work out. With maybe 8 hours invested, I got a basic compiler+bytecode-interpreter, hooked up to my current graph+typesystem, with basic integer-arithmetic and local variables working. There's going to be some things that I'll have to wrap my head around, but its all going in a direction I very much like. And truthfully, so far the code is way less complicated than what I had before.

LorenzoGatti said:
What benefits of decoupling the front end and back end of your scripting system with a bytecode representation apply to your needs? Is there a cheaper and simpler way to obtain those benefits? For example, a large and sophisticated node graph seems a good fit for generating code in an established high level language, and leveraging infrastructure for that language: C or C++ (or D, Rust, Go…) for direct embedding as suggested above, Lua or Javascript or other languages with ready made high performance embedded interpreters (probably with efficient JIT compilation), GPU shaders (maybe in part).

I'm not gonna use different languages. Its really only a subjective thing, I'm not using any external libraries for the engine (except for the bare basics of DX/WinAPI). No reason thats grounded in productivity, just how I like to do things.
I could do C++-generation, but mainly for the reasons listed above I'm not going to do it.

(Don't get me wrong, I really appreciate the suggestions ? But for one reason or another going that way is what I wanna be doing ? )

Considering how complex your project is becoming, I would second @lorenzogatti suggestion. Consider you might be going beyond the goal of something useful as a personal project. Keep in perspective UnrealScript as far as I understood is no more a thing, it seems like the middle ground isn't good enough.

I took the chance to check how I did it and considering it's about 10 years late and predating the introduction of lambdas as far as I can remember I doubt sharing its details would be useful. Yet I will try to distillate it a bit in case it sparks you an idea.

My scripting was based on walking an AST. Every time you wrote `{ … }` (in script) you would create an `InstructionBlock`. Each instance of this class by default was a list of VM instructions. This was the standard implementation. For calls to native C++ I just (remember it's before lambdas) defined a new class with its Execution call implemented as a marshalling code to the C++ function. I did this for several dozens of calls.

In modern C++ I guess the special implementations would be lambdas, then writing a marshalling function which could force the type match at compile time. It wouldn't even have to be parameter-pack driven, for 6-8 parameters it could be possible to just write the variations by hand. The result of this function call would be the table entry for your compiler/interpreter.

I'd like to see how this could be done today. What is the specific issue stopping you? Your posts seem to describe mostly the desiderata than the roadblocks you are encountering.

Previously "Krohm"

Krohm said:
Considering how complex your project is becoming, I would second @lorenzogatti suggestion. Consider you might be going beyond the goal of something useful as a personal project. Keep in perspective UnrealScript as far as I understood is no more a thing, it seems like the middle ground isn't good enough.

I mean, this is kind of hard to argue based on just what I can tell you w/o showing details (I mean, I could, I'm always considering to continue writing my blog but I'm too lazy/too far busy with developing)… but its actually that other way around. Since I started this project about 8 years ago, for most of the time it has been something way too complex to ever be useful. While I was still enjoying working with my own engine simply because its fun for me, I always knew that it was not really productive. But ever since I made some drastic changes a few years ago (which granted took a long time itself), its actually became quite managable and useful. With the occasional breaks when I “have to” do stuff like that now, but I'm really liking the way the project is going and don't think its growing out of scope or anything ?

EDIT: Or, in simpler words, adding this bytecode-"language" is not whats going to be making the project too complex, trust me :D

My scripting was based on walking an AST. Every time you wrote `{ … }` (in script) you would create an `InstructionBlock`. Each instance of this class by default was a list of VM instructions. This was the standard implementation. For calls to native C++ I just (remember it's before lambdas) defined a new class with its Execution call implemented as a marshalling code to the C++ function. I did this for several dozens of calls.

In modern C++ I guess the special implementations would be lambdas, then writing a marshalling function which could force the type match at compile time. It wouldn't even have to be parameter-pack driven, for 6-8 parameters it could be possible to just write the variations by hand. The result of this function call would be the table entry for your compiler/interpreter.

Well, in modern C++ this is actually very very easy thanks to (variadic) templates and fold expressions. You can pretty much write a function which does this

template<typename Functor, typename ReturnT, typename... Args>
void callNative(Functor f, Stack&amp;amp;amp;amp; stack)
{
	if constexpr (!std::is_void_v<ReturnT>)
		stack.Push(f(getArg<Args>(stack)...));
	else
		f(getArg<Args>(stack)...);
}

You just have to use a few more templates to get the functions return/arg-packs and also have to store the function-poiner (yeah lambdas do help there), but in essence thats what I have already been doing. (the real code is a lot more complicated as I allow introducing new types for the function-signature that must not even be recognized by the type system, ie. being able to use “Sprite&” which will act as an entity in the graph but automatically fetch the component so I don't have to do that in the function. What can I say, I love automation and hate writing repetetive code :D )

Krohm said:
I'd like to see how this could be done today. What is the specific issue stopping you? Your posts seem to describe mostly the desiderata than the roadblocks you are encountering.

There's nothing from stopping me from doing that, I'm already in the process ? My question originally aimed at getting rid of this extra-step and in
how I would manage to call the function directly without any wrapper. Towards that question after @frob s answer, I'm still not sure how I would manage to setup the stack/registers for the function call while executing my own code to perform the setup.
But I'm going to stick for the wrapper-approach for now, as it means way less work and really only means minimal modification of my function-binding approach. So at this point were are probably just having a friendly discussion/exchange of ideas about the topic and not a specific road-bump I'm facing; which is fine by me ?

Juliean said:
My question originally aimed at getting rid of this extra-step and in how I would manage to call the function directly without any wrapper.

You could maybe look at what Python `ctypes` module does to call a native DLL/so function, it's not a bytecode feature so last I checked is a lot slower than a C extension taking the PyObject* array/tuple and dealing with that itself, but maybe something to learn.

Presumably they have a way in there to setup the stack such that once done with all the arguments it is ready to just do a CALL. I'd guess some manual stack manipulation which would need consideration for C++ exceptions, destructors, etc.

Do you have a common base type like PyObject or VALUE?

Not sure any interpreter I have used does this though as its primary way to call native functions, looking up Lua and JS (napi) they do something similar to as well with `lua_State` and `napi_callback_info`, basically relying on the native compiled code to unpack some sort of array-like argument into what it wants.

SyncViews said:
You could maybe look at what Python `ctypes` module does to call a native DLL/so function, it's not a bytecode feature so last I checked is a lot slower than a C extension taking the PyObject* array/tuple and dealing with that itself, but maybe something to learn. Presumably they have a way in there to setup the stack such that once done with all the arguments it is ready to just do a CALL. I'd guess some manual stack manipulation which would need consideration for C++ exceptions, destructors, etc.

Thanks, I'll have a look at it.

SyncViews said:
Do you have a common base type like PyObject or VALUE?

Somewhat. I have a class called “Variable” which can hold any value known to the type-system, with the caveat that its expensive when using in conjunction with arrays/strings/objects as it has to allocate/refcount those, so I'd rather not use that during interpreting. I'm currently using it to dictate what byte-code gets generated though. I do have a lighter variant called “VariableView” (its what string_view is to string), but obviously this requires the value to be stored somewhere else so at that point I might as well use the value directly from stack.

My idea for handling this is to go for a cdecl-style approach where I push all function-arguments on the stack with a default-convention (by-value for most primitives, by ref for everything costly) and then have my wrapper pop them off in inverse order. This doesn't require me to even have/use a base/wrapper-class for the values.

SyncViews said:
Not sure any interpreter I have used does this though as its primary way to call native functions, looking up Lua and JS (napi) they do something similar to as well with `lua_State` and `napi_callback_info`, basically relying on the native compiled code to unpack some sort of array-like argument into what it wants.

Yeah, this ought to be be good enough for me as well. I'm expecting drastic speedups anyways, as simply using a stack gets rid of so many dynamic allocations, indirections, lookups etc… And when I have everything up and running again I can still try to do wrapper-less native calls later.

Juliean said:
You just have to use a few more templates to get the functions return/arg-packs and also have to store the function-poiner (yeah lambdas do help there), but in essence thats what I have already been doing. (the real code is a lot more complicated as I allow introducing new types for the function-signature that must not even be recognized by the type system, ie. being able to use “Sprite&” which will act as an entity in the graph but automatically fetch the component so I don't have to do that in the function. What can I say, I love automation and hate writing repetetive code :D )

You might find useful to think in terms of the “backing store” of your script-object.

Ok, some more thinking. I think your `getArg` is pretty much what I would have developed as `Resolver` if the project could get along.

An aside, how does the parameter pack look like in case you need to debug it?

Previously "Krohm"

Krohm said:
You might find useful to think in terms of the “backing store” of your script-object.

Could you elaborate a bit on what you mean with that? Not sure I can follow?

Krohm said:
An aside, how does the parameter pack look like in case you need to debug it?

Also not sure I fully understand the question. The template-parameters are just 1:1 mapping of the function signature, so a specialisation in debugging might look like that:

callNative<XXX, int, float, const Object*>(...); 

Is that what you meant?

Did you already consider using OpCodes instead of byte-code to run your scripting language or even to embedd C# CLR into your engine instead of writing your own maybe more limited “CLR” and also provide a well-known language at the same time to your users?

Once I looked through the LUA Runtime code at GitHub, it looked very easy to create your own processor using OpCodes and a well designed combination of switch and goto statements along some registers and a stack that mimic the behavior of a real CPU-Core. Sure it's speed might be still limited to some point in opposite to real machine code but it is fast up to some point and you could design your “compiler” to optimise calls on OpCode level. Another pro feature would be to allow users to write “Assembly” in the same way C# does

Shaarigan said:
Did you already consider using OpCodes instead of byte-code to run your scripting language or even to embedd C# CLR into your engine instead of writing your own maybe more limited “CLR” and also provide a well-known language at the same time to your users?

Hold on, now. Thats already what I'm doing (=writing my own OpCodes), so now I'm confused. I assumed that thats what bytecode is/does. I'm having my own set of OpCodes which are run inside a loop via switch in some sort of VM/interpreter. Did I use the wrong term for that? Everything I searched for bytecode suggested that this is what that is. So what I'm doing is something like this:

enum class OpCodes : uint8_t
{
	PushIntConst,
	AddInt
};

// inside the VM
const auto opCode = bytecodeStream.Read<OpCodes>();
switch (opCode)
{
case OpCodes::PushIntConst:
{
	const auto constant = bytecodeStream.Read<int>();
	stack.Push(constant);
	break;
}
case OpCodes::AddInt:
{
	const auto value1 = stack.Pop<int>();
	const auto value2 = stack.Pop<int>();
	
	stack.Push(value1 + value2);
	break;
}
}

Shaarigan said:
or even to embedd C# CLR into your engine instead of writing your own maybe more limited “CLR” and also provide a well-known language at the same time to your users?

I considered that, but again for a few reasons I'm not going to use external code in this project. Aside from a few technical points that I assume would be more difficult if I didn't have 100% control over execution, its mostly just for the fact that “I don't use external code in this project” :D For no particular reason other than that I like having full control and enjoy learning/writing all aspects myself.

Shaarigan said:
Sure it's speed might be still limited to some point in opposite to real machine code but it is fast up to some point and you could design your “compiler” to optimise calls on OpCode level.

Yeah, I'm already getting good optimization-opportunities even on the level of control-flow (discarding unused returns, scoping local variables). And again, I'm already being to get such a huge improvement in terms of performance over my old system that it doesn't really matter that its still slower than native code (at which point I could turn to JIT if I really needed to).

Shaarigan said:
Another pro feature would be to allow users to write “Assembly” in the same way C# does

Yes, I also consider this a good addition. The current backend offered a lot of control over flow of execution to the plugins/users, so I definately wanna keep that (only now on compilation-level rather than runtime).

This topic is closed to new replies.

Advertisement