Each frame in a gb emulator I am making, I’m calling a switch at the next opcode which calls some functions. The problem is that there are 256 different opcodes so there are 256 cases in my switch. Is it optimized ? If not, is there a more optimized (and somewhat readable) solution ? Thx !
Switch cases may or may not be optimized depending on the cases. Switch cases consisting of mostly sequential integers are usually turned into a jump table. Check with a decompiler if you want to be sure.
It sounds pretty optimized for speed.
Maybe… you could do a Dictionary where you have a key-value pair that is Opcode – Delegate then that delegate goes out to whatever functions you need to run? That might be prettier and more readable but I think it would be slower in execution.
it would be slower in execution.
Definitely not. Dictionary uses a hashtable and its lookup time is O(1).
Switch, on the other hand, can be optimized by a compiler in a number of ways. IIRC it’s being compiled to either:
an if/else chain
a jump table
a binary search tree
an actual Dictionary instance
You can see the actual optimization with tools like dotPeek.
The only way to know is to measure. Look up benchmarkdotnet, do some experiments. There isn’t likely a much better performing option but you can try things and see. Results will likely depend on the particular compiler/runtime version, and platform, and surrounding code/context. So some other poster saying “xyz is definitely faster” may be wrong, in your particular case.
I’ve seen a C# GB emulator that has a switch for the opcode decoder and it works really fast.
I would assume you have a processing method for each opcode.
So (let’s pretend opcode is int):
Dictionary<int, Action> would work. To make things easier you can create custom attribute to indicate which method for which opcode, scan your class and populate that dictionary.
One of the recommended methods to optimise something like this is to use a Dictionary of Funcs (if you want to return a value)
So for example. I’ve created an auto type converter. I have a private variable shown below
private IDictionary<Type, Func<string, object>> typeConverters;
Whereby the key is the type and the Func is the value
In the constructor of the class initialise the Dictionary
typeConverters = new Dictionary<Type, Func<string, object>>
{
[typeof(int)] = (string value) => ConvertToInt(value)
};
Which points to
private int ConvertToInt(string value)
{
try
{
int.TryParse(value, out int result);
return result;
}
catch (Exception)
{
return default;
}
}
This way I do away completely with the switch statement, but get the same effect by calling the Dictionary as below
typeConverters[typeof(Int)]
Hope this is useful.
Two important questions:
Are the opcodes sequential?
Do the calls have many (>2) parameters?
If the opcodes are sequential, the Jit emits very efficient code, which consists of one load, one add and one indirect jump, and you cannot get much better than that.
As for the second question, the nature of RyuJit’s register allocator means that this pattern of switch+calls has some overhead: one load, one store and one reg-to-reg move (these ones are mainly a problem in terms of of code size) per parameter. See https://github.com/dotnet/runtime/issues/46391.