Motivation I am trying to create my own Math library in C# (NET Core 3.1) which of course includes vectors. The main purpose for this is that the Matrix structs in the System.Numerics namespace are somewhat heavily biased towards computer graphics and they don’t seem to benefit from the same hardware optimizations as System.Numerics.VectorX. Now I could of course just implement my own Matrix structs and use them together with the Vectors from System.Numerics. But then I get an ugly mix and I prefer to implement everything that I need in my own namespace.
Issue However, it seems that even though I am using the SIMD Intrinsic’s (SSE4.1) I am unable to implement a Vector class that has the same performance as the Vector from the System.Numerics namespace.
I ended up with the following implementation for my own Vector2 struct:
Unfortunately, whatever I tried so far ends up being much slower than the Vector2 struct from System.Numerics. I know that the Vectors from System.Numerics are directly vectorized by the CLR. But I would have expected that the same goes for using the .NET Intrinsic. Especially since I marked the method as “inlined”. Looking at the actual machine code (not IL) it generates, it turns out the method does not get inlined though and therefore is obviously much slower than the System.Numerics vector which gets inlined.
Is there something wrong with my implementation or is this just a limitation of the CLR?
You find a working sample here: https://pastebin.com/tT4gP0BG
So first you don’t want to be laying things out such that you are storing X and Y individually and as well storing them in a vector128. You are doubling how much memory is being used up, which wastes time not just on creation but also eats up memory bandwidth when iterating over collections of these things.
Also you don’t want to be storing a vector2 in just half of an SSE simd register either. But not to worry, you are making the most common mistake when starting to work with SIMD! You want to vectorize over multiple instances of 2 dimensional vectors.
I go over almost exactly your use case here:
https://www.youtube.com/watch?v=8RcjQPbvvRU
Thank you very much. Looking into the video. In general, there is barely anything to be found on the topic.
Just finished the video. Great talk! It’s still somewhat cumbersome though that C# basically does not allow one to encapsulate intrinsics operations into methods that would be more readable and then get inlined. For whatever reason though Microsoft is doing exactly this in an implementation of a Ray Tracer using intrinsics: https://github.com/dotnet/runtime/blob/master/src/tests/JIT/Performance/CodeQuality/HWIntrinsic/X86/PacketTracer/VectorMath.cs
However, if I do something similar, it still ends up as a function call instead of being inlined.
I guess I have to live with it.
How exactly did you determine that the method wasn’t inlined?
Decompiling the output DLL should work
Slightly offtopic but I am a starter in C# but i get overwhelmed by the level of knowledge and questions which get asked here. I am a junior developer creating mvc websites and api’s but that’s all googled.How do i get good like the OP, who want’s to create his own libraries because he know’s limitations of built in libraries.Any tips? how do i progress .
Hey, we would welcome you on csharp discord (#lowlevel channel) (aka.ms/csharp-discord), where there are people who are proficient with these things :).
Regarding you sample, in this static function, everything gets inlined as expected: public static float Dot(SrVector a, SrVector b) => SrVector.Dot(a, b)
.
C# devs
null reference exceptions