I’m a CS student studying for the C# developer exam and I’m wondering how much LINQ is used in actual coding considering its overhead. I love LINQ and all its features, it’s filtering and data manipulation is awesome but if I’m not mistaken it comes with a performance loss.

IIRC I was developing an in-house program last year at work and I was reading some JSON data from an API and then filter the deserialized data using Linq. It took a very long time (hundreds of milliseconds) and considering the application’s use-case, it was unacceptable. I resulted in writing the logic with foreach loops instead and it was almost 3 times faster.

I can only imagine that in the real world, performance is important. How do you get around this performance loss? Is PLINQ the replacement? Or am I mistaken on how much performance is prioritized? Thanks in advance.

I use LINQ all the time. I have never encountered a significant performance impact. But I don’t write a lot of performance-critical code either. I am fortunate enough to be able to prioritize code readability and maintainability over pure performance.

If I was in your situation I would have done it the same way: write it using LINQ first then if performance was unacceptable, replace the critical sections with looping code.

Thanks for answering, what do you work with if you don’t mind me asking?
Edit: Also, do you use Parallel Linq as well or mostly just plain Linq?
Linq is used constantly, everywhere, except in places where it’s small performance overhead can’t be tolerated, for example, in the main game loop of a game, or a really high throughput end point of a website. I work in a very high traffic web application, and we have linq all over the place.

There are a few spots where we have gotten rid of it though, either in high throughput endpoints or when we process large, simple collections to make caches out of them.

How much overhead there is with linq depends a lot on the use case. Generally, with things like select/where clauses etc, you have a very small one time allocation of memory, and then a per-iteration time cost. So say you have N elements in your collection, iterating it over LINQ will cost a small bit of time extra. If all you are doing in each iteration is a little bit of math, then the overhead can make it like 3 times slower, like you observed.

On the other hand if you are spending milliseconds doing something each iteration (maybe some IO, or bigger computation) then the linq overhead is basically irrelevant.

BenchmarkDotNet is a nice tool to use to reliably benchmark differences, its good to play with for a while to build some intuition about when LINQ is slow, and when it isn’t. Also keep in mind LINQ performance has gradually improved over time. So something that is bad in .NET 4.6 might be fast in .NET 5

there are libraries like LinqFaster, and others, which let you get some of the same convenience with less overhead.
Interesting, thanks for the detailed answer.
LinqFaster looks amazing. Thanks for the info!

It would be interesting to know the LINQ used. LINQ is pretty optimized. If you were doing something in the LINQ that ended up making duplicate calls that you somehow avoided by converting to a for loop, that might be the reason.

Why do you think LINQ is “pretty optimized” and what do you think that means? There are lots of languages with features like LINQ. Java Streams, C++ and Rust iterators. All three of those are very well optimized. They compile down to the same thing you would get if you wrote a for loop by hand instead. C# doesn’t do that, for complex historical reasons. To fix it would be a breaking change so it may never happen. Improvements have been made for sure, but its not a cost-free abstraction or anything.

In my experience with C# (almost 6 years now) I’m using linq basically everywhere. Only once I had a hot path so critical that I had to refractor linq code into loops with stackalloc/span magic (700mb allocations into a 4mb allocations, from 230ms into 4ms and the most important – reduced time spent in gc from 43% into 0,3%). The difference was astronomical, but like I said – I had to do it once, and code is abysmal, completely unreadable without comments and very detailed documentation (not to say that I refactored 40 lines into 720).

It’s pretty rare to care about performance in business applications. The most important thing is to write code to be readable, optimize only if that’s really really necessary (and I cannot stress that enough – measure measure measure. BenchmarkDotNet is your best friend here). Premature opmitization is root of all evil, so keep that in mind
I use it all the time.

I use it all the time with Entity Framework. Entity Framework converts your Linq commands into SQL, and they execute with all the speed that you’d expect from SQL.

When not working with SQL, Linq will have performance problems, yes – but readability is nearly always more important, so I have no problem using Linq. Avoiding it completely would be considered premature optimisation. If it subsequently proves to be a problem (which is rare, because it’s only going to be a problem when every other part of your workflow is faster, and that’s rare) then you should remove it and use loops instead.

Iterating a collection of 3 milion items with a Where!clause is about 4ms slower than a for loop. If you have performance issues, this is not where you first should look.

I’d be curious to see the difference between the LINQ code and the explicit loops. If I had to guess, the speedup probably had to do with using delegates vs. static code, not LINQ per se. LINQ itself is just doing loops anyway, it’s nothing mysterious and shouldn’t add much overhead at all.

Another possibility is that your LINQ code had unrelated inefficiencies which got inadvertently fixed when you refactored the code. It’s impossible to say without seeing the code itself. I would never expect LINQ to cause a 3x slowdown compared to “longhand” code.

The code I wrote is now unfortunately lost in some DevOps repo. But I do think that the scenario was similar to this. The API I was working with only returned data in XML-format which got me to write a deserializer myself which consisted of, at first, a couple of Linq commands to convert values of tags to new objects. After witnessing the performance I rewrote it using ~10 foreach loops iterating through the data, which was much faster. Now that I think about it I was probably using the XML.Linq namespace.

This is probably how it looked like but with ~10 loops.
LINQ is extremely powerful not only with ORM like ef core but actually collection as well. It’s an excellent tool and it makes programming easy and more readable.

Unless you are writing some trading/real-time app I don’t think that overhead will be that important.
Check out – https://www.pluralsight.com/courses/linq-fundamentals-csharp-6

I use it enough I don’t think about it, which is sort of a non-answer but indicative of how prevalent it is in modern code.
But it is a tool with implications, so I don’t doubt you had a bad experience in some use case with it. LINQ in general is not a performance hog, but there are some bad cases where it shouldn’t be used. Maybe you were in one of them, it’s generally fun to dissect code that’s performing poorly, explain why, and propose alternatives. My guess is you were accidentally creating one of the most common problems and invoking multiple enumerations. What’s that mean? I’m going to pretend you asked. I don’t think this is underdiscussed, but you do tend to have to stumble into an article about it. Consider this LINQ snippet I’ve expanded for debugging purposes:

Try to figure out what you think it will print before you run it. Now run it. Did you get what you expected?
Most people early in their LINQ journey would expect to see:
But that’s not what you saw, is it? Instead, you see evidence that the Where() prints its tests for each item a second time! That’s the trick with enumerables: when you make most LINQ calls, what you get back isn’t an “actual” collection. It’s a teeny state machine that will iterate from start to finish every time you use it. This is a feature of LINQ called “deferred execution”, the intent is that you don’t pay the cost of filtering up-front when you make the LINQ call but instead amortize it over iterations in a loop. However, this assumes you’ll take care to only perform one iteration!
So a good “fix” here would be to avoid the Contains() call and integrate that logic into the foreach loop:

Another alternative might be to use ToList() or ToArray() on evens. This loses deferred execution, but gives you a pre-filtered list/array so you don’t pay for enumeration each time you use it. It’s on you to know these decisions exist, and when each is what you want.

I think it’s really likely this is what burned you in that early project you did. It’s an easy mistake to make, and when the LINQ query happens far away from where you next use the enumerable, even experts can forget.
I’m just getting into learning to code C# and I was just wondering what LINQ is and what is the use of it. I’m really new so might need it explaining super simply if that’s ok

LINQ always and everywhere. Do not remember any cases when I had to use something else for performance reasons. I look at my crappy algorithms first. Sometimes replacing scan with hash set improves performance, other times changing filtering order or data structures.

Unlike binary trees that your professors make you think are important, LINQ is actually used everywhere. The exceptions are in hot paths, and even then LINQ is only refactored away after profiling.

Writing maintainable and easy to read code often outweighs the performance benefits of writing optimized code. LINQ, at least the method syntax, tells you exactly what it is doing.
i use it anywhere where I want to showboat to anyone that knows a little c# but doesnt know linq 🤣
All the time.
Loops are ofc faster, but it can get messy really fast. We are talking ms, even ns here. This is not your bottleneck if your application is slow.

source