|
I have never used LINQ and never will...
For me,the optimum form of access to a database is through the use of stored procedures and functions with a Data Access Layer.
All the other inline code frameworks from LINQ to ORMs simply add massive layers of software between the client and the database, which is inherently inefficient.
Over the years, the trend has been to push increasingly more to the client but this violates good standards of n-tiered development, which has always been the overriding standard for quality system design.
No matter how you slice or dice it, n-tiered implementations done properly simply cannot be beaten in terms of efficiency and ease-of-maintenance.
However, recent generations of developers seem to insist upon stuffing as much as possible at the client, especially in the web environments, which in the latter case, only makes them less secure.
The thin-client has always been and will always be the most efficient form of development given the state of current architectures and topologies...
Steve Naidamast
Sr. Software Engineer
Black Falcon Software, Inc.
blackfalconsoftware@outlook.com
|
|
|
|
|
There are places where LINQ works even in the scenarios you describe.
|
|
|
|
|
Thank you for the controversial opinion.
I wrote this article from the perspective that many C# developers in my environment think that every problem can be solved with C#.
There is a saying in German: Those who only have a hammer see a nail in every problem.
I see it in a more differentiated way and would like to recommend that one should not close oneself off to one (for loops) or the other (LINQ) solution approach from the outset for ideological reasons. And that also applies to the trench warfare between C++ and C#.
|
|
|
|
|
I don't believe I have ever aligned myself with any technical ideology.
Its just that with a lot of the "new" techniques for doing various things in development, I see little no advantage to using them.
For example, the use of a List(Type Of) over an ArrayList. It is of course true that with a List(Type Of) you eliminate the costly boxing and unboxing internal processes. However, on the other hand, how many objects would one ever put into an ArrayList that a user would notice the lesser efficiency? Not many I suggest.
I never understood the attraction of LINQ. As one who has done a lot of database development with stored procedures over a very long career, it made little sense to me to begin moving that aspect of my development into a software layer.
True, one does not have to learn the complexities of using stored procedures but then again, years ago, that was what was just expected of a developer.
Also, I am not sure how many different databases other than SQL Server LINQ supports. And since I have worked with quite a few database engines, using LINQ would appear to be a limitation than an enhancement to my skills.
In the end however, it is usually up to the individual developer and how they see the efficiencies and inefficiencies they want to deal with when developing...
Steve Naidamast
Sr. Software Engineer
Black Falcon Software, Inc.
blackfalconsoftware@outlook.com
|
|
|
|
|
I agree with you almost completely. For me, too, new technologies must first prove that they offer advantages before I use them. But I think you're leaving out one aspect: Deferred execution, especially for chained LINQ statements.
|
|
|
|
|
Steve,
I agree, a new technology needs to offer benefits in excess of the learning curve. That's hard to know up front, and why I try to focus on learning small, incremental changes that helps me write higher quality code (easier to maintain, less bugs, less code, more performance - in that order) in less time. (In most cases, the performance of the code is secondary to the efficiency of the developer, the main exception being tight loops.)
Sometimes LINQ lets me be more efficient, and that outweighs the lower performance of LINQ. Other times, the overhead of figuring out how to do the work in LINQ, or the performance penalty, outweighs the benefits.
On List<t>, you left out two key benefits. (1) It eliminates the need to write casts all the time and (2) fewer bugs (type safety blocks invalid casts at compile time, with ArrayList it's a runtime error). Consider:
class Employee { public string FirstName; public string LastName; }
void PrintEmployeeNames(ArrayList EmployeeList)
{
foreach (object o in EmployeeList)
Console.WriteLine($"First Name: {((Employee)o).FirstName}, Last Name: {((Employee)o).LastName}");
}
void PrintEmployeeNames(List<Employee> EmployeeList)
{
foreach (var e in EmployeeList)
Console.WriteLine($"First Name: {e.FirstName}, Last Name: {e.LastName}");
}
The first case requires more code, which carries a cost to the developer and at runtime (verifying the cast). I could rewrite it to only do the cast once, but that would mean writing more code overall. It also has the problem that it could be passed an ArrayList of anything (e.g. Invoices) and you wouldn't know it until that particular code path ran and the code blew up with an invalid cast.
The second case is a little more code in the declaration, but less code overall, and less cost because the type cast is completely avoided. More importantly, the developer doesn't have to code the cast, and can't accidentally call it with the wrong type of list, the compiler will prevent it, and so I can't put that bad call into the application for someone to stumble on down the road, and then spend hours or days finding, fixing, and redistributing (plus it avoids all the customer reputation problems that buggy code brings).
Whether you use LINQ or not, is up to you, but please do reconsider List<t> and other generics, they're a huge win over their object-based alternatives (and you can always do List<object> which has the advantage of telling other developers that you really do have a mix of completely unrelated objects in the list and that they need to code accordingly when using it.)
|
|
|
|
|
Quote: For me,the optimum form of access to a database is through the use of stored procedures and functions with a Data Access Layer.
Hmmm, stored procedures strike me as an attempt to move the business layer into the database, which itself violates the n-tier development
There are still good reasons for providing a solid, well-designed abstraction to the actual database, which views and stored procedures can provide.
Quote: No matter how you slice or dice it, n-tiered implementations done properly simply cannot be beaten in terms of efficiency and ease-of-maintenance.
I agree completely. However, sometimes the priority for a project is to "get something out", or "make a required level of functionality available", or "keep it flexible, we don't know where this project is going to end up".
In those cases, the "right" decision might be to not spend a long time designing the database, and throw some functionality together in C# using LINQ and EF, because its faster, easier and more flexible. It carries the risk of creating a monstrosity that is inefficient but that isn't always the most important criterion for a project.
Pragmatism FTW.
|
|
|
|
|
Quote: Hmmm, stored procedures strike me as an attempt to move the business layer into the database, which itself violates the n-tier development
Looks like a "typical" answer from a "microsoft-side" developer.
In larger applications with an Oracle-database as backend, there is no question that you put as much business logic as possible into the database.
Andreas
|
|
|
|
|
I agree with you about accessing the database (although this opinion is very seldom in times of EF).
But LINQ is not only about accessing the database. LINQ is also used to filter, query, transform or sort in-memory-data with a least amount of code. And that's a huge step forward in terms of ease and readability of the produced code.
Especially for database-developers, as they can write SQL-like code in C# to process lists or arrays.
Andreas
|
|
|
|
|
I would agree with you on the ability to perform in-memory operations with retrieved data as an advantage for using LINQ.
However, this would mean that one is using LINQ as an in-memory database system, which is a very advantageous situation for database development.
However, practically all of the database processes I have worked with in my very long career against a variety of database engines, rely on getting direct access to data at the database level.
This is not to say that using LINQ in the way you describe is not a benefit but I would hazard a guess that unless you are a rather advanced database developer, most people will not take advantage of such capabilities in LINQ.
As a result, most of these capabilities are ignored leaving LINQ to simply be a rather heavy layer between the application and the database.
I could be wrong since I have not been in the corporate environments for several years now, but most companies do not change their processes all that quickly...
Steve Naidamast
Sr. Software Engineer
Black Falcon Software, Inc.
blackfalconsoftware@outlook.com
|
|
|
|
|
I'm a huge fan of fat database development, but in all the client-applications me and my colleagues wrote the last decade, there is always the need to somehow process data from the database "in-memory" on the client.
One example is tranforming "tabular data", which is entered by the user in some datagrid, and somehow show sum's aggrgations, groupings, charts etc. BEFORE the data is stored in the database.
And this is very easy and similar to sql-handling with LINQ.
Andreas
|
|
|
|
|
Wouldn't simple arithmetic sums against the datagrid data be more efficient than using a heavy layer like LINQ to process such operations?
Steve Naidamast
Sr. Software Engineer
Black Falcon Software, Inc.
blackfalconsoftware@outlook.com
|
|
|
|
|
Hi,
i don't know why you talk of a "heavy layer" when talking about LINQ. It's just a language feature of C#.
Here's a short (simplified, but working) example (see method CalculateSum)
internal class OrderSumCalculater
{
internal List<OrderItem> OrderItems = new List<OrderItem>
{
new OrderItem { Product = "Bike", NumUnits = 1, TaxRate = 19.0, Price = 1000.0 },
new OrderItem { Product = "Car", NumUnits = 1, TaxRate = 19.0, Price = 10000.0 },
new OrderItem { Product = "Hamburger", NumUnits = 10, TaxRate = 7.0, Price = 7.0 },
new OrderItem { Product = "Drink", NumUnits = 10, TaxRate = 7.0, Price = 5.0 }
};
internal List<OrderItem.TaxSum> TaxSums;
internal void CalculateSums()
{
TaxSums = OrderItems
.GroupBy(x => x.TaxRate)
.Select(x => new OrderItem.TaxSum { TaxRate = x.Key, Amount = x.Sum(y => y.NumUnits * y.Price) })
.ToList();
}
}
internal class OrderItem
{
public string Product { get; set; }
public double TaxRate { get; set; }
public double Price { get; set; }
public int NumUnits { get; set; }
internal class TaxSum
{
public double TaxRate { get; set; }
public double Amount { get; set; }
}
internal class Program
{
private static void Main(string[] args)
{
new OrderSumCalculater().CalculateSums();
}
}
}
}
Andreas
|
|
|
|
|
In the last decades, how many developers said they would never use higher level language (vs. assembly), object-oriented design, event-driven architecture, and so on
|
|
|
|
|
I believe your comparison is not very accurate.
As one who came up through the ranks of the Mainframe years, I can attest to the fact that most business developers very much wanted to move on to COBOL as Assembly was never a good language to use for most business-line applications.
As to Object Oriented Programming, it too had a rather quick assimilation into the developer communities as it provided cleaner ways to compartmentalize your work.
LINQ has added some advantages to working with memory-based data but in general offered no advantage when dealing with database access.
Given that with LINQ one still has to access a database in such applications, it is simply not as efficient as using stored procedures and the like...
Steve Naidamast
Sr. Software Engineer
Black Falcon Software, Inc.
blackfalconsoftware@outlook.com
|
|
|
|
|
some aspects are missing (see my message)
|
|
|
|
|
Hello,
this is a great and interesting work to read.
But, I have some remark
1. I regret that comparaison include a multiplication. I prefer that 'order' field is randomly computed.
2. I regret that in C# version you use a ternary operator (?) that sometimes introduce hidden effect. Can you use a simple if/then/else block instead ?
3. I regret that Check() part time is included in computed time.
4. I regret that C++ sort is done using std::vector class. Perhaps that sd::list class is a better choice when result set must be sorted
Best regards
|
|
|
|
|
Thank you very much for the suggestions!
Each of them could produce an interesting statement that delves even deeper into the matter. I was primarily concerned with a fair comparison - i.e. almost identical code in C++ and C#.
1. Sure, you have a good point with that - by multiplication order follows a pattern, but is not presorted in any way. That this pattern is recognized and exploited for optimizations I think is out of the question.
2. The ternary operator (? ) is outside the main loop and its influence should not be measurable.
3. The check() method ensures that all elements of the result set are passed through. Perhaps the name is poorly chosen.
4 Unfortunately, the iterator of list only satisfies the LegacyBidirectionalIterator requirement, but sort() requires LegacyRandomAccessIterator .
modified 9-Nov-22 0:02am.
|
|
|
|
|
Hello,
I have discovered your error in using your code.
I have now improved it in implementing my proposals (except random generation).
For FILTER-ONLY, I obtain following result
******************************************************************
* NOT ORDERED
******************************************************************
LINQ , run 0 -> Time: 9,5593 msec
CLASSIC , run 0 -> Time: 2,8850 msec
LINQ , run 1 -> Time: 9,5066 msec
CLASSIC , run 1 -> Time: 2,7031 msec
LINQ , run 2 -> Time: 9,8213 msec
CLASSIC , run 2 -> Time: 2,5691 msec
LINQ , run 3 -> Time: 8,4035 msec
CLASSIC , run 3 -> Time: 2,4614 msec
LINQ , run 4 -> Time: 8,8647 msec
CLASSIC , run 4 -> Time: 2,4003 msec
LINQ , run 5 -> Time: 8,7324 msec
CLASSIC , run 5 -> Time: 2,7045 msec
LINQ , run 6 -> Time: 8,2537 msec
CLASSIC , run 6 -> Time: 2,0996 msec
LINQ , run 7 -> Time: 8,1946 msec
CLASSIC , run 7 -> Time: 2,0260 msec
LINQ , run 8 -> Time: 9,4539 msec
CLASSIC , run 8 -> Time: 2,2876 msec
LINQ , run 9 -> Time: 8,1598 msec
CLASSIC , run 9 -> Time: 2,2953 msec
LINQ Filter only MIN Time: 8,1598 msec MAX Time: 9,8213 msec AVG Time: 8,8950 msec
CLASSIC Filter only MIN Time: 2,0260 msec MAX Time: 2,8850 msec AVG Time: 2,4432 msec
For FILTER + SORT, I obtain
******************************************************************
* ORDERED
******************************************************************
LINQ , run 0 -> Time: 24,9436 msec
CLASSIC , run 0 -> Time: 20,2982 msec
PARALLEL , run 0 -> Time: 54,4161 msec
LINQ , run 1 -> Time: 24,4057 msec
CLASSIC , run 1 -> Time: 19,1738 msec
PARALLEL , run 1 -> Time: 15,8568 msec
LINQ , run 2 -> Time: 25,9121 msec
CLASSIC , run 2 -> Time: 19,8143 msec
PARALLEL , run 2 -> Time: 38,2151 msec
LINQ , run 3 -> Time: 25,0715 msec
CLASSIC , run 3 -> Time: 19,6121 msec
PARALLEL , run 3 -> Time: 27,8901 msec
LINQ , run 4 -> Time: 24,0634 msec
CLASSIC , run 4 -> Time: 20,8182 msec
PARALLEL , run 4 -> Time: 13,5480 msec
LINQ , run 5 -> Time: 24,5678 msec
CLASSIC , run 5 -> Time: 20,5560 msec
PARALLEL , run 5 -> Time: 14,9940 msec
LINQ , run 6 -> Time: 22,1855 msec
CLASSIC , run 6 -> Time: 16,6719 msec
PARALLEL , run 6 -> Time: 13,2802 msec
LINQ , run 7 -> Time: 21,8709 msec
CLASSIC , run 7 -> Time: 16,4458 msec
PARALLEL , run 7 -> Time: 13,6003 msec
LINQ , run 8 -> Time: 21,1196 msec
CLASSIC , run 8 -> Time: 16,2656 msec
PARALLEL , run 8 -> Time: 10,2886 msec
LINQ , run 9 -> Time: 21,6723 msec
CLASSIC , run 9 -> Time: 15,4793 msec
PARALLEL , run 9 -> Time: 14,0374 msec
LINQ Filter + Sort MIN Time: 21,1196 msec MAX Time: 25,9121 msec AVG Time: 23,5812 msec
CLASSIC Filter + Sort MIN Time: 15,4793 msec MAX Time: 20,8182 msec AVG Time: 18,5135 msec
PARALLEL Filter + Sort MIN Time: 10,2886 msec MAX Time: 54,4161 msec AVG Time: 21,6127 msec
I introduced a new approach to SORT the list in CLASSIC approach in adding AsParallel().
Personnaly, I estimate that the comparaison must compare MINIMUM value.
In ALL cases, CLASSIC version is ALWAYS better that LINQ version.
I read this code in RELEASE mode from Windows 10, 16GB of RAM using Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz.
I have display timestamp in millisecond and aligned all values so that is more readable.
Here is my code (= improved version of your code)
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Net.Mail;
using System.Numerics;
using System.Security.Permissions;
using System.Xml.Linq;
namespace LINQvsCLASSIC
{
public static class Settings
{
public const int VECTORSIZE = 640000;
public const int NR_RUNS = 10;
}
class Element : IComparable<Element>
{
private int tag;
private int order;
private int data;
public Element(int t, int o, int d)
{
tag = t;
order = o;
data = d;
}
public int Tag { get { return tag; } set { tag = value; } }
public int Order { get { return order; } set { order = value; } }
public int Data { get { return data; } set { data = value; } }
public int CompareTo(Element? b)
{
return (b == null) ? 1 : (order - b.order);
}
}
class TestPerformance
{
public Double iMinTime;
public Double iMaxTime;
public Double iSumTime;
public int iNrTest;
public String sAction = "?";
public String sMethod = "?";
public List<Element> resultVector = new List<Element>();
public TestPerformance()
{
iMinTime = Double.MaxValue;
iMaxTime = 0;
iSumTime = 0;
init();
}
public void start(int run, List<Element> testVector)
{
Stopwatch sw = new Stopwatch();
sw.Start();
resultVector = new List<Element>(testVector.Count / 8);
test(testVector);
sw.Stop();
if (resultVector.Count != Settings.VECTORSIZE / 8)
throw new Exception("Number of result elements not correct!");
checkFilterResults(resultVector);
resultVector.Clear();
Double iMilliSec = sw.Elapsed.TotalMilliseconds;
if (iMilliSec < iMinTime) iMinTime = iMilliSec;
if (iMilliSec > iMaxTime) iMaxTime = iMilliSec;
iSumTime += iMilliSec;
Console.WriteLine("{0}, run {1} -> Time: {2,10:0.0000} msec"
,sMethod.PadRight(10)
,run
,iMilliSec
);
}
private void checkFilterResults(IEnumerable<Element> vector)
{
foreach (var element in vector)
{
if (element.Tag != 0)
throw new Exception("Filter failed");
}
}
static void checkSortResults(IEnumerable<Element> vector)
{
int o = 0;
foreach (var element in vector)
{
if (o > element.Order)
throw new Exception("Ordering failed");
o = element.Order;
}
}
public void WriteMinMaxAvg()
{
Console.WriteLine("{0} {1} MIN Time: {2,10:0.0000} msec MAX Time: {3,10:0.0000} msec AVG Time: {4,10:0.0000} msec"
,sMethod.PadRight(10)
,sAction.PadRight(16)
,iMinTime
,iMaxTime
,iSumTime / Settings.NR_RUNS
);
}
public virtual void test(List<Element> testVector) { }
public virtual void init() { }
}
class Test_LINQ :TestPerformance
{
public override void init()
{
sMethod = "LINQ";
sAction = "Filter only";
}
public override void test(List<Element> testVector)
{
resultVector =
testVector.Where(element => element.Tag < 1)
.ToList();
}
}
class Test_LINQ_ORDER : TestPerformance
{
public override void init()
{
sMethod = "LINQ";
sAction = "Filter + Sort";
}
public override void test(List<Element> testVector)
{
resultVector =
testVector.Where(element => element.Tag < 1)
.OrderBy(element => element.Order)
.ToList();
}
}
class Test_CLASSIC : TestPerformance
{
public override void init()
{
sMethod = "CLASSIC";
sAction = "Filter only";
}
public override void test(List<Element> testVector)
{
foreach (var element in testVector)
{
if (element.Tag < 1)
resultVector.Add(element);
}
}
}
class Test_CLASSIC_ORDER : TestPerformance
{
public override void init()
{
sMethod = "CLASSIC";
sAction = "Filter + Sort";
}
public override void test(List<Element> testVector)
{
foreach (var element in testVector)
{
if (element.Tag < 1)
resultVector.Add(element);
}
resultVector.Sort();
}
}
class Test_CLASSIC_PARALLEL : TestPerformance
{
public override void init()
{
sMethod = "PARALLEL";
sAction = "Filter + Sort";
}
public override void test(List<Element> testVector)
{
List<Element> filteredVector = new List<Element>(testVector.Count / 8);
foreach (var element in testVector)
{
if (element.Tag < 1)
filteredVector.Add(element);
}
resultVector = filteredVector.AsParallel<Element>().OrderBy(element => element.Order).ToList();
}
}
class Program
{
static void Main(string[] args)
{
var testVector = new List<Element>(Settings.VECTORSIZE);
for (int index = 0; index < Settings.VECTORSIZE; index++)
{
int tag = index % 8;
int order = (index % 12) * Settings.VECTORSIZE + index;
testVector.Add(new Element(tag, order, index));
}
Console.WriteLine("******************************************************************");
Console.WriteLine("* NOT ORDERED");
Console.WriteLine("******************************************************************");
TestPerformance t_LINQ = new Test_LINQ();
TestPerformance t_CLASSIC = new Test_CLASSIC();
for (int run = 0; run < Settings.NR_RUNS; run++)
{
t_LINQ.start(run, testVector);
t_CLASSIC.start(run, testVector);
}
Console.WriteLine("");
t_LINQ.WriteMinMaxAvg();
t_CLASSIC.WriteMinMaxAvg();
Console.ReadKey();
Console.WriteLine("******************************************************************");
Console.WriteLine("* ORDERED");
Console.WriteLine("******************************************************************");
TestPerformance t_LINQ_ORDER = new Test_LINQ_ORDER();
TestPerformance t_CLASSIC_ORDER = new Test_CLASSIC_ORDER();
TestPerformance t_CLASSIC_PARALLEL = new Test_CLASSIC_PARALLEL();
for (int run = 0; run < Settings.NR_RUNS; run++)
{
t_LINQ_ORDER.start(run, testVector);
t_CLASSIC_ORDER.start(run, testVector);
t_CLASSIC_PARALLEL.start(run, testVector);
}
Console.WriteLine("");
t_LINQ_ORDER.WriteMinMaxAvg();
t_CLASSIC_ORDER.WriteMinMaxAvg();
t_CLASSIC_PARALLEL.WriteMinMaxAvg();
Console.ReadKey();
}
}
}
|
|
|
|
|
Great job!
Do I see correctly that your results support the previous statements?
I am not very surprised that the parallel sort is not faster - but measuring is always better than guessing (which I have done so far). Finally, splitting into independently processable clusters/ranges is simply not possible under these test conditions.
Thanks a lot!
|
|
|
|
|
Linq got a complete performance overhaul in .Net 7.0. The team addrsssed a lot of issues and some Linq queries will be orders of magnitudes faster.
Performance Improvements in .NET 7 - .NET Blog[^]
Linq comes with a lot of usability, at the cost of performance. The latest iteration however definately makes performance quite acceptable.
|
|
|
|
|
Thank you for the suggestion.
If .NET 7 is available on Linux, that would be an interesting question. Under Win10 I have observed too much variance in the results and I don't expect a change in runtime in orders of magnitude - so a statement under Win10 is easily attackable. And this is for two reasons: From .NET 3.5 to .NET 6.0, Microsoft has already made an impressive performance improvement in LINQ. And my LINQ example is so simple that I don't expect to see any change in runtime here.
|
|
|
|
|
Seems like 'order' parameter ignored in the C++ version
|
|
|
|
|
Thank you very much! What a creepy systematic error! That explains the strange results of line 1 vs. 2 and line 3 vs. 4. I'll correct that immediately, of course!
|
|
|
|
|
Thank you very much for testing and sharing!
LiTe
|
|
|
|
|