Part 1 In this part you will learn how to the query in-memory collections.

Post date: Mar 17, 2011 12:41:49 PM

Introducing LINQ – Part 1

In this article we will cover only the querying of in-memory collections, in future parts we will look at using LINQ with relational data (LINQ to SQL) and XML (LINQ to XML).

This article has been designed to give you a core understanding of LINQ that we will rely heavily on in subsequent parts of this series.

Before diving into the code it is essential to define what LINQ actually is. LINQ is not C# 3.0, and vice versa. LINQ relies heavily on the new language enhancements introduced in C# 3.0; however, LINQ essentially is the composition of many standard query operators that allow you to work with data in a more intuitive way regardless of the data source.

The benefits of using LINQ are significant – queries are a first class citizen within the C# language, benefit from compile time checking of queries, and the ability to debug (step through) queries. We can expect the next Visual Studio IDE to take full advantage of these benefits – certainly the March 2007 CTP of Visual Studio Orcas does!

In-Memory Collections

The best way to teach new technologies is to just to show you an example and then explain what the heck is going on! – That will be my approach throughout this series; hopefully it is a wise decision.

For our first example we will compose a query to retrieve all the items in a generic List collection (Fig. 1).

Figure 1: Selecting all the items in a generic List collection

01.private static List<string> people = new List<string>() 

02.{ 

03.  "Granville", "John", "Rachel", "Betty", 

04.  "Chandler", "Ross", "Monica" 

05.};

06.  

07.public static void Example1() 

08.{

09.  IEnumerable<string> query = from p in people select p;

10.  foreach (string person in query) 

11.  {

12.    Console.WriteLine(person);

13.  }

14.}

The code example given in Fig. 1 is very basic and its functionality could have been replicated easier by simply enumerating through the items in the List via a foreach loop.

In Fig.1 we compose a query that will return each of the items in the people List collection by aliasing the people collection with a variable p and then selecting p (p is of type string remember as the peopleList is a collection of immutable string objects).

You may notice that query is of type IEnumerable<string> - this is because we know that query will hold an enumeration of type string. When we foreach through the query the GetEnumerator of query is invoked.

At this time it is beneficial to look at exactly what the compiler generated code looks like (Fig. 2).

Figure 2: Compiler generated code for Fig. 1

01.public static void Example1()

02.{

03.  IEnumerable<string> query = people.Select<string, string>(delegate (string p) 

04.  {

05.    return p;

06.  });

07.  foreach (string person in query)

08.  {

09.    Console.WriteLine(person);

10.  }

11.}

Fig. 2 reveals that our query has actually been converted by the compiler to use an extension method (in this case just the Select extension method is used) taking a delegate as its argument.

You will find that queries and lambda expressions are simply a facade that we deal with in order to make our lives easier – under the covers the compiler is generating the appropriate code using delegates. Be aware of this internal compiler behavior!

Also be aware that a cached anonymous delegate method is generated at compile time as well (Fig. 3) – we will discuss this particular feature in future articles.

Figure 3: Compiler generated cached anonymous delegate method

1.[CompilerGenerated]

2.private static Func<string, string> <>9__CachedAnonymousMethodDelegate1;

We will now take a look at a more complex query of the same collection which retrieves a sequence of all strings in the List whose length is greater than 5(Fig. 4).

Figure 4: A more complex query

01.public static void Example2() 

02.{

03.  IEnumerable<string> query = from p in people where p.Length > 5 

04.  orderby p select p;

05.  

06.  foreach (string person in query) 

07.  {

08.    Console.WriteLine(person);

09.  }

10.}

The example in Fig. 4 relies on the use of two other standard query operators – Where and orderby to achieve the desired results.

If we examine the code generated by the compiler for the Example2 method you will see that shown in Fig. 5 – notice as well that we now have another two cached anonymous delegate methods (Fig. 6) – each of which having the type signature of their corresponding delegates (Where delegate and orderby delegate).

Figure 5: Compiler generated code for Fig. 4

01.public static void Example2()

02.{

03.  IEnumerable<string> query = people.Where<string>(delegate (string p) 

04.  {

05.    return (p.Length > 5);

06.  }).OrderBy<string, string>(delegate (string p) 

07.  {

08.    return p;

09.  });

10.  foreach (string person in query)

11.  {

12.    Console.WriteLine(person);

13.  }

14.}

Figure 6: Cached anonymous delegate methods for their respective Where and orderby delegates defined in Fig. 5

1.[CompilerGenerated]

2.private static Func<string, bool> <>9__CachedAnonymousMethodDelegate4;

3.[CompilerGenerated]

4.private static Func<string, string> <>9__CachedAnonymousMethodDelegate5;

The type signature of the Where delegate (Fig. 5) is Func. The delegate takes a string argument and returns a bool depending on whether the string was greater than 5 characters in length. Similarly the orderby delegate (Fig. 5) takes a string argument and returns a string.

Standard Query Operators

For completeness I will briefly cover what the standard query operators entail.

When using LINQ with any supported data source (in-memory, relational data, XML) we can make use of a set of a standard query operators which empower us to manipulate our data source more effectively. A few standard query operators include:

Select

OrderBy

Where

SelectAll

TakeWhile

Take

Skip

First

SkipWhile

...

There are a tonne of standard query operators and I advise you to explore the use of each to gain a richer understanding of how to deal with data.

We will cover many of the standard query operators in future parts of this series.

Lambda Expressions

Lambda expressions provide a clearer syntax for anonymous delegates – in this section we will replicate the code in Fig. 4 by using lambda expressions and extension methods (Fig. 6).

Figure 6: Same example as Fig. 4 but using lambda expression and extension methods

1.public static void Example3() 

2.{

3.  IEnumerable<string> query = people.Where(x => x.Length > 5).OrderBy(x => x);

4.  foreach (string person in query) 

5.  {

6.    Console.WriteLine(person);

7.  }

8.}

A lambda expression normally takes the form of arguments => expression, the expression is always preceded by the => token. If you want to have a lambda expression with more than one argument you must enclose the arguments in parentheses delimited by a comma.

If you noticed in Figs 2 and 5 the compiler generated code actually uses extension methods not a query – this is purely an implementation detail, much like the use of lambda expressions in Fig. 6 will be converted to anonymous delegates by the compiler (Fig. 7).

Figure 7: Compiler generated code for Fig. 6

01.public static void Example2()

02.{

03.  IEnumerable<string> query = people.Where<string>(delegate (string p) 

04.  {

05.    return (p.Length > 5);

06.  }).OrderBy<string, string>(delegate (string p) 

07.  {

08.    return p;

09.  });

10.  foreach (string person in query)

11.  {

12.    Console.WriteLine(person);

13.  }

14.}

We will use lambda extensions extensively in future parts of this series.