Skip to content
HebaruSan edited this page Feb 4, 2020 · 2 revisions

LINQ

LINQ is a .NET library for processing sequences of items. It is handy when you need to convert data from one format to another, or otherwise shuffle and adjust lists and arrays. It also facilitates and encourages a more functional style in which side effects and mutability are de-emphasized. We use it extensively in CKAN.

using System.Linq;

Lambdas

A lambda is a function without a name; that is, it accepts zero or more parameters, and it returns a value. Lambdas are useful for passing predicates or callbacks to other code without having to define a full proper function in a class every time. LINQ code typically makes heavy use of lambdas, because most LINQ functions accept other functions as parameters to specify how to do their work.

There are multiple ways to create a lambda in C#, but in CKAN we usually use the => operator. First comes a new variable name for the parameter, then the => operator, followed by the expression to return. If you need more complex logic than can be fit into a simple expression, you can use a code block enclosed in curly braces.

    var myLambda = (x => x + 2);

    var myLongLambda = (x =>
    {
        int y = 10;
        y = y * x;
        return x + y;
    });

IEnumerable<T>

The default return type for most LINQ functions is IEnumerable<T>. This interface represents a generic sequence of elements of some type and is implemented by all of the common generic collection classes like Array and List. You can call a LINQ function on any object of those types.

For Dictionary<KeyType, ValueType> objects, T is KeyValuePair<KeyType, ValueType>. You can use LINQ to treat a dictionary as a sequence, but each "element" will be made up of some key and value in a pair structure.

Filtering with Where

If there are elements in a sequence that you want to exclude, you can use Where to select which ones to keep:

    int[] numbers = new int[] { 0, 1, 2, 3, 4, 5, 6, 7, 8 };
    var odds = numbers.Where(element => element % 2 == 1);

Substituting with Select

If you want to replace each element in a sequence with a value generated from it, you can use Select to apply an expression to each element, similar to map in other languages:

    int[] numbers = new int[] { 0, 1, 2, 3, 4, 5, 6, 7, 8 };
    var squares = numbers.Select(element => element * element);

Handling duplicates with Distinct and GroupBy

You can ensure that a sequence has no duplicated elements simply by calling Distinct:

    int[] withDuplicates = new int[] { 0, 1, 2, 1, 3, 1, 4, 1, 5 };
    var nonDuplicated = withDuplicates.Distinct();

You can also group identical or similar elements with GroupBy, which returns a sequence of groups, each of which is a subsequence of the original plus a Key property identifying the group:

    int[] withCommonSquares = new int[] { -4, -3, -2, -1, 0, 1, 2, 3, 4 };
    var groupedBySquares = withCommonSquares.GroupBy(element => element * element);
    foreach (var group in groupedBySquares)
    {
        Console.WriteLine("Processing group {0}", group.Key);
        Console.WriteLine("Elements: {0}", string.join(", ", group));
    }

Sorting with OrderBy

The OrderBy function can be used to rearrange the elements of a sequence according to the value of some expression based on each element.

    string[] unsorted = new string[] { "Einstein", "Bohr", "Feynman", "Planck", "Maxwell" };
    var sortedBySecondChar = unsorted.OrderBy(element => element[1]);

Creating collections with To<Type>

To convert a LINQ expression to a specific type of collection, several helper functions are provided:

    return original.ToList();
    return original.ToArray();
    return original.ToDictionary(element => element.MakeKey(),
                                 element => element.MakeValue());

Chaining

Since LINQ functions work with IEnumerable<T> sequences, and also return those same sequences, it's possible to call a LINQ function directly on the return value of another LINQ function. You can exploit this to do a great deal of complex processing of a sequence in very few lines:

    return original.Where(x => x.IsGoodElement())
                   .Select(x => x.ImportantProperty())
                   .Distinct()
                   .OrderBy(x => x.SortingProperty())
                   .ToArray();

Lazy evaluation

You may be surprised to learn that when most of the above examples execute, no calculations are performed! By default, most LINQ functions are lazily evaluated; rather than returning a simple list of all elements, they return a special object called an enumerator that will generate the elements as needed.

This has an important consequence for performance: Enumerators only generate as many elements as they need to! So if you have a potentially large sequence, but you only need the first few elements, LINQ will only calculate the first few elements, meaning you do not pay CPU cycles for the ones you won't use, and you don't have to write any special "stop early" logic.

The yield return statement can be used to write a lazily evaluated function without LINQ. Be careful, though: sequence elements are not remembered after they are provided to your code! If your function does something expensive at the start and then returns the elements lazily, the expensive part will be performed again and again if you use the start of the sequence more than once!

    var mySequence = myLazyFunc();
    // 1. First evaluation happens here
    if (mySequence.Any())
    {
        Console.WriteLine("OK");
    }
    // 2. Second evaluation happens here
    foreach (var elt in mySequence)
    {
        // Do stuff with elt
    }
    // 3. Third evaluation happens here
    var stuff = mySequence.Select(x => x.Something()).ToList();

You can reduce this to a single evaluation by creating a normal list with ToList. Of course this means you lose the benefits of lazy evaluation as well, as the entire sequence will be calculated and stored; you may want to try the CKAN extension function Memoize to get the best of both worlds, an enumerator that only generates as many elements as needed and only calculates them once even if you re-use them.

Clone this wiki locally