Top 6 Examples of Boost Performance Working List in C#

Optimizing C# List Performance (Top 6 Practical Examples)

In the previous post we took an in-depth look at using list in C#. In addition to that, let’s take a look at real examples of how to use lists from the performance point of view. How does using a class or structure in lists affect the speed? Which loop is faster for or foreach? How many times faster is it to use the AddRange method instead of the Add method in a loop? Answers to these and other performance questions will be discussed in this post (in examples and charts). To compare performance in our examples we will use the BenchmarkDotNet nuget package. Well, let’s go further into the subtleties of working with lists in C#.

Initialize C# List with Capacity

If you know the number of elements that will be added to the list in advance, it’s more efficient to initialize the list with that capacity. The Capacity property of a List<T> is quite interesting. It represents the number of elements that the List<T> can hold before it needs to resize its internal array.

When you first initialize a List<T> without specifying any capacity, the default capacity is set, which is typically small. As you add elements to the list, and once the number of elements equals the current capacity, the list automatically resizes its internal array to accommodate more elements. This resizing operation involves allocating a new array (usually double the current size) and copying all elements from the old array to the new one. This process is computationally expensive, especially for large lists.

The Capacity property is crucial for optimizing performance when dealing with large collections. If you know the number of elements you plan to store in a list in advance, you can set the capacity accordingly using the List<T>(int capacity) constructor. This pre-allocation helps avoid frequent resizing, leading to better performance.

Now let’s measure with BenchmarkDotNet the speed of working with a list with and without pre-initialization of the list capacity.

public class BenchmarkListCapacity
{
	[Params(100, 5000, 20000)]
	public int count;

	[Benchmark]
	public void AddToList()
	{
		List<int> list = new List<int>();

		for (int i = 0; i < count; i++)
		{
			list.Add(i);
		}
	}

	[Benchmark]
	public void AddToListWithCapacity()
	{
		List<int> list = new List<int>(count);

		for (int i = 0; i < count; i++)
		{
			list.Add(i);
		}
	}
}

Let’s see the result of our measurements.

As you can see from the diagram, setting the capacity on average decrease the time of adding items to the list by from 20 to 55 percent, regard of the list size.

Remember, the Capacity of a List<T> is different from its Count property. Count indicates the actual number of elements in the list, whereas Capacity is about the potential number of elements it can hold before resizing.

Use AddRange() Instead of Multiple Add() Calls

In C#, using the AddRange() method when adding to a List<T> instead of multiple Add() calls can significantly improve performance. Especially when you are working with large datasets. The AddRange() method is designed to add an entire collection of items to the end of a List, which is more efficient compared to adding items one at a time using Add().

Here’s why AddRange() is a better choice:

  1. Reduced Overhead: Each call to Add() might trigger a check for the capacity of the List and potentially resize the underlying array if the capacity is exceeded. AddRange(), however, calculates the required capacity once and resizes the array only if necessary, reducing the overhead of multiple capacity checks and resizes.
  2. Better Memory Management: AddRange() can allocate the exact memory needed for the entire collection in one operation, which is more efficient than potentially multiple memory allocations that might occur with successive Add() calls.
  3. Improved Performance: Because AddRange() minimizes the number of resize operations and memory allocations, it generally offers better performance, especially when adding a large number of elements.
  4. Cleaner Code: Using AddRange() leads to more concise and readable code. Instead of multiple lines of Add() calls, you have a single line that clearly expresses the intent of adding a collection of items to the list.

Let’s now measure the speed of working with a list using the Add() and AddRange() methods.

public class BenchmarkListAddRange
{
	private List<int> _items;

	[Params(5000, 20000)]
	public int count;

	[GlobalSetup]
	public void Setup()
	{
		_items = Enumerable.Range(1, count).ToList();
	}

	[Benchmark]
	public void AddToList()
	{
		List<int> list = new List<int>();

		for (int i = 0; i < _items.Count; i++)
		{
			list.Add(i);
		}
	}

	[Benchmark]
	public void AddRangeToList()
	{
		List<int> list = new List<int>();

		list.AddRange(_items);
	}
}

Let’s see the result of working with a list using the Add() and AddRange() methods.

As you can see from the diagram, using AddRange() instead of multiple Add() calls can on average decrease time of execution by 90 percent.

Consider Custom Data Structures for Specialized Needs

Using structures instead of classes in C# lists can have a positive impact on performance, especially when working with large collections of small data of value type. In C#, structures are value types and classes are reference types. This distinction has several implications:

  1. Memory Allocation: Structs are stored in the stack, which usually results in faster allocation and deallocation. Meanwhile, classes are stored in the heap, which may take longer due to the added task of garbage collection.
  2. Value Semantics: Structures are passed by value, which creates a copy when they are assigned to other variables or passed to methods. This approach can be more efficient for small data structures, as it avoids the overhead of creating and managing references.
  3. Performance: For small, immutable types, structs can improve performance by reducing heap allocation and garbage collection. This is particularly important for large collections.

However, there may be situations where a struct is not the optimal option:

  • If the struct becomes large (the general recommendation is that the struct should be less than 16 byte), the overhead of the copy process can exceed the benefits of the value-type semantics.
  • Structs do not have support for inheritance, which can be a limitation in certain design scenarios.
  • Manipulating structs within a collection, such as a list, can lead to unexpected errors and glitches because the struct copy is likely to be used instead of the actual one.

Let’s test the efficiency of utilizing a list with struct and class:

public class BenchmarkListStructAndClass
{
	[Params(5000, 20000)]
	public int count;

	[Benchmark]
	public void AddStructToList()
	{
		List<ItemStruct> list = new List<ItemStruct>();

		for (int i = 0; i < count; i++)
		{
			list.Add(new ItemStruct());
		}
	}

	[Benchmark]
	public void AddStructToListWithCapacity()
	{
		List<ItemStruct> list = new List<ItemStruct>(count);

		for (int i = 0; i < count; i++)
		{
			list.Add(new ItemStruct());
		}
	}

	[Benchmark]
	public void AddClassToList()
	{
		List<ItemClass> list = new List<ItemClass>();

		for (int i = 0; i < count; i++)
		{
			list.Add(new ItemClass());
		}
	}

	[Benchmark]
	public void AddClassToListWithCapacity()
	{
		List<ItemClass> list = new List<ItemClass>(count);

		for (int i = 0; i < count; i++)
		{
			list.Add(new ItemClass());
		}
	}
}

Let’s see what we’ve got.

And diagram of result:

Prefer Count Property Over Any() for Checking Emptiness

In C# development, especially when dealing with collections like list, the choice between using the Count property and the Any() method for checking if a list is empty can significantly impact performance and readability. This choice is particularly relevant in performance-critical applications.

The Count property, available on List<T> and other collection types, directly returns the number of elements in the collection. When you need to check if a list is empty, using Count is straightforward and efficient. For instance, if (myList.Count == 0) is a clear and direct way to check for an empty list. This approach is extremely fast because Count is a property that holds the current count of the list, and accessing it is a matter of reading this value.

On the other hand, the Any() method, provided by LINQ (Language Integrated Query), is a more general-purpose tool. It’s designed to work with any IEnumerable<T>, not just lists, and it determines whether any elements of a sequence satisfy a condition (or just any elements exist in the case of an empty predicate). When you call myList.Any(), it starts iterating over the list to check if at least one element exists. While this is not a significant performance hit for small or moderate-sized lists, it can become more noticeable with larger collections or in performance-critical code paths, as it involves more overhead than simply reading a property.

Prefer myList.Count == 0 over myList.Any() for checking the emptiness of a list in C#. This choice leads to more performant code, particularly in scenarios where this check is performed frequently or on large collections. It also enhances code readability by explicitly stating the intention to check for an empty collection.

Let’s see how fast Count and Any() work:

public class BenchmarkListAnyAndCount
{
	private List<int> _items;

	[Params(20000)]
	public int count;

	[GlobalSetup]
	public void Setup()
	{
		_items = Enumerable.Range(1, 10000).ToList();
	}

	[Benchmark]
	public void CheckEmptyByCount()
	{
		for (int i = 0; i < count; i++)
		{
			if (_items.Count > 0)
			{
			}
		}
	}

	[Benchmark]
	public void CheckEmptyByAny()
	{
		for (int i = 0; i < count; i++)
		{
			if (_items.Any())
			{
			}
		}
	}
}

Result of our test:

As you can see Count works many times faster compared to Any().

Use foreach Loop for Iteration

Using a foreach loop in C# is often a preferred choice over a for loop when iterating over collections like list, especially when you don’t require the index of the current element. The foreach loop stands out for its readability and simplicity. It automatically iterates through each element in the collection, eliminating the need for manual index tracking and bounds checking.

Here’s why choosing foreach can be beneficial:

  1. Readability: The syntax of foreach is straightforward and self-explanatory. It clearly indicates that you’re iterating over each element in a collection, making the code easier to understand.
  2. Safety: Since foreach handles the iteration process internally, it reduces the risk of errors like off-by-one mistakes or index out of bounds exceptions.
  3. Efficiency: In some cases, foreach can be more efficient than a for loop because it directly accesses the elements in the collection without needing to use an index to retrieve them.
  4. Compatibility with Various Collections: foreach can iterate over any collection that implements the IEnumerable or IEnumerable<T> interface, making it versatile for use with arrays, lists, dictionaries, and other collection types.

Remember, foreach is best used when you don’t need to modify the collection or require the index of each element during iteration. In scenarios where you need to modify the collection or the indexes are essential, a for loop or other iteration methods might be more appropriate.

Let’s take a look at how work loop foreach and for:

public class BenchmarkListForeachAndFor
{
	private List<ItemClass> _items;

	[Params(2000000)]
	public int count;

	[GlobalSetup]
	public void Setup()
	{
		_items = Enumerable.Range(1, count)
			.Select(x => new ItemClass { ItemId = x })
			.ToList();
	}

	[Benchmark]
	public void CheckForeach()
	{
		int sum = 0;

		foreach (var item in _items)
		{
			sum += item.ItemId;
		}
	}

	[Benchmark]
	public void CheckFor()
	{
		int sum = 0;

		for (int i = 0; i < _items.Count; i++)
		{
			sum += _items[i].ItemId;
		}
	}
}

Binary Search for Sorted Lists

Using the BinarySearch method with a C# List is a powerful approach for optimized searching, especially when dealing with large, sorted collections. This method, inherent to the List<T> class in C#, operates on the principle of the binary search algorithm, offering a significant performance advantage over linear search methods.

Here’s a breakdown of how to use BinarySearch effectively:

  1. Precondition – Sorted List: Ensure your List<T> is sorted. BinarySearch works only on sorted collections. If the list isn’t sorted, the results are undefined. Use List<T>.Sort() method to sort the list if necessary.
  2. Implementing BinarySearch: Call BinarySearch(T item) on your List<T> instance, where T is the type of elements in the list. This method returns the zero-based index of the item in the sorted list, if the item is found; otherwise, a negative number, which is the bitwise complement of the index of the next element that is larger or, if there is no larger element, the bitwise complement of Count.
  3. Handling the Return Value: If the return value is non-negative, it denotes the index of the item in the list. If it’s negative, the item isn’t in the list. You can compute the insertion point (if needed) using ~returnValue.
  4. Performance Aspect: BinarySearch has a time complexity of O(log n), making it much faster for large datasets compared to O(n) for a linear search.

Let’s see the difference in performance between a normal search and using the BinarySearch method:

public class BenchmarkListBinarySearch
{
	private List<int> _items;
	private int _findValue;

	[Params(50000)]
	public int count;

	[GlobalSetup]
	public void Setup()
	{
		_items = Enumerable.Range(1, count).ToList();
		var random = new Random();
		_findValue = random.Next(count);
	}

	[Benchmark]
	public void CheckBinarySearch()
	{
		int index = _items.BinarySearch(_findValue);
	}

	[Benchmark]
	public void CheckFirst()
	{
		int index = _items.First(x => x == _findValue);
	}
}

As you can see from the result, searching a sorted list with BinarySearch is many times more efficient.

The diagram shows that search with BinarySearch is more than 100 times more efficient than with LINQ method First().

Conclusion

The right optimization strategy depends on the specific use case. Be sure to profile and test your application to ensure that these optimizations are beneficial in your particular scenario.

I’ve been working tirelessly on a comprehensive GitHub project that’s packed with an array of BenchmarkDotNet examples specifically focusing on List operations. These examples are designed to provide you with real insights and practical knowledge that can elevate your coding skills to the next level.

I’ve been working tirelessly on a comprehensive GitHub project that’s packed with an array of BenchmarkDotNet examples specifically focusing on List operations. These examples are designed to provide you with real insights and practical knowledge that can elevate your coding skills to the next level.

I’ve just released this GitHub project on my Telegram Channel! Don’t miss out on this opportunity to boost your .NET prowess.

๐Ÿ‘‰ Head over to my Telegram channel, download the project, and start exploring today.

๐Ÿ”— Go to Telegram Channel

Let’s dive into the world of high-performance .NET together! ๐Ÿš€

One thought on “Top 6 Examples of Boost Performance Working List in C#

Leave a Reply

Your email address will not be published. Required fields are marked *