Stop Race Bugs: Master C# Locks, Interlocked & Lock‑Free (with Real Benchmarks)

Master C# Thread Safety: Locks, Interlocked & Lock‑Free

Stop race bugs in C#. Benchmarks compare lock, Interlocked, ReaderWriterLockSlim, and concurrent collections with copy‑ready code.

.NET Fundamentals·By amarozka · October 31, 2025

Master C# Thread Safety: Locks, Interlocked & Lock‑Free

Are your threads quietly corrupting data right now? Stop guessing. In this guide you’ll see the truth about lock, Monitor, Interlocked, ReaderWriterLockSlim, concurrent collections, and real benchmarks that show when each choice wins or loses.

Why read this

You want safe code that stays fast under load. In my projects (high‑throughput APIs and pricing engines) I’ve shipped the wrong approach more than once and paid for it later: hidden race bugs, CPU spikes, and deadlocks that only show up at 03:00. This post gives you a field kit you can apply today.

You will get:

  • A clear mental model of race conditions, deadlocks, and livelock.
  • Practical patterns with copy‑paste‑ready code.
  • A small benchmarking harness you can run on your machine.
  • A checklist to pick the right tool without overthinking.

Stop Race Conditions: spot, reproduce, fix

What is a race? Two or more threads touch the same data without proper sync. The order of reads/writes changes the outcome.

Smells you can notice:

  • Flaky tests that pass on your laptop and fail in CI.
  • Random counters that are off by a few percent.
  • Rare NullReferenceException where nothing looks null.

Minimal repro (broken):

// Problem: increments get lost under load
class Counter
{
    public int Value; // not thread-safe
    public void Increment() => Value++; // read, add, write (not atomic)
}

Fix #1 – lock it:

class LockedCounter
{
    private int _value;
    private readonly object _gate = new();
    public void Increment()
    {
        lock (_gate)
        {
            _value++;
        }
    }
    public int Read() { lock (_gate) return _value; }
}

Fix #2 – use Interlocked:

class AtomicCounter
{
    private int _value;
    public void Increment() => Interlocked.Increment(ref _value);
    public int Read() => Volatile.Read(ref _value);
}

When to pick which?

  • Interlocked for simple numeric state (int, long) or swap/publish of a reference.
  • lock when you mutate multiple fields together or must keep an invariant across steps.

Tip: make races loud. In tests, run your code in a tight loop with Parallel.For and random delays. If it’s wrong, it will scream.

Master the lock statement (and Monitor under the hood)

lock(obj) compiles to Monitor.Enter(obj) + try/finally + Monitor.Exit(obj).

Rules that keep you safe:

  • Never lock on this or a public object. Use a private readonly gate.
  • Keep critical sections tiny. Do only the work that needs sync.
  • Don’t call out (I/O, await, or callbacks) while holding a lock.

Example – safe queue wrapper:

class SafeQueue<T>
{
    private readonly Queue<T> _q = new();
    private readonly object _gate = new();

    public void Enqueue(T item)
    {
        lock (_gate) _q.Enqueue(item);
    }

    public bool TryDequeue(out T? item)
    {
        lock (_gate)
        {
            if (_q.Count == 0) { item = default; return false; }
            item = _q.Dequeue();
            return true;
        }
    }
}

Using Monitor directly gives you timeouts and try‑enter:

if (Monitor.TryEnter(_gate, millisecondsTimeout: 5))
{
    try { /* critical work */ }
    finally { Monitor.Exit(_gate); }
}
else
{
    // back off, queue work, or log
}

Use timeouts to kill deadlocks early in non‑critical paths and add metrics around it.

The Truth about Interlocked: tiny ops, huge wins

Interlocked gives atomic read‑modify‑write on primitive fields and references.

Great hits:

  • Increment/Decrement for counters.
  • Exchange to publish a new instance safely.
  • CompareExchange (CAS) to build lock‑free structures.

Publish‑once pattern:

class ConfigCache
{
    private Config? _current;

    public void Publish(Config cfg)
        => Interlocked.Exchange(ref _current, cfg);

    public Config? Snapshot()
        => Volatile.Read(ref _current);
}

Gotchas:

  • On 32‑bit you must use long with care; .NET makes Interlocked.Read(ref long) safe, but plain reads aren’t atomic.
  • Memory ordering matters. Use Volatile.Read/Write when you share state without locks to make intent clear.

Kill Contention with ReaderWriterLockSlim

When reads are huge and writes are rare, a simple lock becomes a traffic jam. ReaderWriterLockSlim lets many readers proceed together while writers get exclusivity.

Example – read‑heavy cache:

class ProductCache
{
    private readonly Dictionary<int, string> _data = new();
    private readonly ReaderWriterLockSlim _rw = new(LockRecursionPolicy.NoRecursion);

    public string? Find(int id)
    {
        _rw.EnterReadLock();
        try { return _data.TryGetValue(id, out var v) ? v : null; }
        finally { _rw.ExitReadLock(); }
    }

    public void Put(int id, string name)
    {
        _rw.EnterWriteLock();
        try { _data[id] = name; }
        finally { _rw.ExitWriteLock(); }
    }
}

Use it when: reads outweigh writes by at least 4‑5× and the read section is small.

Avoid when: you might reenter, or you hold the lock around I/O.

Master the Concurrent Collections

The BCL gives high‑quality thread‑safe containers. Prefer them to home‑grown locks.

  • ConcurrentDictionary<TKey,TValue> – fast lookups, atomic add/update with GetOrAdd, AddOrUpdate.
  • ConcurrentQueue<T> / ConcurrentStack<T> – non‑blocking queues/stacks.
  • BlockingCollection<T> – producer/consumer with bounding and blocking on Take. Great for simple pipelines.
  • Channel<T> (from System.Threading.Channels) – async producer/consumer with backpressure, works nicely with async/await.

Atomic update example:

var hits = new ConcurrentDictionary<string, int>();

void Count(string key)
{
    hits.AddOrUpdate(key, 1, (_, old) => old + 1);
}

Bounded work queue:

var channel = Channel.CreateBounded<string>(capacity: 100);

// producer
_ = Task.Run(async () =>
{
    foreach (var line in File.ReadLines(path))
        await channel.Writer.WriteAsync(line);
    channel.Writer.Complete();
});

// consumer
await foreach (var line in channel.Reader.ReadAllAsync())
{
    Process(line);
}

The Truth about Lock‑Free: fast, sharp, and not for every case

Lock‑free code uses CAS loops (CompareExchange) to update state without blocking. It shines under heavy contention, but it’s easy to cut yourself.

Treiber stack (simplified):

public sealed class LockFreeStack<T>
{
    private sealed class Node { public T Item = default!; public Node? Next; }
    private Node? _head; // shared

    public void Push(T item)
    {
        var node = new Node { Item = item };
        while (true)
        {
            var head = Volatile.Read(ref _head);
            node.Next = head;
            if (Interlocked.CompareExchange(ref _head, node, head) == head)
                return; // success
            // CAS failed – someone changed head, retry
            Thread.SpinWait(1);
        }
    }

    public bool TryPop(out T? result)
    {
        while (true)
        {
            var head = Volatile.Read(ref _head);
            if (head is null) { result = default; return false; }
            var next = head.Next;
            if (Interlocked.CompareExchange(ref _head, next, head) == head)
            { result = head.Item; return true; }
            Thread.SpinWait(1);
        }
    }
}

Risks:

  • ABA problem on references under extreme patterns (rare in GC’d apps but real for advanced structures).
  • Fairness: starved threads can spin forever if you design it poorly.
  • Debug pain: state changes without a lock are harder to trace.

Rule of thumb: use library collections first. Use lock‑free only for hot spots you can prove with numbers.

Stop Deadlocks and Livelock

Deadlock: two locks taken in opposite order. Both wait forever.

Kill it with ordering: define a global order and always lock A then B.

void Transfer(Account a, Account b, decimal amount)
{
    var first = a.Id < b.Id ? a : b;
    var second = a.Id < b.Id ? b : a;

    lock (first.Gate)
    lock (second.Gate)
    {
        a.Withdraw(amount);
        b.Deposit(amount);
    }
}

Timeouts save you: prefer Monitor.TryEnter with a short timeout on non‑critical paths; log and back off.

Livelock: everyone keeps moving but no progress (e.g., all threads retry at once).

Fix livelock with backoff + jitter:

static void Backoff(int attempt)
{
    var delay = Math.Min(1 << Math.Min(attempt, 20), 1_000); // cap at 1s
    Thread.Sleep(Random.Shared.Next(delay));
}

Use Backoff(++attempt) inside your CAS retry loop after a few failed tries.

Kill False Sharing

Two hot fields on the same cache line can fight each other across cores.

Fix: pad or separate them. .NET has System.Runtime.CompilerServices.SkipLocalsInitAttribute and StructLayout(LayoutKind.Explicit) tricks, but the simplest is to split objects or use System.Runtime.InteropServices.StructLayout with care.

For counters used by multiple producers, try sharded counters:

class ShardedCounter
{
    private readonly int[] _cells = new int[Environment.ProcessorCount * 2];

    public void Increment()
    {
        var idx = Thread.GetCurrentProcessorId() & (_cells.Length - 1);
        Interlocked.Increment(ref _cells[idx]);
    }

    public int Sum()
    {
        var total = 0;
        foreach (var c in _cells)
            total += Volatile.Read(ref Unsafe.AsRef(in c));
        return total;
    }
}

The Benchmark You Can Copy

I like to test on my box before making promises. Here is a BenchmarkDotNet harness that compares common sync choices for a simple counter scenario.

// <Project Sdk="Microsoft.NET.Sdk">
//   <PropertyGroup>
//     <TargetFramework>net9.0</TargetFramework>
//     <AllowUnsafeBlocks>true</AllowUnsafeBlocks>
//   </PropertyGroup>
//   <ItemGroup>
//     <PackageReference Include="BenchmarkDotNet" Version="0.13.12" />
//   </ItemGroup>
// </Project>

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System.Collections.Concurrent;
using System.Threading.Channels;

[MemoryDiagnoser]
[ThreadingDiagnoser]
public class SyncBench
{
    [Params(1, 4, 8, 16)] public int Threads;
    [Params(1_0000)] public int OpsPerThread;

    private object _gate = new();
    private int _x;

    [Benchmark(Baseline = true)]
    public int InterlockedIncrement()
    {
        _x = 0;
        Parallel.For(0, Threads, _ =>
        {
            for (int i = 0; i < OpsPerThread; i++)
                Interlocked.Increment(ref _x);
        });
        return _x;
    }

    [Benchmark]
    public int LockedIncrement()
    {
        _x = 0;
        Parallel.For(0, Threads, _ =>
        {
            for (int i = 0; i < OpsPerThread; i++)
                lock (_gate) _x++;
        });
        return _x;
    }

    [Benchmark]
    public int RWLockIncrement()
    {
        var rw = new ReaderWriterLockSlim();
        _x = 0;
        Parallel.For(0, Threads, _ =>
        {
            for (int i = 0; i < OpsPerThread; i++)
            {
                rw.EnterWriteLock();
                _x++;
                rw.ExitWriteLock();
            }
        });
        return _x;
    }

    [Benchmark]
    public int ConcurrentDictionaryAddOrUpdate()
    {
        var cd = new ConcurrentDictionary<int, int>();
        cd[0] = 0;
        Parallel.For(0, Threads, _ =>
        {
            for (int i = 0; i < OpsPerThread; i++)
                cd.AddOrUpdate(0, 1, (_, v) => v + 1);
        });
        return cd[0];
    }

    public static void Main() => BenchmarkRunner.Run<SyncBench>();
}

Sample results from one laptop (Ryzen 7, .NET 9, Release, 16 threads):

MethodMean (ns/op)Ratio
InterlockedIncrement7.81.00
LockedIncrement38.54.9×
RWLockIncrement65.28.4×
ConcurrentDictionaryAddOrUpdate92.311.8×

Your numbers will differ, but the shape is stable: Interlocked is cheapest, a plain lock is next, then heavier tools.

Benchmark tips: run on Release, no debugger, CPU governor on Performance, and close background apps.

Stop Guessing: a short decision guide

  • Need to update a single numeric or swap a reference? Use Interlocked.
  • Need to change related fields together? Use lock with a private gate.
  • Read‑heavy dictionary or cache? Try ReaderWriterLockSlim or ConcurrentDictionary.
  • Producer/consumer? Use Channel<T> or BlockingCollection<T>.
  • Crazy contention on a tiny hot path? Consider lock‑free, but only with tests and metrics.

Debugging: fast ways to see what’s wrong

  • Parallel Stacks / Tasks window in Visual Studio shows who is waiting where. Look for many threads stuck inside the same method.
  • Dump + !syncblk (WinDbg) shows held locks.
  • EventPipe tools: dotnet-trace and dotnet-counters can show ThreadPool starvation and GC pauses.
  • Log lock waits: wrap Monitor.TryEnter with timing and log slow entries.

Tiny helper to log slow lock entries:

static IDisposable TimedEnter(object gate, int thresholdMs, string name)
{
    var sw = ValueStopwatch.StartNew();
    Monitor.Enter(gate);
    var waited = sw.ElapsedMilliseconds;
    if (waited > thresholdMs)
        Console.WriteLine($"WARN lock {name} took {waited}ms to enter");
    return new ActionOnDispose(() => Monitor.Exit(gate));
}

readonly struct ValueStopwatch
{
    private readonly long _start;
    private ValueStopwatch(long start) => _start = start;
    public static ValueStopwatch StartNew() => new(Environment.TickCount64);
    public long ElapsedMilliseconds => Environment.TickCount64 - _start;
}

sealed class ActionOnDispose : IDisposable
{
    private readonly Action _a;
    public ActionOnDispose(Action a) => _a = a;
    public void Dispose() => _a();
}

Usage:

using (TimedEnter(_gate, thresholdMs: 10, name: nameof(MyType)))
{
    // protected section
}

FAQ: quick answers that save you time

Can I await inside a lock?

No. lock is sync only. Use SemaphoreSlim for async flows.

Is ConcurrentDictionary always faster than a locked Dictionary?

Not always. For single‑threaded or low contention, a plain Dictionary with a simple lock can win.

Do I need volatile on fields protected by lock?

No. Enter/Exit already include the right memory fences.

Why does my CPU spike with ReaderWriterLockSlim?

If writers are frequent, readers get blocked and spin. In that case switch to a simple lock.

When is lock‑free worth it?

Only for tiny hot spots you can prove with benchmarks. Library types already cover most needs.

How do I avoid deadlocks in complex code?

Define a lock order, keep scopes small, add timeouts, and never call external code while holding a lock.

Conclusion: safer threads, faster code

You don’t need magic, you need the right tool for the job and numbers to back it. Start with simple tools (Interlocked, lock, concurrent collections). Add ReaderWriterLockSlim only when reads truly dominate. Keep lock scopes tiny. Add backoff to CAS loops. Measure. Repeat.

Pick one hot path in your app this week, wire up the benchmark harness, and post your results. I’ll answer questions and help tune them.

Leave a Reply

Your email address will not be published. Required fields are marked *