Can a Blazor Server app really keep 10,000 people online at the same time without melting your servers? Yes – if you treat SignalR like a high‑throughput system, not a chat toy.
Over the last few years I’ve shipped Blazor Server apps that handle spikes during live events: thousands of dashboards open, heavy broadcasts, and a sea of reconnects when Wi‑Fi hiccups. In this guide I’ll show you the concrete steps that kept those systems fast and stable. No magic – just good limits, lean payloads, and the right Azure setup.
What you’re building (and why it’s hard)
Blazor Server rides on a SignalR connection. Each browser holds a long‑lived connection (WebSocket when possible). At 10,000 concurrent users your app is mostly about:
- Connections: tracking, reconnecting, and keeping them alive.
- Messages: small, frequent renders from the server to the browser.
- CPU & memory: JSON/MessagePack serialization and diffing of render batches.
- Scale‑out: more instances and/or Azure SignalR Service.
If any of these is wasteful, you bleed CPU and RAM per connection and hit limits fast.
The scaling path at a glance
- Scale up a single node: Kestrel, server GC, WebSockets on, MessagePack on, strict limits.
- Scale out app instances: shared Data Protection keys, health probes, ARR affinity if you don’t use Azure SignalR.
- Offload fan‑out to Azure SignalR Service (Default mode for Blazor Server): better connection density, smoother bursts, simpler routing.
- Automate: autoscale rules + visibility (counters, logs, end‑to‑end traces).
Below I’ll walk through each step with code and config that you can copy into a real project today.
Project baseline
Create a plain Blazor Server app on .NET 8 (or newer).
Key packages
<ItemGroup>
<PackageReference Include="Microsoft.AspNetCore.SignalR.Protocols.MessagePack" Version="8.*" />
</ItemGroup>
Program.cs – minimal but fast
var builder = WebApplication.CreateBuilder(args);
// Blazor + SignalR with strict hub limits
builder.Services
.AddServerSideBlazor(options =>
{
// keep render pipeline under control
options.MaxBufferedUnacknowledgedRenderBatches = 5; // backpressure to slow clients
options.DisconnectedCircuitRetentionPeriod = TimeSpan.FromMinutes(3);
options.JSInteropDefaultCallTimeout = TimeSpan.FromSeconds(10);
})
.AddHubOptions(o =>
{
o.MaximumReceiveMessageSize = 64 * 1024; // 64 KB per incoming message
o.EnableDetailedErrors = false; // never in prod
o.ClientTimeoutInterval = TimeSpan.FromSeconds(30); // drop dead connections faster
o.KeepAliveInterval = TimeSpan.FromSeconds(15); // keep LB happy (WebSockets)
o.HandshakeTimeout = TimeSpan.FromSeconds(15);
o.StreamBufferCapacity = 8; // per stream buffer size
});
// SignalR protocol: prefer MessagePack for smaller payloads
builder.Services.AddSignalR().AddMessagePackProtocol();
// Presence and fan-out pipeline
builder.Services.AddSingleton<PresenceStore>();
builder.Services.AddSingleton<BroadcastQueue>();
var app = builder.Build();
app.MapBlazorHub();
app.MapFallbackToPage("/_Host");
app.Run();
Why these numbers? They’re safe defaults to stop noisy clients from pushing the server over the edge. Tune them with your traffic profile, but keep the mindset: deny by default, allow by measurement.
Cut payload size first (MessagePack + lean models)
Serialization burns CPU and memory. Two simple wins:
- Use MessagePack for SignalR. It is binary and compact.
- Send lean DTOs to the client, not EF models or full view models.
DTO example
public sealed record StockTickDto(string Symbol, decimal Price, long EpochMs);
Sending
public class TickerHub : Hub
{
public Task Subscribe(string symbol) => Groups.AddToGroupAsync(Context.ConnectionId, symbol);
}
// Elsewhere: broadcast to a group
public sealed class TickerFanout
{
private readonly IHubContext<TickerHub> _hub;
public TickerFanout(IHubContext<TickerHub> hub) => _hub = hub;
public Task PublishAsync(string symbol, StockTickDto dto, CancellationToken ct)
=> _hub.Clients.Group(symbol).SendAsync("tick", dto, ct);
}
Tip: don’t send strings with extra whitespace or long property names. Every byte counts at 10,000 users.
Don’t do heavy work inside hubs
A hub method runs on the request path of a live connection. Block here and you stall the socket. Use a bounded channel to offload work to background workers.
Bounded channel + backpressure
public sealed class BroadcastQueue
{
private readonly Channel<(string Group, object Payload)> _channel =
Channel.CreateBounded<(string, object)>(new BoundedChannelOptions(10_000)
{
FullMode = BoundedChannelFullMode.DropOldest, // protect server under burst
SingleReader = true,
SingleWriter = false
});
public bool TryEnqueue(string group, object payload)
=> _channel.Writer.TryWrite((group, payload));
public IAsyncEnumerable<(string Group, object Payload)> ReadAllAsync(CancellationToken ct)
=> _channel.Reader.ReadAllAsync(ct);
}
public sealed class BroadcastWorker : BackgroundService
{
private readonly BroadcastQueue _queue;
private readonly IHubContext<TickerHub> _hub;
public BroadcastWorker(BroadcastQueue queue, IHubContext<TickerHub> hub)
=> (_queue, _hub) = (queue, hub);
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
await foreach (var (group, payload) in _queue.ReadAllAsync(stoppingToken))
{
try
{
await _hub.Clients.Group(group).SendAsync("tick", payload, stoppingToken);
}
catch (OperationCanceledException) when (stoppingToken.IsCancellationRequested) { }
catch (Exception ex)
{
// log and continue; never block the loop
}
}
}
}
Now hub methods just validate input and enqueue – constant time under load.
Presence tracking without locks
You’ll need to know who is online, and which groups they’re in, fast.
public sealed class PresenceStore
{
private readonly ConcurrentDictionary<string, HashSet<string>> _groups = new();
public void Join(string connectionId, string group)
{
var set = _groups.GetOrAdd(group, _ => new HashSet<string>(StringComparer.Ordinal));
lock (set)
{
set.Add(connectionId);
}
}
public void Leave(string connectionId, string group)
{
if (_groups.TryGetValue(group, out var set))
{
lock (set)
{
set.Remove(connectionId);
if (set.Count == 0)
_groups.TryRemove(group, out _);
}
}
}
public int Count(string group)
=> _groups.TryGetValue(group, out var set) ? set.Count : 0;
}
public sealed class PresenceHub : Hub
{
private readonly PresenceStore _presence;
public PresenceHub(PresenceStore presence) => _presence = presence;
public override Task OnConnectedAsync()
{
// join a personal group for targeted pushes
_presence.Join(Context.ConnectionId, Context.UserIdentifier ?? Context.ConnectionId);
return base.OnConnectedAsync();
}
public override Task OnDisconnectedAsync(Exception? ex)
{
_presence.Leave(Context.ConnectionId, Context.UserIdentifier ?? Context.ConnectionId);
return base.OnDisconnectedAsync(ex);
}
}
Rationale:
- ConcurrentDictionary + per‑set locks keeps the hot path cheap.
- Avoid global locks; at 10k connections they turn into choke points.
Streaming for large result sets
When you need to push many items, use server‑to‑client streaming to keep memory low and give the client first bytes early.
public sealed class ReportHub : Hub
{
public async IAsyncEnumerable<ReportRow> StreamReport(
string reportId,
[EnumeratorCancellation] CancellationToken ct = default)
{
await foreach (var row in LoadRowsAsync(reportId, ct))
{
yield return row; // client receives rows as they are ready
}
}
private static async IAsyncEnumerable<ReportRow> LoadRowsAsync(string id, [EnumeratorCancellation] CancellationToken ct)
{
for (var i = 0; i < 10_000; i++)
{
yield return new ReportRow(i, $"R{i}");
await Task.Yield();
}
}
}
Client
var stream = hubConnection.StreamAsync<ReportRow>("StreamReport", reportId, cancellationToken);
await foreach (var row in stream.WithCancellation(cancellationToken))
{
// render as items arrive
}
Guard rails: rate limits and quotas
You don’t want one buggy tab to ruin the party.
- HubOptions.MaximumReceiveMessageSize: drop huge payloads.
- StreamBufferCapacity: limit per‑connection memory during streaming.
- MaxBufferedUnacknowledgedRenderBatches: slow render spam to clients that can’t keep up.
- Per‑user rate limit (simple counter) inside a hub filter.
Lightweight rate limit with a hub filter
public sealed class SimpleRateLimitFilter : IHubFilter
{
private static readonly ConcurrentDictionary<string, (int Count, long WindowStart)> _counters = new();
private const int Limit = 30; // 30 calls
private static readonly TimeSpan Window = TimeSpan.FromSeconds(10);
public async ValueTask<object?> InvokeMethodAsync(
HubInvocationContext context, Func<HubInvocationContext, ValueTask<object?>> next)
{
var key = context.Context.UserIdentifier ?? context.Context.ConnectionId;
var now = Stopwatch.GetTimestamp();
var windowStartTicks = now - (long)(Window.TotalSeconds * Stopwatch.Frequency);
_counters.AddOrUpdate(key,
_ => (1, now),
(_, v) => v.WindowStart < windowStartTicks ? (1, now) : (v.Count + 1, v.WindowStart));
var (count, start) = _counters[key];
if (start >= windowStartTicks && count > Limit)
throw new HubException("rate limit");
return await next(context);
}
}
// Program.cs
builder.Services.AddSingleton<IHubFilter, SimpleRateLimitFilter>();
Blazor Server specifics that matter at scale
Blazor Server has a render queue per circuit. Keep these in check:
MaxBufferedUnacknowledgedRenderBatches
: if a browser lags, the server will pause sending renders instead of hoarding them in memory.DisconnectedCircuitRetentionPeriod
andDisconnectedCircuitMaxRetained
: let users reconnect after a brief network drop without losing state, but don’t keep thousands of dead circuits.JSInteropDefaultCallTimeout
: stuck JS calls should fail fast.
Also:
- Avoid large
@foreach
renders on every tick. UseVirtualize
or diff small parts. - Throttle UI: if you push ticks at 10/sec, render at 2-4/sec and aggregate values.
- Minimize
StateHasChanged
calls; batch updates in a timer.
Azure setup: the safe defaults
1) App Service or containers
- Linux plans tend to have fewer surprises with WebSockets.
- Enable WebSockets on the App Service.
- If you’re not using Azure SignalR Service, keep ARR Affinity ON so circuits stick to the same instance.
- Share Data Protection keys across instances (Blob or Key Vault) so auth cookies stay valid after scale‑out.
2) Azure SignalR Service (Default mode)
For big fan‑out and high connection counts, place Azure SignalR Service in front of your app.
- Use Default mode for Blazor Server.
- Choose a SKU that covers your expected peak connections and messages per day; start modest, autoscale the app, and watch service metrics.
- In App Service, ARR Affinity can be OFF when you use Azure SignalR; the service handles routing.
- Keep WebSockets allowed end‑to‑end (CDN/Front Door/Application Gateway must pass them through).
appsettings.json
{
"Azure": {
"SignalR": {
"ConnectionString": "Endpoint=...;AccessKey=...;Version=1.0;"
}
}
}
Program.cs (when using Azure SignalR Service)
builder.Services.AddSignalR().AddMessagePackProtocol();
// later
app.MapBlazorHub(); // with AddAzureSignalR() if your template uses it in your stack
Keep the app stateless beyond the Blazor circuit. For anything shared across instances, use Redis, SQL, or a durable store.
3) Timeouts & keep‑alives
- Most proxies drop idle sockets after a few minutes. Your KeepAliveInterval (15s above) is fine.
- ClientTimeoutInterval at ~30s helps prune dead connections when users close laptops.
4) Autoscale rules that actually work
Start with:
- CPU at 60-65% over 10 minutes,
- Connections per instance (WebSocket connections) threshold that keeps memory comfortable,
- Queue length if you use a broker for background jobs.
Scale out before you’re in pain; scale in slowly.
Monitoring: what to watch during a load test
Server
- Current connections
- Messages/sec (send & receive)
- Average hub invocation time
- GC pauses and LOH allocations
- Exceptions (especially
HubException
and disconnect reasons)
Client
- Reconnect attempts
- Mean time to render after a message
How I collect it
- Application Insights for logs + custom metrics (track counts on connect/disconnect, queue depth).
EventCounters
viadotnet-counters
during test runs.- A small /healthz endpoint that returns connection counts and queue sizes.
Sample health endpoint
app.MapGet("/healthz", (PresenceStore p) => Results.Ok(new
{
Utc = DateTime.UtcNow,
TickerSubscribers = p.Count("TICKER")
}));
Load test recipe (works on a laptop and scales up)
- Spin up a k6 script (or your tool of choice) that opens N WebSockets and keeps them subscribed.
- Run for 15 minutes, broadcasting a small DTO to all clients every second.
- Record: CPU, memory, connections, send time P95, server exceptions.
- Increase by 2× until you hit the limit; note the first bottleneck (CPU, memory, socket caps). Fix that, repeat.
Broadcast loop for tests
var timer = new PeriodicTimer(TimeSpan.FromSeconds(1));
var rnd = new Random();
while (await timer.WaitForNextTickAsync())
{
var dto = new StockTickDto("ACME", Math.Round((decimal)rnd.NextDouble() * 100, 2), DateTimeOffset.UtcNow.ToUnixTimeMilliseconds());
queue.TryEnqueue("TICKER", dto);
}
Common pitfalls (I’ve fixed all of these in real apps)
- Big JSON everywhere: switch to MessagePack and DTOs.
- Hub does I/O or EF per call: enqueue and let a worker do it.
- No limits: one slow browser fills server memory with render batches.
- ARR Affinity off without Azure SignalR: circuits bounce between instances and die.
- Long polling allowed: force WebSockets when possible; long polling is a last resort.
- Giant
@foreach
updates: render diffs, not the world. - No autoscale: a live event arrives and your single instance cries.
Quick checklist before your next spike
- MessagePack on
MaximumReceiveMessageSize
set- Backpressure on renders (
MaxBufferedUnacknowledgedRenderBatches
) - Bounded channel for fan‑out
- Azure SignalR Service in Default mode (for big scale)
- WebSockets enabled end‑to‑end
- ARR Affinity set correctly
- Autoscale rules active
- Health endpoint + counters visible
FAQ: Blazor Server & SignalR at scale
It’s the safest route. You can reach high numbers without it on big machines, but the service gives better connection density, buffer management, and simpler routing.
Not in the database sense. Think in terms of connection density and server connections to Azure SignalR. You pool work, not sockets: use a bounded channel and background workers to smooth bursts.
Compression for WebSockets is separate and not always available across proxies. The more reliable gain is switching to MessagePack and trimming DTOs.
SignalR prefers WebSockets. Keep those healthy and you’ll be fine.
Make the server cheap per connection: short timeouts, strict limits, and a queue for fan‑out. The system should bend, not break.
When CPU stays above ~65% for 10 minutes, memory keeps growing, queue depth rises, or send P95 crosses your UX target.
Yes, but check WebSocket limits/timeouts and prefer in‑process hosting. For fewer moving parts at scale I usually pick Linux plans.
Conclusion: Win the crowd with limits, not luck
You don’t need exotic gear to serve 10,000 live users. You need tight limits, small payloads, background fan‑out, and an Azure layout that respects WebSockets. Start with the code in this post, run a load test, and tweak with data — not vibes. If you’ve tried different settings or hit a tricky bottleneck, drop a comment: what did your graphs show when the load doubled?