Are you sure your first OpenAI request isn’t leaking secrets, stalling under load, or silently eating money? In this post I’ll show you the exact C# patterns I use to call OpenAI for text generation – clean, fast, stream-friendly, and production-safe. We’ll go from “hello world” to streaming, retries, DI, and structured JSON outputs you can ship today.
What you’ll build
- A minimal console app that generates text
- Streaming responses with
await foreach
- ASP.NET Core DI setup with typed clients
- Safe configuration (User Secrets / env vars)
- Raw
HttpClient
alternative if you can’t use the SDK - JSON-validated outputs (for predictable parsing)
- Resilience: retries, timeouts, cancellation
I’ll use the official OpenAI .NET SDK for most examples and sprinkle in a raw REST approach so you can choose.
Prerequisites
- .NET 8+ (works with .NET Standard 2.0 for consuming the library)
- An OpenAI API key (store it outside source control)
- Basic familiarity with C# async/await
Install the official SDK
# from your project folder
dotnet add package OpenAI
Keep secrets out of Git
# Local dev (User Secrets)
dotnet user-secrets init
# set once per project
dotnet user-secrets set "OPENAI_API_KEY" "<your_key>"
In CI/containers, prefer environment variables:
# Linux/macOS
export OPENAI_API_KEY=***
# Windows (Powershell)
$env:OPENAI_API_KEY="***"
Your first text generation (Console)
Let’s start with the simplest possible call using the Chat API. (Yes, “text generation” uses chat-style messages under the hood.)
using System;
using OpenAI.Chat;
class Program
{
static void Main()
{
var apiKey = Environment.GetEnvironmentVariable("OPENAI_API_KEY")
?? throw new InvalidOperationException("OPENAI_API_KEY is not set");
// Choose a balanced default. Swap to "gpt-4o-mini" for lower cost, or a bigger model when needed.
var client = new ChatClient(model: "gpt-4o", apiKey: apiKey);
var completion = client.CompleteChat(
"Write a 2-sentence product update about faster search results."
);
Console.WriteLine(completion.Content[0].Text);
}
}
Result: a short paragraph ready for your UI, email, or logs (but don’t log secrets!).
Tip: The SDK ships async variants for every call. Prefer
await
in real apps.
Streaming responses (print while it thinks)
When latency matters, stream tokens as they arrive.
using System;
using OpenAI.Chat;
class Program
{
static async System.Threading.Tasks.Task Main()
{
var apiKey = Environment.GetEnvironmentVariable("OPENAI_API_KEY")!;
var client = new ChatClient("gpt-4o", apiKey);
await foreach (var update in client.CompleteChatStreamingAsync(
"Explain why indexes speed up SQL queries in plain English."
))
{
if (update.ContentUpdate.Count > 0)
{
Console.Write(update.ContentUpdate[0].Text);
}
}
Console.WriteLine();
}
}
This pattern is perfect for chat UIs, CLI tools, or long-running content generation.
ASP.NET Core: register once, use everywhere
In web apps you want one shared client (thread-safe) via DI.
Program.cs
using OpenAI.Chat;
var builder = WebApplication.CreateBuilder(args);
var apiKey = builder.Configuration["OPENAI_API_KEY"]
?? Environment.GetEnvironmentVariable("OPENAI_API_KEY")
?? throw new InvalidOperationException("OPENAI_API_KEY missing");
builder.Services.AddSingleton(new ChatClient("gpt-4o", apiKey));
var app = builder.Build();
app.MapPost("/api/generate", async (ChatClient chat, PromptDto dto) =>
{
// Include prior messages if you need context
var completion = await chat.CompleteChatAsync($"Summarize in 3 bullets: {dto.Text}");
return Results.Ok(new { text = completion.Content[0].Text });
});
app.Run();
record PromptDto(string Text);
Why singleton? The SDK internally reuses HTTP connections; one instance per app avoids socket churn.
Structured JSON output (parse safely)
Free-form text is great until you need to parse it. Ask the model for JSON and validate on your side. With the SDK’s Responses client you can push the model toward structured output.
using System.Text.Json;
using OpenAI.Responses;
var apiKey = Environment.GetEnvironmentVariable("OPENAI_API_KEY")!;
var responses = new OpenAIResponseClient(model: "gpt-4o-mini", apiKey: apiKey);
var response = await responses.CreateResponseAsync(
userInputText: "Extract product name and price from: 'Acme Turbo Blender — $129.99'",
new ResponseCreationOptions
{
JsonSchema = new()
{
// A tiny schema: { name: string, price: number }
Schema = JsonDocument.Parse(
"{" +
"\"type\":\"object\",\"required\":[\"name\",\"price\"]," +
"\"properties\":{\"name\":{\"type\":\"string\"},\"price\":{\"type\":\"number\"}}}"
).RootElement
}
}
);
// The response is already aligned to your schema.
var first = response.OutputItems.OfType<MessageResponseItem>().First();
var json = first.Content.First().Text; // JSON string per schema
Console.WriteLine(json);
When to use it: invoices, entity extraction, config generation, anything you would otherwise regex.
Raw REST with HttpClient
(no SDK)
Sometimes policy or a shared gateway requires calling REST yourself. Here’s a minimal request using the Chat Completions endpoint.
using System.Net.Http;
using System.Net.Http.Headers;
using System.Text;
using System.Text.Json;
var apiKey = Environment.GetEnvironmentVariable("OPENAI_API_KEY")!;
using var http = new HttpClient { Timeout = TimeSpan.FromSeconds(60) };
using var req = new HttpRequestMessage(HttpMethod.Post, "https://api.openai.com/v1/chat/completions");
req.Headers.Authorization = new AuthenticationHeaderValue("Bearer", apiKey);
req.Content = new StringContent(
JsonSerializer.Serialize(new
{
model = "gpt-4o-mini",
messages = new[]
{
new { role = "system", content = "You are a concise marketing copywriter." },
new { role = "user", content = "Give me a headline for a summer sale on shoes." }
},
temperature = 0.7
}), Encoding.UTF8, "application/json");
using var res = await http.SendAsync(req, HttpCompletionOption.ResponseHeadersRead);
res.EnsureSuccessStatusCode();
using var stream = await res.Content.ReadAsStreamAsync();
using var doc = await JsonDocument.ParseAsync(stream);
var content = doc.RootElement.GetProperty("choices")[0]
.GetProperty("message").GetProperty("content").GetString();
Console.WriteLine(content);
Prefer the SDK unless you need custom plumbing, protocol-level control, or to target an OpenAI-compatible proxy via a base URL.
Streaming over REST (SSE)
Server-Sent Events let you print tokens while they arrive.
using System.Buffers;
using System.Net.Http.Headers;
using System.Text;
using System.Text.Json;
var apiKey = Environment.GetEnvironmentVariable("OPENAI_API_KEY")!;
using var http = new HttpClient();
var payload = new
{
model = "gpt-4o-mini",
stream = true,
messages = new[]
{
new { role = "user", content = "List 5 lightweight snacks for a meetup." }
}
};
using var req = new HttpRequestMessage(HttpMethod.Post, "https://api.openai.com/v1/chat/completions")
{
Content = new StringContent(JsonSerializer.Serialize(payload), Encoding.UTF8, "application/json")
};
req.Headers.Authorization = new AuthenticationHeaderValue("Bearer", apiKey);
using var res = await http.SendAsync(req, HttpCompletionOption.ResponseHeadersRead);
res.EnsureSuccessStatusCode();
using var reader = new StreamReader(await res.Content.ReadAsStreamAsync(), Encoding.UTF8);
while (!reader.EndOfStream)
{
var line = await reader.ReadLineAsync();
if (string.IsNullOrWhiteSpace(line)) continue; // keep-alives
if (!line.StartsWith("data:")) continue;
var data = line[5..].Trim();
if (data == "[DONE]") break;
var json = JsonDocument.Parse(data);
var delta = json.RootElement
.GetProperty("choices")[0]
.GetProperty("delta");
if (delta.TryGetProperty("content", out var text))
Console.Write(text.GetString());
}
Console.WriteLine();
Gotcha: SSE yields multiple JSON chunks; terminate when you see [DONE]
.
Controlling style & cost
A few knobs you’ll use daily:
- Model:
gpt-4o-mini
(cheap & fast),gpt-4o
(richer), and reasoning models likeo3-mini
when you need deliberate steps. - System prompt: set tone/policy once; keep it short & stable.
- Temperature: creativity.
0.0–0.3
deterministic,0.7
lively,>0.9
wild. - Max tokens: cap costs; ensure enough room for the answer.
- Stop sequences: prevent rambling when embedding into templates.
Example options (Chat):
var options = new ChatCompletionOptions
{
Temperature = 0.4,
MaxOutputTokenCount = 400,
StopSequences = { "\n--\n" }
};
var completion = await client.CompleteChatAsync(
[ new SystemChatMessage("You are a terse assistant."),
new UserChatMessage("Show 3 bullet rules for PR reviews.") ],
options);
Reliability: retries, timeouts, cancellation
Production hurts without guardrails. Here’s how I keep calls healthy.
Use cancellation tokens everywhere
using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(20));
var result = await client.CompleteChatAsync("Summarize this:", cancellationToken: cts.Token);
Add a circuit breaker (Polly) for raw HttpClient
using Microsoft.Extensions.DependencyInjection;
using Polly;
using Polly.Extensions.Http;
builder.Services.AddHttpClient("openai")
.AddPolicyHandler(HttpPolicyExtensions
.HandleTransientHttpError()
.OrResult(r => (int)r.StatusCode == 429)
.WaitAndRetryAsync(3, retry => TimeSpan.FromSeconds(Math.Pow(2, retry))))
.AddPolicyHandler(Policy.TimeoutAsync<HttpResponseMessage>(TimeSpan.FromSeconds(30)));
The official SDK already retries common transient errors with exponential backoff; combine that with short timeouts and cancellation for sanity.
Don’t log secrets (ever)
- Never print the API key
- Scrub prompts if they contain PII
- Log request IDs and durations, not raw model output (unless redacted)
Prompt patterns that actually work
System template (one-liner policy):
You are a precise technical writer. Prefer bullet points. Keep answers under 120 words unless asked.
Few-shot to anchor style:
User: Write a crisp release note about faster build times.
Assistant: • Build time -30% on average\n• Parallel test runs\n• Smarter cache invalidation
Guardrails for JSON:
Reply ONLY with JSON matching this schema: {"title": string, "bullets": string[]}
End-to-end example: summarizer API (minimal)
A tiny endpoint that turns long text into a short summary with predictable JSON.
using System.Text.Json;
using Microsoft.AspNetCore.Mvc;
using OpenAI.Responses;
var builder = WebApplication.CreateBuilder(args);
var apiKey = Environment.GetEnvironmentVariable("OPENAI_API_KEY")!;
builder.Services.AddSingleton(new OpenAIResponseClient("gpt-4o-mini", apiKey));
var app = builder.Build();
app.MapPost("/summarize", async ([FromBody] string text, OpenAIResponseClient client) =>
{
var schema = JsonDocument.Parse("{\n \"type\": \"object\",\n \"required\": [\"summary\"],\n \"properties\": { \"summary\": { \"type\": \"string\" } }\n}").RootElement;
var response = await client.CreateResponseAsync(
userInputText: $"Summarize in 60 words: {text}",
new ResponseCreationOptions { JsonSchema = new() { Schema = schema } });
var message = response.OutputItems.OfType<MessageResponseItem>().First();
return Results.Text(message.Content.First().Text, "application/json");
});
app.Run();
Why this rocks: one call, one JSON blob, zero post-processing pain.
Common errors (and how I fix them)
- 401 Unauthorized: key is missing/typo; check env var scope and process user.
- 429 Too Many Requests: back off (retry with jitter), lower concurrency, or use a cheaper/smaller model.
- 408/5xx: transient; rely on retries + circuit breaker.
- Model name invalid: double-check spelling and availability in your account.
- Long prompts blow context: cut boilerplate, compress history, or move to a model with a larger context window.
Architecture notes (mental model)
+----------------------------+ HTTPS +------------------+
| Your .NET App (API/UI) | ───────────▶ | OpenAI API |
| DI: ChatClient/Response | ◀─────────── | (models, tools) |
+----------------------------+ Streaming +------------------+
| retries, CT, timeouts
└──► Telemetry (OTel), logging (redacted), metrics
- SDK first: fastest way to be correct
- Raw REST: when you need total control or a gateway
- Streaming: upgrade UX with token trickle
- Structured: banish brittle scraping of free-form text
FAQ: C# + OpenAI Text Generation
Start with gpt-4o-mini
for most text tasks (cheap, fast). Upgrade to gpt-4o
when quality matters more than cost. Use o3-mini
when you need deliberate reasoning outputs.
Plan for the Responses API (it unifies tools like web/file search and structured output). Chat Completions still works, but Responses is the forward-looking path.
Short prompts, lower temperature, cap MaxOutputTokenCount
, pick smaller models, stream to cancel early when you have enough.
Yes – configure a custom Endpoint
in client options or point your HttpClient
at the proxy. Keep model names consistent with the target.
Abstract your service, inject clients, and mock them. The official SDK supports mocking via model factories so you can unit test logic without network calls.
Conclusion: Ship text generation you can trust
If your first OpenAI integration is a quick POST
in a controller – cool. But a production-ready integration adds streaming, structured outputs, DI, retries, and strict secret hygiene. Start tiny, add the guardrails above, and you’ll have a service that’s fast, predictable, and inexpensive.
Your turn: which part do you want sample code for next – tool calls, RAG with a vector store, or vision inputs? Drop a comment and I’ll expand this into a mini-series.
I don’t get it, how do I even start using OpenAI API in C#?
To start using OpenAI API in C#, you first need to have access to the OpenAI API and an API key.
Then, set up your C# development environment and optionally use the OpenAI C# Library for easier API interactions.