OpenAI Text Generation in C# (.NET 8)

Are you sure your first OpenAI request isn’t leaking secrets, stalling under load, or silently eating money? In this post I’ll show you the exact C# patterns I use to call OpenAI for text generation – clean, fast, stream-friendly, and production-safe. We’ll go from “hello world” to streaming, retries, DI, and structured JSON outputs you can ship today.

What you’ll build

A minimal console app that generates text
Streaming responses with await foreach
ASP.NET Core DI setup with typed clients
Safe configuration (User Secrets / env vars)
Raw HttpClient alternative if you can’t use the SDK
JSON-validated outputs (for predictable parsing)
Resilience: retries, timeouts, cancellation

I’ll use the official OpenAI .NET SDK for most examples and sprinkle in a raw REST approach so you can choose.

Prerequisites

.NET 8+ (works with .NET Standard 2.0 for consuming the library)
An OpenAI API key (store it outside source control)
Basic familiarity with C# async/await

Install the official SDK

# from your project folder
 dotnet add package OpenAI

Keep secrets out of Git

# Local dev (User Secrets)
dotnet user-secrets init
# set once per project
dotnet user-secrets set "OPENAI_API_KEY" "<your_key>"

In CI/containers, prefer environment variables:

# Linux/macOS
export OPENAI_API_KEY=***
# Windows (Powershell)
$env:OPENAI_API_KEY="***"

Your first text generation (Console)

Let’s start with the simplest possible call using the Chat API. (Yes, “text generation” uses chat-style messages under the hood.)

using System;
using OpenAI.Chat;

class Program
{
    static void Main()
    {
        var apiKey = Environment.GetEnvironmentVariable("OPENAI_API_KEY")
                     ?? throw new InvalidOperationException("OPENAI_API_KEY is not set");

        // Choose a balanced default. Swap to "gpt-4o-mini" for lower cost, or a bigger model when needed.
        var client = new ChatClient(model: "gpt-4o", apiKey: apiKey);

        var completion = client.CompleteChat(
            "Write a 2-sentence product update about faster search results."
        );

        Console.WriteLine(completion.Content[0].Text);
    }
}

Result: a short paragraph ready for your UI, email, or logs (but don’t log secrets!).

Tip: The SDK ships async variants for every call. Prefer await in real apps.

Streaming responses (print while it thinks)

When latency matters, stream tokens as they arrive.

using System;
using OpenAI.Chat;

class Program
{
    static async System.Threading.Tasks.Task Main()
    {
        var apiKey = Environment.GetEnvironmentVariable("OPENAI_API_KEY")!;
        var client = new ChatClient("gpt-4o", apiKey);

        await foreach (var update in client.CompleteChatStreamingAsync(
            "Explain why indexes speed up SQL queries in plain English."
        ))
        {
            if (update.ContentUpdate.Count > 0)
            {
                Console.Write(update.ContentUpdate[0].Text);
            }
        }

        Console.WriteLine();
    }
}

This pattern is perfect for chat UIs, CLI tools, or long-running content generation.

ASP.NET Core: register once, use everywhere

In web apps you want one shared client (thread-safe) via DI.

Program.cs

using OpenAI.Chat;

var builder = WebApplication.CreateBuilder(args);
var apiKey = builder.Configuration["OPENAI_API_KEY"]
             ?? Environment.GetEnvironmentVariable("OPENAI_API_KEY")
             ?? throw new InvalidOperationException("OPENAI_API_KEY missing");

builder.Services.AddSingleton(new ChatClient("gpt-4o", apiKey));
var app = builder.Build();

app.MapPost("/api/generate", async (ChatClient chat, PromptDto dto) =>
{
    // Include prior messages if you need context
    var completion = await chat.CompleteChatAsync($"Summarize in 3 bullets: {dto.Text}");
    return Results.Ok(new { text = completion.Content[0].Text });
});

app.Run();

record PromptDto(string Text);

Why singleton? The SDK internally reuses HTTP connections; one instance per app avoids socket churn.

Structured JSON output (parse safely)

Free-form text is great until you need to parse it. Ask the model for JSON and validate on your side. With the SDK’s Responses client you can push the model toward structured output.

using System.Text.Json;
using OpenAI.Responses;

var apiKey = Environment.GetEnvironmentVariable("OPENAI_API_KEY")!;
var responses = new OpenAIResponseClient(model: "gpt-4o-mini", apiKey: apiKey);

var response = await responses.CreateResponseAsync(
    userInputText: "Extract product name and price from: 'Acme Turbo Blender — $129.99'",
    new ResponseCreationOptions
    {
        JsonSchema = new()
        {
            // A tiny schema: { name: string, price: number }
            Schema = JsonDocument.Parse(
                "{" +
                "\"type\":\"object\",\"required\":[\"name\",\"price\"]," +
                "\"properties\":{\"name\":{\"type\":\"string\"},\"price\":{\"type\":\"number\"}}}"
            ).RootElement
        }
    }
);

// The response is already aligned to your schema.
var first = response.OutputItems.OfType<MessageResponseItem>().First();
var json = first.Content.First().Text; // JSON string per schema
Console.WriteLine(json);

When to use it: invoices, entity extraction, config generation, anything you would otherwise regex.

Raw REST with `HttpClient` (no SDK)

Sometimes policy or a shared gateway requires calling REST yourself. Here’s a minimal request using the Chat Completions endpoint.

using System.Net.Http;
using System.Net.Http.Headers;
using System.Text;
using System.Text.Json;

var apiKey = Environment.GetEnvironmentVariable("OPENAI_API_KEY")!;
using var http = new HttpClient { Timeout = TimeSpan.FromSeconds(60) };

using var req = new HttpRequestMessage(HttpMethod.Post, "https://api.openai.com/v1/chat/completions");
req.Headers.Authorization = new AuthenticationHeaderValue("Bearer", apiKey);
req.Content = new StringContent(
    JsonSerializer.Serialize(new
    {
        model = "gpt-4o-mini",
        messages = new[]
        {
            new { role = "system", content = "You are a concise marketing copywriter." },
            new { role = "user", content = "Give me a headline for a summer sale on shoes." }
        },
        temperature = 0.7
    }), Encoding.UTF8, "application/json");

using var res = await http.SendAsync(req, HttpCompletionOption.ResponseHeadersRead);
res.EnsureSuccessStatusCode();

using var stream = await res.Content.ReadAsStreamAsync();
using var doc = await JsonDocument.ParseAsync(stream);
var content = doc.RootElement.GetProperty("choices")[0]
    .GetProperty("message").GetProperty("content").GetString();

Console.WriteLine(content);

Prefer the SDK unless you need custom plumbing, protocol-level control, or to target an OpenAI-compatible proxy via a base URL.

Streaming over REST (SSE)

Server-Sent Events let you print tokens while they arrive.

using System.Buffers;
using System.Net.Http.Headers;
using System.Text;
using System.Text.Json;

var apiKey = Environment.GetEnvironmentVariable("OPENAI_API_KEY")!;
using var http = new HttpClient();

var payload = new
{
    model = "gpt-4o-mini",
    stream = true,
    messages = new[]
    {
        new { role = "user", content = "List 5 lightweight snacks for a meetup." }
    }
};

using var req = new HttpRequestMessage(HttpMethod.Post, "https://api.openai.com/v1/chat/completions")
{
    Content = new StringContent(JsonSerializer.Serialize(payload), Encoding.UTF8, "application/json")
};
req.Headers.Authorization = new AuthenticationHeaderValue("Bearer", apiKey);

using var res = await http.SendAsync(req, HttpCompletionOption.ResponseHeadersRead);
res.EnsureSuccessStatusCode();

using var reader = new StreamReader(await res.Content.ReadAsStreamAsync(), Encoding.UTF8);

while (!reader.EndOfStream)
{
    var line = await reader.ReadLineAsync();
    if (string.IsNullOrWhiteSpace(line)) continue;       // keep-alives
    if (!line.StartsWith("data:")) continue;

    var data = line[5..].Trim();
    if (data == "[DONE]") break;

    var json = JsonDocument.Parse(data);
    var delta = json.RootElement
        .GetProperty("choices")[0]
        .GetProperty("delta");

    if (delta.TryGetProperty("content", out var text))
        Console.Write(text.GetString());
}
Console.WriteLine();

Gotcha: SSE yields multiple JSON chunks; terminate when you see [DONE].

Controlling style & cost

A few knobs you’ll use daily:

Model: gpt-4o-mini (cheap & fast), gpt-4o (richer), and reasoning models like o3-mini when you need deliberate steps.
System prompt: set tone/policy once; keep it short & stable.
Temperature: creativity. 0.0–0.3 deterministic, 0.7 lively, >0.9 wild.
Max tokens: cap costs; ensure enough room for the answer.
Stop sequences: prevent rambling when embedding into templates.

Example options (Chat):

var options = new ChatCompletionOptions
{
    Temperature = 0.4,
    MaxOutputTokenCount = 400,
    StopSequences = { "\n--\n" }
};
var completion = await client.CompleteChatAsync(
    [ new SystemChatMessage("You are a terse assistant."),
      new UserChatMessage("Show 3 bullet rules for PR reviews.") ],
    options);

Reliability: retries, timeouts, cancellation

Production hurts without guardrails. Here’s how I keep calls healthy.

Use cancellation tokens everywhere

using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(20));
var result = await client.CompleteChatAsync("Summarize this:", cancellationToken: cts.Token);

Add a circuit breaker (Polly) for raw `HttpClient`

using Microsoft.Extensions.DependencyInjection;
using Polly;
using Polly.Extensions.Http;

builder.Services.AddHttpClient("openai")
    .AddPolicyHandler(HttpPolicyExtensions
        .HandleTransientHttpError()
        .OrResult(r => (int)r.StatusCode == 429)
        .WaitAndRetryAsync(3, retry => TimeSpan.FromSeconds(Math.Pow(2, retry))))
    .AddPolicyHandler(Policy.TimeoutAsync<HttpResponseMessage>(TimeSpan.FromSeconds(30)));

The official SDK already retries common transient errors with exponential backoff; combine that with short timeouts and cancellation for sanity.

Don’t log secrets (ever)

Never print the API key
Scrub prompts if they contain PII
Log request IDs and durations, not raw model output (unless redacted)

Prompt patterns that actually work

System template (one-liner policy):

You are a precise technical writer. Prefer bullet points. Keep answers under 120 words unless asked.

Few-shot to anchor style:

User: Write a crisp release note about faster build times.
Assistant: • Build time -30% on average\n• Parallel test runs\n• Smarter cache invalidation

Guardrails for JSON:

Reply ONLY with JSON matching this schema: {"title": string, "bullets": string[]}

End-to-end example: summarizer API (minimal)

A tiny endpoint that turns long text into a short summary with predictable JSON.

using System.Text.Json;
using Microsoft.AspNetCore.Mvc;
using OpenAI.Responses;

var builder = WebApplication.CreateBuilder(args);
var apiKey = Environment.GetEnvironmentVariable("OPENAI_API_KEY")!;
builder.Services.AddSingleton(new OpenAIResponseClient("gpt-4o-mini", apiKey));
var app = builder.Build();

app.MapPost("/summarize", async ([FromBody] string text, OpenAIResponseClient client) =>
{
    var schema = JsonDocument.Parse("{\n  \"type\": \"object\",\n  \"required\": [\"summary\"],\n  \"properties\": { \"summary\": { \"type\": \"string\" } }\n}").RootElement;

    var response = await client.CreateResponseAsync(
        userInputText: $"Summarize in 60 words: {text}",
        new ResponseCreationOptions { JsonSchema = new() { Schema = schema } });

    var message = response.OutputItems.OfType<MessageResponseItem>().First();
    return Results.Text(message.Content.First().Text, "application/json");
});

app.Run();

Why this rocks: one call, one JSON blob, zero post-processing pain.

Common errors (and how I fix them)

401 Unauthorized: key is missing/typo; check env var scope and process user.
429 Too Many Requests: back off (retry with jitter), lower concurrency, or use a cheaper/smaller model.
408/5xx: transient; rely on retries + circuit breaker.
Model name invalid: double-check spelling and availability in your account.
Long prompts blow context: cut boilerplate, compress history, or move to a model with a larger context window.

Architecture notes (mental model)

+----------------------------+      HTTPS      +------------------+
|   Your .NET App (API/UI)  |  ───────────▶   |   OpenAI API     |
|  DI: ChatClient/Response  |  ◀───────────   |  (models, tools) |
+----------------------------+   Streaming     +------------------+
        |  retries, CT, timeouts                                    
        └──► Telemetry (OTel), logging (redacted), metrics

SDK first: fastest way to be correct
Raw REST: when you need total control or a gateway
Streaming: upgrade UX with token trickle
Structured: banish brittle scraping of free-form text

FAQ: C# + OpenAI Text Generation

Which model should I start with?

Start with gpt-4o-mini for most text tasks (cheap, fast). Upgrade to gpt-4o when quality matters more than cost. Use o3-mini when you need deliberate reasoning outputs.

Is Chat Completions going away?

Plan for the Responses API (it unifies tools like web/file search and structured output). Chat Completions still works, but Responses is the forward-looking path.

How do I keep costs down?

Short prompts, lower temperature, cap MaxOutputTokenCount, pick smaller models, stream to cancel early when you have enough.

Can I call OpenAI-compatible endpoints?

Yes – configure a custom Endpoint in client options or point your HttpClient at the proxy. Keep model names consistent with the target.

How do I test without hitting the API?

Abstract your service, inject clients, and mock them. The official SDK supports mocking via model factories so you can unit test logic without network calls.

Conclusion: Ship text generation you can trust

If your first OpenAI integration is a quick POST in a controller – cool. But a production-ready integration adds streaming, structured outputs, DI, retries, and strict secret hygiene. Start tiny, add the guardrails above, and you’ll have a service that’s fast, predictable, and inexpensive.

Your turn: which part do you want sample code for next – tool calls, RAG with a vector store, or vision inputs? Drop a comment and I’ll expand this into a mini-series.

Post Views: 1,730

2 Comments

Dude19 says:

January 20, 2024 at 12:30 am

I don’t get it, how do I even start using OpenAI API in C#?

- amarozka says:
  
  January 20, 2024 at 12:31 am
  
  To start using OpenAI API in C#, you first need to have access to the OpenAI API and an API key.
  Then, set up your C# development environment and optionally use the OpenAI C# Library for easier API interactions.

OpenAI Text Generation in C# (.NET 8) – Complete Guide

What you’ll build

Prerequisites

Install the official SDK

Keep secrets out of Git

Your first text generation (Console)

Streaming responses (print while it thinks)

ASP.NET Core: register once, use everywhere

Structured JSON output (parse safely)

Raw REST with `HttpClient` (no SDK)

Streaming over REST (SSE)

Controlling style & cost

Reliability: retries, timeouts, cancellation

Use cancellation tokens everywhere

Add a circuit breaker (Polly) for raw `HttpClient`

Don’t log secrets (ever)

Prompt patterns that actually work

End-to-end example: summarizer API (minimal)

Common errors (and how I fix them)

Architecture notes (mental model)

FAQ: C# + OpenAI Text Generation

Conclusion: Ship text generation you can trust

2 Comments

Leave a Reply Cancel reply

OpenAI Text Generation in C# (.NET 8) – Complete Guide

What you’ll build

Prerequisites

Install the official SDK

Keep secrets out of Git

Your first text generation (Console)

Streaming responses (print while it thinks)

ASP.NET Core: register once, use everywhere

Structured JSON output (parse safely)

Raw REST with HttpClient (no SDK)

Streaming over REST (SSE)

Controlling style & cost

Reliability: retries, timeouts, cancellation

Use cancellation tokens everywhere

Add a circuit breaker (Polly) for raw HttpClient

Don’t log secrets (ever)

Prompt patterns that actually work

End-to-end example: summarizer API (minimal)

Common errors (and how I fix them)

Architecture notes (mental model)

FAQ: C# + OpenAI Text Generation

Conclusion: Ship text generation you can trust

2 Comments

Leave a Reply Cancel reply

Raw REST with `HttpClient` (no SDK)

Add a circuit breaker (Polly) for raw `HttpClient`