Ever tried to teach an assistant to fetch coffee only to discover it remembers yesterday’s cappuccino order but forgets the sugar again? If yes, welcome to the world of context‑hungry AI, where memory is gold and “skills” decide whether your bot is a genius or a goldfish. Microsoft’s Semantic Kernel (SK) lets you weave classic C# methods with large‑language‑model prompts – no glue gun required. By the end of this article you’ll know exactly when to write C# delegates, when to reach for YAML, and how to wire in long‑term memory so your bot stops acting like Groundhog Day.
The Hybrid Philosophy – Why Two Function Types Beat One
Semantic Kernel takes a hybrid‑function model approach:
- Native functions (C# delegates) for deterministic logic, database hits, or anything that absolutely must not hallucinate.
- Semantic functions (prompt templates) for fuzzy language tasks – summaries, polite rewrites, chain‑of‑thought reasoning.
Think of it like having a calculator and a poet share the same brain. You call the calculator for “2 + 2”, and the poet for “explain zero‑trust in pirate slang”. The glue is SK’s Pipeline – it treats both delegates and prompts as drop‑in skills, so orchestration is uniform.
Defining a Skill – Anatomy of a Folder
A skill is just a folder until you breathe prompts into it.
Inside skills/
each subfolder is a skill branded with one or more functions. Typical layout:
skills/
└── SummarizeEmail/
├── SummarizeEmail.yaml # Semantic prompt
├── config.json # Optional metadata (description, auth, schema)
└── SummarizeEmail.cs # Optional native helper(s)
Your First YAML Prompt
name: SummarizeEmail
description: "Turn verbose emails into bullets."
template: |
Write a concise, friendly summary of the email below
that a busy CTO can read in 10 seconds.
{{$input}}
input_variables:
- name: input
type: string
description: "Raw email body"
default: ""
Key points:
template
– may contain Liquid placeholders ({{$input}}
,{{$time}}
).input_variables
– strongly typed, so SK can surface IntelliSense inside C#.config.json
(optional) – versions your prompt, sets themax_tokens
, temperature, or required OpenAI model.
Because every prompt lives in a folder, you can ship skills as NuGet‑ready assets – drop skills/ZipCodeLookup/
into any repo and you’re off.
When to Go Native – Writing C# Functions
Not everything deserves a token‑hungry prompt. Use delegates when you need:
- Zero risk of hallucination (financial calculations).
- Heavy I/O (SQL, REST) best done outside the LLM loop.
- Ultra‑low latency (millisecond SLA).
public class UtilitySkill
{
[SKFunction("Generate a cryptographically‑secure random GUID")]
public static string NewGuid() => Guid.NewGuid().ToString();
[SKFunction("Fetch today's USD → EUR rate")]
public async Task<decimal> ExchangeRateAsync()
=> await _fxClient.GetRateAsync("USD", "EUR");
}
Register it in your kernel builder:
builder.WithFunctions<UtilitySkill>();
The delegate now shows up as {{utility.NewGuid}}
inside any semantic template, bridging deterministic output into generative flows.
Working with Memory – Embeddings, Stores & the 2025 Unified API
SK 0.93 shipped the Unified Memory API – one façade to rule OpenAI, Azure AI, Qdrant, Pinecone, Redis Vector, and any custom store you write.
var memory = new MemoryBuilder()
.WithAzureOpenAITextEmbeddings(deployment: "gpt-4o-embeddings")
.WithQdrant(endpoint: "http://localhost:6333")
.Build();
Writing Memories
await memory.SaveReferenceAsync(
collection: "emails",
externalId: email.Id,
text: email.Body,
description: "Customer support thread");
Behind the scenes SK vectorizes email.Body
, stores it, and returns a MemoryRecord
. Need metadata? Stick it into the description
field or attach a JSON blob.
Recalling Context
var recalls = await memory.SearchAsync(
"emails",
query: "budget approval status",
limit: 3,
minRelevance: 0.8);
You get the three closest chunks plus embeddings for chaining back into your prompt – “Here’s what the customer said earlier…”.
Tip: Swap stores at runtime – testing locally with SQLite vectors, then flipping to Azure Cognitive Search in production – without touching business code.
Hands‑On: Building a Hybrid Q&A Bot
Let’s wire the pieces into a console chatbot that:
- Loads
FAQSkill
(semantic) andUtilitySkill
(native). - Stores every user question in memory.
- Retrieves the top‑k related past Qs before answering.
var kernel = new KernelBuilder()
.WithAzureOpenAIChatCompletion("gpt‑4o", key)
.WithMemory(memory)
.Build();
// import skills
kernel.ImportSkill("faq", "./skills/FAQSkill");
kernel.RegisterFunctions<UtilitySkill>();
while (true)
{
Console.Write("👤 ");
var question = Console.ReadLine();
// store question
await kernel.Memory.SaveAsync("chat", question, question);
// recall
var history = await kernel.Memory.SearchAsync(
"chat", question, limit: 2);
var context = new SKContext();
context["history"] = string.Join("\n", history.Select(h => h.Metadata.Text));
// ask semantic function
var answer = await kernel.RunAsync(
context,
kernel.Skills["faq"]["Answer"]);
Console.WriteLine($"🤖 {answer}");
}
With fewer than 60 lines you get a bot that remembers earlier discussions and answers consistently – no sugar left behind.
FAQ: Semantic Kernel Memory & Vector Databases
Yes. Memory lives in whichever vector store you configure. Local stores (SQLite, LiteDB) write to disk, while managed stores (Qdrant Cloud, Pinecone, Azure AISearch) persist indefinitely – retention depends on your service tier.
Qdrant, Pinecone, Typesense, Azure AISearch, Cosmos DB with Vectors, Redis Vector, Chroma and Postgres‑pgvector. Adding a new one is one interface away – implement IMemoryStore
.
Yes. Embeddings are deterministic per model, so relevance drops if you swap models. To migrate, re‑embed existing text and upsert the new vectors; SK’s ReembedAsync()
helper streams through a collection for you.
A practical ceiling is your embedding model’s token limit (8,192 tokens for GPT‑4o‑Embeddings in 2025). SK auto‑chunks oversized text when you set EnableAutoChunking = true
on MemoryBuilder
.
Encryption depends on the backing store. Azure AISearch, Cosmos DB, and Pinecone default to AES‑256. Self‑hosted Qdrant/Chroma can enable full‑disk encryption; SK always transmits via HTTPS, so transport is covered.
Conclusion: From Skills to Agents
You’ve seen how skills, semantic functions, and the new memory API turn Semantic Kernel into a Swiss‑army orchestrator. Next up is SK’s Agent Framework, where these building blocks grow decision‑making brains, schedule background tasks, and even spawn child kernels. Curious? Smash that subscribe and share your wildest hybrid‑skill ideas in the comments – I read every one.