Semantic Kernel Skills & Memory Guide (2025)

Ever tried to teach an assistant to fetch coffee only to discover it remembers yesterday’s cappuccino order but forgets the sugar again? If yes, welcome to the world of context‑hungry AI, where memory is gold and “skills” decide whether your bot is a genius or a goldfish. Microsoft’s Semantic Kernel (SK) lets you weave classic C# methods with large‑language‑model prompts – no glue gun required. By the end of this article you’ll know exactly when to write C# delegates, when to reach for YAML, and how to wire in long‑term memory so your bot stops acting like Groundhog Day.

The Hybrid Philosophy – Why Two Function Types Beat One

Semantic Kernel takes a hybrid‑function model approach:

Native functions (C# delegates) for deterministic logic, database hits, or anything that absolutely must not hallucinate.
Semantic functions (prompt templates) for fuzzy language tasks – summaries, polite rewrites, chain‑of‑thought reasoning.

Think of it like having a calculator and a poet share the same brain. You call the calculator for “2 + 2”, and the poet for “explain zero‑trust in pirate slang”. The glue is SK’s Pipeline – it treats both delegates and prompts as drop‑in skills, so orchestration is uniform.

Defining a Skill – Anatomy of a Folder

A skill is just a folder until you breathe prompts into it.
Never miss a post
Short, punchy summaries on Medium. No spam.
Subscribe on Medium

Inside skills/ each subfolder is a skill branded with one or more functions. Typical layout:

skills/
└── SummarizeEmail/
    ├── SummarizeEmail.yaml   # Semantic prompt
    ├── config.json           # Optional metadata (description, auth, schema)
    └── SummarizeEmail.cs     # Optional native helper(s)

Your First YAML Prompt

name: SummarizeEmail
description: "Turn verbose emails into bullets."
template: |
  Write a concise, friendly summary of the email below
  that a busy CTO can read in 10 seconds.

  {{$input}}

input_variables:
  - name: input
    type: string
    description: "Raw email body"
    default: ""

Key points:

template – may contain Liquid placeholders ({{$input}}, {{$time}}).
input_variables – strongly typed, so SK can surface IntelliSense inside C#.
config.json (optional) – versions your prompt, sets the max_tokens, temperature, or required OpenAI model.

Because every prompt lives in a folder, you can ship skills as NuGet‑ready assets – drop skills/ZipCodeLookup/ into any repo and you’re off.

When to Go Native – Writing C# Functions

Not everything deserves a token‑hungry prompt. Use delegates when you need:

Zero risk of hallucination (financial calculations).
Heavy I/O (SQL, REST) best done outside the LLM loop.
Ultra‑low latency (millisecond SLA).

public class UtilitySkill
{
    [SKFunction("Generate a cryptographically‑secure random GUID")]
    public static string NewGuid() => Guid.NewGuid().ToString();

    [SKFunction("Fetch today's USD → EUR rate")]
    public async Task<decimal> ExchangeRateAsync()
        => await _fxClient.GetRateAsync("USD", "EUR");
}

builder.WithFunctions<UtilitySkill>();

The delegate now shows up as {{utility.NewGuid}} inside any semantic template, bridging deterministic output into generative flows.

Working with Memory – Embeddings, Stores & the 2025 Unified API

SK 0.93 shipped the Unified Memory API – one façade to rule OpenAI, Azure AI, Qdrant, Pinecone, Redis Vector, and any custom store you write.

var memory = new MemoryBuilder()
    .WithAzureOpenAITextEmbeddings(deployment: "gpt-4o-embeddings")
    .WithQdrant(endpoint: "http://localhost:6333")
    .Build();

Writing Memories

await memory.SaveReferenceAsync(
    collection: "emails",
    externalId: email.Id,
    text: email.Body,
    description: "Customer support thread");

Behind the scenes SK vectorizes email.Body, stores it, and returns a MemoryRecord. Need metadata? Stick it into the description field or attach a JSON blob.

Recalling Context

var recalls = await memory.SearchAsync(
    "emails",
    query: "budget approval status",
    limit: 3,
    minRelevance: 0.8);

You get the three closest chunks plus embeddings for chaining back into your prompt – “Here’s what the customer said earlier…”.

Tip: Swap stores at runtime – testing locally with SQLite vectors, then flipping to Azure Cognitive Search in production – without touching business code.

Hands‑On: Building a Hybrid Q&A Bot

Let’s wire the pieces into a console chatbot that:

Loads FAQSkill (semantic) and UtilitySkill (native).
Stores every user question in memory.
Retrieves the top‑k related past Qs before answering.

var kernel = new KernelBuilder()
    .WithAzureOpenAIChatCompletion("gpt‑4o", key)
    .WithMemory(memory)
    .Build();

// import skills
kernel.ImportSkill("faq", "./skills/FAQSkill");
kernel.RegisterFunctions<UtilitySkill>();

while (true)
{
    Console.Write("👤 ");
    var question = Console.ReadLine();

    // store question
    await kernel.Memory.SaveAsync("chat", question, question);

    // recall
    var history = await kernel.Memory.SearchAsync(
        "chat", question, limit: 2);

    var context = new SKContext();
    context["history"] = string.Join("\n", history.Select(h => h.Metadata.Text));

    // ask semantic function
    var answer = await kernel.RunAsync(
        context,
        kernel.Skills["faq"]["Answer"]);

    Console.WriteLine($"🤖 {answer}");
}

With fewer than 60 lines you get a bot that remembers earlier discussions and answers consistently – no sugar left behind.

FAQ: Semantic Kernel Memory & Vector Databases

Is memory persisted between sessions?

Yes. Memory lives in whichever vector store you configure. Local stores (SQLite, LiteDB) write to disk, while managed stores (Qdrant Cloud, Pinecone, Azure AISearch) persist indefinitely – retention depends on your service tier.

Which vector DBs are supported out of the box?

Qdrant, Pinecone, Typesense, Azure AISearch, Cosmos DB with Vectors, Redis Vector, Chroma and Postgres‑pgvector. Adding a new one is one interface away – implement IMemoryStore.

Can I switch embedding models after data is already stored?

Yes. Embeddings are deterministic per model, so relevance drops if you swap models. To migrate, re‑embed existing text and upsert the new vectors; SK’s ReembedAsync() helper streams through a collection for you.

How large can a single memory record be?

A practical ceiling is your embedding model’s token limit (8,192 tokens for GPT‑4o‑Embeddings in 2025). SK auto‑chunks oversized text when you set EnableAutoChunking = true on MemoryBuilder.

Is semantic memory encrypted at rest?

Encryption depends on the backing store. Azure AISearch, Cosmos DB, and Pinecone default to AES‑256. Self‑hosted Qdrant/Chroma can enable full‑disk encryption; SK always transmits via HTTPS, so transport is covered.

Conclusion: From Skills to Agents

You’ve seen how skills, semantic functions, and the new memory API turn Semantic Kernel into a Swiss‑army orchestrator. Next up is SK’s Agent Framework, where these building blocks grow decision‑making brains, schedule background tasks, and even spawn child kernels. Curious? Smash that subscribe and share your wildest hybrid‑skill ideas in the comments – I read every one.

Post Views: 343