| Author(s) | Collection number | Pages | Download abstract | Download full text | 
|---|---|---|---|---|
| Vasiuta S. P., Дулька О. Б. | № 1 (70) | 142-147 |   |   | 
The article is dedicated to the practical aspects of extending the Retrieval-Augmented Generation (RAG) architecture by integrating the Semantic Kernel framework into the .NET environment. Against the backdrop of the growing popularity of large language models (LLMs) like GPT-4 and Gemini, the authors analyze how Semantic Kernel (SK) helps minimize the «friction» between application business logic and external AI services. The SDK manages request memory, connects to various LLMs, works with vector databases, and calls internal plugin functions. It demonstrates how an ASP.NET Core web service can be configured in just a few dozen lines of code to combine a document search module, Semantic Kernel Memory, and the Gemini model to generate responses with source citations.
The core of the discussion compares two approaches—direct HTTP API calls versus integration via SK—from the perspectives of security, maintainability, cost, and development speed. A key focus is on how SK simplifies the implementation of the RAG pattern: the kernel stores embeddings in ChromaDB or Azure Cognitive Search, and the LLM automatically pulls relevant context into the prompt, significantly reducing the risk of hallucinations. The article highlights the architectural principle of loose coupling, achieved through interfaces and DI containers, which ensures that an LLM or vector store can be easily replaced without breaking the entire system. Furthermore, SK plugins enable the LLM to invoke domain-specific C# code, allowing it to act as an «agent-orchestrator» that combines multiple data sources into a single response.The scientific novelty lies in the synthesis of the RAG architecture and the Semantic Kernel’s agent-based approach within the .NET ecosystem, enabling rapid prototyping and scaling of AI functions in enterprise environments. Practical recommendations cover secure key storage, using OpenTelemetry for cost tracking, and A/B testing different models (e.g., GPT-4 vs. Gemini) without changing the application code. The authors conclude that this integration lowers the entry barrier for .NET teams, increases the transparency of AI modules, and makes RAG services suitable for critical business processes, from customer support chatbots to internal analytical assistants.
Keywords: Semantic Kernel, Retrieval-Augmented Generation (RAG), large language models (LLM), GPT-4, Gemini, .NET 8, vector databases, plugins, loose coupling, agent orchestration, OpenTelemetry.
doi: 10.32403/1998-6912-2025-1-70-132-141