#3 Custom API for Open WebUI API Endpoints
In part 3 of our local AI series, we turn the Open WebUI chat endpoint into a clean, programmatic API. The example shows a simple JSON request that returns the model’s answer while filtering out extraneous metadata. This setup dramatically simplifies integrating RAG into your .NET applications, boosting user productivity by providing an API for your own local RAG-aware chatbots.
#2 Retrieval-Augmented Generation (RAG) with Open WebUI
This article explains Retrieval‑Augmented Generation (RAG), a method that combines a retrieval system with a generative language model and how to implement it localy with Ollama and Open WebUI. A user’s query is first embedded and matched against a database of document embeddings. The most relevant documents are fetched and fed to an LLM, which generates a response grounded in that external data—reducing hallucinations and improving accuracy. RAG is ideal for chatbots, research tools, and enterprise AI where up‑to‑date or domain‑specific knowledge is essential.
#1 Create your own custom LLM
The article explains how to “create your own custom LLM” by tweaking an existing model in Ollama rather than training a new one. It uses a ComfyUI workflow that needs detailed photographic prompts, so it shows how to define a Modelfile (based on gemma3:4b
) that sets temperature, context size, and a system prompt that expands simple scene descriptions into full camera‑style prompts. With Docker and Open WebUI (which acts like a local ChatGPT), you can run the modified model and then generate tailored prompts for ComfyUI. The result is a lightweight, reusable setup that delivers consistent, high‑quality prompts for realistic image generation without heavy fine‑tuning.
C# using a Local Language Model with LM Studio to Sort Screenshots
Learn how to use a local language model with C# and LM Studio to automatically sort screenshots into categories using AI and Vision Language Models (VLM). Utilize the power of visual recognition with .NET!
Using the Model Context Protocol (MCP) with C#
With the new MCP C# SDK, developers can efficiently manage communication between AI models and applications.
Anthropic’s Model Context Protocol (MCP) is currently making waves. It's a standardized protocol designed to streamline communication between applications and models by offering a structured way to exchange context and data between AI models and their clients. Whether you're developing AI-powered applications or integrating multiple models into a cohesive system, MCP ensures interoperability and scalability.
LLM Temperature Explained
Temperature is a key parameter in large language models (LLMs) that controls the randomness of generated text. It adjusts the probability distribution used when predicting the next token in a sequence. A lower temperature makes the model more deterministic, favoring highly probable tokens and producing more consistent, coherent outputs — ideal for factual or technical tasks. A higher temperature increases randomness by allowing the selection of less probable tokens, leading to more varied and creative responses, though sometimes at the cost of coherence.
AI Tools Are Hallucinating Software Dependencies – And Cybercriminals Are Taking Advantage
Discover how AI-generated code from large language models (LLMs) is introducing new cybersecurity risks through hallucinated software dependencies and slopsquatting. Learn how attackers exploit these vulnerabilities and what developers can do to stay safe.
Exploring the Next Level of AI Image Generation with OpenAI's GPT-4o
The world of AI-generated images has reached a new frontier with the release of OpenAI's latest model, GPT-4o
. Unlike previous iterations, GPT-4o seamlessly integrates advanced image generation capabilities, producing highly detailed, contextually accurate, and incredibly realistic visuals. One of the most groundbreaking advancements in this model is its ability to generate photorealistic images that rival human-created photography. This leap forward opens up unprecedented possibilities across industries such as marketing, design, entertainment, and education.
DeepSeek JailBrake Attack
The Chinese government is famous/notorious for its censorship. The Chinese LLM DeepSeek naturally follows this attitude and does not talk about content that is sensitive for China's leadership, such as Tiananmen Square Massacre, Taiwan or the Cultural Revolution. With some simple JailBrakeing you can get DeepSeek to bypass the censorship.