Blog | wolfSYS.net

show blog sitemap

#3 Custom API for Open WebUI API Endpoints

In part 3 of our local AI series, we turn the Open WebUI chat endpoint into a clean, programmatic API. The example shows a simple JSON request that returns the model’s answer while filtering out extraneous metadata. This setup dramatically simplifies integrating RAG into your .NET applications, boosting user productivity by providing an API for your own local RAG-aware chatbots.

#2 Retrieval-Augmented Generation (RAG) with Open WebUI

Information

This article explains Retrieval‑Augmented Generation (RAG), a method that combines a retrieval system with a generative language model and how to implement it localy with Ollama and Open WebUI. A user’s query is first embedded and matched against a database of document embeddings. The most relevant documents are fetched and fed to an LLM, which generates a response grounded in that external data—reducing hallucinations and improving accuracy. RAG is ideal for chatbots, research tools, and enterprise AI where up‑to‑date or domain‑specific knowledge is essential.

#1 Create your own custom LLM

Information

The article explains how to “create your own custom LLM” by tweaking an existing model in Ollama rather than training a new one. It uses a ComfyUI workflow that needs detailed photographic prompts, so it shows how to define a Modelfile (based on gemma3:4b) that sets temperature, context size, and a system prompt that expands simple scene descriptions into full camera‑style prompts. With Docker and Open WebUI (which acts like a local ChatGPT), you can run the modified model and then generate tailored prompts for ComfyUI. The result is a lightweight, reusable setup that delivers consistent, high‑quality prompts for realistic image generation without heavy fine‑tuning.