Retrieval-Augmented Generation (RAG) with Open WebUI
This article explains Retrieval‑Augmented Generation (RAG), a method that combines a retrieval system with a generative language model and how to implement it localy with Ollama and Open WebUI. A user’s query is first embedded and matched against a database of document embeddings. The most relevant documents are fetched and fed to an LLM, which generates a response grounded in that external data—reducing hallucinations and improving accuracy. RAG is ideal for chatbots, research tools, and enterprise AI where up‑to‑date or domain‑specific knowledge is essential.
Create your own custom LLM
The article explains how to “create your own custom LLM” by tweaking an existing model in Ollama rather than training a new one. It uses a ComfyUI workflow that needs detailed photographic prompts, so it shows how to define a Modelfile (based on gemma3:4b
) that sets temperature, context size, and a system prompt that expands simple scene descriptions into full camera‑style prompts. With Docker and Open WebUI (which acts like a local ChatGPT), you can run the modified model and then generate tailored prompts for ComfyUI. The result is a lightweight, reusable setup that delivers consistent, high‑quality prompts for realistic image generation without heavy fine‑tuning.