#3 Custom API for Open WebUI API Endpoints

Previously on this channel

In part #1 of this series we learned how to setup Ollama and Open WebUI locally.
In part #2 we learned how to implement a RAG pipeline with Open Web UI locally.

Today in part #3 we will implement a .NET WebAPI that queries the previous created RAG pipeline programmatically.

Create API Key in Open WebUI

According to our previous setup, Open WebUI runs locally on http://localhost:3000. In order to access the Open WebUI API Endpoints we need to create within Open WebUI an API Key for our created "my-rag" Model.

In Open WebUI go to Settings/Account and click on (+) to create an API Key:

Copy it to your clipboard and click on "Save". Now we have all we need in order to talk to our instance of Open WebUI. So let's try it with Postman!

In Postman, enter the URL http://localhost:3000/api/chat/completions for the endpoint that we are currently interested in.

Refere to the Open WebUI documentation for a list of all exposed endpoints.

In the Authorization tab enter the previously created API key and - important - leave the type as Bearer Token:

In the Headers tab enter Content-Type / application/json as a Key/Value pair:

In the Body tab, choose "raw" and JSON and enter a JSON request that our RAG model can answer like for example:

{
      "model": "my-rag",
      "messages": 
      [
        {
          "role": "user",
          "content": "who works at Microsoft?"
        }
      ]
}

That's it, we're done. Let's talk to Open WebUI by hitting "Send":

As you see, the response body contains many data that we're not (at last not at this point) really interested in like for example user-information with a profile picture:

With our own .NET WebAPI, we can significantly slim down the response and focus on the model's response. Hell, we even will "finetune" our model, setting params like f.e. the temperature!

Simple C# WebAPI skeleton

Below is a minimal .NET 9 Web API skeleton that talks to the Open WebUI endpoint for your RAG‑powered model (“my‑rag”).

You can download the sample .NET project here.

Project structure

Program.cs

var builder = WebApplication.CreateBuilder(args);

// Register HttpClient that will be used by OpenWebUiService
builder.Services.AddHttpClient<OpenWebUiService>(client =>
{
   client.BaseAddress = new Uri("http://localhost:3000"); // Doker Open WebUI base URL
   client.DefaultRequestHeaders.Authorization =
      new System.Net.Http.Headers.AuthenticationHeaderValue("Bearer", "PASTE-YOUR-OPENWEBUI-API-KEY");
});

builder.Services.AddControllers();

var app = builder.Build();
app.MapControllers();
app.Run();

Models

RagRequest

namespace RagApi.Models;

public class RagRequest
{
   public string Prompt { get; set; } = string.Empty;
}

RagResponse

namespace RagApi.Models;

public class RagResponse
{
   public string Id { get; set; } = string.Empty;
   public string Object { get; set; } = string.Empty;
   public long Created { get; set; }
   public string Model { get; set; } = string.Empty;
   public List<RagChoice> Choices { get; set; } = new();
}

public class RagChoice
{
   public int Index { get; set; }
   public RagMessage Message { get; set; } = new();
   public string FinishReason { get; set; } = string.Empty;
}

public class RagMessage
{
   public string Role { get; set; } = string.Empty;
   public string Content { get; set; } = string.Empty;
}

Services

OpenWebUiService

using RagApi.Models;

namespace RagApi.Services;

public class OpenWebUiService
{
   private readonly HttpClient _client;

   public OpenWebUiService(HttpClient client)
   {
      _client = client;
   }

   /// <summary>
   /// Sends a prompt to the Open WebUI "my‑rag" model (linked to Knowlege of 3 PDF files) and 
   /// returns the full response.
   /// </summary>
   public async Task<RagResponse?> AskAsync(string prompt)
   {
      var payload = new
      {
         model = "my-rag",    // the name you created in Open WebUI for the model linked to the Knowlege
         messages = new[]
         {
            new { role = "user", content = prompt }
         },
         temperature = 0.7,      // fine-tuning model params programatically here
         max_tokens = 512
      };

      var response = await _client.PostAsJsonAsync("/api/v1/chat/completions", payload);

      if (!response.IsSuccessStatusCode)
      {
         return null;   // log or throw as needed
      }

      return await response.Content.ReadFromJsonAsync<RagResponse>();
      }
}

Controllers

RagController

using Microsoft.AspNetCore.Mvc;
using RagApi.Models;
using RagApi.Services;

namespace RagApi.Controllers;

[ApiController]
[Route("api/[controller]")]
public class RagController : ControllerBase
{
   private readonly OpenWebUiService _service;

   public RagController(OpenWebUiService service)
   {
      _service = service;
   }

   /// <summary>
   /// POST api/rag
   /// Body: { "prompt": "Your question here" }
   /// </summary>
   [HttpPost]
   public async Task<IActionResult> Ask([FromBody] RagRequest request)
   {
      if (string.IsNullOrWhiteSpace(request.Prompt))
      {
         return BadRequest("Prompt cannot be empty.");
      }

      var result = await _service.AskAsync(request.Prompt);

      if (result == null)
      {
         return StatusCode(502, "Failed to get a response from Open WebUI.");
      }

      var assistantText = result.Choices.FirstOrDefault()?.Message?.Content ?? string.Empty;
      return Ok(new { reply = assistantText });
   }
}

Test our WebAPI with Postman

dotnet run exposes the newly created WebAPI at https://localhost:7092/api/rag according to our current launchSettings.json so we enter this as the endpoint in Postman.

Be shure to enter the Open WebUI API key as shown before in the Authorization tab.
In the Headers tab we again need the Key/Value pair Content-Type / application/json.
We choose raw/JSON for the body type and enter our prompt as a JSON string.

And here is the reply from our WebAPI:

Conclusion

In this final part, we created an API key in Open WebUI and tested the chat completions endpoint with Postman, illustrating how the raw JSON request yields the model’s answer. We then built a minimal .NET  WebAPI that wraps this endpoint, allowing fine‑tuning of parameters and delivering only the assistant’s reply. The skeleton includes data models, an OpenWebUiService, and a controller exposing /api/rag endpoint. Running the API and testing it in Postman confirms a clean, streamlined response—ready for integration into any .NET application.