Retrograde WebUI

Settings

Your persona

Used in chat display and in requests to the model. Character cards can use {{user}} for your name and {{char}} for the character’s name (like SillyTavern).

My name About me

API

Endpoint base (OpenAI-compatible) API key (optional)

Model

Model Chat adapter (instruction format)

JSON templates from the adapters folder (KoboldCPP-style). None keeps standard OpenAI /v1/chat/completions with messages. When you pick an adapter, the UI builds one prompt string and calls /v1/completions — your backend must support that endpoint. Edit adapters/manifest.json to change the list.

Temperature: 0.7 Max reply tokens (per completion): 2048

Caps how long each assistant message can be. It does not limit the whole conversation; use Context in API requests below to trim what is sent as input.

Top P: 0.9 Frequency penalty: 0 Presence penalty: 0 Stream responses (SSE) Context in API requests

The full chat is sent by default. Trim older turns here to save tokens; empty-send “continue” adds the line below only to the request (not your saved transcript).

Uses context_length / n_ctx from /v1/models when present. Otherwise the budget above is used. When trimming, room for the next reply is reserved using max reply tokens above.

Continuation prompt (empty-send when last line is assistant)

Appearance

Theme Accent Font size: 16px Time zone (timestamps)

Applies to message times and the conversation list. Automatic follows your browser (—).

Data

Destructive

Name Avatar

Description Personality First message Example messages

Import character

JSON Or file

Supports SillyTavern / Chub / chara_card_v2 and common exports.

Forget everything

This deletes all characters, conversations, and settings (including API keys) from this browser.

Type FORGET to confirm

Help

Private. Chats and settings stay in this browser unless you export them.

Quick start

Open Settings and set your API base URL (e.g. KoboldCPP or http://localhost:5001/v1).
Test connection, pick a model, then create or import a character.
Start a conversation and send messages. Enable Stream responses if your backend supports SSE.

Tips

Empty send asks for another assistant reply. Your saved chat does not get an extra line; if the last turn is the assistant, an extra user turn is sent only in the API request (empty content by default; add text in Settings if your backend needs it).
Context (Settings → Model): choose full history or trim by last N messages, an approximate token budget, or context length from the API when the model entry includes it.
Max reply tokens is only the cap on each generated assistant message (the max_tokens sent to the API). It is not the context-window size for the whole chat.
Chat adapter (Settings → Model) selects a template from adapters/ so the request uses the right special tokens; it switches the client to the /v1/completions prompt API instead of chat messages.
Regenerate replaces the last assistant reply or resends the last user message.
Resend (chat header) retries with the current transcript—no duplicate user line if the backend failed after you sent.
Back removes the last user or assistant message (system prompt is kept).
Chat bubbles support Markdown and fenced code blocks (highlighted locally in the browser).
In Settings → Your persona, set your name and bio; {{user}} and {{char}} in card text and messages are filled from that and the active character.
Use Export / Import to back up data. Forget everything wipes local storage completely.

Hosting, privacy, and legal notices are in .

Disclaimer

Private. Chats and settings stay in this browser unless you export them. Markdown and syntax highlighting in messages are rendered only in your browser from bundled scripts (no third-party formatter or CDN).

How this site fits in (hosting & responsibility)

The server that hosts these static files only sends you the WebUI (HTML, CSS, JavaScript). In normal use, your prompts and model replies are not sent to that host; your browser talks directly to whatever API base URL you configure in Settings (for example your own computer or another service you control).

Your chats and settings are stored in your browser unless you export them. Whoever runs the LLM / API server you point at may process and log traffic according to their own setup—you choose that endpoint.

You are responsible for your use of this tool, for the content you create, for obeying applicable laws and any terms of the APIs or models you use, and for any misuse. This interface is provided as-is, without warranty. To the extent allowed by law, the people who publish this hosted WebUI are not liable for user-generated content, third-party backends, or how others use the software.

This host only serves the interface. Your API calls go to the server you set in Settings, not to the operator of this page, unless you deliberately configure otherwise.

Welcome

Retrograde WebUI runs in your browser. Chats and settings stay on this device unless you export them.

Your requests go to the API URL you set in Settings—not to whoever hosts these files.

Quick start

Open Settings and set your OpenAI-compatible API base URL.
Use Test connection, then pick a model.
Create or import a character and start a conversation.

More detail is in Help (top of the page).

Select a character