Enterprise Assistant Console

Workspace configuration

Voice / vision controls

Camera preview

Camera off — enable Voice+Video or “Camera for text chat”

Vision & routing status

Vision / routing: loading…

Text camera & debug toggles

Camera for text chat When checked, JPEG frames are sent with vision_session_id on text chat requests. Use Voice+Video for spoken questions such as “what do you see?”

Every reply uses live camera

LLM verbose (debug)

Expand thinking blocks by default

Core session & instructions

WebSocket URL Saved instructions Session selection

Session memory actions

System Instructions

LLM generation (expand)

Sampling for text chat and the next voice connect (toggle voice off/on to re-send). Model context length is set when the stack starts (e.g. local llama --ctx-size in start_current_stack.sh, overridable via LOCAL_LLAMA_CTX_SIZE); Text history lines limits how many prior chat lines are included. Text chat max prompt tokens/chars add separate caps on total prompt size before the request is sent.

Local vs Ollama: effective context is not the same. Long prompts can use more of the window on local llama than on Ollama, where the server may truncate inputs near a fixed token ceiling (large prompts were observed to cap around 32k prompt tokens while the same character padding counted higher on local). Max tokens is only a completion cap—prompt + reply must still fit the active backend’s context.

Loading limits for the active model…

Max tokens (reply)

Temperature

Min reply tokens

Voice only: if your max-tokens setting is below this number, the server raises it. Sent as max_tokens_floor when you connect voice.

Top-p (optional)

Text history lines

Text chat max prompt tokens

Text chat max prompt chars

Local RAG index (expand)

Offline lexical index (SQLite FTS5). Text chat: enable Local RAG below with agent tools. Voice / voice+video: snippets are prepended when VOICE_LOCAL_RAG is on (stack default). Chat attachments are mirrored under from_chat/ for later questions.

Dir paths in useLoading…

No. of files (indexable on disk)—

File types—

Total size—

Files chunked / pending—

FTS chunk rows—

Tool Management (expand)

Offline only: tools run on this machine. HTTP connectors must use allowlisted hosts (default 127.0.0.1 / localhost; see TOOL_HTTP_ALLOW_* in stack docs). No public internet.

Local RAG tool settings

Local RAG (indexed docs)

Local calendar

Local automation connector

LLM Routing (expand)

First: Select route

Local Ollama

Second: Select model (used for Ollama route) Available Ollama models

Vision: image captions always go through Ollama (VISION_OLLAMA_MODEL in the stack), independent of this dialogue route. With Local dialogue + vision + XTTS on one GPU, VRAM can spike; with Ollama for both dialogue and captions, use smaller tags (e.g. gemma4:e2b-it-q4_K_M) or run ./offline_setup/lisa_stack.sh status to inspect GPU rows.

Full stack restart (expand)

Runs offline_setup/start_current_stack.sh (ASR, XTTS, LLM, bot on :7861, HTTPS on :7860). Takes several minutes; this page may disconnect until the bot is back.

Session: Disconnected | Conversation stream: voice + text