Skip to main content

AI Services And Agent Stacks

Moltern's catalog includes AI-focused services that can be combined into chat systems, agent builders, model gateways, retrieval workflows, and evaluation stacks.

Common AI Building Blocks

Building BlockExamples
Chat UIOpen WebUI, LibreChat, LobeChat, AnythingLLM
Agent or workflow builderFlowise, Langflow
Model runtime or gatewayOllama, LiteLLM, NewAPI
Vector databaseQdrant, Chroma, Weaviate
Observability and evaluationLangfuse, Argilla
Search and translationSearXNG, LibreTranslate
Labeling and data prepLabel Studio, Argilla

The exact catalog can change. Use the in-app service docs and service catalog as the source of truth.

Pattern 1: Private Chat UI

Use this when a team needs a private AI chat interface.

Typical services:

  • Chat UI service.
  • Model runtime or external model provider.
  • Optional model gateway.
  • Optional vector database for knowledge retrieval.

Deployment flow:

  1. Create an ai or production environment.
  2. Deploy the model runtime or configure the external provider.
  3. Deploy the chat UI.
  4. Add required variables in the chat UI.
  5. Open the generated URL.
  6. Configure users and authentication inside the chat UI.
  7. Add a custom domain only after access control is ready.

Pattern 2: Agent Builder

Use this when teams want visual workflows or agent chains.

Typical services:

  • Flowise or Langflow.
  • Model gateway or model runtime.
  • Vector database.
  • Optional observability service.

Deployment flow:

  1. Deploy the vector database if retrieval is needed.
  2. Deploy the model gateway or runtime.
  3. Deploy the agent builder.
  4. Add provider credentials as variables.
  5. Create a test workflow.
  6. Confirm logs and service status.
  7. Document which environment owns the stack.

Pattern 3: Model Gateway

Use a model gateway when multiple apps or tools should call AI providers through one managed endpoint.

Benefits:

  • Centralized provider configuration.
  • Easier spend tracking inside the gateway.
  • Common authentication layer.
  • One endpoint for several AI tools.

Do not expose a model gateway publicly without authentication and rate limits inside the gateway.

Pattern 4: Retrieval Stack

Use this when users need chat or agents over documents.

Typical components:

  • Chat or agent UI.
  • Embedding provider.
  • Vector database.
  • Optional document processing service.
  • Optional observability.

Checklist:

  • Decide which documents may be indexed.
  • Confirm data owners approved indexing.
  • Keep provider API keys in variables.
  • Restrict public access until auth is configured.
  • Monitor storage growth.
  • Test retrieval with non-sensitive sample data first.

AI Security Guidance

  • Do not paste production secrets into prompts.
  • Do not index customer data without approval.
  • Keep model provider keys in variables.
  • Use separate environments for experiments.
  • Avoid public access until service-level authentication is configured.
  • Review service logs before sharing a new AI URL.
  • Track who owns each AI stack.

AI Cost Guidance

AI workloads can grow quickly because they combine compute, storage, and external provider spend.

Before production use:

  1. Review Moltern plan limits.
  2. Check the size of model runtimes.
  3. Watch storage for vector databases.
  4. Set spend controls in external model providers.
  5. Use pay-as-you-go guardrails only with owner approval.

Example Stack Plans

GoalSuggested Stack
Team chat over local modelsOpen WebUI + Ollama
Multi-provider chatLibreChat or LobeChat + LiteLLM
Visual agent workflowsFlowise or Langflow + LiteLLM + Qdrant
Prompt and trace reviewLangfuse + model gateway
Dataset labelingLabel Studio or Argilla
Search-assisted assistantSearXNG + agent builder + model gateway