(SOLVED: Can send updated .yaml if anyone needs; I was just not cloning the repos beforehand.)
Hi all!
I’m excited to be here, as I’ve been a fan of Cole’s for a bit, and I’m excited to finally develop to the point of being add Bolt.diy to my stack as a good stand-alone open-source coding assistant I can use with local models! I know Cole had referenced Open WebUI (OWUI) in his stack previously.
The idea is for me to use Bolt.diy when I’m doing coding/development work, and keep using Open WebUI/Ollama for my main “playground”, and doing any intro work, with serious development to take place on the Bolt.diy side for any agentic setup. I know that this involves a fair amount of API work, how do I need to chain the API calls together? Does my YAML have that configured correctly? I’m trying to implement Bolt.diy, as well as TabbyAPI and LiteLLM to my existing OWUI/Ollama stack. I’m getting some crashing errors on bolt.diy that, in the logs, will say something to the effect of “did you mean to install” and loops over and over. I will be posting logs as a reply to this message as I don’t have them handy right this second.
The ultimate idea is a) use Open WebUI as my daily driver/access to 2-3 local GGUF models, b) use Bolt.diy/TabbyAPI/LiteLLM as my coding environment and host EXL2 models (as opposed to GGUF), and to start playing with more robust development capabilities than currently present in Open WebUI. My first order of business would be to generate a UI similar to OWUI that just talks to my Bolt.diy environment and my local EXL2 models (I’d be setting up separate API keys in all my API calls to track API usage).
Any help or any expertise on the Docker side of things would be most welcome! Thank you so much! Here is my .yaml below…
services:
# -----------------------
# 1) WATCHTOWER
# -----------------------
watchtower:
image: containrrr/watchtower
container_name: watchtower
runtime: nvidia
volumes:
- /var/run/docker.sock:/var/run/docker.sock
environment:
- WATCHTOWER_CLEANUP=true
- WATCHTOWER_POLL_INTERVAL=3000
- WATCHTOWER_TIMEOUT=300s
- WATCHTOWER_INCLUDE_STOPPED=true
networks:
- shared-network
restart: unless-stopped
env_file:
- my-keys.env
# -----------------------
# 2) OLLAMA
# -----------------------
ollama:
image: ollama/ollama
container_name: ollama
depends_on:
- watchtower
ports:
- "11434:11434"
runtime: nvidia
volumes:
- ./ollama:/root/.ollama
networks:
- shared-network
restart: always
env_file:
- my-keys.env
# -----------------------
# 3) TABBY
# -----------------------
tabby:
image: ghcr.io/tabbyml/tabby:latest
container_name: tabby
depends_on:
- watchtower
ports:
- "8085:8080"
runtime: nvidia
volumes:
- ./tabby-models:/models
- ./tabby-config:/config
environment:
- NVIDIA_VISIBLE_DEVICES=all
- TABBY_MODEL_PATH=/models
- TABBY_CUDA_DEVICE=0
- TABBY_MAX_TOKENS=8192
networks:
- shared-network
restart: always
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
interval: 30s
timeout: 300s
retries: 3
env_file:
- my-keys.env
# -----------------------
# 4) LITELLM
# -----------------------
litellm:
image: ghcr.io/berriai/litellm:main-latest
container_name: litellm
depends_on:
- watchtower
- tabby
- ollama
ports:
- "8000:8000"
runtime: nvidia
environment:
- PORT=8000
- HOST=0.0.0.0
- STORE_MODEL_IN_DB=True
- OLLAMA_API_BASE_URL=http://ollama:11434
- TABBY_API_BASE_URL=http://tabby:8080
- LITELLM_MASTER_KEY=sk-1234
- LITELLM_SALT_KEY=sk-1234
volumes:
- ./litellm-config:/app/config.yml
- ./litellm-data:/app/backend/data
command: --config /app/config.yml --port 8000 --num_workers 8
networks:
- shared-network
restart: always
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 30s
timeout: 300s
retries: 3
env_file:
- my-keys.env
# -----------------------
# 5) WEBUI
# -----------------------
webui:
image: ghcr.io/open-webui/open-webui:main
container_name: open-webui
depends_on:
- watchtower
- ollama
- litellm
ports:
- "3000:8080"
runtime: nvidia
environment:
- OLLAMA_API_BASE_URL=http://ollama:11434/api
- OPENAI_API_BASE_URL=http://litellm:8000/v1
- NVIDIA_VISIBLE_DEVICES=all
- NVIDIA_DRIVER_CAPABILITIES=compute,utility
- LOCAL_INTERFACE=1
- PLUGIN_DOWNLOAD=1
volumes:
- ./appdata:/app/backend/data
- ./shared-data:/app/shared
networks:
- shared-network
restart: always
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/"]
interval: 30s
timeout: 300s
retries: 5
start_period: 30s
env_file:
- my-keys.env
# -----------------------
# 6) BOLT.DIY
# -----------------------
boltdiy:
image: bolt-diy:latest
container_name: boltdiy
depends_on:
- watchtower
- litellm
- ollama
ports:
- "5173:5173"
runtime: nvidia
environment:
- NODE_ENV=production
- PORT=5173
- OLLAMA_API_BASE_URL=http://ollama:11434
- RUNNING_IN_DOCKER=true
- HUGGINGFACE_API_KEY=${HUGGINGFACE_API_KEY}
- OPENAI_API_KEY=${OPENAI_API_KEY}
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
- OPEN_ROUTER_API_KEY=${OPEN_ROUTER_API_KEY}
- GOOGLE_GENERATIVE_AI_API_KEY=${GOOGLE_GENERATIVE_AI_API_KEY}
- TOGETHER_API_KEY=${TOGETHER_API_KEY}
- TOGETHER_API_BASE_URL=${TOGETHER_API_BASE_URL}
- VITE_LOG_LEVEL=${VITE_LOG_LEVEL}
- DEFAULT_NUM_CTX=${DEFAULT_NUM_CTX}
volumes:
- ./bolt.diy:/app
networks:
- shared-network
restart: always
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:5173/health"]
interval: 30s
timeout: 300s
retries: 3
env_file:
- my-keys.env
# -----------------------
# 7) PIPELINES
# -----------------------
pipelines:
image: ghcr.io/open-webui/pipelines:main
container_name: pipelines
depends_on:
- watchtower
- webui
ports:
- "9099:9099"
runtime: nvidia
extra_hosts:
- "host.docker.internal:host-gateway"
environment:
PIPELINES_URLS: "https://github.com/open-webui/pipelines/blob/main/examples/filters/detoxify_filter_pipeline.py"
volumes:
- pipelines:/app/pipelines
networks:
- shared-network
restart: always
env_file:
- my-keys.env
# -----------------------
# NETWORK & VOLUMES
# -----------------------
networks:
shared-network:
driver: bridge
volumes:
pipelines:
again, thanks for taking the time and I hope to become a big part of the community!