
MoltBot with Ollama - Run Local AI Models
Use Ollama with MoltBot for private, offline AI. Local LLM setup guide for complete privacy and control.
Run MoltBot with Local Models via Ollama
Ollama lets you run AI models locally, giving you complete privacy and zero API costs. This guide shows how to set up MoltBot with Ollama.
Why Local Models?
- Complete Privacy: Data never leaves your machine
- No API Costs: Free after initial setup
- Offline Capable: Works without internet
- Full Control: Choose any open-source model
- Fast: No network latency
Prerequisites
- 8GB+ RAM (16GB+ recommended)
- macOS, Windows, or Linux
- GPU optional but recommended
Installing Ollama
macOS
brew install ollamaOr download from ollama.com.
Windows
Download the installer from ollama.com.
Linux
curl -fsSL https://ollama.com/install.sh | shStarting Ollama
Start the Ollama service:
ollama serveOllama runs on http://localhost:11434 by default.
Downloading Models
Recommended Models
# Best quality (requires 16GB+ RAM)
ollama pull llama3.1:70b
# Good balance (requires 8GB+ RAM)
ollama pull llama3.1
# Fast and light (requires 4GB+ RAM)
ollama pull llama3.2:3b
# Coding specialist
ollama pull codellama
# Uncensored
ollama pull dolphin-mistralCheck Downloaded Models
ollama listConfiguring MoltBot
Basic Setup
moltbot config set aiProvider ollama
moltbot config set ollamaModel llama3.1Full Configuration
Edit ~/.moltbot/config.json:
{
"aiProvider": "ollama",
"ollama": {
"baseUrl": "http://localhost:11434",
"model": "llama3.1",
"options": {
"temperature": 0.7,
"numCtx": 8192,
"numGpu": 1
}
}
}Configuration Options
| Option | Description | Default |
|---|---|---|
baseUrl | Ollama server URL | http://localhost:11434 |
model | Model to use | Required |
temperature | Response creativity (0-1) | 0.7 |
numCtx | Context window size | 4096 |
numGpu | Number of GPUs to use | 0 (CPU) |
Model Selection Guide
For General Use
{
"ollama": {
"model": "llama3.1"
}
}Llama 3.1 offers the best balance of quality and performance.
For Coding
{
"ollama": {
"model": "codellama:34b"
}
}For Fast Responses
{
"ollama": {
"model": "llama3.2:3b",
"options": {
"numCtx": 4096
}
}
}For Complex Analysis
{
"ollama": {
"model": "llama3.1:70b",
"options": {
"numCtx": 16384,
"numGpu": 1
}
}
}GPU Acceleration
NVIDIA
Ollama automatically uses NVIDIA GPUs. Verify:
ollama run llama3.1 --verbose
# Look for "GPU" in outputApple Silicon
Native Metal support on M1/M2/M4:
{
"ollama": {
"options": {
"numGpu": 1
}
}
}AMD (ROCm)
Install ROCm first, then Ollama will detect it automatically.
Memory Optimization
Reducing Memory Usage
{
"ollama": {
"model": "llama3.1",
"options": {
"numCtx": 4096,
"numBatch": 512,
"numThread": 4
}
}
}Loading Models
Pre-load models on startup:
# Keep model in memory
ollama run llama3.1 &Unloading Models
Free memory when not in use:
curl http://localhost:11434/api/generate -d '{
"model": "llama3.1",
"keep_alive": 0
}'Running Ollama Remotely
Server Setup
On your server:
# Allow external connections
OLLAMA_HOST=0.0.0.0 ollama serveClient Configuration
{
"ollama": {
"baseUrl": "http://192.168.1.100:11434",
"model": "llama3.1"
}
}With Authentication
Use a reverse proxy like nginx with basic auth:
location /api/ {
auth_basic "Ollama";
auth_basic_user_file /etc/nginx/.htpasswd;
proxy_pass http://localhost:11434;
}Hybrid Mode
Use Ollama for simple tasks, cloud API for complex ones:
{
"aiProvider": "hybrid",
"hybrid": {
"default": "ollama",
"fallback": "anthropic"
},
"ollama": {
"model": "llama3.1"
},
"anthropic": {
"model": "claude-3-sonnet"
},
"skills": {
"quickChat": {
"aiProvider": "ollama"
},
"complexAnalysis": {
"aiProvider": "anthropic"
}
}
}Performance Tuning
For Speed
{
"ollama": {
"model": "llama3.2:3b",
"options": {
"numCtx": 2048,
"numBatch": 1024,
"numGpu": 1
}
}
}For Quality
{
"ollama": {
"model": "llama3.1:70b",
"options": {
"numCtx": 8192,
"temperature": 0.5,
"repeatPenalty": 1.1
}
}
}Troubleshooting
Model Not Found
# Pull the model first
ollama pull llama3.1Connection Refused
# Start Ollama service
ollama serveOut of Memory
- Use a smaller model
- Reduce
numCtx - Close other applications
Slow Performance
- Enable GPU:
numGpu: 1 - Use a smaller model
- Reduce context size
Custom Models
Using Custom Modelfiles
Create Modelfile:
FROM llama3.1
SYSTEM "You are MoltBot, a helpful AI assistant."
PARAMETER temperature 0.7Build and use:
ollama create moltbot-custom -f ModelfileConfigure MoltBot:
{
"ollama": {
"model": "moltbot-custom"
}
}Enjoy private, local AI with MoltBot and Ollama! Need help? Join our Discord community.
Categories
More Posts
ClawdBot is Now MoltBot - Migration Guide
ClawdBot has been rebranded to MoltBot. Learn how to migrate from ClawdBot to MoltBot seamlessly. All features remain the same.

Install MoltBot with Docker - Complete Deployment Guide
Step-by-step guide to deploy MoltBot using Docker and Docker Compose. Run your personal AI assistant in containers.

MoltBot Skills: Extend Your AI with 50+ Integrations
Explore MoltBot's extensive skill system with 50+ built-in integrations. Learn how to use existing skills and create your own custom integrations.
Newsletter
Join the community
Subscribe to our newsletter for the latest news and updates