Part 5 of 8

Tabby — Self-Hosted Code Completion

Replace GitHub Copilot with your own private AI code assistant.

35 minutes

8GB recommended

Prerequisites

Completed Part 1 (Ollama), VS Code or JetBrains IDE, Docker

Time to Complete

30–40 minutes

Recommended Plan

8GB ($40/mo) recommended for responsive code completion

Looking for a quick-start guide? Check out our standalone Tabby Deployment Guide for a streamlined setup walkthrough.

Introduction

GitHub Copilot costs $19/user/month and sends your proprietary code to Microsoft's servers. For a team of 10, that's $190/month — and every line of your codebase is processed externally.

Tabby runs on your VPS, uses open models, and keeps every line of code private. The ROI case practically makes itself.

💰 Cost comparison: GitHub Copilot Business for 5 users: $95/month. Tabby on your RamNode VPS: $0/user — unlimited users, zero code exposure.

Tabby vs Continue.dev

Feature	Tabby	Continue.dev
Architecture	Dedicated server + IDE extension	VS Code extension only
Model serving	Built-in, optimized	Connects to Ollama
Repo context	Indexes your codebase	File-level context
Team features	Multi-user, analytics	Single user
IDE support	VS Code, JetBrains, Vim	VS Code only

Deploying Tabby

mkdir -p ~/ai-stack/tabby && cd ~/ai-stack/tabby

docker-compose.yml

version: "3.8"

services:
  tabby:
    image: tabbyml/tabby:latest
    container_name: tabby
    restart: unless-stopped
    command: serve --model StarCoder-1B --device cpu
    ports:
      - "8080:8080"
    volumes:
      - tabby-data:/data
    environment:
      - TABBY_DISABLE_USAGE_COLLECTION=1

volumes:
  tabby-data:

docker compose up -d

Tabby dashboard available at http://your-server-ip:8080.

Model Selection for Code Completion

Model	Size	RAM	Latency	Quality
StarCoder-1B	1B	~2 GB	Fast	Good for basic completion
StarCoder-3B	3B	~4 GB	Moderate	Better suggestions
CodeLlama-7B	7B	~6 GB	Slower	Best quality
DeepSeek-Coder-1.3B	1.3B	~2 GB	Fast	Excellent for size

Remember to account for RAM used by Ollama (Part 1). On an 8GB VPS, use StarCoder-1B or DeepSeek-Coder-1.3B alongside Ollama.

IDE Integration

VS Code

Install the Tabby extension from the VS Code marketplace
Open Settings → search "Tabby"
Set Server URL to http://your-server-ip:8080
Set your authentication token (from Tabby dashboard)
Start typing — completions appear inline automatically

JetBrains IDEs

Go to Settings → Plugins → Marketplace
Search and install "Tabby"
Configure the server endpoint under Settings → Tools → Tabby
Works with IntelliJ, PyCharm, WebStorm, and all JetBrains IDEs

Repository Context

Tabby can index your codebase for context-aware completions that understand your project's patterns, naming conventions, and architecture:

Navigate to the Tabby dashboard → Repositories
Add your Git repositories (supports GitHub, GitLab, or local paths)
Configure indexing schedules (hourly/daily)
Select which repositories and branches to index

This dramatically improves suggestion quality — completions reference your actual code patterns rather than generic suggestions.

Team Configuration

Set up multi-user access:

Create user accounts from the Tabby dashboard
Generate per-user authentication tokens
View usage analytics per developer
Configure different models for different teams (Python team might prefer CodeLlama, TypeScript team might prefer StarCoder)

Performance Optimization

Completions should feel instant — aim for under 500ms latency:

Model	4GB VPS	8GB VPS	Feel
StarCoder-1B	~200ms	~150ms	Instant
DeepSeek-1.3B	~300ms	~200ms	Snappy
StarCoder-3B	~800ms	~400ms	Noticeable
CodeLlama-7B	Too slow	~700ms	Acceptable

Continue.dev Alternative

If you prefer a simpler setup that connects directly to Ollama (Part 1):

Install the Continue extension in VS Code
Open ~/.continue/config.json
Configure your Ollama endpoint:

~/.continue/config.json

{
  "models": [{
    "title": "Ollama - Mistral",
    "provider": "ollama",
    "model": "mistral",
    "apiBase": "http://your-server-ip:11434"
  }],
  "tabAutocompleteModel": {
    "title": "DeepSeek Coder",
    "provider": "ollama",
    "model": "deepseek-coder:6.7b",
    "apiBase": "http://your-server-ip:11434"
  }
}

What's Next?

Your developers now have private AI code assistance that rivals Copilot — at a fraction of the cost and with zero code exposure. In Part 6: CrewAI, we go beyond single-model interactions with multi-agent AI workflows.

← Part 4: AnythingLLM Part 6: CrewAI