How to Self-Host Tube Archivist with Docker

What Is Tube Archivist?

Tube Archivist is a self-hosted YouTube media manager that downloads, indexes, and organizes YouTube videos into a personal media library. Subscribe to channels, download their entire catalogs, search across all your archived content, and watch everything through a clean web interface — all stored on your own hardware.

Think of it as Plex/Jellyfin but specifically designed for YouTube content. Videos are yours permanently, even if the original is deleted, copyright-claimed, or the channel disappears. Tube Archivist uses Elasticsearch for full-text search across video titles, descriptions, and subtitles, and yt-dlp for downloading.

Official site: tubearchivist.com

Prerequisites

A Linux server (Ubuntu 22.04+ recommended)
Docker and Docker Compose installed (guide)
4 GB+ of free RAM (Elasticsearch alone uses 1 GB heap)
20+ GB of free disk space for application data, plus storage for video files
vm.max_map_count kernel parameter set to 262144 (required for Elasticsearch)

Set the required kernel parameter:

# Temporary (resets on reboot)
sudo sysctl -w vm.max_map_count=262144

# Permanent (persists across reboots)
echo "vm.max_map_count=262144" | sudo tee -a /etc/sysctl.conf
sudo sysctl -p

Docker Compose Configuration

Tube Archivist requires three services: the main application, Elasticsearch for search indexing, and Redis for caching and task queuing.

Create a docker-compose.yml file:

services:
  tubearchivist:
    image: bbilly1/tubearchivist:v0.5.9
    container_name: tubearchivist
    restart: unless-stopped
    ports:
      - "8000:8000"
    volumes:
      # Video storage — point this to a drive with plenty of space
      - media:/youtube
      # Application cache (thumbnails, metadata)
      - cache:/cache
    environment:
      # Elasticsearch connection
      - ES_URL=http://archivist-es:9200
      # Redis connection
      - REDIS_CON=redis://archivist-redis:6379
      # File ownership (match your host user)
      - HOST_UID=1000
      - HOST_GID=1000
      # Full URL of your instance (include protocol and port)
      - TA_HOST=http://localhost:8000
      # Initial admin credentials (change the password!)
      - TA_USERNAME=admin
      - TA_PASSWORD=CHANGE_THIS_PASSWORD
      # Must match ELASTIC_PASSWORD on the ES service
      - ELASTIC_PASSWORD=CHANGE_THIS_ES_PASSWORD
      # Timezone for download scheduler
      - TZ=America/New_York
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/api/health/"]
      interval: 2m
      timeout: 10s
      retries: 3
      start_period: 30s
    depends_on:
      - archivist-es
      - archivist-redis

  archivist-redis:
    image: redis:7.4
    container_name: archivist-redis
    restart: unless-stopped
    expose:
      - "6379"
    volumes:
      - redis:/data
    depends_on:
      - archivist-es

  archivist-es:
    image: bbilly1/tubearchivist-es
    # For ARM64, use: image: elasticsearch:8.19.0
    container_name: archivist-es
    restart: unless-stopped
    environment:
      # Must match ELASTIC_PASSWORD on tubearchivist service
      - "ELASTIC_PASSWORD=CHANGE_THIS_ES_PASSWORD"
      # Java heap size — 1 GB minimum, increase for large libraries
      - "ES_JAVA_OPTS=-Xms1g -Xmx1g"
      - "xpack.security.enabled=true"
      - "discovery.type=single-node"
      - "path.repo=/usr/share/elasticsearch/data/snapshot"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    volumes:
      - es:/usr/share/elasticsearch/data
    expose:
      - "9200"

volumes:
  media:
  cache:
  redis:
  es:

Replace CHANGE_THIS_PASSWORD and CHANGE_THIS_ES_PASSWORD with strong passwords. The ELASTIC_PASSWORD must be identical on both the tubearchivist and archivist-es services.

Start the stack:

docker compose up -d

First startup takes 2-3 minutes while Elasticsearch initializes and Tube Archivist runs migrations. Monitor with:

docker compose logs -f tubearchivist

Initial Setup

Open http://your-server-ip:8000 in your browser
Log in with the credentials set in TA_USERNAME and TA_PASSWORD
Navigate to Settings → Downloads:
- Set your preferred video quality (720p is a good balance of quality vs storage)
- Configure subtitle languages if desired
- Set the download scheduler interval
Subscribe to your first channel:
- Click + → Subscribe
- Paste a YouTube channel URL
- Tube Archivist imports the channel metadata and queues videos for download
Start downloading:
- Go to Downloads to see queued videos
- Click Start Download to begin processing the queue

Configuration

Download Settings

Setting	Recommendation	Description
Video quality	720p	Best quality/storage balance. 1080p doubles storage per video.
Subtitle languages	en	Downloads subtitle tracks for full-text search
Auto-download new	Enabled	Automatically downloads new videos from subscribed channels
Download limit	0 (unlimited)	Max videos to download per channel per scan. 0 = all
Download speed	0 (unlimited)	Limit in KB/s. Useful to avoid YouTube throttling
Throttle rate	Enabled	Adds delays between downloads to reduce ban risk

Storage Planning

YouTube video sizes vary significantly:

Quality	Average Size per Hour	1,000 Videos (~5 min avg)
360p	~250 MB	~20 GB
720p	~900 MB	~75 GB
1080p	~1.8 GB	~150 GB
4K	~7 GB	~580 GB

Plan your storage accordingly. Use bind mounts instead of named volumes if you want to store media on a specific drive:

volumes:
  - /mnt/storage/youtube:/youtube  # Large media drive
  - cache:/cache

Using an External Media Drive

To store videos on an external or NAS-mounted drive, replace the media volume with a bind mount:

volumes:
  - /mnt/nas/youtube:/youtube
  - cache:/cache

Ensure the directory exists and is writable by the UID/GID specified in HOST_UID/HOST_GID.

Reverse Proxy

Tube Archivist listens on port 8000. Update TA_HOST to match your public URL when using a reverse proxy:

- TA_HOST=https://tubearchivist.yourdomain.com

For Nginx Proxy Manager, create a proxy host pointing to tubearchivist on port 8000. For Caddy:

tubearchivist.yourdomain.com {
    reverse_proxy localhost:8000
}

Reverse Proxy Setup

Backup

What to Back Up

Volume	Priority	Description
`es`	Critical	Elasticsearch indices (search data, metadata, subscriptions)
`media`	Important	Downloaded videos (can be re-downloaded but may be slow)
`cache`	Low	Thumbnails and temporary data (auto-regenerated)
`redis`	Low	Task queue state (auto-recovers)

Elasticsearch Snapshots

Tube Archivist has built-in Elasticsearch snapshot support:

Go to Settings → Scheduler → Snapshot
Enable snapshot schedule
Snapshots are stored in the ES data volume at the snapshot repository path

Manual Backup

# Stop the stack to ensure consistency
docker compose stop

# Back up Elasticsearch data
tar czf es-backup.tar.gz -C /var/lib/docker/volumes/$(docker volume inspect es --format '{{.Name}}')/_data .

# Back up video files (if needed)
rsync -avz /mnt/storage/youtube/ /backup/youtube/

# Restart
docker compose start

Backup Strategy

Troubleshooting

Elasticsearch Won’t Start

Symptom: archivist-es container exits immediately with “bootstrap check failure” or “max virtual memory areas vm.max_map_count too low.”

Fix: Set the kernel parameter:

sudo sysctl -w vm.max_map_count=262144
echo "vm.max_map_count=262144" | sudo tee -a /etc/sysctl.conf

Elasticsearch Goes Read-Only

Symptom: Downloads fail. Logs show “FORBIDDEN/12/index read-only / allow delete.” Tube Archivist dashboard shows errors.

Fix: Elasticsearch triggers a read-only watermark when disk usage exceeds 95%. Free up space, then reset:

curl -X PUT "http://localhost:9200/_cluster/settings" \
  -H 'Content-Type: application/json' \
  -d '{"transient":{"cluster.routing.allocation.disk.watermark.flood_stage":"97%"}}'

Long-term: ensure your media drive has at least 10% free space.

ulimits Error in LXC/Proxmox

Symptom: Elasticsearch fails with “error setting rlimit” in LXC containers or Proxmox.

Fix: Remove the ulimits section from the archivist-es service in your docker-compose.yml. LXC containers don’t support changing memlock limits.

Timezone ValueError on Startup

Symptom: Tube Archivist crashes with ValueError related to timezone.

Fix: Use canonical timezone names (e.g., America/Chicago) instead of aliases (e.g., US/Central). Check valid names at /usr/share/zoneinfo/.

”Connection to Elasticsearch Failed” on Startup

Symptom: Tube Archivist logs show repeated connection failures to Elasticsearch.

Fix: Elasticsearch takes 30-60 seconds to start. Tube Archivist retries automatically. If it persists, verify the ELASTIC_PASSWORD matches between both services.

Resource Requirements

RAM: 2 GB minimum (1 GB for ES heap + 512 MB for app + 256 MB for Redis), 4 GB recommended
CPU: Dual-core minimum, quad-core recommended for concurrent downloads and indexing
Disk: 20 GB for application data. Video storage depends on your library size — see storage planning table above

Verdict

Tube Archivist is the best self-hosted tool for building a personal YouTube archive. The combination of channel subscriptions, automated downloads, full-text search across subtitles, and a clean watching interface makes it genuinely useful — not just a yt-dlp wrapper with a web UI.

The main cost is resources: Elasticsearch is hungry (1 GB+ RAM just for search), and video storage grows fast at higher quality settings. If you just want to watch YouTube without ads and tracking (but not download videos), Invidious is a better fit. If you want to archive specific videos for offline viewing, Tube Archivist is the right tool.

Frequently Asked Questions

Is Tube Archivist legal to use?

Tube Archivist downloads YouTube videos for personal archival purposes. Whether this is legal depends on your jurisdiction. In many countries, downloading for personal use falls under fair use, but redistributing downloaded content is not. Tube Archivist is designed for personal media archival, not piracy.

How much disk space do I need for a YouTube archive?

It depends on video quality. At 720p, expect roughly 500 MB per hour of video. At 1080p, about 1-2 GB per hour. At 4K, 5-10 GB per hour. A modest archive of 500 videos at 720p would use around 100-200 GB. Plan your storage based on the quality settings and channel sizes you intend to archive.

Can I watch archived videos without YouTube?

Yes. Tube Archivist has its own built-in media player with a clean watching interface. Videos play directly from your server with no connection to YouTube needed. You can also integrate with Jellyfin for a more polished media center experience, or use Plex via community plugins.

Why does Tube Archivist need Elasticsearch?

Elasticsearch provides full-text search across video titles, descriptions, and subtitles. It enables instant search across your entire archive, including searching within subtitle text. The downside is Elasticsearch’s memory footprint — it requires at least 1 GB of RAM dedicated to its heap.

How does Tube Archivist compare to Invidious?

Invidious is a YouTube frontend proxy — it lets you watch YouTube without tracking but doesn’t download or store videos. Tube Archivist downloads and archives videos locally for offline access. Use Invidious for private YouTube browsing; use Tube Archivist for building a permanent local video library.

Yes. Tube Archivist supports channel subscriptions with configurable download schedules. You can subscribe to channels or playlists, and the built-in scheduler will automatically download new uploads according to your configured frequency and quality settings.