How to Self-Host Tube Archivist with Docker
What Is Tube Archivist?
Tube Archivist is a self-hosted YouTube media manager that downloads, indexes, and organizes YouTube videos into a personal media library. Subscribe to channels, download their entire catalogs, search across all your archived content, and watch everything through a clean web interface — all stored on your own hardware.
Think of it as Plex/Jellyfin but specifically designed for YouTube content. Videos are yours permanently, even if the original is deleted, copyright-claimed, or the channel disappears. Tube Archivist uses Elasticsearch for full-text search across video titles, descriptions, and subtitles, and yt-dlp for downloading.
Official site: tubearchivist.com
Prerequisites
- A Linux server (Ubuntu 22.04+ recommended)
- Docker and Docker Compose installed (guide)
- 4 GB+ of free RAM (Elasticsearch alone uses 1 GB heap)
- 20+ GB of free disk space for application data, plus storage for video files
vm.max_map_countkernel parameter set to 262144 (required for Elasticsearch)
Set the required kernel parameter:
# Temporary (resets on reboot)
sudo sysctl -w vm.max_map_count=262144
# Permanent (persists across reboots)
echo "vm.max_map_count=262144" | sudo tee -a /etc/sysctl.conf
sudo sysctl -p
Docker Compose Configuration
Tube Archivist requires three services: the main application, Elasticsearch for search indexing, and Redis for caching and task queuing.
Create a docker-compose.yml file:
services:
tubearchivist:
image: bbilly1/tubearchivist:v0.5.9
container_name: tubearchivist
restart: unless-stopped
ports:
- "8000:8000"
volumes:
# Video storage — point this to a drive with plenty of space
- media:/youtube
# Application cache (thumbnails, metadata)
- cache:/cache
environment:
# Elasticsearch connection
- ES_URL=http://archivist-es:9200
# Redis connection
- REDIS_CON=redis://archivist-redis:6379
# File ownership (match your host user)
- HOST_UID=1000
- HOST_GID=1000
# Full URL of your instance (include protocol and port)
- TA_HOST=http://localhost:8000
# Initial admin credentials (change the password!)
- TA_USERNAME=admin
- TA_PASSWORD=CHANGE_THIS_PASSWORD
# Must match ELASTIC_PASSWORD on the ES service
- ELASTIC_PASSWORD=CHANGE_THIS_ES_PASSWORD
# Timezone for download scheduler
- TZ=America/New_York
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/api/health/"]
interval: 2m
timeout: 10s
retries: 3
start_period: 30s
depends_on:
- archivist-es
- archivist-redis
archivist-redis:
image: redis:7.4
container_name: archivist-redis
restart: unless-stopped
expose:
- "6379"
volumes:
- redis:/data
depends_on:
- archivist-es
archivist-es:
image: bbilly1/tubearchivist-es
# For ARM64, use: image: elasticsearch:8.19.0
container_name: archivist-es
restart: unless-stopped
environment:
# Must match ELASTIC_PASSWORD on tubearchivist service
- "ELASTIC_PASSWORD=CHANGE_THIS_ES_PASSWORD"
# Java heap size — 1 GB minimum, increase for large libraries
- "ES_JAVA_OPTS=-Xms1g -Xmx1g"
- "xpack.security.enabled=true"
- "discovery.type=single-node"
- "path.repo=/usr/share/elasticsearch/data/snapshot"
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- es:/usr/share/elasticsearch/data
expose:
- "9200"
volumes:
media:
cache:
redis:
es:
Replace CHANGE_THIS_PASSWORD and CHANGE_THIS_ES_PASSWORD with strong passwords. The ELASTIC_PASSWORD must be identical on both the tubearchivist and archivist-es services.
Start the stack:
docker compose up -d
First startup takes 2-3 minutes while Elasticsearch initializes and Tube Archivist runs migrations. Monitor with:
docker compose logs -f tubearchivist
Initial Setup
- Open
http://your-server-ip:8000in your browser - Log in with the credentials set in
TA_USERNAMEandTA_PASSWORD - Navigate to Settings → Downloads:
- Set your preferred video quality (720p is a good balance of quality vs storage)
- Configure subtitle languages if desired
- Set the download scheduler interval
- Subscribe to your first channel:
- Click + → Subscribe
- Paste a YouTube channel URL
- Tube Archivist imports the channel metadata and queues videos for download
- Start downloading:
- Go to Downloads to see queued videos
- Click Start Download to begin processing the queue
Configuration
Download Settings
| Setting | Recommendation | Description |
|---|---|---|
| Video quality | 720p | Best quality/storage balance. 1080p doubles storage per video. |
| Subtitle languages | en | Downloads subtitle tracks for full-text search |
| Auto-download new | Enabled | Automatically downloads new videos from subscribed channels |
| Download limit | 0 (unlimited) | Max videos to download per channel per scan. 0 = all |
| Download speed | 0 (unlimited) | Limit in KB/s. Useful to avoid YouTube throttling |
| Throttle rate | Enabled | Adds delays between downloads to reduce ban risk |
Storage Planning
YouTube video sizes vary significantly:
| Quality | Average Size per Hour | 1,000 Videos (~5 min avg) |
|---|---|---|
| 360p | ~250 MB | ~20 GB |
| 720p | ~900 MB | ~75 GB |
| 1080p | ~1.8 GB | ~150 GB |
| 4K | ~7 GB | ~580 GB |
Plan your storage accordingly. Use bind mounts instead of named volumes if you want to store media on a specific drive:
volumes:
- /mnt/storage/youtube:/youtube # Large media drive
- cache:/cache
Using an External Media Drive
To store videos on an external or NAS-mounted drive, replace the media volume with a bind mount:
volumes:
- /mnt/nas/youtube:/youtube
- cache:/cache
Ensure the directory exists and is writable by the UID/GID specified in HOST_UID/HOST_GID.
Reverse Proxy
Tube Archivist listens on port 8000. Update TA_HOST to match your public URL when using a reverse proxy:
- TA_HOST=https://tubearchivist.yourdomain.com
For Nginx Proxy Manager, create a proxy host pointing to tubearchivist on port 8000. For Caddy:
tubearchivist.yourdomain.com {
reverse_proxy localhost:8000
}
Backup
What to Back Up
| Volume | Priority | Description |
|---|---|---|
es | Critical | Elasticsearch indices (search data, metadata, subscriptions) |
media | Important | Downloaded videos (can be re-downloaded but may be slow) |
cache | Low | Thumbnails and temporary data (auto-regenerated) |
redis | Low | Task queue state (auto-recovers) |
Elasticsearch Snapshots
Tube Archivist has built-in Elasticsearch snapshot support:
- Go to Settings → Scheduler → Snapshot
- Enable snapshot schedule
- Snapshots are stored in the ES data volume at the snapshot repository path
Manual Backup
# Stop the stack to ensure consistency
docker compose stop
# Back up Elasticsearch data
tar czf es-backup.tar.gz -C /var/lib/docker/volumes/$(docker volume inspect es --format '{{.Name}}')/_data .
# Back up video files (if needed)
rsync -avz /mnt/storage/youtube/ /backup/youtube/
# Restart
docker compose start
Troubleshooting
Elasticsearch Won’t Start
Symptom: archivist-es container exits immediately with “bootstrap check failure” or “max virtual memory areas vm.max_map_count too low.”
Fix: Set the kernel parameter:
sudo sysctl -w vm.max_map_count=262144
echo "vm.max_map_count=262144" | sudo tee -a /etc/sysctl.conf
Elasticsearch Goes Read-Only
Symptom: Downloads fail. Logs show “FORBIDDEN/12/index read-only / allow delete.” Tube Archivist dashboard shows errors.
Fix: Elasticsearch triggers a read-only watermark when disk usage exceeds 95%. Free up space, then reset:
curl -X PUT "http://localhost:9200/_cluster/settings" \
-H 'Content-Type: application/json' \
-d '{"transient":{"cluster.routing.allocation.disk.watermark.flood_stage":"97%"}}'
Long-term: ensure your media drive has at least 10% free space.
ulimits Error in LXC/Proxmox
Symptom: Elasticsearch fails with “error setting rlimit” in LXC containers or Proxmox.
Fix: Remove the ulimits section from the archivist-es service in your docker-compose.yml. LXC containers don’t support changing memlock limits.
Timezone ValueError on Startup
Symptom: Tube Archivist crashes with ValueError related to timezone.
Fix: Use canonical timezone names (e.g., America/Chicago) instead of aliases (e.g., US/Central). Check valid names at /usr/share/zoneinfo/.
”Connection to Elasticsearch Failed” on Startup
Symptom: Tube Archivist logs show repeated connection failures to Elasticsearch.
Fix: Elasticsearch takes 30-60 seconds to start. Tube Archivist retries automatically. If it persists, verify the ELASTIC_PASSWORD matches between both services.
Resource Requirements
- RAM: 2 GB minimum (1 GB for ES heap + 512 MB for app + 256 MB for Redis), 4 GB recommended
- CPU: Dual-core minimum, quad-core recommended for concurrent downloads and indexing
- Disk: 20 GB for application data. Video storage depends on your library size — see storage planning table above
Verdict
Tube Archivist is the best self-hosted tool for building a personal YouTube archive. The combination of channel subscriptions, automated downloads, full-text search across subtitles, and a clean watching interface makes it genuinely useful — not just a yt-dlp wrapper with a web UI.
The main cost is resources: Elasticsearch is hungry (1 GB+ RAM just for search), and video storage grows fast at higher quality settings. If you just want to watch YouTube without ads and tracking (but not download videos), Invidious is a better fit. If you want to archive specific videos for offline viewing, Tube Archivist is the right tool.
Frequently Asked Questions
Is Tube Archivist legal to use?
Tube Archivist downloads YouTube videos for personal archival purposes. Whether this is legal depends on your jurisdiction. In many countries, downloading for personal use falls under fair use, but redistributing downloaded content is not. Tube Archivist is designed for personal media archival, not piracy.
How much disk space do I need for a YouTube archive?
It depends on video quality. At 720p, expect roughly 500 MB per hour of video. At 1080p, about 1-2 GB per hour. At 4K, 5-10 GB per hour. A modest archive of 500 videos at 720p would use around 100-200 GB. Plan your storage based on the quality settings and channel sizes you intend to archive.
Can I watch archived videos without YouTube?
Yes. Tube Archivist has its own built-in media player with a clean watching interface. Videos play directly from your server with no connection to YouTube needed. You can also integrate with Jellyfin for a more polished media center experience, or use Plex via community plugins.
Why does Tube Archivist need Elasticsearch?
Elasticsearch provides full-text search across video titles, descriptions, and subtitles. It enables instant search across your entire archive, including searching within subtitle text. The downside is Elasticsearch’s memory footprint — it requires at least 1 GB of RAM dedicated to its heap.
How does Tube Archivist compare to Invidious?
Invidious is a YouTube frontend proxy — it lets you watch YouTube without tracking but doesn’t download or store videos. Tube Archivist downloads and archives videos locally for offline access. Use Invidious for private YouTube browsing; use Tube Archivist for building a permanent local video library.
Can I subscribe to channels and auto-download new videos?
Yes. Tube Archivist supports channel subscriptions with configurable download schedules. You can subscribe to channels or playlists, and the built-in scheduler will automatically download new uploads according to your configured frequency and quality settings.
Related
Get self-hosting tips in your inbox
Get the Docker Compose configs, hardware picks, and setup shortcuts we don't put in articles. Weekly. No spam.
Comments