Install Paperless-ngx on Ubuntu Server
Why Ubuntu for Paperless-ngx
Ubuntu Server is the default choice for most self-hosted deployments, and Paperless-ngx runs well on it. The full stack — Paperless, PostgreSQL, and Redis — is handled entirely through Docker Compose. Ubuntu’s mature package ecosystem makes it easy to set up scanner integration, network shares, and automated backups around the container stack.
Prerequisites
- Ubuntu 22.04 or 24.04 LTS server
- Docker and Docker Compose installed (guide)
- 4 GB RAM minimum (OCR is memory-intensive)
- 20 GB free disk space (more for large document libraries)
- A scanner that can output to a network folder (optional, for auto-import)
- Root or sudo access
Install Docker
If Docker is not already installed:
sudo apt update
sudo apt install -y ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io docker-compose-plugin
sudo usermod -aG docker $USER
Log out and back in for the group change to take effect.
Docker Compose Configuration
Create the project directory:
mkdir -p ~/paperless && cd ~/paperless
Create docker-compose.yml:
services:
paperless:
image: ghcr.io/paperless-ngx/paperless-ngx:2.20.9
container_name: paperless
restart: unless-stopped
depends_on:
db:
condition: service_healthy
redis:
condition: service_started
ports:
- "8000:8000"
environment:
PAPERLESS_DBHOST: db
PAPERLESS_DBPORT: 5432
PAPERLESS_DBNAME: paperless
PAPERLESS_DBUSER: paperless
PAPERLESS_DBPASS: paperless_db_password # Change this
PAPERLESS_REDIS: redis://redis:6379
PAPERLESS_SECRET_KEY: change-this-to-a-long-random-string # Generate with: openssl rand -hex 32
PAPERLESS_URL: http://localhost:8000 # Change to your domain/IP
PAPERLESS_ADMIN_USER: admin # Initial admin username
PAPERLESS_ADMIN_PASSWORD: admin # Change this -- initial admin password
PAPERLESS_OCR_LANGUAGE: eng # See OCR Language section below
PAPERLESS_TIME_ZONE: America/New_York # Your timezone
PAPERLESS_CONSUMER_POLLING: 30 # Check consume folder every 30 seconds
PAPERLESS_CONSUMER_RECURSIVE: "true" # Process subdirectories in consume folder
PAPERLESS_CONSUMER_SUBDIRS_AS_TAGS: "true" # Create tags from subfolder names
PAPERLESS_TASK_WORKERS: 2 # Number of OCR workers (adjust based on CPU cores)
PAPERLESS_THREADS_PER_WORKER: 2 # Threads per OCR worker
USERMAP_UID: 1000
USERMAP_GID: 1000
volumes:
- paperless_data:/usr/src/paperless/data
- paperless_media:/usr/src/paperless/media
- ./consume:/usr/src/paperless/consume # Drop PDFs here for auto-import
- ./export:/usr/src/paperless/export # For document exports/backups
networks:
- paperless-net
db:
image: postgres:16-alpine
container_name: paperless-db
restart: unless-stopped
environment:
POSTGRES_USER: paperless
POSTGRES_PASSWORD: paperless_db_password # Must match PAPERLESS_DBPASS
POSTGRES_DB: paperless
volumes:
- postgres_data:/var/lib/postgresql/data
healthcheck:
test: ["CMD", "pg_isready", "-U", "paperless"]
interval: 10s
timeout: 5s
retries: 5
networks:
- paperless-net
redis:
image: redis:7-alpine
container_name: paperless-redis
restart: unless-stopped
volumes:
- redis_data:/data
networks:
- paperless-net
volumes:
paperless_data:
paperless_media:
postgres_data:
redis_data:
networks:
paperless-net:
Create the consume and export directories:
mkdir -p consume export
Start the stack:
docker compose up -d
First startup takes 1-2 minutes as the database initializes and Paperless runs migrations. Monitor progress:
docker compose logs -f paperless
Wait for Listening on 0.0.0.0:8000 before accessing the web UI.
OCR Language Configuration
Paperless-ngx uses Tesseract for OCR. The default eng (English) language is included. For additional languages:
# Single additional language
PAPERLESS_OCR_LANGUAGE: eng+deu # English + German
# Multiple languages
PAPERLESS_OCR_LANGUAGE: eng+deu+fra+spa # English, German, French, Spanish
Common language codes: eng (English), deu (German), fra (French), spa (Spanish), ita (Italian), por (Portuguese), nld (Dutch), jpn (Japanese), chi_sim (Simplified Chinese), kor (Korean).
The container includes all Tesseract language packs — no additional installation needed. Adding more languages slightly increases OCR processing time.
First-Time Setup
Open http://your-server-ip:8000 in your browser. Log in with the admin credentials set in PAPERLESS_ADMIN_USER and PAPERLESS_ADMIN_PASSWORD.
The first things to configure:
- Upload a test document — drag and drop a PDF onto the dashboard. Paperless processes it with OCR and adds it to the library.
- Create tags — organize documents by type (invoice, receipt, contract, etc.)
- Create correspondents — track who sent/received documents
- Create document types — categorize documents
- Set up matching rules — auto-assign tags and correspondents based on document content
Consume Folder Setup
The consume folder (./consume mapped to /usr/src/paperless/consume inside the container) is the primary automation mechanism. Any PDF, PNG, JPG, or TIFF dropped into this folder is automatically imported, OCR-processed, and added to your library.
Scanner Integration
Configure your scanner to save directly to the consume folder:
Network scanner (Samba/SMB share):
sudo apt install -y samba
# Add to /etc/samba/smb.conf:
sudo tee -a /etc/samba/smb.conf > /dev/null <<EOF
[paperless-consume]
path = /home/$USER/paperless/consume
browseable = yes
writable = yes
valid users = $USER
create mask = 0644
directory mask = 0755
EOF
# Set Samba password
sudo smbpasswd -a $USER
# Restart Samba
sudo systemctl restart smbd
Point your scanner to \\server-ip\paperless-consume.
Subdirectories as tags: With PAPERLESS_CONSUMER_SUBDIRS_AS_TAGS: "true", you can create folders like consume/invoices/ and consume/receipts/ — documents dropped in each get tagged automatically.
inotify Watches
For large document libraries, the default inotify watch limit may be too low. Increase it:
echo "fs.inotify.max_user_watches=524288" | sudo tee -a /etc/sysctl.conf
sudo sysctl -p
This prevents “inotify watch limit reached” errors when Paperless monitors many files.
UFW Firewall Rules
# Paperless web UI
sudo ufw allow 8000/tcp comment 'Paperless-ngx'
# If using Samba for scanner integration
sudo ufw allow samba comment 'Samba for Paperless consume'
# If behind a reverse proxy
sudo ufw allow 443/tcp comment 'HTTPS'
sudo ufw status
Backup Strategy
Paperless stores data in three locations that all need backing up:
- PostgreSQL database — document metadata, tags, correspondents, matching rules
- Media volume — original documents and OCR-processed versions
- Data volume — thumbnails and search index
Using Paperless Built-in Export
Paperless has a built-in document exporter that creates a portable backup:
docker exec paperless document_exporter /usr/src/paperless/export
This writes all documents and metadata to the ./export directory. Schedule it with cron:
# Daily export at 1 AM
0 1 * * * docker exec paperless document_exporter /usr/src/paperless/export
Database Backup
#!/bin/bash
# backup-paperless.sh
BACKUP_DIR="/opt/backups/paperless/$(date +%Y-%m-%d)"
mkdir -p "$BACKUP_DIR"
# Database dump
docker exec paperless-db pg_dump -U paperless paperless > "$BACKUP_DIR/paperless-db.sql"
# Media files
docker run --rm \
-v paperless_paperless_media:/media \
-v "$BACKUP_DIR":/backup \
alpine tar czf /backup/paperless-media.tar.gz /media
echo "Backup complete: $BACKUP_DIR"
See Backup Strategy for the full 3-2-1 approach.
HTTPS via Reverse Proxy
For remote access, put Paperless behind a reverse proxy with SSL. Update the environment variable:
PAPERLESS_URL: https://docs.yourdomain.com
Then configure your reverse proxy to forward to localhost:8000. See Reverse Proxy Setup.
Important: Paperless uses WebSocket connections for real-time task updates. Your reverse proxy must support WebSocket proxying. Nginx Proxy Manager handles this by default. For raw Nginx, add:
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
Troubleshooting
OCR Produces Garbled Text
Symptom: Document content is recognized but text is wrong or mixed characters.
Fix: Set the correct OCR language. If your documents are in German, PAPERLESS_OCR_LANGUAGE: eng will produce garbage. Set it to deu or eng+deu for mixed-language documents.
Consume Folder Not Processing Files
Symptom: Files sit in the consume folder and are not imported.
Fix: Check permissions. The container runs as UID/GID set by USERMAP_UID/USERMAP_GID. The consume folder must be writable by that user:
sudo chown -R 1000:1000 ~/paperless/consume
Also check PAPERLESS_CONSUMER_POLLING is set and the container logs for errors:
docker compose logs paperless | grep -i consume
Database Connection Errors on Startup
Symptom: Paperless logs show could not connect to server: Connection refused.
Fix: PostgreSQL may not be ready. The depends_on with health check should handle this, but if it persists, increase the retry count in the PostgreSQL healthcheck or add a startup delay:
paperless:
depends_on:
db:
condition: service_healthy
Verify the database is healthy:
docker compose ps
High Memory Usage During OCR
Symptom: System becomes unresponsive during document processing. OOM killer terminates processes.
Fix: Reduce the number of OCR workers:
PAPERLESS_TASK_WORKERS: 1
PAPERLESS_THREADS_PER_WORKER: 1
Each worker uses 300-500 MB during OCR processing. On a 4 GB system, 2 workers is the safe maximum.
Search Not Finding Documents
Symptom: Documents exist but full-text search returns no results.
Fix: The search index may need rebuilding:
docker exec paperless document_index reindex
This can take a while for large libraries. Check progress in the Paperless web UI under Tasks.
Resource Requirements
- RAM: ~500 MB idle, 1-2 GB during active OCR processing (per worker)
- CPU: Low idle. OCR processing is CPU-intensive — each page takes 2-10 seconds on modern x86 hardware
- Disk: 500 MB for the application, plus document storage (plan 1-5 MB per page depending on originals)
Related
Get self-hosting tips in your inbox
Get the Docker Compose configs, hardware picks, and setup shortcuts we don't put in articles. Weekly. No spam.
Comments