Best Self-Hosted Document Management Systems

Two Different Problems

“Document management” covers two distinct use cases, and the best tool depends on which one you need:

Updated March 2026: Verified with latest Docker images and configurations.

Problem	Best Tool	What It Does
Archive and search documents	Paperless-ngx	Scans, OCRs, tags, and organizes documents permanently
Edit and manipulate PDFs	Stirling-PDF	Merge, split, convert, compress, sign PDFs on demand

These tools complement each other rather than compete. Most self-hosters who care about documents run both — Paperless-ngx as the permanent archive, Stirling-PDF as the workbench.

Paperless-ngx — Best Document Archive

Paperless-ngx is a document management system that ingests physical and digital documents, OCRs them, applies machine-learning-based categorization, and makes everything searchable. Drop a scanned receipt into the consumption folder → Paperless reads it, extracts text, suggests a title, assigns a correspondent and document type, and files it. Months later, search “electric bill 2025” and find it instantly.

The consumption pipeline is the core feature. Point Paperless at a folder (or email inbox, or IMAP mailbox) and it processes everything automatically. The ML-based classification improves over time — after you correct a few categorizations, it starts getting them right on its own.

The web interface is well-designed for browsing and searching a document archive. Filter by correspondent, document type, tag, date range, or full-text content. Preview PDFs inline. Download originals or the OCR-processed versions.

Pros:

Automatic OCR on every ingested document (Tesseract + optional Tika)
ML-based auto-classification (correspondent, document type, tags)
Multiple consumption sources: folder, email, IMAP
Full-text search across all documents
Clean web UI with inline PDF preview
Email-based document submission
Workflow rules for automatic processing
Audit trail — original files preserved alongside OCR versions
Active development with frequent releases

Cons:

Resource-heavy: requires PostgreSQL, Redis, and the app itself
~500 MB RAM minimum for the full stack
OCR processing is CPU-intensive during ingestion
Initial categorization requires manual correction to train the ML
No PDF editing capabilities (can’t merge, split, or modify PDFs)
Setup is involved — several services to configure

Docker Compose:

services:
  paperless:
    image: ghcr.io/paperless-ngx/paperless-ngx:2.20.11
    container_name: paperless
    ports:
      - "8000:8000"
    environment:
      - PAPERLESS_REDIS=redis://paperless-redis:6379
      - PAPERLESS_DBHOST=paperless-db
      - PAPERLESS_DBUSER=paperless
      - PAPERLESS_DBPASS=paperless_secret     # Change this
      - PAPERLESS_SECRET_KEY=change-me-long-random-string  # Change this
      - PAPERLESS_OCR_LANGUAGE=eng
      - PAPERLESS_TIME_ZONE=America/New_York
      - PAPERLESS_ADMIN_USER=admin             # Change this
      - PAPERLESS_ADMIN_PASSWORD=changeme      # Change this
    volumes:
      - paperless-data:/usr/src/paperless/data
      - paperless-media:/usr/src/paperless/media
      - paperless-export:/usr/src/paperless/export
      - paperless-consume:/usr/src/paperless/consume  # Drop files here
    depends_on:
      - paperless-db
      - paperless-redis
    restart: unless-stopped

  paperless-db:
    image: postgres:16-alpine
    container_name: paperless-db
    environment:
      - POSTGRES_USER=paperless
      - POSTGRES_PASSWORD=paperless_secret     # Match above
      - POSTGRES_DB=paperless
    volumes:
      - paperless-pgdata:/var/lib/postgresql/data
    restart: unless-stopped

  paperless-redis:
    image: redis:7-alpine
    container_name: paperless-redis
    restart: unless-stopped

volumes:
  paperless-data:
  paperless-media:
  paperless-export:
  paperless-consume:
  paperless-pgdata:

Resources: ~500 MB RAM (app + PostgreSQL + Redis). CPU spikes during OCR processing. Storage depends on document volume — plan for the size of your document archive.

[Read our full guide: How to Self-Host Paperless-ngx]

Stirling-PDF — Best PDF Toolkit

Stirling-PDF is a self-hosted PDF manipulation toolkit. Merge multiple PDFs, split a PDF into pages, convert between formats (PDF to Word, images to PDF, HTML to PDF), compress files, add watermarks, rotate pages, add/remove passwords, flatten forms, and perform OCR on scanned documents. Over 40 PDF operations in a single web interface.

It’s not a document archive — Stirling-PDF doesn’t store your files. It’s a tool you open when you need to do something to a PDF, process it, download the result, and close the tab. Think of it as a self-hosted replacement for iLovePDF, SmallPDF, or Adobe Acrobat’s online tools.

The security benefit is significant. Instead of uploading sensitive documents (tax returns, contracts, medical records) to cloud PDF services, process them on your own server. Files never leave your infrastructure.

Pros:

40+ PDF operations in one tool
No file storage — process and download, nothing persists
OCR support via Tesseract
PDF/A conversion for archival compliance
Digital signature support
API for automation (batch processing via scripts)
Single container, no dependencies
Lightweight (~100 MB RAM)
Active development

Cons:

Not a document management system — no search, no categorization
Files are processed transiently — no history or audit trail
OCR quality depends on document scanning quality
Some advanced operations (e.g., PDF forms) are less polished
No batch upload via web UI (API-only for batch processing)

Docker Compose:

services:
  stirling-pdf:
    image: stirlingtools/stirling-pdf:2.7.3
    container_name: stirling-pdf
    ports:
      - "8080:8080"
    volumes:
      - stirling-data:/usr/share/tessdata       # OCR language data
      - stirling-configs:/configs               # Custom configurations
    environment:
      - DOCKER_ENABLE_SECURITY=false
      - LANGS=en_US                              # UI language
    restart: unless-stopped

volumes:
  stirling-data:
  stirling-configs:

Resources: ~100 MB RAM idle. CPU spikes during heavy operations (OCR, large file conversion). Minimal disk.

[Read our full guide: How to Self-Host Stirling-PDF]

Comparison

Feature	Paperless-ngx	Stirling-PDF
Primary purpose	Document archive + search	PDF manipulation toolkit
OCR	Automatic on ingestion	On-demand per file
Document storage	Yes (permanent archive)	No (transient processing)
Full-text search	Yes	No
Auto-categorization	Yes (ML-based)	No
PDF editing	No	Yes (40+ operations)
Format conversion	No	Yes (PDF ↔ Word, images, HTML)
Merge/split PDFs	No	Yes
Digital signatures	No	Yes
API	REST API	REST API
Multi-user	Yes (permissions per document)	Yes (basic auth)
RAM usage	~500 MB	~100 MB
Docker containers	3 (app + PostgreSQL + Redis)	1
License	GPL-3.0	GPL-3.0

Run Both

The ideal document workflow uses both tools:

Stirling-PDF to prepare documents — merge related pages, OCR scanned documents, convert formats, compress large files
Paperless-ngx to archive and organize the prepared documents — auto-categorize, make searchable, store permanently

Both run on Docker and the combined overhead is manageable (~600 MB RAM total). Set up a workflow where Stirling-PDF’s output feeds into Paperless-ngx’s consumption folder for automatic processing.

Frequently Asked Questions

Can Paperless-ngx read handwritten documents?

Paperless-ngx uses Tesseract OCR, which works well on printed text but poorly on handwriting. If your handwritten documents are neat and high-contrast, some text may be recognized. For reliable handwriting recognition, you’d need to pre-process documents through a specialized service before importing into Paperless-ngx.

How do I get documents into Paperless-ngx?

Multiple ways: drop files into the consumption folder (a directory Paperless watches), email documents to a configured IMAP mailbox, use the web upload interface, or use the mobile app. Many users connect a network scanner that saves directly to the consumption folder — scan a receipt and it’s automatically OCR’d, categorized, and filed.

Is Stirling-PDF safe for sensitive documents?

Yes — that’s one of its main advantages. Stirling-PDF processes files on your server and doesn’t store them after processing. Unlike cloud services (SmallPDF, iLovePDF), your tax returns, contracts, and medical records never leave your infrastructure. The processed file is returned to your browser and then discarded.

Can I use both Paperless-ngx and Stirling-PDF together?

Yes, and this is the recommended setup. Use Stirling-PDF as a workbench to prepare documents (merge, split, compress, OCR) and then feed the results into Paperless-ngx’s consumption folder for permanent archival and search. The combined overhead is about 600 MB RAM.

How much storage does a document archive need?

It depends on your volume. A typical household generating 50-100 documents per year needs a few GB. Paperless-ngx stores both the original file and an OCR-processed version, roughly doubling storage per document. A 10-year archive of household documents (receipts, bills, medical, tax) typically fits in 10-20 GB.

Does Paperless-ngx support multiple users?

Yes — Paperless-ngx has user accounts with permissions. You can control who can view, edit, or delete documents. Documents can be assigned to specific users or shared across the instance. This works well for households where each person manages their own documents but shares some categories.

Two Different Problems

Paperless-ngx — Best Document Archive

Stirling-PDF — Best PDF Toolkit

Comparison

Run Both

Frequently Asked Questions

Can Paperless-ngx read handwritten documents?

How do I get documents into Paperless-ngx?

Is Stirling-PDF safe for sensitive documents?

Can I use both Paperless-ngx and Stirling-PDF together?

How much storage does a document archive need?

Does Paperless-ngx support multiple users?

Related

Related Articles

Best Self-Hosted PDF Tools in 2026

How to Self-Host Paperless-ngx with Docker

Paperless-ngx vs Stirling-PDF: Which to Use?

How to Self-Host Stirling-PDF with Docker

Install Paperless-ngx on Raspberry Pi

Install Paperless-ngx on Proxmox VE

Get self-hosting tips in your inbox

Comments