How to Self-Host Typesense with Docker Compose

What Is Typesense?

Typesense is an in-memory search engine designed for sub-millisecond search latency. It stores the entire search index in RAM for maximum speed, with automatic typo tolerance, faceting, and geo-search built in. Written in C++, it’s designed as a faster, easier alternative to Elasticsearch for application search. Think of it as Algolia that you self-host.

Prerequisites

  • A Linux server (Ubuntu 22.04+ recommended)
  • Docker and Docker Compose installed (guide)
  • 1 GB+ RAM (scales with index size — 2-3x indexed data size)
  • 5 GB+ free disk space
  • No GPU required

Docker Compose Configuration

Create a docker-compose.yml file:

services:
  typesense:
    image: typesense/typesense:30.1
    container_name: typesense
    ports:
      - "8108:8108"
    volumes:
      - typesense_data:/data
    command: >
      --data-dir /data
      --api-key=your-api-key-change-this-to-something-strong
      --enable-cors
    restart: unless-stopped

volumes:
  typesense_data:

Note: Typesense uses CLI arguments instead of environment variables for configuration.

Start the stack:

docker compose up -d

Initial Setup

Verify the Server

curl -H "X-TYPESENSE-API-KEY: your-api-key-change-this-to-something-strong" \
  http://localhost:8108/health

Create a Collection (Schema)

Unlike Meilisearch, Typesense requires you to define a schema:

curl -X POST http://localhost:8108/collections \
  -H "X-TYPESENSE-API-KEY: your-api-key-change-this-to-something-strong" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "movies",
    "fields": [
      {"name": "title", "type": "string"},
      {"name": "genre", "type": "string", "facet": true},
      {"name": "year", "type": "int32", "facet": true},
      {"name": "rating", "type": "float", "optional": true}
    ],
    "default_sorting_field": "year"
  }'

Index Documents

curl -X POST http://localhost:8108/collections/movies/documents/import \
  -H "X-TYPESENSE-API-KEY: your-api-key-change-this-to-something-strong" \
  -H "Content-Type: text/plain" \
  --data-binary '{"title": "The Matrix", "genre": "sci-fi", "year": 1999, "rating": 8.7}
{"title": "Interstellar", "genre": "sci-fi", "year": 2014, "rating": 8.7}
{"title": "The Dark Knight", "genre": "action", "year": 2008, "rating": 9.0}'
curl "http://localhost:8108/collections/movies/search?q=matrx&query_by=title" \
  -H "X-TYPESENSE-API-KEY: your-api-key-change-this-to-something-strong"

Typo tolerance is automatic — “matrx” finds “The Matrix”.

Configuration

Key CLI Arguments

ArgumentDefaultDescription
--data-dirRequiredDirectory for persistent data
--api-keyRequiredAdmin API key
--enable-corsfalseEnable CORS for browser requests
--api-port8108HTTP API port
--peering-port8107Port for cluster communication
--thread-pool-size4Number of request handling threads
--num-collections-parallel-load0Collections to load in parallel on startup
--cache-num-entries1000Number of search results to cache
--log-dirDirectory for log files

Scoped API Keys

Create search-only API keys for frontend use:

curl -X POST http://localhost:8108/keys \
  -H "X-TYPESENSE-API-KEY: your-api-key-change-this-to-something-strong" \
  -H "Content-Type: application/json" \
  -d '{
    "description": "Search-only key",
    "actions": ["documents:search"],
    "collections": ["movies"]
  }'

Advanced Configuration

Clustering (High Availability)

Typesense supports built-in Raft-based clustering:

services:
  typesense-1:
    image: typesense/typesense:30.1
    command: >
      --data-dir /data
      --api-key=your-api-key
      --nodes=/config/nodes.txt
      --api-port=8108
      --peering-port=8107
    volumes:
      - ts1_data:/data
      - ./nodes.txt:/config/nodes.txt:ro
    restart: unless-stopped

  typesense-2:
    image: typesense/typesense:30.1
    command: >
      --data-dir /data
      --api-key=your-api-key
      --nodes=/config/nodes.txt
      --api-port=8108
      --peering-port=8107
    volumes:
      - ts2_data:/data
      - ./nodes.txt:/config/nodes.txt:ro
    restart: unless-stopped

Create nodes.txt:

typesense-1:8107:8108,typesense-2:8107:8108

Typesense supports vector search for semantic similarity:

# Add a vector field to your collection
curl -X PATCH http://localhost:8108/collections/movies \
  -H "X-TYPESENSE-API-KEY: your-api-key" \
  -H "Content-Type: application/json" \
  -d '{"fields": [{"name": "embedding", "type": "float[]", "num_dim": 384}]}'

Result Curation (Overrides)

Pin specific results for specific queries (useful for merchandising):

curl -X PUT "http://localhost:8108/collections/movies/overrides/featured-scifi" \
  -H "X-TYPESENSE-API-KEY: your-api-key" \
  -H "Content-Type: application/json" \
  -d '{"rule": {"query": "best sci-fi", "match": "contains"}, "includes": [{"id": "1", "position": 1}]}'

Reverse Proxy

Configure your reverse proxy to forward to port 8108. See Reverse Proxy Setup.

Backup

Snapshots

Create a snapshot:

curl -X POST "http://localhost:8108/operations/snapshot?snapshot_path=/data/snapshots" \
  -H "X-TYPESENSE-API-KEY: your-api-key"

Volume Backup

docker run --rm -v typesense_data:/data -v $(pwd):/backup alpine \
  tar czf /backup/typesense-backup.tar.gz /data

See Backup Strategy.

Troubleshooting

High Memory Usage

Symptom: Typesense uses more RAM than expected. Fix: Typesense keeps the entire index in RAM. Expect 2-3x the size of your indexed fields. Remove unused fields from the schema. Consider Meilisearch if your dataset is too large for RAM.

Collection Creation Fails

Symptom: 400 error when creating a collection. Fix: Check that all field types are valid (string, int32, int64, float, bool, string[], float[]). Ensure default_sorting_field references a numeric field.

Search Returns No Results

Symptom: Documents indexed but search finds nothing. Fix: Verify query_by parameter references the correct field(s). Check that the field is of type string or string[] (non-string fields aren’t text-searchable).

Resource Requirements

  • RAM: 2-3x the size of indexed data (in-memory engine)
  • CPU: Low (C++ binary, very efficient)
  • Disk: Index size + snapshot space

Verdict

Typesense is the fastest application search engine you can self-host. Sub-millisecond latency is real — and it makes search-as-you-type feel instant. The trade-off is RAM usage: keeping the entire index in memory means larger datasets get expensive. Built-in Raft clustering and result curation are standout features.

Choose Typesense for application search where speed is critical and your index fits in RAM. Choose Meilisearch for similar functionality with lower RAM usage (disk-based). Choose Elasticsearch for analytics, logging, and massive-scale search.

Frequently Asked Questions

How does Typesense compare to Meilisearch?

Both are fast, developer-friendly search engines. Typesense keeps the entire index in RAM for sub-millisecond latency. Meilisearch uses disk-based storage with lower RAM requirements. Typesense has built-in Raft clustering and result curation. Meilisearch has a simpler API and lower barrier to entry. Choose Typesense when speed is critical and your index fits in RAM; choose Meilisearch for larger datasets with tighter RAM budgets.

How much RAM does Typesense need?

Typesense keeps the entire index in memory. Plan for 2-3x the size of your indexed fields. A 1 GB dataset of searchable text fields needs roughly 2-3 GB of RAM. Non-indexed fields and vector embeddings add to this. If your dataset exceeds available RAM, consider Meilisearch or Elasticsearch instead.

Can I use Typesense as a drop-in replacement for Algolia?

Typesense provides an Algolia-compatible API client (typesense-instantsearch-adapter) that works with InstantSearch.js. Most Algolia implementations can switch to Typesense with minimal code changes — swap the client, update the API key and endpoint. Some advanced Algolia features (A/B testing, analytics) aren’t available.

Yes. Typesense supports vector search for semantic similarity alongside keyword search. You can add float[] fields with a specified dimension count and search using embedding vectors. This enables hybrid search — combining traditional keyword matching with semantic similarity in a single query.

Can Typesense cluster across multiple servers?

Yes. Typesense has built-in Raft-based clustering for high availability. Deploy multiple nodes with the --nodes configuration, and they automatically replicate data and handle leader election. No external coordination service needed. This is a significant advantage over Meilisearch, which doesn’t support clustering in its open-source edition.

Is Typesense suitable for full-text search in a self-hosted app?

Yes. Typesense is commonly used as the search backend for self-hosted applications like wikis, documentation sites, and e-commerce platforms. It integrates with tools like DocSearch and works well with static site generators. The typo tolerance and instant search capabilities make it ideal for application search.

Comments