Update text search best practices regarding scalability

per @DavIvek 

I’ve experimented with text search imports and found that parallel imports can be extremely slow when a text index is present. When a text index is created and updated in parallel, imports can be 20x slower compared to running without a text index.

This happens because the text index takes a unique lock during commits and those commits can be expensive in the text index context. Meaning only one transaction can write to the index at a time. Since these writes also persist data to disk, the slowdown is expected. While this can be optimized in the future, it’s the current behavior.

Running the entire import in a single transaction is significantly faster than many small transactions, because writes don’t block each other and everything is flushed in a single commit.

That said, based on practical testing with 100k nodes, the best import speed was achieved by batching transactions rather than using a single large transaction:

Splitting the import into larger batches yielded the best results.

10k nodes per worker provided the fastest import for 100k nodes:

- ~50% faster than importing everything in a single transaction.
- ~30% faster than using 1k-node batches with 10 parallel workers.
- The worst performance occurred when each node was created in its own transaction.

**Conclusion**: batching is strongly preferred. For large imports with a text index, using fewer transactions with larger batches provides the best balance between parallelism and commit overhead. This would be valuable to call out explicitly in the import best practices section of the documentation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update text search best practices regarding scalability #1496

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Update text search best practices regarding scalability #1496

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions