Skip to content

Consumer backoff halts all processing even if one target collection is in bad state. #89

@sigram

Description

@sigram

Here's the scenario:

  • create a number of 1:1 collections in source / target Solr
  • start indexing (in my tests I was simply sending a constant stream of random documents in a round-robin fashion to each collection)
  • simulate a problem in ONE of the target collections. I simply deleted it, but in real life scenario it could've been any other kind of breakdown, or the following hypothetical scenario:
    • with a high-enough traffic there will be a number of messages in-flight between source and target. Assume the ops decided to remove (simultaneously) one of the collections, both at source and at target, without waiting for all messages for that collection to drain. Now the Consumer will pick up queued in-flight messages intended for the no-longer existing target collection.
  • Consumer will attempt to send picked up requests addressed to a no-longer existing (or functional) collection, to which Solr will respond with errors.
  • this will trigger back-offs, which will eventually halt ALL processing, also for the remaining healthy collections.

Is this behavior the best we can do? I'm not sure, I would expect the Consumer to continue processing requests for healthy collections. At the very least we should offer some protection against the hypothetical I mentioned above.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions