Horizontal scaling
Quix Streams provides out of box horizontal scaling using streaming context for automatic partitioning together with the underlying broker technology, such as kafka.
Imagine the following example:
Each car produces one stream with its own time-series data, and each stream is processed by a replica of the deployment, labelled "Process". By default the message broker will assign each stream to one replica via the RangeAssignor strategy.
When the purple replicas crashes "stream 4" is assigned automatically to the blue replica.
This situation will trigger an event on topic consumer in the blue replica indicating that "stream 4" has been received:
output on blue replica:
When the purple replica has restarted and becomes available again, it signals to the broker and takes control of "stream 4".
This will trigger two events, one in the blue replica indicating that "stream 4" has been revoked, and one in the purple replica indicating that "stream 4" has been assigned again:
def on_stream_received_handler(stream_received: StreamConsumer):
print("Stream received:" + stream_received.stream_id)
def on_streams_revoked_handler(topic_consumer: TopicConsumer, streams_revoked: [StreamConsumer]):
for stream in streams_revoked:
print("Stream revoked:" + stream.stream_id)
topic_consumer.on_stream_received = on_stream_received_handler
topic_consumer.on_streams_revoked = on_streams_revoked_handler
Output on the blue replica:
Output on the purple replica:
The same behavior will happen if we scale the "Process" deployment up or down, increasing or decreasing the number of replicas. Kafka will trigger the rebalancing mechanism internally and this will trigger the same events on Quix Streams. Note that this example assumes perfect conditions, but in reality a rebalance event can shift all streams to different processes. In the library we ensured you only get revocation raised for a stream if it is not assigned back to the same consumer, not while it is rebalancing.
Rebalancing mechanism and partitions
Kafka uses partitions and the RangeAssignor strategy to decide which consumers receive which messages.
Partitions and the Kafka rebalancing protocol are internal details of the Kafka implementation behind Quix Streams. You don’t need to worry about them because everything is abstracted within the Streaming Context feature of the library. The events described above will remain the same, even if Quix Streams uses another message broker technology or another rebalancing mechanism in the future.
Warning
Because of how the Kafka rebalancing mechanism works, you should follow one golden rule: you should not have more replicas than the number of partitions the topic you're subscribing to has, as additional replicas will idle.