Latency and cost-aware consumer group autoscaling in message broker systems

Publication date

2025-07

Authors

Landau, Diogo
Saurabh, NishantORCID 0000-0002-1926-4693ISNI 0000000512605880
Andrade, Xavier
Barbosa, Jorge G.

Editors

Advisors

Supervisors

Document Type

Article
Open Access logo

License

cc_by

Abstract

Message brokers often facilitate communication between data producers and consumers by adding variable-sized messages to ordered distributed queues. Our goal is to determine the number of consumers and consumer partition assignments needed to ensure that the data consumption rate matches the data production rate. We model this problem as a variable item size bin packing problem. As the production rate varies, new consumer–partition assignments are computed, potentially requiring the reallocation of partitions from one consumer to another. During reallocation, data in the queue are not read, leading to increased latency costs. To address this problem, we focus on the multiobjective optimization cost of minimizing the number of consumers and reducing latency. We introduce several heuristic algorithms and compare them to state-of-the-art heuristics. In our experimental setup, the proposed modified worst fit (MWF) heuristic achieves a 48% reduction, with a similar number of consumers, in comparison with the best fit decrease (BFD). In addition, MWF achieves a 99th percentile latency of 2.24 seconds compared with that of 364.66 with the approach by Kafka using the same number of consumers. Alternatively, to obtain a lower 99th percentile latency than our approach does, Kafka requires at least 60% more consumers than our method requires.

Keywords

Bin packing, Consumer group autoscaling, Message broker, Variable item size, Software, Theoretical Computer Science, Hardware and Architecture, Computer Networks and Communications, Artificial Intelligence

Citation

Landau, D, Saurabh, N, Andrade, X & Barbosa, J G 2025, 'Latency and cost-aware consumer group autoscaling in message broker systems', Journal of Parallel and Distributed Computing, vol. 201, 105071. https://doi.org/10.1016/j.jpdc.2025.105071