stream processing systems

Load balancing with partial key grouping

I studied the problem of workload skew also in the context of stream processing systems. I proposed one of the first load balancing techniques to address this problem, which is called partial key grouping. The technique has been incorporated into Apache Storm and became a popular baseline for other load balancing algorithms.

References

2016

  1. When Two Choices are not Enough: Balancing at Scale in Distributed Stream Processing
    Muhammad Anis Uddin Nasir, Gianmarco De Francisci Morales, Nicolas Kourtellis, and Marco Serafini
    In Proceedings of the 32nd IEEE International Conference on Data Engineering (ICDE), 2016

2015

  1. The Power of Both Choices: Practical Load Balancing for Distributed Stream Processing Engines
    Muhammad Anis Uddin Nasir, Gianmarco De Francisci Morales, David Garcia-Soriano, Nicolas Kourtellis, and Marco Serafini
    In Proceedings of the 31st IEEE International Conference on Data Engineering (ICDE), 2015