I lead the Data Systems group, which works on systems for data science and machine learning, data management systems, and distributed systems.
I am a member of the Data systems Research for Exploration, Analytics, and Modeling (DREAM) lab and of the Center for Data Science.
Phone: 413 577 0354
Email: marco # cs.umass.edu
Office: CS 348
- [8/19/21] - I received a Distinguished Reviewer Award at VLDB’21
- [5/1/21] - Test-of-time award for the paper “Zab: High-Performance Broadcast for Primary-Backup Systems”, which I co-authored with Flavio Junqueira and Benjamin Reed, assigned at the IEEE/IFIP International Conference on Dependable Systems and Networks (DSN) 2021. It was originally published at DSN 2011.
- [1/29/21] - I have received an Adobe Research Collaboration Grant.
- [12/20/20] - I have become an ACM Senior member.
- [02/07/20] - I was a recipient of the Facebook Systems for ML Research Award, with Arjun Guha.
- [10/23/19] - I visited the Bay Area and gave a talk titled “Connected Data: Pushing the Envelope of Data Management Systems” at Facebook, Adobe, LinkedIn.
- [08/30/19] - Our VLDB 2019 paper on benchmarking Cloud DBMSs was featured in The morning paper.
- [03/26/19] - I gave a keynote at the DataStax 2019 Product and Engineering Summit.
Selected and Recent Papers
Systems for graph machine learning, mining, and data management
- Case for Sampling: Marco Serafini, Hui Guan, Scalable Graph Neural Network Training: The Case for Sampling. ACM SIGOPS Operating Systems Review, 55(1), July 2021. Paper.
- NextDoor: Abhinav Jangda, Sandeep Polisetty, Arjun Guha, Marco Serafini, Accelerating Graph Sampling for Graph Machine Learning Using GPUs. Eurosys 2021. Paper - Project page - Code - Reproduce the results.
- LiveGraph: Xiaowei Zhu, Guanyu Feng, Marco Serafini, Xiaosong Ma, Jiping Yu, Lei Xie, Ashraf Aboulnaga, Wenguang Chen, LiveGraph: A Transactional Graph Storage System with Purely Sequential Adjacency List Scans. Int. Conf. on Very Large Data Bases (VLDB) 2020. Paper - Presentation - Code.
- QFrag: Marco Serafini, Gianmarco De Francisci Morales, Georgos Siganos, “QFrag: Distributed Graph Search via Subgraph Isomorphism”. ACM Symp. on Cloud Computing (SoCC), 2017. Paper
- Arabesque: Carlos T. H. Teixeira, Alex J. Fonseca, Marco Serafini, Georgos Siganos, Mohammed J. Zaki, Ashraf Aboulnaga, “Arabesque: A System for Distributed Graph Mining”. ACM Symp. on Operating Systems Principles (SOSP), 2015. Paper - Presentation - Project page - Code - “The morning paper” coverage.
Cloud data processing systems
- Hybrid Caching + Pushdown: Yifei Yang, Matt Youill, Matthew Woicik, Yizhou Liu, Xiangyao Yu , Marco Serafini, Ashraf Aboulnaga, Michael Stonebraker, FlexPushdownDB: Hybrid Pushdown and Caching in a Cloud DBMS. Int. Conf. on Very Large Data Bases (VLDB) 2021. Paper
- Cloud Configurations: Muhammad Bilal, Marco Serafini, Marco Canini, Rodrigo Rodrigues, Do the Best Cloud Configurations Grow on Trees? An Experimental Evaluation of Black Box Algorithms for Optimizing Cloud Workloads. Int. Conf. on Very Large Data Bases (VLDB) 2020. Paper
- PushdownDB: Xiangyao Yu, Matt Youill, Matthew Woicik, Abdurrahman Ghanem, Marco Serafini, Ashraf Aboulnaga, Michael Stonebraker. PushdownDB: Accelerating a DBMS using S3 Computation. IEEE Int. Conf. on Data Engineering (ICDE) 2020 (short). Paper
- Cloud DBMS: Junjay Tan, Matthew Perron, Xiangyao Yu, Thanaa Ghanem, Michael Stonebraker, David DeWitt, Marco Serafini, Ashraf Aboulnaga, Tim Kraska. Choosing a Cloud DBMS: Architectures and Tradeoffs. Int. Conf. on Very Large Data Bases (VLDB) 2019. Paper - “The morning paper” coverage.
- P-Store: Rebecca Taft, Nosayba El-Sayed, Marco Serafini, Ashraf Aboulnaga, Michael Stonebraker, Ricardo Mayerhofer, Francisco Andrade. “P-Store: An Elastic Database System with Predictive Provisioning”. ACM SIGMOD Int. Conf. on Management of Data (SIGMOD) 2018. Paper
- Clay: Marco Serafini, Rebecca Taft, Aaron Elmore, Andrew Pavlo, Ashraf Aboulnaga, Michael Stonebraker. “Clay: Fine-Grained Adaptive Partitioning for General Database Schemas”. Int. Conf. on Very Large Data Bases (VLDB) 2017. Paper
- E-Store: Rebecca Taft, Essam Mansour, Marco Serafini, Jennie Duggan, Aaron J. Elmore, Ashraf Aboulnaga, Andrew Pavlo, Michael Stonebraker. “E-Store: Fine-Grained Elastic Partitioning for Distributed Transactions Processing Systems”. Int. Conf. on Very Large Data Bases (VLDB) 2015. Paper
- Accordion: Marco Serafini, Essam Mansour, Ashraf Aboulnaga, Kenneth Salem, Taha Rafiq, Umar Farooq Minhas. “Accordion: Elastic Scalability for Database Systems Supporting Distributed Transactions”. Int. Conf. on Very Large Data Bases (VLDB) 2014. Paper
Stream processing systems
- Both choices: Muhammad Anis Uddin Nasir, Gianmarco De Francisci Morales, David García-Soriano, Nicolas Kourtellis, Marco Serafini, “The Power of Both Choices: Practical Load Balancing for Distributed Stream Processing Engines”. IEEE Int. Conf. on Data Engineering (ICDE) 2015. Paper
- N choices: Muhammad Anis Uddin Nasir, Gianmarco De Francisci Morales, Nicolas Kourtellis and Marco Serafini. “When Two Choices Are not Enough: Balancing at Scale in Distributed Stream Processing.” IEEE Int. Conf. on Data Engineering (ICDE), 2016. Paper
Fault-tolerant distributed systems
- Gyro: Habib Saissi, Marco Serafini and Neeraj Suri, Gyro: A Modular Scale-Out Layer for Single-Server DBMSs. IEEE Int’l Symp. on Reliable Distributed Systems (SRDS) 2019. Paper
- SEI: Diogo Behrens, Marco Serafini, Flavio P Junqueira, Sergei Arnautov, Christof Fetzer, Scalable error isolation for distributed systems. USENIX Symp. on Networked Systems Design and Implementation (NSDI) 2015. Paper
- PASC: M. Correia, D. Gómez Ferro, F. Junqueira, M. Serafini, “Practical Hardening of Crash-Tolerant Systems”. USENIX Annual Technical Conference (ATC) 2012. Paper
- Zookeeper: F. Junqueria, B. Reed and M. Serafini. “Zab: High-Performance Broadcast for Primary-Backup Systems”. IEEE Int’l Conf. on Dependable Systems and Networks (DSN) 2011. Paper
- Eventual linearizability: M. Serafini, D. Dobre, M. Majuntke, P. Bokor and N. Suri, “Eventually Linearizable Shared Objects”. ACM Symp on Principles of Distributed Computing (PODC), 2010. Paper
- Scrooge: M. Serafini, P. Bokor, D. Dobre, M. Majuntke, and N. Suri, “Scrooge: Reducing the cost of fast Byzantine replication in presence of unresponsive replicas”. IEEE Int. Conf. on Dependable Systems and Networks (DSN), 2010. Paper
- Information Systems (COMPSCI 445)
- Systems for Machine Learning, Machine Learning for Systems (COMPSCI 692S)
Program Committee member: SIGMOD 2022, OSDI 2022, Eurosys 2022, SIGMOD 2022 Demo, Eurosys 2021, ASPLOS 2021, VLDB 2021, ICDE 2021, SIGMOD 2020, SIGMOD 2020 Demo, PaPoC 2020, SRDS 2020, VLDB 2019 Demo, SIGMOD 2019 Industry, DSN 2019, DASFAA 2019, VLDB 2018, ICDE 2018, APSys 2018, OPODIS 2018, SOSP 2017, VLDB 2017, Eurosys 2017, WWW 2017, DSN 2017, IC2E 2017, APSys 2017, Eurosys 2016, ICDCS 2016, ICDE 2016, SRDS 2106, IC2E 2016, SIGMOD 2016 demo, ICDCS 2015, WWW 2015, SRDS 2015, OPODIS 2014, SSS 2014, Middleware 2012 (Industry), EDCC 2012, SOFSEM 2011.
Program Committee chair: DSN 2022 student workshop; LADIS 2018 workshop (Large-Scale Distributed Systems and Middleware), colocated with PODC 2018; PaPoC 2015 workshop (Principles and Practice of Consistency for Distributed Data), colocated with Eurosys 2015.
Other service: Award Committee member: Eurosys Roger Needham PhD Thesis Award 2019. Workshop Chair: IEEE Symp. on Reliable Distributed Systems (SRDS) 2019.
DSN 2021 Test-of-Time Award for the paper “Zab: High-Performance Broadcast for Primary-Backup Systems”.
ACM Senior Member.
Nomination for the “Best PhD Thesis of 2010” by the German, Swiss and Austrian Computer Science societies and the German chapter of ACM.