Marco Serafini

Data Systems @ UMass


Office: LGRC A335

740 N Pleasant St

Amherst, MA 01003, USA

I am an Assistant Professor in the College of Information and Computer Sciences at the University of Massachusetts Amherst. I am a member of the Data systems Research for Exploration, Analytics, and Modeling (DREAM) lab and of the Center for Data Science.

I lead the Data Systems group, which works on systems for machine learning and data science, data management systems, and parallel and distributed systems. We focus on performance, scalability, fault tolerance, and programming abstractions. My research areas and projects are listed here.

Before joining UMass, I was with Yahoo Research and QCRI. I got my PhD from the Technical University of Darmstadt, Germany.


Jul 2024 Our paper “FlexPushdownDB: Rethinking Computation Pushdown for Cloud OLAP DBMSs” was accepted for publication at the VLDB Journal. We propose a new adaptive mechanism to avoid overloading the computational capacity of the storage layer during computation pushdown, and identify new opportunities for pushing down query operators.
Jun 2024 We published a preprint presenting an extensive comparison of full-graph and mini-batch GNN training systems. We found that mini-batch training systems achieve lower time-to-accuracy in all scenarios we considered and comparable accuracy. More interesting results in the paper!
Mar 2024 I received a new Adobe Research Collaboration Grant on using Temporal Graph Neural Networks for query prediction. Many thanks to Adobe for the continued support!
Jan 2024 Our paper GMorph: Accelerating Multi-DNN Inference via Model Fusion accepted at Eurosys’24. The paper proposes “model fusion”, a new approach to fuse multiple task-specific, pre-trained, and heterogeneous DNNs into a single multi-task model to reduce inference latency.
Aug 2023 GraphMini paper accepted at PACT’23. GraphMini speeds up graph pattern matching, a key step in graph mining, by up to on order of magnitude compared to GraphPi and Dryadic. It builds auxiliary graphs by proactively pruning the input graph during query execution time.
Mar 2023 GSplit preprint published. GSplit is a multi-GPU Graph Neural Network training system that introduces split parallelism to reduce sampling, loading, and training overheads.
Oct 2022 Amazon Research Award on split-parallel graph neural network training (PI).
Jul 2022 NSF CNS Core Small grant on split-parallel graph neural network training (PI).
Aug 2021 FlexPushDownDB paper appeared at VLDB. It investigates the tradeoff between caching data at the query execution server vs. pushing computation to storage in analytical query workloads.
Jul 2021 Test-of-time award for the Zookeeper Atomic Broadcast (Zab) paper at DSN’21.
Jun 2021 Our paper on scalable graph neural network training using sampling appeared in the ACM SIGOPS Operating Systems Reviews.
Apr 2021 NextDoor paper appeared at Eurosys. NextDoor proposes pushing graph sampling to the GPU in order to significantly speed up end-to-end training time for GNNs and graph ML.
Jan 2021 Adobe Research Collaboration Grant on distributed data caching (PI).
Dec 2020 I became an ACM Senior Member.
Aug 2020 Our paper on finding optimal resource configurations on the cloud appeared at VLDB. We evaluate and compare several commonly used black-box optimization algorithms.
Aug 2020 LiveGraph paper appeared at VLDB. LiveGraph is the first graph storage system that supports transactions.
Jul 2020 Facebook Systems for ML Research Award on the NextDoor project, which pushes graph sampling to the GPU for graph machine learning (PI).
Apr 2020 PushDownDB paper appeared at ICDE. It studies the effectiveness of pushing parts of DBMS analytics queries onto the storage layer, specifically the S3 service by AWS.
Aug 2019 Our paper on choosing cloud DBMS appeared at VLDB. We discuss the tradeoffs involved in using shared-nothing vs. shared-storage designs on the cloud, considering different databases.
Mar 2019 I gave a keynote at the DataStax 2019 Product and Engineering Summit.



Md. Ashraful Islam

Juelin Liu

Sandeep Polisetty

Hojae Son


Abhinav Jangda - now at Microsoft Research (co-advised with Arjun Guha)


DSN 2021 Test-of-Time Award for the paper “Zab: High-Performance Broadcast for Primary-Backup Systems”.

Nomination for the “Dissertationspreis” (Doctoral Dissertation Award) by the German, Swiss and Austrian Computer Science societies and the German chapter of ACM.

ACM Senior Member.