Marco Serafini

Data Systems @ UMass


Office: LGRC A335

740 N Pleasant St

Amherst, MA 01003

I am an Assistant Professor in the College of Information and Computer Sciences at the University of Massachusetts Amherst. I am a member of the Data systems Research for Exploration, Analytics, and Modeling (DREAM) lab and of the Center for Data Science.

I lead the Data Systems group, which works on systems for machine learning and data science, data management systems, and parallel and distributed systems. We focus on performance, scalability, fault tolerance, and programming abstractions.

Before joining UMass, I was with Yahoo Research and QCRI. I got my PhD from the Technical University of Darmstadt, Germany.


Aug 2023 GraphMini paper accepted at PACT’23. GraphMini speeds up graph pattern matching, a key step in graph mining, by up to on order of magnitude compared to GraphPi and Dryadic. It builds auxiliary graphs by proactively pruning the input graph during query execution time.
Mar 2023 GSplit preprint published. GSplit is a multi-GPU Graph Neural Network training system that introduces split parallelism to reduce sampling, loading, and training overheads.
Oct 2022 Amazon Research Award on split-parallel graph neural network training (PI).
Jul 2022 NSF CNS Core Small grant on split-parallel graph neural network training (PI).
Aug 2021 FlexPushDownDB paper appeared at VLDB. It investigates the tradeoff between caching data at the query execution server vs. pushing computation to storage in analytical query workloads.
Jul 2021 Test-of-time award for the Zookeeper Atomic Broadcast (Zab) paper at DSN’21.
Jun 2021 Our paper on scalable graph neural network training using sampling appeared in the ACM SIGOPS Operating Systems Reviews.
Apr 2021 NextDoor paper appeared at Eurosys. NextDoor proposes pushing graph sampling to the GPU in order to significantly speed up end-to-end training time for GNNs and graph ML.
Jan 2021 Adobe Research Collaboration Grant on distributed data caching (PI).
Dec 2020 I became an ACM Senior Member.
Aug 2020 Our paper on finding optimal resource configurations on the cloud appeared at VLDB. We evaluate and compare several commonly used black-box optimization algorithms.
Aug 2020 LiveGraph paper appeared at VLDB. LiveGraph is the first graph storage system that supports transactions.
Jul 2020 Facebook Systems for ML Research Award on the NextDoor project, which pushes graph sampling to the GPU for graph machine learning (PI).
Apr 2020 PushDownDB paper appeared at ICDE. It studies the effectiveness of pushing parts of DBMS analytics queries onto the storage layer, specifically the S3 service by AWS.
Aug 2019 Our paper on choosing cloud DBMS appeared at VLDB. We discuss the tradeoffs involved in using shared-nothing vs. shared-storage designs on the cloud, considering different databases.
Mar 2019 I gave a keynote at the DataStax 2019 Product and Engineering Summit.



Md. Ashraful Islam

Juelin Liu

Sandeep Polisetty

Hojae Son


Abhinav Jangda - now at Microsoft Research


DSN 2021 Test-of-Time Award for the paper “Zab: High-Performance Broadcast for Primary-Backup Systems”.

Nomination for the “Best PhD thesis of the year” by the German, Swiss and Austrian Computer Science societies and the German chapter of ACM.

ACM Senior Member.