Systems for Machine Learning, Machine Learning for Systems (COMPSCI 692S)
Machine learning is employed in an increasingly wide range of applications. Using ML entails developing end-to-end pipelines to collect data, clean it, and run learning and inference algorithms in a scalable manner. This results in computationally intense workloads and complex software pipelines. Systems for ML help users organize their data and scale these computationally intense problems to larger and larger datasets. At the same time, ML is having an increasing impact on systems design. Fine-tuned analytical heuristics and cost models are being replaced by learned models, following trends observed in other fields. This seminar will review cutting-edge research on these topics and allow students to work on a hands-on project. This course will primarily involve reading, presenting, and discussing papers (1 credit), and a final project building an end-to-end machine learning pipeline (3 credits).
Class meetings: Wednesday 11:15 AM-1:15 PM, CS 142
All students will be required to prepare a tutorial. Each tutorial will be presented by a group of 3 students. A tutorial will cover an area and present at least 3 papers from the reading list.
The typical structure of a presentation would be along these lines:
- What is the problem being addressed? Give context assuming people know nothing about the area. Why is the problem important? (Approx. 15 minutes)
- Present the papers. Focus on the big ideas rather than the technicalities, but give enough details to make the presentation informative. (Approx. 40 minutes)
- Discussion: Comparison between the different papers, strengths and weaknesses of each. (Approx. 10 minutes)
- Q&A (Approx. 10 minutes)
Groups will announce the 3 main papers they will present before the class. All students will have to read the papers and write a short review by the end of the day before the class. Links will be provided on Piazza and the course website.
Projects (3 credits only)
Students that have registered for the 3 credits section will also have to prepare a project. Students (in groups of 3) will also have to pick either (1) a systems problem that can benefit from the use of ML algorithms, or (2) a use case ML application that requires systems support for data collection, data cleaning, machine learning, inference, or a pipeline composing these steps. The choice will be agreed upon with the instructor.
Students will then prepare a “problem statement” report where they describe the application and identify challenges in terms of scalability, reducing running time, and/or usability. Students will also propose a solution. The claims need to be validated experimentally.
Students will then implement the solution and finally write a report describing it. The final report will validate the system design through performance measurements and/or user studies. There will be a final presentation for all projects.