COMPSCI 692S: SEMINAR - SYSTEMS FOR MACHINE LEARNING, MACHINE LEARNING FOR SYSTEMS

Graduate Seminar, UMass Amherst, , 2020

Over the last few years, a wave of excitement about ML and deep learning has proliferated from academia to industry, transforming prototypes in research labs to valid solutions to real-world problems. Using ML entails developing end-to-end pipelines to collect data, clean it, and run learning and inference algorithms in a scalable manner. This results in computationally intense workloads and complex software pipelines. Systems for ML help users organize their data and scale these computationally intense problems to larger and larger datasets. At the same time, ML is having an increasing impact on systems design. Fine-tuned analytical heuristics and cost models are being replaced by learned models, following trends observed in other fields.

This seminar follows the same structure as COMPSCI 692S offered by Dr. Marco Serafini. It will primarily involve reading, presenting, and discussing recent papers in the domains of ML for systems and systems for ML (1 credit) and a final project focusing on a specific ML system topic (3 credits).

Class Meetings: 08/24/2020 - 11/20/2020, Wed. 11:15AM - 1:15PM, Online via Zoom

Piazza Signup Link: www.piazza.com/umass/fall2020/compsci692s01

Seminar Structure

This seminar has three components: presentations, paper reviews, and projects.

Presentations

All students will be required to prepare at least one presentation. Each presentation will be done by a group of 2-3 students. A presentation will cover an area and present 3 papers from the reading list.

The typical structure of a presentation would be along these lines:

  1. What is the problem being addressed? Give context assuming people know nothing about the area. Why is the problem important? (Approx. 15 minutes)
  2. Present the papers. Focus on the big ideas rather than the technicalities, but give enough details to make the presentation informative. (Approx. 40 minutes)
  3. Discussion: Comparison between the different papers, strengths and weaknesses of each. (Approx. 10 minutes)
  4. Q&A (Approx. 10 minutes)

Reviews

  • Groups will announce the 3 main papers they will present before the class at least 1 week earlier.
  • All students will have to read the papers and write a short review by the end of the day before the class (DDL: Tuesday, 11:59PM).
  • Links to paper review form will be provided on Piazza and the course website.

Projects (3 credits only)

Students that have registered for the 3 credits section will also have to prepare a project. Students (in groups of at most 3) have to pick either

  1. a system problem that can benefit from the use of ML algorithms, or
  2. a use case ML application that requires systems support for data collection, machine learning training, inference, or a pipeline composing these steps. The choice will be agreed upon with the instructor.

Students will then prepare a “problem statement” report where they describe the application and identify challenges in terms of scalability, reducing running time, and/or usability. Students will also propose a solution. The claims need to be validated experimentally.

Students will need to implement the solution and finally write a report describing it. The final report will validate the system design through performance measurements and/or user studies.

Tentative project timeline:

  • Identify a Sys4ML or ML4sys problem and prepare 1-page problem statement (Sept. 20)
  • Evaluate the existing solutions and propose improvements (Nov. 17)
  • Final presentation: poster or slides (Nov. 18)
  • Project Deadline. (Dec. 4)

Grading

For 3-credit session:

  • 20% participation (presentation and discussion)
  • 20% reviews
  • 60% course project:
    • presentation (20%)
    • report (20%)
    • code (20%)

For 1-credit session:

  • 50% participation (presentation and discussion)
  • 50% reviews