Jia (Kevin) Liu,

Associate Professor of Electrical and Computer Engineering, The Ohio State University

  • Home
  • Research
  • Publications
  • Awards
  • Grants
  • Activities
  • Teaching
  • My Group

ECE 8101: Nonconvex Optimization for Machine Learning
(Autumn 2024)


Personnel

Instructor: Jia (Kevin) Liu, Assistant Professor, Dept. of Electrical and Computer Engineering
Contact: 620 Dreese Labs, Email: liu@ece.osu.edu
Time & Location: TuTh 11:10 AM - 12:30 PM, Baker Systems 140
Office Hours: Wed 5:00 PM - 6:00 PM
TA: Peiwen Qiu, Email: qiu.617@osu.edu
TA Hours: TBD

Course Description

This course will introduce algorithm design and convergence analysis in nonconvex optimization theory as well as their applications in solving modern machine learning and data science problems. The goal of this course is to prepare graduate students with a solid theoretical and mathematical foundation at the intersection of optimization and machine learning so that they will be able to use optimization to solve advanced machine learning problems and/or conduct advanced research in the related fields. This course will take the traditional linear, nonlinear, and convex optimization taught in operation research or related engineering fields (e.g., ECE, CSE) as a prerequisite, and focus on topics in nonconvex optimization that are of special interest in the machine learning community.

Course Materials

There is no required textbook. Most of the materials covered in the class will be based on classical books and recently published papers and monographs. A list of historically important and/or trending papers on ML optimization theory will be provided on the course website.

Paper Reading Assignments

There will be estimated six paper reading assignments, each of which will be assigned during each topic set. Reading assignment must be typeset in ICML format. In each reading assignment, each student writes a review of a set of related papers in a topic set published in recent major machine learning venues (e.g., ICML, NeurIPS, ICLR, AAAI) or on arXiv. Some papers may be from the papers lectured in class. The reviews may include the followings: 1) a summary of the papers and their connections; 2) strengths/weaknesses of the papers from the following aspects: soundness of assumptions/theorems, empirical evaluation, novelty, and significance, etc.; 3) which parts are difficult to understand, questions about proofs/results/experiments (if there are any); and 4) how the papers can be improved and extended.

Final Project

You could choose to finish a project individually or by a team of no more than two persons. Final reports will be due after project presentations in the final week. Final reports should follow the ICML format. Each project is required to have a 20-minute presentation in the final week. Attendance to your fellow students' presentations is required. Potential project ideas include but are not limited to: i) nontrivial extension of the results introduced in class; ii) novel applications in your own research area; iii) new theoretical analysis of an existing algorithm, etc. Each project should contain something new. It is important that you justify its novelty.

Grading Policy

    • Class Participation: 10%; Paper Reading Assignments: 60%; Final project: 30%.

Late Policy

Without the consent of the instructor, late paper reading assignments or final report will not be accepted and will result in a grade of zero. In the case of a conference deadline or something of the like, a 5-day written notice of extension is required. In the case of an emergency (sudden sickness, family problems, etc.), an after-the-fact notice is acceptable. But we emphasize that this is reserved for true emergencies.

Schedule

Here is the full class schedule, which follows the lecture progress and class interests, with some adjustments in the syllabus.

Class
Date
Topics
Lecture Topics Lecture Notes
Lecture Recordings
1
8/22
1. Course Info & Introduction
1. Course Info & Introduction Lecture 1
Video 01-01
2
8/22
2. First-Order Methods for Nonconvex Optimization
2-1. Math Background Review Lecture 2-1
Video 01-02
3
8/27
Video 02-01
4
8/29
2-2. Convexity
Lecture 2-2
Video 02-02
5
9/3
2-3. Gradient Descent
Lecture 2-3
Video 03-01
6
9/5
Video 03-02
7
9/10
2-4. Stochastic Gradient Descent
(General Expectation Minimization,
Finite-Sum Minimization)
Lecture 2-4
Video 04-01
8
9/12
Video 04-02
9
9/17
Video 05-01
10
9/19
2-5. Variance-Reduced Methods
(SAG, SVRG, SAGA, SPIDER, PAGE)
Lecture 2-5
Video 05-02
11
9/24
Video 06-01
12
9/26
Video 06-02
13
10/1
2-6. Adaptive Methods
(AdaGrad, RMSProp, Adam)
Lecture 2-6
Video 07-01
14
10/3
Video 07-02
15
10/8
Video 08-01
Autumn Break
16
10/15
3. Federated and Decentralized Learning
3-1. Federated Learning
(Distributed Learning, FedAvg)
Lecture 3-1
Video 09-01
17
10/17
Video 09-02
18
10/22
Video 10-01
19
10/24
3-2. Decentralized Learning
(Dentralized SGD, Gradient Tracking)
Lecture 3-2
Video 10-02
20
10/29
Video 11-01
21
10/31
Canceled
22
11/5
4. Zeroth-Order Methods for Nonconvex Optimization
4-1. ZO Methods with Random Directions of Gradient Estimation Lecture 4-1
Video 12-01
23
11/7
Video 12-02
24
11/12
4-2. Variance-Reduced Zeroth-Order Methods
Lecture 4-2
Video 13-01
25
11/14
5. Complex-Structured Learning
5-1. Min-Max and Bilevel Optimization Lecture 5-1
Video 13-02
26
11/19
Video 14-01
27
11/21
5-2. Multi-Objective Optimization Lecture 5-2
Video 14-02
28
11/26
Video 15-01
Thanksgiving Break
29
12/3
In-Class Project Presentations Video 16-01
30
12/5
Video 16-02

Academic Integrity

This course will follow OSU's Code of Academic Conduct. Discussions of homework assignments and final projects are encouraged. However, what you turn in must be your own. You should not directly copy solutions from others. Any reference (including online resources) used in your solution must be clearly cited.

 
Copyright © 2004- Jia (Kevin) Liu. All rights reserved.
Updated: . Design adapted from TEMPLATED.