# Course Projects

Students will be expected to work on a theory-focused project (in groups of up to 2 students). The aim of this project is to prepare students for research and publication in the space of learning, dynamics and control. While it is strictly speaking optional, students are encouraged to think of the project as a potentially publishable research project.

### Important Dates

**Project Proposal Due:**Friday, October 11**Midterm Report Due:**Friday, November 8**Final Report Due:**Friday, December 20

### Reports and preliminary deadlines

**Report Format:**The report must be written up in LaTeX in single column style in the article document class. Please use the letterpaper and 11pt options with standard line-spacing.**Project Proposal:**Your proposal should be 2 pages maximum (not including references), and should include title, team members, abstract, related works, problem formulation and goals.**Midterm Report:**Your report should be 4 pages maximum (not including references). Your midterm report should build on your project proposal, and outline your solution approach, current progress and preliminary results, as well as highlight challenges that you are facing.**Final Report:**Your report should be 10 pages maximum (not including references and supplementary material). Your final report will be evaluated by the following criteria:**Merit:**Is your problem formulation and solution strategy well-motivated? Can you justify the complexity-level of your approach?**Technical depth:**Is your project technically challenging? Did you write your own code, or did you use a available software packages? While it is ok for a project to lean more towards theory or implementation, the sum of theoretical + implementation efforts should remain at least constant (i.e., if you use existing software packages rather than write your own code, the theoretical component of your project should be more ambitious).**Presentation:**Are your solution approach, assumptions, results, and interpretations of experimentaltheoretical outcomes clearly explained andor justified? Is the report clearly and written? Are the mathematical arguments rigorous and easy to follow? Are graphs/visualizations clear?

### Project Ideas

Note that you are free (and encouraged) to suggest your own project ideas. However, we ask that you discuss these—in a timely manner—with the instructor before the initial proposal is due.

A few tentative ideas along with papers to read to get you started are also listed below:

- System Identification with Partial Observability: Understand and survey how non-asymptotic results are obtained when the learner is not given direct state access.
- It may also be of interest to explore connections to L1- and Nuclear norm regularization (and even implicit regularization)
- Suggested reading: System Identification: Theory for the User, Revisiting Ho–Kalman-Based System Identification: Robustness and Finite-Sample Analysis, Finite Sample Analysis of Stochastic System Identification, Rate-Optimal Non-Asymptotics for the Quadratic Prediction Error Method

- Imitation Learning: Roughly speaking imitation learning synthesizes understanding of supervised learning and the mitigation of a special form of distribution shift. Understand and survey this field.
- It might be a good place to start to understand the relation of this field to the material in the course.
- Suggested reading: Toward the Fundamental Limits of Imitation Learning, TaSIL: Taylor Series Imitation Learning, Is Behavior Cloning All You Need? Understanding Horizon in Imitation Learning

- Self-Supervised Learning: In a nutshell, next-token prediction, used in pre-training large language models, is structurally not much different from the autoregressions we have analyzed in this course.
- Survey and understand the available theoretical progress in this space.
- It may be helpful focus on understanding how things change once we leave square loss behind and use the logarithmic loss function.
- Suggested reading: Mamba: Linear-Time Sequence Modeling with Selective State Spaces, From Self-Attention to Markov Models: Unveiling the Dynamics of Generative Transformers

- Policy optimization methods:
- conduct a survey on results showing that direct policy optimization is effective for linear optimal control.
- Can these ideas be extended to nonlinear systems?
- Suggested Reading: Global convergence of policy gradient methods for the linear quadratic regulator, Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition, Stabilizing Dynamical Systems via Policy Gradient Methods, Towards a Theoretical Foundation of Policy Optimization for Learning Control Policies

- Information-Theoretic Analyses: While much of the standard analysis of learning algorithms proceeds via uniform convergence, a powerful alternative is the information-theoretic (~change-of-measure, ~PAC-Bayes) approach.
- Survey this approach and discuss how it can be applied to learning in control and dynamical systems.
- Can you extend A Short Information-Theoretic Analysis of Linear Auto-Regressive Learning to more general system classes?
- Suggested Reading: From ε-Entropy to KL-Entropy: Analysis of Minimum Information Complexity Density Estimation, Information-theoretic analysis of generalization capability of learning algorithms, Information Theoretic Regret Bounds for Online Nonlinear Control, Information-Theoretic Foundations for Machine Learning, A Short Information-Theoretic Analysis of Linear Auto-Regressive Learning

- Multi-Task/Representation Learning:
- Conduct a survey of existing results that highlights connections and significance to the L4DC space
- Suggested Reading: Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks, Few-Shot Learning via Learning the Representation, Provably, Guarantees for Nonlinear Representation Learning: Non-identical Covariates, Dependent Data, Fewer Samples

- Control as Online Optimization:
- Survey this line of work and discuss how it relates to traditional adaptive control
- Suggested Reading: Introduction to Online Nonstochastic Control, Improper Learning for Non-Stochastic Control

- The interplay between computation and information: under certain widely believed conjectures in theoretical computer science it can be shown that there are fundamental trade-offs between computational and statistical complexity in sparse estimation. In particular, the LASSO incurs additional dependence on the data’s condition number and this is believed to be unimprovable in general.
- Survey this area and discuss its significance for learning high-dimensional linear dynamical systems
- Suggested Reading: Lower Bounds on the Performance of Polynomial-time Algorithms for Sparse Linear Regression

**Please do not feel restricted by the above**: The theme of “learning, dynamics and control” is to be interpreted broadly and you are more than welcome to suggest other topics, including, but not limited to: diffusion models, reinforcement learning, next-token prediction, adaptive control, robot learning, and so on.