Spring 2025 Projects
Implementing Undetectable Backdoors in Machine Learning Models
Faculty Mentor: Daniel Shumow
Project Description: A recent paper showed how to implement undetectable backdoors in machine learning models. href=”https://ieeexplore.ieee.org/document/9996741″
The work is fairly theoretical, and it is unclear how difficult this would be to implement in practice. As such, the purpose of this project will be to go over this paper, understand it, and then try to implement these backdoors.
Project Level: Intermediate: students who have taken Math 300
Additional Course Requirements: Linear algebra would be helpful.
Programming Requirements: Python
Random Projections and Large-Scale Networks in AI Recommendation Systems
Faculty Mentor: Drs. Tvrtko Tadić, Raunak Kumar, Cassiano Becker
Project Description: Networks, also known as graphs, are powerful structures that encapsulate a wealth of information about the relationships between various data entities. In today’s data-driven world, these networks have grown significantly in size and complexity, presenting a formidable challenge in extracting meaningful insights from them.
In this project, we will delve into the technique of using random projections to encode node information for large graphs. This approach aims to simplify the representation of nodes while preserving their essential characteristics, making it easier to analyze and interpret the data.
We will explore several methods to define node similarity and measure the quality of a recommendation system. Additionally, there will be opportunities to investigate both theoretical problems like concentration estimates and practical challenges such as evaluating quality using Large Language Models.
Participants in this project will gain hands-on experience with advanced data science techniques and tools based on recent work and experience from Microsoft’s Graph Intelligence Sciences Team. They will develop a deep understanding of heterogeneous graphs, Gaussian and Rademacher random projections, and similarity measures. Furthermore, they will learn how to implement and evaluate recommendation systems in today’s AI landscape.
Project Level: Advanced: students who have taken multiple upper-level mathematics courses
Additional Course Requirements: We expect students have taken their second course in Linear Algebra and Probablity (or courses covering similar material). Some knowledge of Statistics and Numerical Mathematics would be good to have.
Programming Requirements: Python
Wild Knot Mosaics
Faculty Mentor: Dr. Allison K. Henrich and Andrew Tawfeek
Project Description: Mosaic diagrams were developed in 2008 by Lomanoco and Kauffman to build quantum knot systems. Since then, the structure of mosaics has been widely studied by many others due to their convenient way of encoding a knot in three-dimensional space as a matrix. In 2014, it was shown by Kuriya and Shehab that mosaics are a complete invariant for tame knots, which are the concrete knots we can make with string in everyday life.
In this project, we attempt to push the capabilities of the mosaic representation further: by adapting them in order to capture knots that are not tame, namely wild knots. These knots can have infinite tangles and diverge towards infinity in various locations along the strand — but by adapting the structure of mosaic diagrams, some of these pathological objects may be represented, such as the Alexander horned sphere. We will spend this project formalizing the theory behind these wild mosaics, writing code along the way for their study and investigation.
Project Level: Intermediate: students who have taken Math 300
Additional Course Requirements:
Programming Requirements:
Musical Billiards
Faculty Mentor: Dr. Jayadev Athreya
Project Description: We will explore the language of billiards in regular polygons: a point mass, moving at unit speed, with no friction, and bouncing off walls with angle of incidence = angle of reflection. The associated language is given by labeling all the sides, and considering the set of all possible codings of trajectories. We will explore this with coding and music, by assigning musical notes to the sides, and seeing what musical phrases we can get.
Project Level: Intermediate: students who have taken Math 300
Additional Course Requirements: Differential Equations
Programming Requirements: python and java would be great
Self-Organized Criticality
Faculty Mentor: Dr. Christopher Hoffman
Project Description: Self-Organized Criticality is a concept from mathematical physics that claims to explain diverse phenomenon such as earthquakes, avalanches and forest fires. These systems all have energy that builds up slowly and then is quickly released. The amount of energy released can vary widely from very small earthquakes that can barely be felt to large earthquakes that can destroy cities.
Activated random walk is a probabilistic model that is believed to share some crucial properties with models form self-organized criticality. This model involves many particles that are performing random walks that interact with each other. We will numerically study the distribution of the amount of energy released in the activated random walk model.
Project Level: Intermediate: students who have taken Math 300
Additional Course Requirements: 394/5/6 would be helpful.
Programming Requirements: programming skills are important
Number Theory and Noise
Faculty Mentor: Dr. Matthew Conroy
Project Description: Students will learn some elementary number theory (and other mathematics) via the creation of audio representations of integer sequences. The project page is here: https://sites.math.washington.edu/~conroy/WXML/integerSequenceNoise/project.htm Students should be open to listening carefully to noisy sounds and doing a little programming (zero experience required or expected!). By listening to the sounds and asking questions about phenomena heard therein, we will investigate number theoretic topics (and topics related to hearing, digital audio, fourier analysis, etc.) Students will contribute to the mathematical community by adding to a library of sounds publicly available to other researchers.
Project Level: Open: students who have taken Math 126
Additional Course Requirements:
Programming Requirements: None, but some python or PARI/GP programming will happen. Students will be given much assistance and guidance in this. No previous knowledge necessary.
Creating 3D Prints and Animations to Help Teach Calculus
Faculty Mentor: Dr. Andy Loveless
Project Description: We will create a collection of 3D printed shapes that come directly from our calculus course materials. The goal will be to make them for the MSC and for instructors to use in their classrooms. We will attempt to make 3D animations related to these objects in order to bring several ideas from our courses to life in an interesting and beautiful way.
Project Level: Open: students who have taken Math 126
Additional Course Requirements:
Programming Requirements: Would be helpful if they had some experience with 3D graphing in some form and/or if someone in the group had experience with 3D printing.
Structured and Sparse Covariance Estimation
Faculty Mentor: Dr. Bamdad Hosseini
Project Description: Covariance estimation and approximation are fundamental tasks in statistics and machine learning. We will begin by studying the relationship between sparsity structures in the inverse of the covariance matrix and Cholesky factors, and conditional independence of the multivariate Gaussian random variables. Next, we will examine the problem of estimating a covariance matrix given structural knowledge, such as a parametric form, sparsity constraint, or tensorization.
The main goal of the project will be to develop efficient algorithms for estimating and computing with Gaussian processes. Some particular topics of interest are nonparametrically covariance models for Gaussian process, fast sparse approximate factorizations of kernel matrices and structured inverse covariance estimation.
Project Level: Advanced: students who have taken multiple upper-level mathematics courses
Additional Course Requirements: Linear Algebra is necessary (MATH 208/AMATH 352). Any experience with probability and statistics would be very helpful.
Programming Requirements: Some basic programming is necessary. Python is preferred, any other experience is a plus.
Quantum Probability via Arbitrary Functions
Faculty Mentor: Dr. Benjamin Feintzeig
Project Description: How do deterministic physical systems give rise to probabilities? In classical probability theory, one can show that some differential equations governing the dynamical evolution of physical systems lead to “universal behavior” in the limit of large times. More specifically, the system almost always approaches the same final distribution for certain physical variables, independent of the distribution of initial conditions one began with. This is called the classical method of arbitrary functions—where the arbitrary functions refer to the possible initial conditions, which turn out to be irrelevant to the long time behavior. Such results are thought to explain why physical systems display the probabilities they do, e.g., why a tossed coin lands heads with probability 1/2 or a roulette wheel has equal probability of landing on each number. In this project, we aim to extend the method of arbitrary functions to systems described by quantum physics, where both the dynamical equations and conceptual framework for probability have novel features. We will explore the behavior for concrete physical models in the limit of large times and the classical limit of small values of Planck’s constant.
Project Level: Advanced: students who have taken multiple upper-level mathematics courses
Additional Course Requirements: Some background in linear algebra, analysis, and abstract algebra will be helpful for this project. No background in physics is required.
Programming Requirements:
Optimizing L_2 discrepancies
Faculty Mentor: François Clément
Project Description: Monte Carlo methods approximate integrals by sampling random points, using the regularity of randomness to provide estimates with a known error bound. In practice, we can do better by using low-discrepancy point sets: these are deterministic constructions, where the points are designed to be as well-spread as possible according to some measure, the L_2 discrepancy. Constructing these points is an old topic (nearly 100 years old), and we still do not know how good these point sets can be! Recent breakthroughs using machine learning and mathematical programming suggest that optimization methods can greatly improve on mathematical constructions. The aim of this project is to use simpler approaches to obtain good constructions, or to identify key structural elements that would make optimization easier.
Project Level: Advanced: students who have taken multiple upper-level mathematics courses
Additional Course Requirements: Math 408
Programming Requirements: Strong programming skills (ideally C/C++, but other languages are fine) are a must, as well as knowledge of optimization techniques and how to implement/use them.
Statistical model selection over partially ordered sets
Faculty Mentor: Dr. Armeen Taeb
Project Description: A common approach to statistical model selection – particularly in scientific domains in which discoveries must be both reliable and reproducible – is to develop powerful procedures that control false discoveries (including attributes into the learned model that are unnecessary and do not improve fit to data). The classical perspective on false discovery control is based on formulating and testing a (possibly large) collection of binary hypotheses. However, in many modern data analysis tasks such as clustering and causal discovery, the underlying decision space is not naturally characterized by Boolean logical structure, but rather by partially ordered sets (posets).
Thinking about posets for model selection is a powerful new perspective (see my paper https://www.pnas.org/doi/10.1073/pnas.2314228121), with a lot of interesting open questions. Examples of some of the questions that we can think about together: Can we improve the speed of the existing algorithms in my paper by using clever computational tricks? Can we use use problem-specific poset structure to speed up computations when searching for good models? What attributes of a poset structure determine the power and computational complexities of algorithms for model selection over posets?
Project Level: Advanced: students who have taken multiple upper-level mathematics courses
Additional Course Requirements: Math 381, Math 394, Math 395
Programming Requirements: Python or R
Solving polynomial equations
Faculty Mentor: Dr. Rekha Thomas
Project Description: In this project we will first learn how to solve systems of polynomial equations which underlies numerous applications. Then we will focus on a number of applications to areas such as optimization, combinatorics and computer science.
Project Level: Intermediate: students who have taken Math 300
Additional Course Requirements: Math 318/340, 402 if possible
Programming Requirements: Students must have a solid interest in computing. Potential software to be used are Sagemath, Maple/Mathematica
Managing delays in publice transit systems
Faculty Mentor: Dr. Giovanni Inchiostro
Project Description: Mathematical optimization models are used everywhere in our lives to solve real-world problems, with a wide range of applications in computer science, mechanical engineering, operations research, and in other areas of mathematics. A less-recognized application area for optimization is the management of public transportation systems.
In this project, we will analyze problems and algorithms used for managing delays in public transit systems. Drawing from the book Optimization in Public Transportation by Anita Schöbel, we will develop models and test algorithms on small examples, with the goal of implementing our algorithms on real-world data from Seattle’s public transportation network.
Project Level: Intermediate: students who have taken Math 300
Additional Course Requirements: Linear Algebra, some familiarity with optimization is preferred (e.g. MATH 407)
Programming Requirements: Python is preferred.
Formalizing polyhedral geometry
Faculty Mentor: Caelan Ritter
Project Description: This project involves formalizing mathematics with the Lean Theorem Prover. Students will learn how to use the Lean programming language to verify the proofs of mathematical statements. If you have never heard of the concept of mathematical formalization and are curious, check out the “Natural Number Game” (https://www.ma.imperial.ac.uk/~buzzard/xena/natural_number_game/).
In this project, the goal is to formalize the basic results in polyhedral geometry, up to the equivalence of definitions of a polyhedron in terms of halfspaces and convex hulls.
A polyhedron is a generalization of a polygon to n dimensions. We can define it either as an intersection of finitely many half-spaces (i.e., the closure of one “side” of a hyperplane) or, if it is bounded, as the “convex hull” of finitely many points. An explicit goal of this project is to formalize enough of the basic definitions and results in polyhedral geometry so that we can write down the equivalence of the two definitions above. We will follow the proof outline in chapter 1 of Gaku Liu’s notes.
Project Level: Intermediate: students who have taken Math 300
Additional Course Requirements:
Programming Requirements:
Reinforcement learning to efficiently compute polynomials
Faculty Mentor: Dr. Jarod Alper
Project Description: We will attempt to use reinforcement learning to train computers to search for efficient arithmetic circuits computing specific polynomials. In some sense, this is a simplified version of proof generation, where computers are trained to search for proofs of theorems, except that in this case the search space is smaller. Our inspiration is the AlphaZero algorithm for two-players games, which we will adapt to a one-player game. The problem we will attack is to find efficient ways to compute a polynomial f(x_1, …, x_n) using an arithmetic circuit consisting of + and x gates together with scalar multiplication. Our strategy is to use Monte Carlo Tree Search to train a deep neural network to learn an effective evaluation function (giving a score of how close the current state is from computing the polynomial) and a policy (a probability distribution over the next moves, with higher probability given to moves that simplify the expression). The computer is rewarded when it finds efficient arithmetic circuits.
Project Level: Intermediate: students who have taken Math 300
Additional Course Requirements:
Programming Requirements:
Neural Theorem Proving
Faculty Mentor: Vasily Ilin
Project Description: Formalization is the process of translating human-written mathematical proofs into a form that can be verified by a computer. A popular tool for this is Lean, a proof assistant that represents proofs as code. However, the process of formalizing proofs in Lean can be slow and time-consuming. Our research explores neural theorem proving strategies, which aim to automate the generation of Lean proofs.
We propose a tree-based search framework to formalize mathematical theorems in Lean using Language Models. This approach explores potential proof steps as branches in a tree, using AI models to suggest “tactics” at each node. This has the benefit of avoiding hallucinations by rigorously checking that AI suggestions represent valid Lean code. We employ both Large Language Models such as Claude Sonnet 3.5 and specialized fine-tuned Small Language Models such as Lean-Dojo. We use Pantograph to interact with Lean, leveraging its native support of Monte Carlo tree search. We assemble a small set of simple and medium-difficulty mathematical theorems to benchmark against, called nanoF2F. Additionally, we benchmark our system on the well-established miniF2F benchmark created by OpenAI.
Project Level: Intermediate: students who have taken Math 300
Additional Course Requirements:
Programming Requirements:
Metaprogramming and writing new tactics
Faculty Mentor: Dhruv Bhatia
Project Description: The goal is to learn Lean’s metaprogramming framework for tactic writing and to experiment with writing our own tactics. The current task at hand is to write a tactic that can, given a set of hypotheses of the form a < b where a and b come from a partially ordered type, prove a goal of the same form by chaining together relevant hypotheses.
Project Level: Intermediate: students who have taken Math 300
Additional Course Requirements:
Programming Requirements: