Optimization for Data Science

This project explores advanced optimization techniques for solving minimum cost flow problems with quadratic, convex, and separable objective functions. The work focuses on implementing and comparing different algorithmic approaches to network flow optimization.

The core problem addresses the challenge of finding optimal flow allocation in directed networks where transportation costs follow a quadratic relationship. Mathematically, the problem is formulated as minimizing xTQx + qx subject to flow conservation constraints Ex = b and capacity constraints 0 ≤ x ≤ u, where Q is a positive semidefinite diagonal matrix representing quadratic costs, E is the node-arc incidence matrix, and b represents supply/demand at each node.

This problem has practical applications in logistics, resource allocation, and network design, where costs often increase non-linearly with flow volumes. Our investigation implements three distinct solution approaches: a Lagrangian dual-based algorithm with smoothing, Nesterov's Fast Gradient Method with dynamic parameters, and the CVXPY optimization library with the SCS solver.

This project provided insights into the practical challenges of convex optimization and the significant gap between theoretical guarantees and real-world performance. I learned that the smoothing parameter μ plays a critical role in algorithm convergence, too small leads to numerical instability and tiny step sizes, while too large distorts the original problem excessively.

The implementation revealed that dynamic parameter adjustment dramatically improves performance over fixed parameters. Our variant with dynamic μ achieved convergence in approximately 1,676 iterations on small matrices compared to 28,784 iterations with constant parameters. However, accessing the optimal value a priori provides substantial advantages, reducing relative errors from 4.1 to 0.082 on complex instances.

I discovered the importance of problem conditioning in optimization algorithms. Fast Gradient methods performed well on well-conditioned problems but struggled with ill-conditioned matrices, while the SCS solver maintained consistent performance across all test cases. The Lagrangian relaxation approach proved powerful for decomposing the problem into parallelizable sub-problems, exploiting the separable structure of the objective function.

Perhaps most importantly, I learned that specialized solvers like SCS, built on years of research and optimization, significantly outperform custom implementations in terms of both speed (25-2925 iterations vs. 150,000) and robustness. This underscored the value of understanding when to build custom solutions versus using existing tools, and highlighted the challenges of translating theoretical convergence rates into practical algorithms.