19-信息工程-数模-谈谈我的国际特等奖与感悟

陈梓康收录于竞赛

2022-10-16 约 5107 字预计阅读 11 分钟 - 次阅读

前言：本人是一个平时相处起来，令人感觉非常低调，但是又愿意默默努力的人。三年的上海大学生活，如白驹过隙，如石火光阴。但尽管如此，在本人的坚持和努力下，我想我也许应该有资格在这大佬云集的SHUfly社区里分享一些在大学生数学建模竞赛中的个人心得。在此抛砖引玉，如有错误欢迎指出。

本篇文章的核心就是美国大学生数学建模竞赛（MCM/ICM），同时可能会夹杂一些其他内容（比如学习、日常之类），因为本人还是希望能够全面得展现出来，以供读者参考。此外，如果读者们对其他感兴趣，本人也乐意分享更多有意义的事情/攻略。共勉。

交代情况

2019级，上海大学本科，计算机工程
江西人
本科绩点：3.8/4.0（4.5/5.0），最好的成绩是学院3/200+
本科获奖：本科生校长奖学金*，国奖奖学金，美国大学生数学建模竞赛（MCM/ICM）Outstanding Winner（国际特等奖，队长），全国大学生“互联网+”创新创业竞赛银奖（专利一篇），国家级大创项目一项，亚太杯数学建模竞赛全国一等奖，学业特等奖学金，等6项国际/省级竞赛奖，13项国家级/校级奖学金
读研方式：弃研出国
研究生院校：Alliance Sorbonne Université（索邦）- Artificial Intelligence and Data Science
语言：中文、English、Français
目前大方向：Operations Research Engineer

Preface
Introduction of MCM/ICM
Preparation for the contest

Preface

…

Introduction of MCM/ICM

MCM : the Mathematical Contest in Modeling, while ICM : the Interdisciplinary Contest in Modeling. Sponsors of MCM/ICM (same famous associations) : MAA, SIAM, INFORMS, ASA, AMS, Two SIGMA.

COMAP’s Mathematical Contest in Modeling (MCM) and Interdisciplinary Contest in Modeling (ICM) are international contests for high school students and college undergraduates. The contests challenge teams of up to three students to analyze, model, solve, and present solution reports to an open-ended application problem. In these contests research, analytics, and applied intelligence reign along with less-quantifiable factors like timing and luck.

Contest teams of up to three students address one of the following six problems choices over the period of the contest weekend :

MCM Problem A (continuous)
MCM Problem B (discrete) (our team chose it for the discrete optimization and simulation)
MCM Problem C (data insights)
ICM Problem D (operations research/network science)
ICM Problem E (sustainability)
ICM Problem F (policy)

Duration of MCM/ICM : about 96 hours (4 days) Deliverable : An academic paper (20-25 pages), codes (maybe)

The results of the MCM/ICM in 2022 (only for the Outstanding Winner):

Problem	Total number of the teams	Number of the Outstanding Winner	Rate
A	2883	7	0.24%
B	2179	6	0.27%
C	10043	10	0.10%
D	955	4	0.42%
E	8181	10	0.12%
F	2964	7	0.24%

Preparation for the contest

Team amd meamates

Common characteristics of the three teammates
- Good understanding and application of basic mathematical knowledge ( (Partial) Differential Equations, Probability and Statistics, Matrix Operations, Numerical Operations, Game Theory, Graph Theory, Optimization Theory, Machine Learning, Data Mining, etc.)
- Fast learning, self-learning, attitude of keeping improving
- Motivation, ambition
- Teamwork, team communication (project experience)
- Ability to read and draft/write papers (research experience), high English level (CET-6 520+). Research experience is very important !
- An excellent leader (one is enough)
Combinations of the teammates (majors) Here are some combinations about the majors which I recomend :
- Mathematics (applied) + Big Data (Python or Matlab) + Other
- Mathematics (applied) + Finance + Other (Python or Matlab)
- Mathematics (applied) + Management + Other (Python or Matlab)
- Mathematics (applied) + Communication Engineering (Python or Matlab) + Other We note that most students in CS (computer sci) are excellent in C++, Java, html, etc., but may not be good at data processing and analysing. You have to check it. : )
Schedule to evaluate and improve your nice team
1. You having some experience in these contest, I recommend that you find your teamates and build your team before the beginning of summer vacation (July), considering a small mathematical contest in modeling in August which coulde be a small test or an evaluation about the modeling abilities of your team.
2. If you join also the Contemporary Undergraduate Mathematical Contest in Modeling (CUMCM) in September, you have to evaluate your teamates for the next cooperation during and after CUMCM. After the Contemporary Undergraduate Mathematical Contest in Modeling (CUMCM).
3. It is strongly recommended that, you have to organize 1 to 5 mock contests consciously, in which you slove one of the past year MCM/ICM problems during the weekend or the vacation (in 96 hours).
4. After the mock contests, it is needed to submit the paper to your supervisor/tutor to aquire the feedbacks accordingly, based on which you revise and refine your preparation plans.
5. Earlier, I mentioned a good ability to read papers. That’s because that’s what I asked my team to do in the first place : set a time limit (a week also) to read a famous paper, dissect it (goals, methods, innovation, structure of the paper, etc.), and show it to my teammates.

Knowledge in Optimization

Here, I would like to share one part of my notes for the preparation, which takes me a lot of time to translate it in English and write it in “.mk” (The word count of the original notes is 10k+). Ah, suffering.

Planning Model

Linear programming model
1. Decision variables, slack variables, residual variables
2. [x, fval] = linprog(c, A, b, Aeq, beq, VLB, VUB, x0)
Graphical method (unreliable)
(Mixed) Integer Programming Models
1. Extreme point test (basically useless because there are too many variables)
2. Pruning method/branch and bound method (reduce the domain of definition according to the value of the objective function), the relaxation problem needs to be determined
3. 0-1 integer programming
  1. [x, fval] = intlinprog(f, intcon, A, b, Aeq, beq, VLB, VUB)
  2. intcon~ the serial number of the variable whose definition domain is an integer
Assignment problems (assigning different people to do different things): can be used to search for permutations
1. Hungarian assignment method (HA) (complexity O(n^3))
2. Ones Assignment Method(OAM)
Goal Planning Model
1. Sequential algorithm (according to the order of priority, the objective programming problem is decomposed into a series of single-objective programming problems, and then solved in sequence)
  1. Set the deviation variable (difference between actual value and target value)
  2. Unified processing of goals and constraints (hard constraints, soft constraints)
  3. Set the priority and weight coefficient of the target (set the priority first, then set the weight)
2. Multi-objective planning
  1. Pareto efficient solution method
    1. Calculate the respective optimal solutions for different objective functions
    2. Construct high-dimensional rectangles from these solutions and divide them equally, and the corresponding objective function values are their weights.
    3. [x fval] = fgoalattain(fun, x0, goal, weight, A, b, Aeq, beq, lb, ub, nonlcon)
      1. weight~ the weight of multiple objective functions (need to explain)
      2. nonlcon ~ constraint function [c; ceq]
  2. Priority method
  3. Constrained Model Approach
    1. For part of the objective function, we only need it to be less than a reasonable value
    2. Sensitivity analysis of linear models
    3. Parametric Linear Programming
      1. Linear programming problem with parameters in the objective function (simplex method)
      2. Constrained constants contain parameters of linear programming problems
    4. Stochastic Linear Programming Model
      1. Expected value model method (do not know the variance, and the probability of being the optimal solution is not high)
        Convexity Theorem (if the formula is convex, the expectation is also convex)
      2. Chance Constraint Model Approach
        Equivalent properties
        Use the upper quantile fraction to simplify and solve
        Alpha reliability linear programming
        Disadvantage: The equivalent form of this formula is uncertain, and it is not necessarily linear programming, and the simplex method cannot be used to solve the optimal solution.
Dual planning problem 1.The right-hand side of the constraints & the coefficients of the objective function are all random variables 2. Process the constraints, then dual, then use the distribution law and the upper quantile, and finally return the dual 3. Deterministic programming of normal stochastic programming and its sensitivity analysis 1. Reliable linear programming of normal stochastic programming 2. Normal random programming sensitivity analysis 1. Estimate the effect of bias on the optimal basis 2. Estimate the influence of the bias value on the optimal value 3. Estimate the effect of bias on the optimal solution
Second-order mathematical programming model with simple compensation
1. The deterministic equivalence class of the model
Relevant Opportunity Planning Model Approach
Nonlinear programming model
1. General type (it is difficult to achieve a global optimal solution. If it is a convex function, it must be a global optimal solution)
2. x = fmincon(fun, x0, A, b, Aeq, beq, VLB, VUB, nonlcon, options)
3. Secondary planning
  1. [x, fval]=quadprog(H, f, A, b, Aeq, beq, VLB, VUB, x0, options)
4. MATLAB solution of constrained extreme values
  1. Optimizing a univariate function
    1. [x, fval]=fminbnd('fun', x1, x2, options)
    2. fminimax()
    3. x = fseminf('fun', x0, ntheta, seminfcon, A, b, Aeq, beq,lb,ub)
    4. seminfcon ~ semi-infinite constraint function that constrains +c(x)+ceq(x) by semi-infinite constraint variables and functions
    5. ntheta ~ the number of semi-infinite constraints
  2. optimtool toolbox interface operation
5. (External) Penalty Function Method - Unconstrained Minimization Technique SUMT (Constrained Problems Converted into a Series of Unconstrained Problems)
6. Sequential Quadratic Programming (SQP) method (Approximate the objective function with a quadratic function at each step of the iteration, and approximate the general constraints with linear constraints)
Dynamic programming (solving multi-stage decision optimization problems)

Optimization Theory

admm optimization algorithm / dual method / step-by-step optimization algorithm / dynamic optimization algorithm
Extreme value search calculation of multivariate functions (multivariate continuous optimization problem)
1. Steepest descent method (negative gradient direction), Newton method (inverse matrix of derivatives), quasi-Newton method (solving Hesse matrix)
2. Midpoint reflection algorithm simplex algorithm (when the dimension of the data is similar to the number of data, the effect is best) (better than the Nelder-Mead algorithm)
3. Nelder-Mead algorithm (unconditional constraints, complicated calculation, slow convergence) (inferior to midpoint reflection algorithm)
  1. The initial value of the variable (possible maximum point)
  2. Simplex vertices (high-dimensional regular polygons passing through the origin)
    1. Calculate the midpoint M
    2. Reflection
    3. Extension
    4. Compression (Contract)
    5. Shrink
    6. Choice of initial value
  3. Multivariate direction setting algorithm
    1. Gradient algorithm
    2. Steepest descent algorithm
    3. Taxi algorithm (the optimization process is more tortuous)
    4. Powell algorithm Powell (search algorithm, the optimization process is more tortuous)
    5. Improved three-direction setting algorithm under NM (high accuracy, simple calculation, fast convergence)

Equation Solving Algorithm (for Points Zero)

Ridders algorithm
Inverse quadratic interpolation trial position algorithm (usually no more than three times)
Tikhonov regularization or ridge regression (optimizing quadratic terms)
1. The noise is white Gaussian noise, so there is least squares.
2. Output: get a smooth image
LASSO problem and its deformation (make the independent variable x as sparse as possible, feature extraction of x)
Matrix separation (robust PCA) (separate the matrix M into a low-rank matrix X and a sparse matrix S. Remove the noise in the original data to the greatest extent, and find the best projection of the data on the low-dimensional space. )

Graphs and Networks

Shortest path
1. (undirected graph) adjacency matrix (symmetric matrix, degree=sum()) (when the number of edges of the graph is much smaller than the number of vertices, the adjacency matrix representation will cause a lot of waste of space)
2. Dijkstra’s algorithm (the method of generating the shortest path in the order of increasing path length, the priority queue optimization method is used for sparse graph)
3. Floyd’s algorithm (applicable to the case where the weight of the edge has a negative number)
Minimum spanning tree problem
Hamilton drawing (one stroke)
Matching and Covering
Network maximum flow problem (petroleum pipeline)
1. Capacity constraints
2. Equilibrium Conditions
3. Augmented circuit, saturated arc, unsaturated arc, zero flow arc, non-zero flow arc
4. Ford-Fulkerson labeling algorithm (step by step to increase the chain, and adjust the flow)

Discrete model

Cellular analysis, cellular automata CA (discrete state, spatial distribution has a certain regularity, rules are local, time evolution)
1. State transition function
2. Cell movement traversal algorithm
3. Lattice gas automata FHP (Fritsch-Hasslacher-Pomeau) model based on hexagonal network
  1. Reversible cellular automata model
  2. Margulos model (2X2 grid space consideration, sliding diagonally)
Time Series Analysis - (essentially a difference equation, Z transform)
1. Major change possibilities
  1. Long-term trend changes
  2. Seasonal changes
  3. Cyclic changes
  4. Irregular changes
2. Identify the main types of time series
  1. Additive model
  2. Multiplicative Model
  3. Mixed models
3. AR model (Auto Regression Model) (Automatically find weights)
  1. Advantages
    1. The technology is simple, and the parameters are interpretable and selectable
    2. All data is used

…

Still 2/3 of the notes are not typed. Update slowly next time. : )

…

Writing skills

En general, there exists serveral structures of the papers. And here an example useful (my team’s paper of Outstanding Winner) :

Summary
Contents
Introduction
1. Backgroud
2. Problem Restatement and Analysis
3. Overview of our work
Assumptions
List of Notion
A Resources Allocation Network of River Colorado
1. Solution for the Allocation Network : A Psudo-Code
2. Results and Analysis
Marginal Utility Model : Resources Interests and Valuation of Each State
Auction Theory for Competing Interests : Independent Private Value (IPV) Model
1. Soluton of the Independent Private Value (IPV) model
2. Results and Analysis
Sensitivity Analysis - Our Model Dealing with More Complex Situations
1. Growth or Shrinkage in Affected Areas
2. Increase of the Proportion of Renewable Energy Technologies
3. Water and Electricity Conservation Measures
Discussion
1. Strengths and Weaknesses
2. Futur Work
Conclusion
References

Just bing read for a few seconds, this Outstanding Winner paper already shows a very clear model with some details appropriate.

碎碎念

这个部分的东西很零碎，没有什么逻辑，只是想到什么就说什么。和知识无关，主要是一些想法/认识。

得益于我在2021年旁听了某优化老师的《最优化理论选件》这门课，这就算我在运筹优化领域的入门了。通过这门课，我接触到了，在应用范围方面比数值数学广，在解释性方面比机器学习/深度学习强的，近10-20年来关于优化算法的问题与解答。同时，我还读了授课老师推荐的《最优化：建模、算法与理论/最优化计算方法》这本书（这是北大学生上最优化课程的主要参考教材）。在那之后，我便立足于该领域，了解和学习更多的算法问题。既有简单的启发式算法，又有难度中等的机器学习，以及更近一步的深度学习。纵观这样的发展路线，我其实是从数学底层，逐步上升到算法理解层面，最后到应用层面的。

因此，我作为团队的组长，很早就确定了比赛的题目范围，即（离散）运筹优化方向。其实这是我们团队获得O奖的一大原因之一。得益于我和数学功底扎实的19级数学系“楠神”（别名）的努力，这使得我们的模型中的数学部分（公式、推导、解法）能够在20000+份参赛论文中脱颖而出。这是我们非常特别的优势。以2022年美赛B题第一小问为例，正是我们对于数学模型背后的数学知识的独到而深厚的理解，我们将异常复杂的供水量与发电量模型在离散的条件下优化为线性模型。这也为之后我们在做模型模拟时的高校性能做出了突出的贡献。

如果你看过我们的O奖论文，我想你一定能注意到，我们立足于经济学中的拍卖理论（2018年诺贝尔经济学奖，记不清楚了），通过数学公式的推导和程序求解，得出了最优的市场销售方案。并对竞价市场做出模拟，以验证我们理论的最优性与稳定性。这里很关键的一点就是：能够理解拍卖理论，能够并应用于美国克罗拉多河系内的水资源分配问题（当然这也和美国市场经济有关，中国人可能能难理解为什么水资源分配是可以被拍卖的）。如果你有看过关于拍卖理论最优竞价策略的研究论文，你就知道里面的数学是有多难了。所以我在这里不得不暴论一下：如果你的应用数学能力不强，这条路你走不通、走不远的。

其实，论文的写作是很有技巧的，这对于做科研发文章亦是如此。这个里面细节太多，我想我要是有时间，可以滔滔不绝地说上一节课。我想起来，之前有一次我和做深度学习硬件的大佬（好像是机动学院的副院长，记不清楚了）交流过一些关于科研学术的内容。最后他给我的评价是，我的科研思维较好、敏捷，是一个适合读博的人才，并且非常推荐我能够在某一个领域继续读博下去。（当然，至于我对直接的未来规划有自己的想法，这里不便多提）。因此，我想表达的是，你一定要有科研经历，哪怕没有，在你们撰写你们的论文的时候也请一定一定记住，阅读你们的论文的人一定是科研人员。打个小广告，如果你对这个问题感兴趣，你可以通过SHUFly社区的QQ群中找到我：），但是找人帮忙要注意礼貌的。

一些QA，我想可能对你们有用：（来源于我的某公众账号）

一、

Q：答主想问一下你们学优化是怎么学的啊？
A：我在大二春季上过一门研究生入优化门课，介绍了近十年的算法，有难度，但是很有趣（好像北大也有开同一门课，教材、PPT、代码是一样的），还上过清华的数据挖掘公开课。然后，在去年暑假看了很多很多讲优化的书（比如随机参量优化模型），是基于MATLAB和Lingo实现的。那个时候还看了一些机器学习的算法，比如t-SNE、Umap这样很多本科生课程很少提到的进阶算法。去年下半年看了《最优化：建模、算法与理论》这本书不错。我花的时间挺多的，也不只是为了建模比赛。今年寒假学了深度学习，用python跑模型。还有一些经历可能忘记了。
Q：要是方便的话答主可以分享一下那些对应的课程嘛（优化课程名算法和深度学习的那些）万分感谢！
A：优化理论：这是我们学校自己开的线下课诶，听说最近也开设了给本科生上的课，好像是《最优化理论选讲》。深度学习：这方面资料太多了，西瓜书什么的你肯定知道，其次我再推荐一本，李沐老师及其合作团队写的《动手学习深度学习》（他们开源了，B站也有视频）。数据挖掘：B站《数据挖掘》（好像是这门课，难度是introduction）
Q：感谢答主！！

二、

Q：我去，我就看了这篇o奖论文，前面还能看懂，觉得很舒服，后面就不怎么清楚了，还是很棒了~
A：哈哈哈，谢谢你阅读我们的文章。在比赛的时候，我们团队还简化了很多，要不然篇幅更长了TAT。算法确实不简单，因为涉及到一些很理论性的东西（好像是近10年的研究产出），代码当初也搞了很久[捂脸]。此外，因为审核对象是教授，所以我们默认教授理解/明白基本的该领域基本的知识，不过确实，对于没有研究过这方面的学生还是很难看懂的。

三、

Q：我们也选的b题，没有蹭的意思吧，我们的模型和答主的基本一样（看约束条件，不一样的可能就是参数名称和上下标了），然后也是有些创新性的方法，比如多目标和效益量化之类的，图感觉画的也还可以，但是最后还是s。呜呜呜
A：你好鸭。我说说我的理解，获奖的因素很多，比如摘要、表达、逻辑、对该问题的研究深度、格式、模型合理性、数据/结论分析等等（不分前后）（你也许可以思考一下：你的方法有多新呢，是否存在逻辑性/常识性错误呢，是否验证有效性呢，是否可靠呢，是否被研究过呢，参考了哪些他人的工作，模型的表述十分是否易于理解呢，等等）。我在回答里面放的截图只是第一问，还是最简单的部分，一个差分形式的线型模型也没什么好创新的。。。另外，第一问大家的方法不都差不多么[飙泪笑]。主要是看你们论文有什么亮点。
A：去年我也参加过这个比赛，也是和O奖得主的方法一样（谷歌Page-Rank算法），但后来经过队伍内部分析、反思，确实那个时候我们做的很多地方都不好，我们队伍只是浮与表面，未曾深入理解和研究，最终也没有拿到F/O。美赛是这样的，在我们没有接触过很牛的文章时，我们容易陷入到对自己过度认可/赞同的泥潭之中。一点小挫折也没事，好好静下心来，细细琢磨其中的道理，这才是关键。
Q：呜呜谢谢楼主这么认真回复，就感觉有点不好运吧哈哈哈哈，就这样安慰自己了，毕竟也准备了差不多一年的了。虽然有很多常用的方法什么的也还没有自习去学过，都只是抓着规划和优化这类题材，可能就有点局限了吧[捂脸][捂脸][捂脸]。然后我们也在和老师讨论交流，先在国赛里面扳回一局，感觉国赛的可控性会更强。

（待续）