Hamilton 环多项式算法、 时间复杂度及其详细证明 本文已在 arxiv 上公布。相对于以前,本次算法略有调整,但证明方法是全新的。以前的证明方法当然是逻辑严密的高超的和有效的,但需要 ”combine and split” ,这个非常难理解,也非常难实现,所以这次我换了一种全新的方法,实现起来要容易得多。当然,还是属相当难看懂的。能透彻看懂本算法及其证明的人,绝对在计算机专业算法方向实力极其强大!总之,本次的证明比我以前的 ”combine and split” 方法要容易理解得多。 说明: 1, 本文的一些表达,是直接借用了某审稿专家的表达方式,这是一个非常不错的审稿专家,在此表示由衷的感谢!明显看出了英语非我的母语,我英语不行,他将我文中许多不规范的英文表达,重新帮我表达了一次。当然,他说他还是有不理解的地方,并详细指出了哪些地方的意思他不理解,当然,我据此作了根本性修改。 2, 本文希望同行高人给予批评指导,交流合作。 3, 希望有专长的博主给予英语表达和 SCI 写作方面的纠正指点或合作。 Algorithm Definitions: For an undirected graph G with n vertices, let x_1, x_2, ... ,x_n denote the n vertices. A broad cycle is defined as a cyclic sequence x_1, x_2, ... ,x_n, x_1 of the vertices of G where every pair x_i, x_{i+1} may or may not be adjacent in G. We call a pair (x_i, x_{i+1}) (including (x_n, x_1)) of non-adjacent vertices a break point. So the number of break points is between 0 and n for a broad cycle. Apparently, a break point is constituted by two vertices(say vertex a and b). We use a*b to denote this break point. If two consecutive vertices a and b are adjacent, we call them “connecting point”, we use ab to denote this connecting point. We use a…b to denote that there are some vertices between a and b. For an undirected graph with n vertices, the number of all break points and connecting points is n(n-1)/2. A connecting point or a break point is also called “a general point”. A general point(a break point or a connecting point) is constituted by two vertices, we say that the general point contains these two vertices. A segment: any number of vertices sit side by side to shape a sequence, we call this sequence a “segment”. So, for a broad cycle, any part between two general points is a segment, and there are n(n-1) different segments for this broad cycle. For each general point, we construct a broad cycle which contains this general point. We call the broad cycle is the general point’s broad cycle, and the general point is the broad cycle’s general point. Because for an n vertices graph, there are n(n-1)/2 general points, we must construct n(n-1)/2 broad cycles(some of them may repeat). For a broad cycle, cut a segment at any place of this broad cycle, insert the segment between two consecutive vertices in some other place of the broad cycle. Before inserting, we may do the rotation and extension on the segment and then insert it. Note: for this rotation or extension, it is not necessary that the two end vertices are adjacent. We call this operation a “cut and insert” . The main operation in our algorithm is the “cut and insert”, now we explain it: Let x_1, ... ,x_n, x_1 denote a broad cycle, let s = x_i, x_{i+1}, ... ,x_{i+r} be a subsequence of this broad cycle(i.e., a segment), and let j be an index such that x_j and x_{j+1} are not in s. A broad cycle C is obtained by cutting s and inserting it into x_j and x_{j+1} if either C = x_j, x_i, x_{i+1}, ... ,x_{i+r}, x_{j+1}, x_{j+2}, ... ,x_n, x_1, ... ,x_{j-1}, x_{j}, or C = x_{j}, x_{i+r}, x_{i+r-1}, ... ,x_{i}, x_{j+1}, x_{j+2}, ..., x_n, x_1, ... ,x_{j-1}, x_{j} (addition is meant modulo n). Also, before inserting, we may do the rotation and extension on the segment and then insert it. Algorithm FindHCycle: For an undirected graph with n vertices and m edges, we first delete all the edges, then we get a graph with n vertices and no edges. This graph has n(n-1)/2 break points. For each break point(or connecting point), we construct a broad cycle which contains this break point(or connecting point). So, we construct n(n-1)/2 broad cycles(some of them may repeat). For each broad cycle, we try to get the least number of break points under the current edges. Apparently, at first, when no edges, all our n(n-1)/2 broad cycles fulfil the condition(least number of break points). Then, our algorithm includes m steps, we call each of the m steps “a big step”. At each big step, we add one edge. We call it “the current added edg e ”. Note, before adding this edge, all broad cycles are with the least number of break points, but after adding the edge, some broad cycles may not. If so, we try all the broad cycles for all possible one “cut and insert”, at least one “cut and insert” can get a new broad cycle which has the least number of break points(its old one does not, i.e., its break points number minus one). Apparently, this new broad cycle contains the new added edge. We let the new broad cycle replace its old broad cycle whose general point is the same as the new broad cycle’s. We call this a “little step”, each little step contains a successful cut and insert and a replacement. At each big step, we call all the broad cycles before adding the edge “old broad cycles”. If after the added edge, some general point’s broad cycle’s break points should minus one, if we can get this broad cycle with the least number of break points, we call it this general point’s “new broad cycle”. If a general point’s broad cycle, no matter how to arrange it, can always get a new broad cycle by a cut and insert, we call this general point “key general point”, call this broad cycle “key broad cycle”, and call this cut and insert “a useful cut and insert”. We call that the key broad cycle contains this useful cut and insert, call the key general point the useful cut and insert’s key general point, call the new broad cycle this useful cut and insert’s new broad cycle, call the new broad cycle’s general point the useful cut and insert’s new key general point. At each big step, when there will be some new broad cycles, we always can get at least one useful cut and insert to produce a new broad cycle. Then we use the new broad cycle to replace its old one. Then we can continue this process until there will be no new broad cycles. So continue to try the all possible one “cut and insert”, i.e., continue to finish the rest little steps, until we cannot get any broad cycle’s break points become less. This means all the broad cycles fulfil the condition(with the least number of break points) after adding the new edge. Go to the next big step, i.e., add another edge, do the same job as above. Until all the m edges have been added and finished. After all the m big steps, we can get all the n(n-1)/2 broad cycles with the least number of break points, of course, each broad cycle at least contains one different break point or one different connecting point. So, if there are some Hamilton Cycles, we can get at least one. Time Complexity: at each step, for one broad cycle, when we cut a segment, try to insert it between two consecutive vertices at any place of the rest part of this broad cycle, there are O(n) different points to be inserted, after the insert, we need to compare to decide if the result is a new broad cycle which contains a general point with the least number of break points under the new added edge(its old one does not, i.e., its break points number minus one). The comparison needs O(1) time(by memory). For each broad cycle, there are O(n 2 ) segments, for each segment, the rotation and extension need O(n) time, at each step there are O(n 2 ) broad cycles(even the new broad cycle replaces the old one, all possible new broad cycles are within O(n 2 )), also there are m big steps, so, all the time complexity is O(n 2 )* O(n 2 )* O(n)*O(n)* O(1)*m = O(n 6 )*m. Too long, right? But this is the worst case or the theoretical case, also it can get all the n(n-1)/2 broad cycles. In fact, by a lot of test, if an undirected graph contains at least one Hamilton cycle, our algorithm’s average time complexity for getting one Hamilton cycle is only about O(n 3 ). Some of the time in the O(n 6 )*m may not be necessary. For example, trying all the O(n 2 ) broad cycles is not necessary, we can make good choice. So is the O(n 2 ) segments. How to prove the algorithm? Our way is: to check every undirected graphs to find if it fulfil our algorithm and all the properties. But how to check “every”? we can see that for each cut and insert, only several vertices take action, other vertices do not. We can delete most vertices and only study the rest part with several vertices being left. First, we explain how to delete a vertex. 1) The deleting rule is: before deleting a vertex, all the broad cycles are with the least number of break points, after deleting, this property should hold and we do not need to rearrange all the broad cycles. 2) For the …abc…, after deleting vertex b, it becomes ac, even if vertex a is not adjacent to c, at this time, at this place, we assume there is a temporary edge between vertex a and c. 3) For the …a*b*c…, after deleting vertex b, it becomes a*c, even if vertex a is adjacent to c, at this time, at this place, we assume vertex a is not adjacent to c. 4) For the …a*bc…, after deleting vertex b, it becomes a*c, even if vertex a is adjacent to c, at this time, at this place, we assume vertex a is not adjacent to c. 5) If there are some contradictions, using this rule to solve the contradictions: some edges cannot occur at the same broad cycles at the same time. For example, for the …abc… and …dbe.., vertex a and c are not adjacent, nor d and e, after deleting vertex b, we get new edges ac and de, but these two edges cannot occur at the same broad cycles at the same time. Now we decide the vertex number for our testing graphs. When we do the “cut and insert”, if we do the rotation and extension on a segment, the segment needs at least 4 vertices, for the rest of the broad cycle, we need at least one new general point to insert the segment, so the rest of the broad cycle should contain at least 4 vertices. Both together 8 vertices. Also we should keep this broad cycle’s general point, 2 vertices. We should keep the current added edge, 2 vertices. So we must test all graphs with at least 12 vertices. Then we should add a vertex to each 12-vertex graph to see if the added vertex can destroy the algorithm. So we still must test all graphs with 13 vertices. Note: each graph is not only a simple graph, but is the rest part of a big graph after being deleted many vertices. In this graph, some edges cannot occur in the same broad cycle at the same time, we call this “a limitation”. For the test graphs, we should consider all possible combinations of the limitations. How to decide the limitations? At each big step, for the old broad cycles, we can consider any limitations of no contradictions, but no limitation on the current added edge, i.e., this edge is an absolute real edge. Also, when we do the cut and insert, we do not need to add any new limitations, because the added vertex can produce new limitations. Note: adding a vertex is the reverse operation of deleting a vertex. If after we add a vertex to the graph, a former useful cut and insert is no longer a useful cut and insert, we call that the added vertex destroys the useful cut and insert. We mainly have the following six kinds of destroys: Destroy 1: if the useful cut and insert is like this: we insert a…b into …fg…cd…, vertex a is adjacent to c, vertex b is adjacent to d, but when we add a vertex x, …fg…cxd…, if x is also adjacent to a or b, we still can insert a…b… between vertex cx or xd. At this time, we say vertex x does not destroy the useful cut and insert, If x is not adjacent to a and b, we say vertex x destroys the useful cut and insert. Destroy 2: if the useful cut and insert is like this: we insert a…b into …fg…cd…, vertex a is adjacent to c, vertex b is adjacent to d, but when we add a vertex x, …fxg…cd…, but now vertex a is not adjacent to c, only axc , thus, we say vertex x destroys the useful cut and insert. Destroy 3: if the useful cut and insert is like this: we insert a…b into …fg…cd…, vertex a is adjacent to c, vertex b may be adjacent to d or may not, after inserting, we get …fg…ca…b*d…, this is the general point b*d’s new broad cycle. When we add a vertex x, …fg…cxd…, we say vertex x destroys the useful cut and insert. But if this time we still can insert a…b between cx to get a new broad cycle of the general point b*x, at this time, we still say vertex x destroys the useful cut and insert. But it produce a new useful cut and insert. Destroy 4: usually one vertex destroys one useful cut and insert, but only in next case each of the two vertices can destroy a useful cut and insert: we insert a…b into …cdef…, vertex a is adjacent to d, vertex b is adjacent to f, vertex e is adjacent to c, thus, we say each of the vertices d and e destroys the useful cut and insert. Destroy 5: if the useful cut and insert is like this: we insert a…b into …f*g…cd…, vertex a is adjacent to c, vertex b is adjacent to d, but when we add a vertex x, …f*x*g…cd…, we still can insert a…b into cd, but this time the result is not with the least number of break points. So we say vertex x destroys the useful cut and insert. Destroy 6: if vertex a and b(or more) together destroy a useful cut and insert, but deleting any one of them cannot get this useful cut and insert, only deleting both of them we can get this useful cut and insert, then we do not think vertex a or b destroys this useful cut and insert. But, if after deleting vertex a, vertex b becomes destroying. In fact, the destroy 6 is not a destroy, we only give an explanation for this case. Note: only these six kinds of destroys, no others, because the test graph is the rest part of some big graph being deleted some vertices. Note: a destroy means: there is a real new broad cycle for this destroy. If there is no such a new broad cycle for this “destroy”, it is not a destroy. For a useful cut and insert, we call the broad cycle, which can produce a new broad cycle by this useful cut and insert, a parent broad cycle, call its general point a parent general point, call this new broad cycle a child broad cycle, call the child broad cycle’s general point a child general point. If a vertex destroys some useful cut and insert, we call it a destroying vertex. If a vertex does not destroy any useful cut and insert, we call it an un-destroying vertex. Our main proof idea is: we check that any graph with 12 or 13 or 14 vertices at each of its little steps has at least one useful cut and insert. Also, we check that for any graph with 12 or 13 or 14 vertices at each of its little steps, its un-destroying vertices is at least 2 more than its destroying vertices. Then we prove that any graph with more than 14 vertices also has this property by deduction on vertex number. So, for a 15-vertex graph, if we delete an un-destroying vertex, it becomes a 14-vertex graph, because each 14-vertex graph always has at least one useful cut and insert, so does the 15-vertex graph, and so on. Now we explain a very important concept “main child general point”. The main child general point has the following properties(we can check them): 1) For each 12-vertex graph at each of its little steps, the least number of break points of any broad cycle of its one main child general point must decrease by 1 after adding the current added edge. 2) Each 12-vertex graph at each of its little steps, has at least one main child general point and also has at least one useful cut and insert for this main child general point unless it finish all the little steps of the current big step. 3) Each 13-vertex graph at each of its little steps, has at least one main child general point and also has at least one useful cut and insert for this main child general point. When we delete any one vertex except the two vertices of the main child general point, the rest 12-vertex graph still has the same main child general point and also has at least one useful cut and insert for this main child general point. 4) For each 13-vertex graph at each of its little steps, let vertex a, b, c are three distinct vertices in this graph which do not belong to the two vertices of the main child general point(say g). We want to add one vertex d into this graph. If either we delete vertex a, add vertex d, or we delete b, add d, both cases the g is the 13-vertex graph’s main child general point, then, for these 14 vertices, deleting any one vertex except the two of the main child general point, g is always the rest 13-vertex graph’s main child general point. We call g the 14-vertex graph’s main child general point. If either we delete vertex a, add vertex d, or we delete b, add d, both cases the 13-vertex graph’s main child general point is (d,c), then, for these 14 vertices, deleting any one vertex except the two of the main child general point, (d,c) is always the rest 13-vertex graph’s main child general point. If we delete a, add d, g or (d,c) is the main child general point, but if we delete b, add d, g or (d,c) is not the main child general point, then, for these 14 vertices, deleting any one vertex except the two of the main child general point (d,a), (d,a) is always the rest 13-vertex graph’s main child general point. At this time, we call (d,a) the 14-vertex graph’s main child general point. 5) From 4), we can get the corollary: for any graph with more than 13 vertices, it also has the property 4). 6) For each 12 or 13 or 14 vertices graph, at each of its little steps, for one main child general point, the number of un-destroying vertices is at least 2 bigger than the number of destroying vertices. Also we check any 14-vertex graph has at least one useful cut and insert for the main child general point at any little step. Note: this destroying only is for such useful cut and inserts: they produce the new broad cycles whose general point is the main child general point. So is the un-destroying. Now we prove any graph with more than 14 vertices also has property 6). We prove this by deduction on vertex number. Proof: we suppose any k-vertex graph has property 6) (k=14), for a k+1-vertex graph, let vertices (a,b) is one main child general point at its one little step, let d is the number of its destroying vertices, let u is the number of its un-destroying vertices, both for this main child general point. If u is at least 2 bigger than d, it is ok. If not, we first suppose d=u. At this time, the worst case is: for each vertex of u, its all possible parent general point(i.e. added any one of the other vertices except the two of the main child general point, i.e., for all the possible parent broad cycles whose general points contain this vertex of u), are destroyed by 2 vertices of d. So, if we delete one vertex of u, the number of destroying vertices is d-2, the number of un-destroying vertices is u+1, i.e., we move 2 vertices from d to u. But we can delete these two vertices again, then, the number of destroying vertices is d-2(may be more, because for the case Destroy 6, deleting one vertex may let a vertex become a destroying vertex, but this does not affect our proof.), the number of un-destroying vertices is u-1, u-1-(d-2)=1, i.e., for the k-2-vertex graph, the number of un-destroying vertices is 1 bigger than the number of destroying vertices. A contradiction. We then suppose u=d+1. At this time, the worst case is: at least for one vertex of u, its all possible parent general point(i.e. added any one of the other vertices except the two of the main child general point), are destroyed by only 1 vertex of d. So, if we delete this vertex, the number of destroying vertices is d-1, the number of un-destroying vertices is u, i.e., we move 1 vertex from d to u. But we can delete this vertex again, then, the number of destroying vertices is d-1(may be more), the number of un-destroying vertices is u-1, u-1-(d-1)=1, i.e., for the (k-1)-vertex graph, the number of un-destroying vertices is 1 bigger than the number of destroying vertices. Also a contradiction. Other cases also have the same way contradictions. So we proved any graph with more than 14 vertices also has property 6), i.e. has a lot of vertices which do not destroy any useful cut and insert for the main child general point. Because any graph with 14 vertices has at least one useful cut and insert for the main child general point at each of its little steps, so does any graph with more than 14 vertices.
伯克利工业工程系的前辈David Gale教授-稳定婚姻问题 (stable matching or stable marriage problem ) 稳定婚姻,就是指男女结婚后,双方都不会发生出轨行为。那怎样才能做到双方都不出轨呢?如果双方都是对方的最爱,自然不会出轨;如果有一方或双方都不是对方的最爱,则必须保证想出轨的人找不到出轨的对象。简言之,只要满足“除妻子(丈夫)外,我爱的人不爱我,爱我的人我不爱”条件,就可形成稳定的婚姻。 稳定婚姻问题(Stable Marriage Problem) 已经被证明是一个 NP 问题, 在过去的几十年里很多人开始对这一课题产生兴趣,同时越来越多的 paper 对这一问题进行深入的研究。 最为著名的研究是由 美国数学家、加州大学伯克利工业工程系教授 David Gale和加州大学洛杉矶分校 Lloyd Shapley在1962年发明的稳定婚姻的寻优策略,人们称之为延迟认可算法(Gale-Shapley算法)。 先对所有男士进行落选标记,称其为自由男。当存在自由男时,进行以下操作: ① 每一位自由男在所有尚未拒绝她的女士中选择一位被他排名最优先的女士; ② 每一位女士将正在追求她的自由男与其当前男友进行比较,选择其中排名优先的男士作为其男友,即若自由男优于当前男友,则抛弃前男友;否则保留其男友,拒绝自由男。 ③ 若某男士被其女友抛弃,重新变成自由男。 在算法执行期间,自由男们主动出击,依次对最喜欢和次喜欢的女人求爱,一旦被接受,即失去自由身,进入订婚状态;而女人们则采取“守株待兔”和“喜新厌旧”策略,对前来求爱的男士进行选择:若该男子比未婚夫强,则悔婚,选择新的未婚夫;否则拒绝该男子的求婚。被女友抛弃的男人重获自由身,重新拥有了追求女人的权利——当然,新的追求对象比不过前女友。 这样,在算法执行期间,每个人都有可能订婚多次——也有可能一开始就找到了自己的最爱,从一而终——每订一次婚,女人们的选择就会更有利,而男人们的品味则越来越差。只要男女生的数量相等,则经过多轮求婚,订婚,悔婚和再订婚之后,每位男女最终都会找到合适的伴侣——虽然不一定是自己的最爱(男人没能追到自己的最爱,或女人没有等到自己的最爱来追求),但绝对不会出现“虽然彼此相爱,却不能在一起”的悲剧,所有人都会组成稳定的婚姻。 Gale-Shapley 算法最大的意义就在于,作为一个为这些男女牵线的媒人,你并不需要亲自计算稳定婚姻匹配,甚至根本不需要了解每个人的偏好,只需要按照这个算法组织一个男女配对活动就可以了。你需要做的仅仅是把算法流程当作游戏规则告诉大家,游戏结束后会自动得到一个大家都满意的婚姻匹配。对于男性来说,从最喜欢的女孩儿开始追起是顺理成章的事;对于女性来说,不断选择最好的男人也正好符合她的利益。因此,大家会自动遵守游戏规则,不用担心有人虚报自己的偏好。 历史上,这样的“配对游戏”还真有过实际应用,并且更有意思的是,这个算法的应用居然比算法本身的提出还早 10 年。早在 1952 年,美国就开始用这种办法给医学院的学生安排工作,这被称之为“全国住院医师配对项目”。配对的基本流程就是,各医院从尚未拒绝这一职位的医学院学生中选出最佳人选并发送聘用通知,当学生收到来自各医院的聘用通知后,系统会根据他所填写的意愿表自动将其分配到意愿最高的职位,并拒绝掉其它的职位。如此反复,直到每个学生都分配到了工作。当然,那时人们并不知道这样的流程可以保证工作分配的稳定性,只是凭直觉认为这是很合理的。直到 10 年之后, Gale 和 Shapley 才系统地研究了这个流程,提出了稳定婚姻问题,并证明了这个算法的正确性。 伯克利官方资料介绍伯克利工业工程系的前辈David Gale教授 David Gale joined Berkeley in 1966 and held appointments in the departments of IEOR, Mathematics, and Economics. His work was instrumental to the development of game theory, and linear and non-linear programming. Gale’s unorthodox but influential research paper describes an “algorithmic procedure to find… a stable marriage.” Their findings from this paper have been expanded and used to match students to schools。 Gale published an influential paper on the so-called stable matching or stable marriage problem with Lloyd S. Shapley, now professor emeritus of mathematics and economics at the University of California, Los Angeles. Gale and Shapley proved that, given an equal number of men and women who rank one another in terms of romantic interest, it is possible so to pair them in marriage that no two people would rather be married to each other than to their partners. They also provided an algorithmic procedure to find such a stable marriage. Gale and Shapley’s theoretical work on this problem legitimized a procedure used to match newly graduated doctors to hospital residency programs. The paper generated an enormous literature, and variations of the matching algorithm are used to match students to schools.