As for sentiment extraction itself, there are different layers: 1. sentiment classification: thumbs-up and down (or plus neutral) 2. sentiment association: to associate a sentiment with a topic or brand 3. fine-grained sentiment extraction: for example, who made the sentiment comment? about which topic or brand this sentiment is about (= 2 above)? how intense is the sentiment? what is the reason of the sentiment? Can the system associate sentiments not only with topics or brands (e.g iPhone), but also with a feature of a brand (e.g. screen) and how well they do so? In addition to sentiments related to emotions about agents (love/hate/happy/annoy etc), can system identify positive or negative evaluations of a topic/brand (cost-effective/poorly-designed) ? How about the agents' needs and wishlist for brands? How about agents' positive or negative action towards a brand (including consumers' purchase intent such as will buy; negative actions such as abandon; discontinue the use of)? What are the brands' functionality (positive features (designed to do what)? Can system identify comparisons between brands/topics (iPhone is better than Blackberry)? Most learning systems stop at 1 and sometimes at 2. We do all 3 based on deep parsing. The most popular and easiest is the sentiment classification of documents based on keyword density: they perform well in domains where there are large labeled data available (e.g Amazon review; movie reviews etc), but they are too coarse-grained. They face challenges when they move to a new domain where labeled data are not sufficient for the algorithm to learn a classifier. The more severe challenge comes from 2 when comparative text mentions two brands in proximity with a detected sentiment (I prefer iPhone to Blackberry; iPhone is better than Blackberry) : because they are based on keywords and do not understand the sentence structures, they do not know how to associate the sentiment with iPhone or with Blackberry. Finally for 3, so far no learning systems have even attempted that degree of fine-grainedness of sentiments in industry, but this is super important for a social media monitoring product which will then be able to support extracting actionable intelligence for decision makers. We are one of the first, if not THE first, to do 3. For fine-grained extraction, rules are more flexible to apply, especially after there is a parser built to support it. Having said that, usually the QA (Quality Assurance) should still focus on 1 and 2, not so much on 3 for cost considerations. We want to make sure the benchmarks reflect the global picture of how well the system performs in sentiments. As long as the global quality control is there, the fine-grained extraction in 3 cannot go too wrong. But in order to test every detail of sentiment related intelligence, there is huge cost that is required. We cannot afford that, but we perform in a self-adjusting mode: each difference the system makes for any development or change of the system, we developers are demanded to eyeball the results to decide if they are good catches or not. This way, we ensure 3 stays on track and makes improvement every day. 【置顶:立委科学网博客NLP博文一览(定期更新版)】
Transportation Analytics and Traffic Sentiment Analysis: A New Direction for ITS Research? 去年国庆节,我正在外地。经过高速路上的收费站时,车还得停下来,接受工作人员发来的一张卡。问司机不是不收费吗?为何还发卡,司机说可能是为了统计流量计算损失吧。到了晚上才知主要城市大堵,网上一片沸腾,引发了我把交通舆情作为一个问题来研究的想法,当即电话曾轲和文礼准备数据,进行分析,写一篇论文(见附件3)。当时乱想,万一有什么灾难或动乱,汽车坦克是没用了,必须是飞机才行。 2013年1月5日,晨9点出行,遇到自己在京最长时间的堵车,平时20分钟的车程,用了1小时56分!车上无事,我把所有的时间节点都记下,准备有时间写自己的第一篇 Field Traffic Studies 实证研究的文章。 文礼基于现场写了一篇黄灯的分析短文,希望将来扩展成论文(见附件1)。 附件2是在我的新浪微博上转的关于交通相关的微博,希望曾刘两位在此基础下写出第二篇交通舆情的论文,使之真正地成为一个研究方向,为社会大众出行和公共政策制定提供有用的信息。 附件1:“黄灯3秒 ”VS“ 大堵车”——交通瓶颈的提前 范文博 :也谈黄灯的交通问题兼提改善建议【Jan. 12, 2013编辑】 http://blog.sciencenet.cn/blog-92473-652210.html 附件2: 我的新浪微博上转的关于交通相关的微博 附件3: Traffic Congestion and Socia l Media in China 附件1: “黄灯 3 秒” VS “大堵车” ——交通瓶颈的提前 从 1 月 1 日开始实施《新交规》以来,各地公布的交通违章数据显示交通违章呈下降趋势。控制“闯黄灯”成为《新交规》减少驾驶员违章行为的功臣之一。诚然,控制闯黄灯在一定层面上保障了安全,但同时也带来了许多负面效应。有人调侃说该规定违反“牛顿定律”,因为绿变黄的一瞬间,车没办法立刻停下来。有人甚至因为黄灯急停而造成了追尾事故。而大多数人的体会则是回家的路仿佛更长,更容易堵,更容易造成大堵车。 《新交规》实行以来,大家都在质疑“黄灯 3 秒”会不会是近几天出现大堵车的原因。我认为,“黄灯 3 秒”是否主要原因大家各有见解,但这 3 秒必定会使交通系统提前到达拥堵瓶颈,使以往同期高峰时段提前形成大堵车。 交通系统中,以交通灯控制车流。从这个角度看,《新交规》使黄灯成为控制车流停车的信号灯之一,这就大大增加了停车的信号灯周期。这样带来的危害是巨大的:原本 5 分钟可通过的路口现在可能需要 20 分钟才能通过;原本同时 100 辆车到达路口才到达堵车瓶颈的,现在可能 80 辆车就开始大堵车了。因此,黄灯的这三秒,将增加车流通过路口的时间,增大堵车的概率。 以下是 2013 年 1 月 5 日傍晚北京中关村东路至保福寺桥路段造成大堵车的一个实例。如图 1 所示, 1 月 5 日傍晚,保福寺桥南,中关村东路南北双向行车压力大,堵车严重。 图 1: 1 月 5 日晚 18 时 45 分保福寺桥南车行路况 为何黄灯的这 3 秒,会使交通路口提前达到输流瓶颈,增加堵车的概率?实际上,每一个交通灯,都有其控制车流的极限。如图 2 所示,在交通灯 1 下,汽车停排的长度为 Lc ,交通灯 1 在一个红灯期内,最大能承受的排队长度为 Lr 。一般地,当 时,交通系统可正常运行;当 时,则交通灯 1 出现排队溢出。《新交规》的出台,无疑是将黄灯归入汽车停止信号灯,车流在此信号灯前停止时间增加,车流陆续到达,使 Lc 比以往更快、更容易达到 Lr 极限。一旦交通灯前的车流达到排队极限,出现排队溢出,将造成不只一个方向堵车,而是大面积的堵车。图 3 至图 8 是 1 月 5 日保福寺桥南,中关村东路南北向排队溢出的实景,经过 T1 (图 3 、图 4 ), T2 (图 5 、图 6 ) ,T3 (图 7 、图 8 )三个时期后,该路口各个方向出现了大量拥堵。 图 2 :交通灯的红灯期容量。 Lc 表示交通灯 1 指示停止时,汽车的排队长度; Lr 表示交通灯 1 在一个周期内最大能承受的排队长度,即红灯期容量 图 3 : T1 时期南北向排队溢出实景,往中关村南路方向可通行。 图 4 : T1 时期南北向排队溢出,往中关村南路方向可通行。 图 5 : T2 时期中关村南路北转绿灯实景,此时将遭遇中关村东路排队溢出 图 6 : T2 时期中关村南路北转绿灯,此时将遭遇中关村东路排队溢出 图 7 : T3 时期中关村东路直行绿灯实景,此时将遭遇被排队溢出滞留的车队 图 8 : T3 时期造成堵车情况。中关村东路南向直行绿灯,此时将遭遇被排队溢出滞留的车队 至此,该路口出现大面积堵车,各个方向的车流越来越多。如图 9 所示,整个道路出现“全红”瘫痪状态。 图 9 :百度实时路况信息。绿色道路代表道路畅通,红色代表道路拥堵。经过几个红灯周期后, 1 月 5 日 18 时 21 分,中关村南路、北四环保福寺桥段、中关村东路出现大面积堵车。 简短的 3 秒,使某些固定路段提前到达“红灯期容量”,提前达到了堵车瓶颈。任何一个方向排队溢出,将迅速导致其他方向形成堵车。这个恶性循环长期难以解决,黄灯的三秒,更是提前促成恶性循环的重要原因,这是我们值得思考的问题。 附件2: 我的新浪微博上转的关于交通相关的微博 // @ 岂不快哉 wzk : 扯。让不让闯黄灯不是规则问题,是科学问题。 // @ 凡事望聞問切 : 一直步行上下班,个人感受:禁闯黃燈實行後,過馬路是安全了些;無論自己還是觀察別的行人;這就是坐在車里的和走在車外的不同角度。黃燈如果行人和機動車都能闖,不又跟綠燈一樣了 // @ 为明天忧 : 另一种钓鱼执法! ◆ ◆ @ 李开复 【中国式黄灯的谬论】 1 ) ” 中国人见黄加速,外国人不会 “ :我在美国三十多年,许多美国人也见黄加速,没去过外国别乱说, 2 ) ” 危险驾车行为该严惩 ” :请勿偷换概念,问题存在不代表解决方案正确,可用闪秒等合理解决方案。 3 ) “ 需要实践再修正 ” :全球已实验 1000 万亿次,有桥有船,不必摸石子过河。 转发 (31157) | 评论 (7767) 1 月 5 日 09:30 来自 脉搏网 1 月 9 日 08:56 来自 新浪微博 转发 (2) | 评论 ———————————————————————————————————————————— ———— // @Geek 李睿蛟 : // @jlijames :// @ 邓侃 : 美国有个笑话,说,开车的人见到黄灯,就像西班牙的牛,见到了红布。 // @leekayak : 转发微博 ◆ ◆ @ 李开复 【中国式黄灯的谬论】 1 ) ” 中国人见黄加速,外国人不会 “ :我在美国三十多年,许多美国人也见黄加速,没去过外国别乱说, 2 ) ” 危险驾车行为该严惩 ” :请勿偷换概念,问题存在不代表解决方案正确,可用闪秒等合理解决方案。 3 ) “ 需要实践再修正 ” :全球已实验 1000 万亿次,有桥有船,不必摸石子过河。 转发 (31157) | 评论 (7767) 1 月 5 日 09:30 来自 脉搏网 1 月 6 日 06:21 来自 新浪微博 转发 | 评论 ——————————————————————————————————————— ——————— ———— 个人观点:自适应程度越高,越应显示红绿时间,使驾驶员能明确情况,否则不可避免的靠经验估计行为可能导致更多的事故。 // @ 智能交通 - 王志彪 : 区域控制自适应系统根据车流情况调整绿灯时间也不是随意的,都会在最后执行一个最小的绿灯时间来保证相位衔接的秩序。杭州的 ..... 所以这个观点是站不住脚的 // ◆ ◆ @ 郭继孚 公安部解释北京为什么没有采取倒计时装置的信号灯:北京主城区采取的是区域控制的自适应系统,智能化程度比较高,信号机道路流量变化,及时调整信号灯配时。为了防止信号突变造成的紊乱,北京市没有采用倒计时装置 @ 中国新闻网 # 中新分享 # http://t.cn/zjR754G (分享自 @ 中国新闻网 ) 1 月 3 日 12:48 来自 中新网微博 转发 (45) | 评论 (20) 1 月 4 日 13:00 来自 新浪微博 转发 | 评论 ——————————————————————————————————————— ——————— // @ 智能交通王志彪 : 区域控制自适应系统据车流调整绿灯时间也不是随意的,都会在最后执行一个最小绿灯时间来保证相位衔接秩序。杭州 SCATS 系统是自适应控制系统,在相位变换前最后几秒中会开始倒计时,而平时是不显示。所以当你看到倒计时牌点亮时,就要准备停车或要起步了。所以这个观点是站不住脚 // ◆ ◆ @ 郭继孚 公安部解释北京为什么没有采取倒计时装置的信号灯:北京主城区采取的是区域控制的自适应系统,智能化程度比较高,信号机道路流量变化,及时调整信号灯配时。为了防止信号突变造成的紊乱,北京市没有采用倒计时装置 @ 中国新闻网 # 中新分享 # http://t.cn/zjR754G (分享自 @ 中国新 闻网 ) 1 月 3 日 12:48 来自 中新网微博 | 转发 (45) | 评论 (20) 1 月 4 日 12:49 来自 新浪微博 转发 | 评论 ——————————————————————————————————————— ——————— // @ 马少平 THU : “ 小心熊出没 ”// @Avatar_C : 小心黄灯出现 // @ 左南 ZuoNan : 马老师 ...... // @Leon_ 里昂王 : 绿灯恐惧症。 // @ 酒井鱼鱼 : 绝对要转! ◆ ◆ @ 马少平 THU 问:路况这么好,为什么你开这么慢?答:你没看见前面是绿灯吗? 1 月 3 日 09:33 来自 新浪微博 转发 (75) | 评论 (4) 1 月 3 日 13:04 来自 新浪微博 转发 (1) | 评论 ——————————————————————————————————————— ——————— // @ 科研战线上的工兵 : // @ 译言 : 遇到黄灯,停不住,有错的是交管部门,因为他们的黄灯设置不合理;不停车,有错的就是你。 # 交规都不遵守 ,谈什么尊重法律 # ◆ ◆ @ 译言 【你以为黄灯是个新问题?】其实全世界司机碰到黄灯都头疼。为什么?一般黄灯的时长是按照 1 秒的反应时间加上车辆的制动时间设置的。但问题是,司机的反应时间远远要超过 1 秒: http://t.cn/zj8s3pQ 但归根到底,无论老交规还是新交规, # 见到黄灯就应该准备停车 # ,不信回家翻翻你几年前学驾照的课本去。 1 月 3 日 11:17 来自 专业版微博 转发 (225) | 评论 (77) 1 月 3 日 13:04 来自 新浪微博 转发 | 评论 ——————————————————————————————————————— ——————— “1927 年,他发明变速电机和红绿灯转换时中间所出现的黄色,获得奖励。于是,红、黄、绿三色信号灯即以一个完整的指挥信号系统,遍及世界陆、海、空交通领域。 ” 这段描述只能推出胡老先生实现黄色信号灯,但无法确认是否发明黄色信号灯,而且百度也不是科技文献。 ◆ ◆ @ 瀟湘墨人 【黄灯居然是中国人发明的】胡汝鼎( 1905~1985 ),杭州人,早年留学美国。 1927 年,小胡在美国的一个繁华十字路口,绿灯亮了,正要向前走,一辆汽车擦身而过,小胡吓了一跳。小胡琢磨,觉得应该在红绿灯中间加个黄灯,提醒司机慢行,注意行人。小胡的建议得到了美国人的重视和采纳,最终全世界通行。 1 月 3 日 08:21 来自 享拍微博通 转发 (36) | 评论 (10) 1 月 3 日 09:50 来自 新浪微博 转发 (2) | 评论 ——————————————————————————————————————— ——————— 被堵胡汝鼎: http://t.cn/zOIvyFC // @ 中科院王飞跃 : 黄种人发明黄色信号灯,是信史吗?有文献吗?应当核查一下。 // @ 中科院王飞跃 : 转发微博 ◆ ◆ @ 瀟湘墨人 【黄灯居然是中国人发明的】胡汝鼎( 1905~1985 ),杭州人,早年留学美国。 1927 年,小胡在美国的一个繁华十字路口,绿灯亮了,正要向前走,一辆汽车擦身而过,小胡吓了一跳。小胡琢磨,觉得应该在红绿灯中间加个黄灯,提醒司机慢行,注意行人。小胡的建议得到了美国人的重视和采纳,最终全世界通行。 1 月 3 日 08:21 来自 享拍微博通 转发 (36) | 评论 (10) 1 月 3 日 09:44 来自 新浪微博 转发 (3) | 评论 (1) ——————————————————————————————————————— ——————— 黄种人发明黄色信号灯,是信史吗?有文献吗?应当核查一下。 // @ 中科院王飞跃 : 转发微博 ◆ ◆ @ 瀟湘墨人 【黄灯居然是中国人发明的】胡汝鼎( 1905~1985 ),杭州人,早年留学美国。 1927 年,小胡在美国的一个繁华十字路口,绿灯亮了,正要向前走,一辆汽车擦身而过,小胡吓了一跳。小胡琢磨,觉得应该在红绿灯中间加个黄灯,提醒司机慢行,注意行人。小胡的建议得到了美国人的重视和采纳,最终全世界通行。 1 月 3 日 08:21 来自 享拍微博通 转发 (36) | 评论 (10) 1 月 3 日 09:42 来自 新浪微博 转发 (4) | 评论 (1) ——————————————————————————————————————— ——————— // @ 概而不论 : 黄灯儿从本质上说啊!是这么回事儿!不等于红灯,就等于绿灯!不等于绿灯,就等于红灯! ◆ ◆ @ 刘仰 新的关于黄灯的交规,是否等同于取消黄灯?黄灯 = 红灯? 1 月 3 日 00:12 来自 新浪微博 转发 (20) | 评论 (23) 1 月 3 日 09:27 来自 新浪微博 转发 (1) | 评论 (1) ——————————————————————————————————————— ——————— // @ 古长宏 : 民意亮起黄灯,权力何去何从?晚安。 ◆ ◆ @ 人民日报 【你好,明天】红灯停绿灯行,黄灯怎么办?新年伊始,一部史上最严交规引发网络吐槽。人们赞同严格执法,却疑惑规定是否合理可行。法规的统一严肃需要维护,民众的意见也要耐心倾听。强制的权力或可带来服从,却不必然带来权威与公信,懂得检视修正才能让人心悦诚服。民意亮起黄灯,权力何去何从?安。 1 月 2 日 23:33 来自 人民日报微博 转发 (4598) | 评论 (1533) 1 月 3 日 09:20 来自 新浪微博 转发 (2) 评论 ——————————————————————————————————————— ——————— // @silverspring : 目前的系统好像读秒就无法实现按流量实时控制红绿灯时长这样的智能控制。北京曾为两种方案论证过,郊区也试点过读秒。未推广,应也有改造成本因素。当初也没预料会出台这样的交规吧 // @ 传奇没那么简单 : 新法大家可遵。但问题是为什么北京不能在信号灯上显倒计数呢,为什么别城都可 ◆ ◆ @ 新华社中国网事 【公安部: “ 抢黄灯 ” 和追尾事故可以避免】针对网友提出的黄灯亮时紧急制动造成的车辆惯性前行,公安部交管局有关负责人表示,黄灯亮时,只要机动车车身任何一部分已越过停止线即可继续通行。驾驶人注意力集中、与前车保持安全车距,行经路口时减速慢行、谨慎驾驶, “ 抢黄灯 ” 和追尾事故是可以避免的。 1 月 2 日 18:23 来自 新华新媒体 转发 (5509) | 评论 (1173) 1 月 3 日 09:19 来自 新浪微博 转发 (2) | 评论 (1) ——————————————————————————————————————— ——————— // @ 钅彡壴 : 大数据中数据挖掘 // @cnsns : 爷,这是屁股问题,不是数据问题! // @ 叶开 : 从大数据角度讲北京海量的路口数据、红绿灯数据及事故监测数据,如进行数据挖掘和预测分析,足可支撑一合理的决策出来,可为什么还要拍脑袋? // @ 我是大刚 : 想起那句话 : 肉食者鄙,未能远谋。 // @ 牛文文 : 一个开文艺酒店 ◆ ◆ @ 桔子水晶吴海 《承认一个错误值多少钱?试算关于黄灯的新交规让北京损失多少钱》:现行黄灯方案每年北京间接损失达到约为 36 亿,如果采用更傻的修补方案,有可能北京间接损失每年 60 亿左右。原文链接: http://t.cn/zjQHX7h 1 月 1 日 17:00 来自 新浪微博 转发 (20679) | 评论 (3426) 1 月 2 日 23:29 来自 新浪微博 转发 (1) | 评论 ——————————————————————————————————————— ——————— // @ 云泉微博 : // @ 蓝旗主的当代微博版史记 :// @ 漆洪波 : 都是倡导创新社会整的,交通局也搞创新。下回教委创新一下数学,说一加一不再等于二。 // @ 王冉 : 我很怀疑是不是交管局有什么情绪才会这样有意找骂。 // @ 光头王凯 : 今后只设一个绿灯就好,亮了走灭了停。低碳中国,从交管局做起,耶! ◆ ◆ @ 光头王凯 黄灯的作用就是作为红灯即将来临的提示。如果闯黄灯罚六分,就等于黄灯变成了红灯。这时候,红灯再也找不到自己,只能在性工作者身上找到些许价值。 1 月 1 日 20:54 来自 新浪微博 转发 (2955) | 评论 (528) 1 月 2 日 09:50 来自 新浪微博 转发 (2) | 评论 ——————————————————————————————————————— ——————— // @ 段永朝 : // @ 曹增辉 : # 修订黄灯 # // @ 王亚彬 : 转 ◆ ◆ @ 楊葵 新交规实行第一天上路,试验几十次黄灯问题,很难掌控。要么灯前 50 米左右就把车速降至极低 —— 这必将给本就拥堵不堪的交通造成更大灾难; 要么随时准备灯前急刹 —— 这必将造成更多追尾事故。我积极赞成交规从严,但这一条确实极不合理,民意反对声音也极大,郑重呼吁 @ 北京交通 修订此条。同意者请转发 1 月 1 日 13:26 来自 新浪微博 转发 (18306) | 评论 (3388) 1 月 2 日 09:48 来自 新浪微博 转发 (1) | 评论 ——————————————————————————————————————— ——————— // @ 云泉微博 : // @ 施力勤 : 谁是做信号灯系统的老大 ? ◆ ◆ @ 人民日报 【微评论:让黄灯不再 “ 纠结 ” 】新交规首日, “ 抢黄灯 ” 成吐槽焦点。培养文明行车,提升道路安全,是新规本意。然而,执法标准能否统一?取证方式如何规范?现有信号灯设施是否合理?网友 “ 纠结 ” 的背后,是对规则漏洞的疑惑,对公平执法的期待。提升文明,不能只靠严法,更需完善而人性化的配套措施。 1 月 1 日 22:10 来自 人民日报微博 转发 (1901) | 评论 (744) 1 月 2 日 09:48 来自 新浪微博 转发 (1) | 评论 ——————————————————————————————————————— ——————— // @ 马少平 THU : 另一种可能的修改方案就是在红绿灯前加一条黄线,过了这个黄线的车可以闯黄灯但不能闯红灯。也许是最经济的不就措施了。 // @ 程序员邹欣 : 文章有理有据 . 天朝在黄灯这件事上也要摸自己的石头过河么 ?! // @ 牛文文 : 一个开文艺酒店的愤青老板,能把黄灯这点事研究得这么门清! ◆ ◆ @ 桔子水晶吴海 《承认一个错误值多少钱?试算关于黄灯的新交规让北京损失多少钱》:现行黄灯方案每年北京间接损失达到约为 36 亿,如果采用更傻的修补方案,有可能北京间接损失每年 60 亿左右。原文链接: http://t.cn/zjQHX7h 1 月 1 日 17:00 来自 新浪微博 转发 (20679) | 评论 (3426) 1 月 1 日 20:57 来自 新浪微博 转发 (4) | 评论 ——————————————————————————————————————— ——————— 旧文:交通设计管理须重视行为心理因素 http://t.cn/zlEARiy 有些路口设计几乎不考虑车的行为和人的心理,不但行人乱穿,车也被逼乱闯。有些路口设计太大,一个信号周期得等很长,早不耐烦,按奈不住抢道心理,结果绿灯来了,可其它方向车占着路,无法通行, “ 绿灯不绿,红灯不红 ” 一片混乱恶性循环。 2012-10-20 23:01 来自 新浪微博 转发 (2) | 评论 ——————————————————————————————————————— ——————— 附件3: Traffic Congestion and Socia l Media in China Ke Zeng Xi’an Jiaotong University, China Wenli Liu National University of Defense and Technology, China Xiao Wang and Songhang Chen Chinese Academy of Sciences The 1st of October is Chinese National Day; in celebration, Chinese citizens get an eight-day holiday, known as the Golden Week, from 30 September to 7 October //au: rewording correct? Ok // . During this week? , the Chinese government implements a policy in which vehicles with seven seats or less travel toll-free on the highways //au: rewording correct? Ok // . Unfortunately, this policy over stimulates domestic travel and creates more traffic congestion. As Figure 1 shows, the highway turns into a giant parking lot. In 2012 , Chinese citizens took 189 million trips while this policy was in place and received //received? ok // 6.54 billion free tolls; however, 68,422 traffic accidents occurred over the Golden Week (Statistical result via URL: http://finance.people.com.cn/money/BIG5/n/2012/1011/c218900-19226060.html) , and 794 people lost their lives (Statistical result via URL: http://leaders.people.com.cn/n/2012/1008/c58278-19186170.html) . Figure 1. Traffic congestion on China’s National Day. The government’s policy of allowing many vehicles to use the highways for free during the Golden Week increases congestion //au: correct? ok / . (Figure via http://tieba.baidu.com/p/1893369318 and http://tieba.baidu.com/p/1922476418 ). Unexpectedly, traffic congestion on roads in physical space also caused an “opinion explosion” in cyberspace. Participants in social media included potential travelers and people who got caught in the holiday traffic jams. Online social media participants have shown great enthusiasm for road trips. Figure 2 shows the growth of topics about road trips on social media sites over the past few years. The number of road-trip related topics were far fewer prior to 2008, but the topic type increased significantly over the past four years . E specially in 2012 , s uch a high growth rate and high occupancy on such sites //correct? Yes // could provide early warning signals, giving a preview of road traffic conditions via social media that indicates that traffic for the holiday could experience unprecedented growth. Figure 2. Growth in road-trip-related topics. Such topics have increased steadily on social media over the past few years. Here, we examine Golden-Week-related topics’ evolution over the past 10 years //au: ok? ok // using comprehensive online communities such as xcar.com.cn, tianya.cn, autohome.com.cn, and Sina Weibo (weibo.com, a Twitter equivalent in China). Using methods based on topic clustering, we can analyze the attention users give to various topics , study the geographic distribution of online topics concerning Golden Week, get travelers’ growth tendencies and geographic distributions, and provide primary research for traffic emergencies during holidays. //au: IS avoids “signposting,” or summarizing point by point what the article will discuss, so I’ve deleted the last few sentences of the introduction . Agree // Topic Evolution With the development of SNS(Social Network Site) //au: SMS? Or what does SNS stand for? Stand for Social Network Site //, IM, Twitter, and other social media platforms, more people prefer to communicate and exchange opinions online. 1,2 Plenty of topic threads let people exchange tourism experiences, such as discussions of travel routes. Right before the 2012//2012? ok // holiday, a few topics about the Golden Week and travel became popular. As highway traffic congestion emerged in real time //au: correct? Just ok, and I prefer to emphasize the synchronism between physical world and online world. // , related topics quickly grew on the Internet. As time passed, the traffic jam became a larger traffic disaster. Concurrently, the deterioration of the situation due to high traffic conditions attracted the attention of users from various social communities and triggered a furious discussion about the Golden Week traffic policy and the traffic congestion. Moreover, online users put forward some emergency strategies to dealing with the congestion and got great support //from or for who/what? The word “put forward” is much better than “developed…from”. // . To illustrate the topic transformation during the entire Golden Week, we constructed dynamic evolutionary social networks to reveal topic participants’ characteristics in accordance with online discussions on Tianya.cn. The Internet contains many Golden-Week-related topics, but most of these have low participation, so their discussions are insignificant. Such topic threads can’t reflect the overall attention and tendencies of online holiday-related topics. For our study, we chose only threads with high reply and user-engagement rates, selecting 11 threads on road travel from different online communities. Figure 3a shows different communities’ discussions concerning holiday trips from 28 September. The discussion thread content and direction was very different; although many people participated, the overlapping coefficient among them is small, and no users participated in different topics at the same time. Topics at that moment mainly focused on scenic spots and travel routes. Figure 3. The evolution network of Golden-Week-related topics on social media. We can see (a) the discussion concerning holiday trips on 28 September; (b) topics related to the toll-free policy; (c) and (d) policy-related topics becoming more prevalent over time; and (e) our social network of participants in Golden-Week-related topics. Nodes represent participants in road-trip related topics; node size is proportional to the degree of online users. Lines between nodes represent different users’ reply and reposting behaviors, whereas the node color represents different topics. In the time leading up to the Golden Week, new topics arose (Figure 3b) about the toll-free policy and road traffic conditions (in pink and red, respectively). As we can see, these two topics attracted few online users until 30 September. As time passed, the policy-related topics became more prevalent (Figures 3c and 3d). They attracted many participants from other travel-related topics and communities. More and more online users took part in the discussion about the toll-free policy; some also suggested reasonable emergency responses. By 8 October, we could construct a social network of participants in Golden-Week-related topics (Figure 3e). Apparently, almost every topic thread included some participants who also took part in the policy-related discussion threads. Simultaneously, topics about road traffic conditions and policy feedback were attracting participants from different communities, and became hot topics on Tianya.cn. Moreover, the crossover and overlap rates of holiday-related topics were at their highest at that time. Opinion Analysis Topics about the Golden Week not only attracted considerable attention in social communities but also generated furious discussion among Sina Weibo users. Analysis about traffic jams, feedback on traffic flow and conditions, and traffic congestion suggestions are still debate topics. The most popular Weibo on Sina.com comes from Daokui Li, a professor from Tsinghua University who asked, “Why not increase the highway toll by 50 percent and give this money to welfare institutions?” This caused immediate and enormous attention. As shown in Figure 4 , w e constructed a dynamic network to illustrate how holiday-related topics evolved and grew on social networks and thus study their dynamic evolution characteristics. The growth process is similar to that of the topic evolution we discussed previously, where topics analyzing traffic conditions and providing feedback on the policy were the most critical. Many online users gave suggestions for improving traffic congestion, while others also put forward temporary suggestions such as “cancel the ‘pass’ card at the highway entrance,” which attracted considerable attention from online users. We extracted various relevant suggestions, analysis, and viewpoints and classified them in accordance with their helpfulness as regards traffic congestion (see Figure 5) //au: where in the text should Figure 4 be called out? Added // . Figure 4. A dynamic Sina Weibo network. The nodes represent the participants in policy feedback topic threads; lines between nodes represents the reply and reposting relationship between different users; and node color represents the time period during which the information was posted: the earlier the post, the darker the node. Figure 5. Categories of suggestions, analysis, and viewpoints. We classified this feedback according to its helpfulness as regards traffic congestion. Traffic Flow Information from Social Media Social media can also forecast the geographic distribution of potential tourists. We did considerable data mining work on several communities; analyzed the travel-related topics that emerged online from 1 to 30 September in those communities; and extracted online users’ attributes and their published content to study potential travelers’ geographic distribution features. We studied the data in detail and categorized it in accordance with its geographic distribution. Simultaneously, we gathered travelers’ geographic distributions and the proportional relationships between travel-related topics from 2011 and 2012 (see Figure 6). More importantly, as our analysis indicates, the geographic distribution features of different provinces reflect road travelers’ potential distribution as well. We can also get each region’s growth proportion in accordance with last year’s travel-related topics and participants. Thus, we can use our analysis to strengthen the accuracy of geographic distribution forecasts as compared to real-world traffic conditions. In the future, besides looking at the number of tourists, we will deploy text mining technologies 3 to travel-related content mining from social media, and to identify tourists’ time of departure , origin, and destination. This data is key for traffic-flow forecasts. With the increase in those using social media while traveling, this method could let users obtain more detailed and useful data for traffic management. For example, travelers could access information about traffic emergencies (such as subway crashes or floods) from social media in real time. Figure 6. The geographic distribution of traffic flow on social media. This chart includes data from 298 online communities, 1,842 travel-related topics, and 42,232 posted messages from autohome.com.cn. Social media plays an increasingly important role in our lives. 4 Here, we’ve used social network analysis 5 methods to analyze online topic evolution for traffic management. Our results could potentially support government decisions and help officials more efficiently manage traffic. Social media has obvious advantages during traffic emergencies—compared to wireless sensors, for example, it isn’t limited by information sources and can get different types of online data, such as videos or pictures. Compared to traditional media, social media is more flexible and more applicable to road emergency management and traffic dispersion. References 1. J. Zhang et al., “Data-Driven Intelligent Transportation Systems: A Survey,” IEEE Trans. Intelligent Transportation Systems , vol. 12, no. 4, 2011, pp. 1624-1639. 2. F. Qu, F.-Y. Wang, and L. Yang, “Intelligent Transportation Spaces: Vehicles, Traffic, Communications, and Beyond,” IEEE Comm. , vol. 48, no. 11, 2010, pp. 136-142. 3. K. Lerman, “Social Information Processing in News Aggregation,” IEEE Internet Computing , vol. 11, no. 6, 2007, pp. 16-28. 4. J. Sutton, L. Palen, and I. Shklovsk, “Backchannels on the Front Lines: Emergent Uses of Social Media in the 2007 Southern California Wildfires,” Proceedings of the 5th International Conference on Information Systems for Crisis Response and Management (ISCRAM) , C aptus P ress I nc . (Eds. F. Fiedrich and B. Van de Walle) , 2008, pp.624-632 . 5. F.-Y. Wang et al., “Social Computing: From Social Informatics to Social Intelligence,” IEEE Intelligent Systems , vol. 22, no. 4, 2007, pp. 79-83. Ke Zeng is a PhD student in the School of Electronic and Information Engineering at Xi’an Jiaotong University, China . Contact him at ke.zeng@live.cn. Wenli Liu is a PhD student in the Center of Military Computational Experiments and Parallel Systems Technology, National University of Defense Technology, China . Contact him at lwl_david@163.com. Xiao Wang is a PhD candidate at the State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences . Contact her at kara0807@gmail.com. Songhang Chen is a PhD student at the State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences . Contact him at chensohg@gmail.com. Research on social media has been applied to various academic fields. During this year’s Chinese National Holiday traffic congestion event, online users showed great enthusiasm on social media, such as forums, Weibo, communities, and other platforms. This article describes the construction of a dynamic evolution network, analyzes the transformation of online users’ concentration, and studies the geographic distribution of travelers by analyzing online users’ attributes. social media, Chinese National Holiday, traffic congestion, online forums, Weibo, dynamic evolution network ————————————————————————————————————————————————— ————————————————————————————————————————————————— ————————————————————————————————————————————————— ————————————————————————————————————————————————— —————————————————————————————————————————————————