科学网

 找回密码
  注册
科学网 标签 WSD

tag 标签: WSD

相关帖子

版块 作者 回复/查看 最后发表

没有相关内容

相关日志

【科普小品:伟哥的关键词故事】
liwei999 2016-1-27 02:25
讲个伟哥的故事。 当年在水牛城的时候,我们开始开发信息抽取挖掘(如今叫知识图谱)的产品,名叫 Brand Dashboard,就是从在线新闻和论坛等专门收集品牌的全方位信息。这个产品生不逢时,超出时代了,因为那时社交媒体还没诞生,网络舆情和品牌情报还处于 BBS 和论坛新闻的时代。即便如此,大企业客户的market 还是有的,我们的顾客之一就是这个伟哥的厂商,大名鼎鼎的 Pfizer。 当时为了这个产品,我领导开发了一个品牌和术语的消歧模块,其中用到的排歧条件包括利用句法关系如SVO的限制,backoff 到 keywords。关键词条件就是所谓共现关系,可以根据距离进一步区分为在同一个句子,同一个段落,或者同一篇文章。所以这个排歧的 backoff model 实际上就是: SVO -- keywords within S -- keywords within P -- keywords within D SVO 不用说,条件最严苛,一旦 match 了,歧义自然七窍生烟被打趴下了,非常精准,但覆盖面常常不够。这关键词怎么用呢?需要给新人讲解为什么关键词共现也可以排歧。于是,顺手牵羊,就用了这么个案例:说 ED 是两个字母的缩写,歧义得很,查查缩略语词典,可以找出一长列可能的词义来,包括不举。但是,哪怕是 backoff 到 Document level,这个排歧也是有效的,因为有的时候,词与词之间有很强的 semantic coherance(其实关键词技术横行NLP领域多年,其诀窍就在于此)。具体说来,ED 的同一篇文章中如果出现了关键词 Viagra 或 Pfizer,它就死定了,绝不会有其他的解释。这时候,句法结构就不必要了(而且句法也不能跨句,更不用说跨越段落去影响了)。俗话说,戏不够,词来凑,这戏就是结构:如果 SVO 太窄或太不全,recall 不够,那就用词的共现来凑呗。懂得这个原理,NLP就入门了。 话说这个讲解还真有效,甚至实习生也一听就明白,原来语法结构与关键词共现还有这样的后备关系啊。 伟哥故事完。 【相关】 【 立委科普:NLP 中的一袋子词是什么 】 《 立委科普:关键词革命 》 《 立委科普:关键词外传 》 《朝华午拾:创业之路》 《朝华午拾 - 水牛风云》 【置顶:立委科学网博客NLP博文一览(定期更新版)】
个人分类: 立委科普|4420 次阅读|0 个评论
【新智元笔记:WSD与分析器,兼谈知识图谱与分析器】
热度 1 liwei999 2015-12-4 01:07
我: 热闹啊。一路扫过去: 印象是这里大概是搞NLP和语义的人最集中的地儿了,托白老师的福。大树底下好乘凉。 二是现在讨论很杂,大概是大家伙儿热情太高。 wang: 白老师在,人正常说话能看出破绽,那机器就无抬头日啊 @wei 昨天挺不好意思,耽误老师太晚 我: 昨天泼冷水,是从我个人角度 不知道你已经钻进去了,没有退路了,:) wang: 李老师,精力也是令人佩服!还好,基本出来了啊 白: 权当预处理了,第0层。也符合伟哥的分层思想。 我: 那是有熬出头的意思,因为成果在望?我以前枪毙过一个加拿大的 WSD 公司。 wang: 前面我已经提到,不用太好的WSD也可以支持不错的句法分析,--这是我的结论。因为我的3级语义码识别,可85%的精度,而这是纯多义词情况比例,而正常句子一般47%的多义词。 嗯,李老师文中提过 我: 哦 文中说过了。人老了,记不住自己都说过啥。反正喷得多了,就成了维吾尔族姑娘,也不怕白老师们抓小辫子了。 白: WSD不是一个解决方案,只是可以和分析器形成流水作业的一道工序。当解决方案用就大错特错了。如果目标在深层的话。 wang: 白老师,确实皆通,句句问到点子上,白老师总结的对,是而这流水协同作用。 我: 关键的要害是,吃力不讨好 wang: 这样,全句的词义消歧92左右,包括单义词,这个正确率,确实不影响太多的句法分析。若算一级语义分类正确率的话,还要再高些。 我: WSD 肯定可以帮到句法,但是费工太大。世界上的事体,没有不能补偿的。譬如眼瞎了,耳朵就灵敏了。不用WSD,别的资源就来补偿了,也可以走得很远。实在绕不过去, 就 keep ambiguity untouched,等到语用的时候再对付。语用的时候,语义问题一下子缩小到一个子集,一个domain,所以原来大海一样的 WSD,就变得 tractable 了,有时甚至自然而然就消失了,不再是问题了。 wang: 嗯,我也不总是一下有解,有些留到后层处理。结果良好,可以接受。 同意,确实有些看似问题,后来不用解决也自然解决 白: 伟哥的意思,解空间是人定的,你搞不清是a还是b,就在论域里增加一个ab好了,后面自有机会把论域再缩小的。不要为了一定要在信息不足的条件下强行分出a还是b,把系统搞重。 我: @白硕 对,白老师说话清楚多了。 第 0 层的想法也对。因为 WSD 这东西可以依靠 density,而 density 是可以在一篇文章的 discourse 下做的。这个有拉动全局帮助局部的好处。 白: 嗯,董振东老师举的“薄熙come”的例子犹如在耳。 我: 这个加 ab 的状况对于完美主义者 心里觉得别扭。但其实,模糊是自然常见的状态,而清晰才是少见的人力的结果,而且还保不定会被翻盘。既然是自然状态,那么就应到不得不清晰的时候去对付它。而不是先清晰了,再去等着不断翻盘。 白: 这个就是量子力学里的叠加态,保留到最后坍缩。 wang: 嗯,刚才也谈到翻盘的,有些压根前期就清晰不了。 我: 不过 话说回来,如果先做 WSD 多少把太不像话的枝枝蔓蔓减除一些。然后做句法 应该还是有益的,只要小心就好。 wang: 嗯,的确减不少。比如一个句子多义词,按平均5个义项算,句子长了各种组合也有很大的规模。 白: 这个,人有时不是这样的。在信息不足时强行坍缩,遇到trigger再翻盘的情况,在段子里一把一把的。我们都被耍弄得很开心。 我: WSD 是个不一定需要结构就可以做个大概的东西。因为全盘的 density 对于 WSD 的影响,比局部的结构对它影响,一般来说,更大一些。这样,discourse 的威力就可以发挥了。道理就在,WSD 虽然是针对个体的词,但是一个 discourse 里面的词的共现,是有很自然的语义相谐性。n 个多义词在同一个 discourse,互相作用,互相消歧。 白: 我就给它定位第0层 他窗口很小,哪里看得见density。 wang: 我接受白老师定义的0层。 是这样的,况且更多是单义词。连续几个多义词在一起也有,处理也还可以,就是连续未登陆词,会出问题 白: 伟哥知道薄熙come的典故吗? 我: 不知道这个典故,但是似乎可以想见董老师的机智和幽默。跟董老师太熟了。 薄熙来了。 薄熙来走了 薄熙come了。 薄熙come走了。 类似这样的? 白: 说的是某汉语文章译成英语,文中出现了5次薄熙来,译成英语后,四次翻译成“Bo Xilai”,一次翻译成“Bo Xi Come”。 wang: 这样啊, 我: 那个系统还是蛮了不起的。 敢于对抗 one sense per discourse 的大原则。我们一般是不敢的。 wang: 从篇章提取关键核心词进行制导,会有改善,但也有改错的时候 我: 你反正是做粗线条,而且是 n-best。目标不是真地消歧,而是减负,譬如从原来的5个,减到3个(3-best)。 wang: 把句法分析结果进行分层,组成篇章理解框架,这样的高级层处理也许,比单句作战要好,---现阶段,只是想想,不敢干。 说的对。 白: 某年我在百度和谷歌翻译上测试周恩来、薄熙来、朱云来,效果依次递减。 wang: @白硕 专有名词词典,能及时跟进,可能就好很多 白: 分析器的lookahead,也是减负,一个道理。 wang: 我目前是选3个,有些很明显分数很大,基本取Top1 白: 但他只看cat不看subcat,典型的活人叫那啥憋死。 wang: 白老师说我? 白: 不是,说分析器,LR(k),包括我自己提出的角色反演算法,都是这个毛病。 wang: main cat 确实误导很多, 我: 哪家分析器只看 cat 不看 subcat?cat 算个球啊,太大太空太少。 白: 不是工程用的。@wei  wang: 同意李老师,subcat 太细也不是好事,但是解说容易懂, 我: 想做分析器,基本靠 cat,那是 CL 教科书玩具系统留下的后遗症。 最大的后遗症来自: S --》NP VP NP --》 DT JJ* NN+ VP --》 V VP --》 V NP 被这么灌输了一阵子,看自然语言就当儿戏了。所以才会有共识:lexicalist ,这可能是 NLP 领域这么多年最大的共识了。没有人不认为 不需要词典化。词典化的方案各个不同而已。 白: 这话分两截说,一是那么定义的问题要用那种系统去做,二是那么定义问题是不对的所以不该那么做。 wang: 我觉得CFG,自由太过了,加上cat 太粗 ,因此这个处理,很难跳出。加上词汇化,又太稀疏。词汇化n元开大了,稀疏问题相当严重。 白: cat是可自定义的,没有谁一定说非得NP,VP。关键是自定义work的,都要到词例化层级。 我: POS 的地位是阴错阳差弄出来的。 结果是大家误以为,必须做 POS,而且 assume POS 是个 solved problem,然后 在 POS 上做分析器,擦不完的屁股。 白: @wang 你这个n=5也是醉了。 wang: 我是语义码,同义词词林义项1400个,比几万,十万词构成规模,还是轻量级。 跳过POS我认为是个进步,但是后面的还是有很多问题要解决。 刘: 在SMT里面ngram的n=5甚至更多都不少见,现在的neural language model已经超过ngram了,rnn、lstm可以更好的利用远距离依赖。 wang: 刘老师晚上好! 刘: 你好!好久不见了 wang: 是啊,好久不见。白老师来大连,我不凑巧没见着,李老师太远 ,呵呵 白: 如果想要处理段子,还是激进一点好,太保守会消灭笑点的。 我: 觉得白老师有时也走火入魔,一天到晚想着段子,这个对做 real life NLP 是 “过度思维”。 白: @wei real life NLP并不是只有一种 我: 段子的事儿,可以启迪思维,但做的时候,就该放在一边。 白: 看应用场景 @刘群 处理WSD的RNN可以和处理句法的RNN流水。 我刚想说5-gram真是巧合,记得多年前你的学生和骆卫华同一天答辩那次,就是用的5-gram。 洪: 李维擂鼓佟佟佟, 分明书生老黄忠。 转战各群显神勇, 定军山找不轻松。 我: 最后一句湿不懂 @洪涛Tao 雷: @wei 老当益壮的意思 我: 哦 四大名著唯一没看下去的是《三国》, 不知道定军山与黄忠的实体关系,这个需要 IE 一下就好了,看 “三国图谱” 一目了然。 洪: @wei 你需要找你的定军山,具体地说,找你的 夏侯渊 。 我: 特佩服读破万卷书的人,譬如洪涛这样的简直就是神人,或人神。 我从小读书就慢,所以读书少,要是在西方的教育体系下,早就淘汰了。 看我女儿上课,那教科书参考书都比砖头还厚,都是一目十行的人才能对付。 我看一个句子,要读三遍,咀嚼五遍,然后进一步退三步地反刍。 洪: 老李今天的作业,看在一个陌生领域,如何迅速建图谱 我: 图谱的问题已经解决,就是工作量了。这是说的真心话,不是胡吹。 图谱的抽取挖掘,比起舆情真地不是一个量级的难度。 舆情都做的,回头做图谱,没有做不成的,不管啥 domain,你给钱,我就做。 白: 可以和郝总PK了 wang: 各位老师,我先下了,各位多聊,温馨提示:白老师也要注意休息!各位聊好 88! 洪: @wei 要不说你老黄忠。可能比老黄忠还老黄忠,因为都不用诸葛亮使激将法。 我: 陌生领域做图谱,关键是要有一个好的分析器。只有这样,domain 的 porting 才可以做得很薄很快。而 分析器 基本是不变的,现成的,那剩下还有啥难的? 你 parsing 做浅了,IE 图谱就必须做深,反之亦然,parsing deep 了,IE 就是薄薄的一层。 反正不管到那个领域,语言还是那个语言,文法还是那个文法,只有词汇(术语,ontologyy)才有最大的差异。 洪: 国内大家都晚安。我也赶紧跑,否则十有八九成为老李刀下的夏侯渊 我: 晚安晚安。 【相关】 词义消歧(WSD) 【置顶:立委科学网博客NLP博文一览(定期更新版)】
个人分类: 立委科普|3292 次阅读|1 个评论
美国纽柯钢铁的竞争力从何而来?
热度 3 lanxum 2014-9-14 12:58
中国有“纽柯”吗? – 美国纽柯钢铁的竞争力从何而来? 140914 李健 美国纽柯(Nucor)钢铁是1966年才开始进入钢铁行业的,起初在钢铁巨鳄面前它只是一个小电炉炼钢厂。如今,它已发展为美国产量第二、利润第一的钢铁公司,在今年最新WSD全球最具竞争力钢企排名中位列第二,仅次于韩国浦项。但其在一些“硬”指标方面优于浦项,如ROE(净资产收益率)、ROA(资产收益率)。 ( 注:WSD为美国钢铁行业分析机构世界钢动态公司。在WSD排名中,纽柯从2006到2010年的第五、第六,到2011年上升至第二名。 ) 纽柯专攻钢铁的几十年,除全球经济危机的2008年亏损外,其它年度一直保持盈利。这在全球钢铁产业低迷、利润下滑的整体形势下,引起了世界钢铁行业界的强烈震动,被视为世界钢厂的样板和学习的典范。纽柯钢铁之所以能保持数十年的持续竞争力,主要原因在于其“低成本、高效率”的管理运营模式,这个口号在中国经常能听得到,但纽柯实实在在做到了。 减少机构、减少分层,节约大量管理费用,决策迅速。 与全球其它大型钢企不同,纽柯没有专门的RD研发部门,但它一直是世界最先进的短流程工艺和世界最先进的炼钢技术的领导者。它的技术是靠设备制造商开发和外包引进。另外,纽柯说两万多的纽柯人是创新的源泉,它是纽柯成长的主要方式。纽柯也没有专门的技改设计和建造部门,所有这些工作都是通过招投标外委承建。纽柯下属企业有90家,但所有总部人员不到100人。纽柯充分放权,总部除了对资金实行集中管控、负责战略规划和战略控制外,其它工作都由下属单位负责。纽柯前总裁艾佛逊曾这样评价纽柯总部的职责,“除了对现金流进行管理外,我们在总部没什么事情好做。。。。。。”。管理层次扁平,与其它大型钢企10多个管理层次相比,纽柯总部到一线工人总共只有5层,纽柯认为这道理很简单——是全体纽柯人驱动纽柯成功,而不只是经理们。 我算过国内某大型钢企集团的管理层次,从董事长发话到传达到一线工人要经历13层,依次是 1 、集团董事长——2、集团主管部门——3、区域公司经理——4、区域公司部门——5、区域下属公司经理——6、区域下属公司部门——7、分公司经理——8、分公司部门——9、分公司下属厂厂长——10、下属厂部门——11、下属厂车间主任——12、车间班组组长——13、一线工人。 董事长离工人们有多远?冗长的管理层次,效率低下不说,可怕的是在经历了13层后,董事长发话的原意估计被扭曲的差不多了! 纽柯前总裁艾佛逊说,要提高和保持工人士气,很重要一条是“摧毁”特权等级制度!面临困境时,纽柯经理们要与工人共患难(Pain Sharing)。 公司经营效益好,每名员工都会受益,公司面临困境时,所有员工共同承受。而经理们则采取逐级递增比例的风险承受方式,纽柯工人工资下降25%,主管工资下降60%,CEO工资则下降75%( 1982 年最困难时,纽柯董事长自降收入76.5%,这在美国十分少见 )。Pain Sharing的做法,正是与纽柯摧毁特权等级制度、平等待人的企业文化相一致。 世界500强的纽柯总部办公室是租的,总经理没专用餐厅和专用停车位,任何人出差一律坐经济舱,重要决策竟然是在熟食店吃午饭时做出的! 据“美国纽柯钢铁公司管理方法纪事”一文,上个世纪90年代已成为财富世界500强的纽柯总部办公室,是500强中最简陋的,是在一家购物中心附近一幢四层楼内租的,面积只有几个牙科诊所那么大。后来由于业务发展,原所在办公楼所在楼房陈旧,维修费上涨,放文件的地方不足,纽柯总部搬到一幢新楼房里,但仍是租的。按理说,一个成功的公司总是要建一座公司大楼,一是办公,二是公司成功的象征,可纽柯公司总裁坚决反对建造公司大楼,不愿意自我庆祝。纽柯管理人员不享受任何特殊待遇,他们与工人享受同样的医疗保险,同样长短的节日与休假。纽柯没有总经理专用餐厅,公司没有专用的小汽车、飞机和游艇。纽柯各工厂都不设厂长、经理专用停车位,和工人一样,早来就有近停车位,晚来只好停在远的位置上。公司里不论任何级别的人出差,一律坐飞机的经济舱,连总裁也不例外。纽柯总部管理人员从不正式开会,但他们定期在办公楼对面购物中心里的一个熟食店里借吃午饭时候,商谈公司业务并做出重要决策,纽柯一个重要的CSP连铸项目就是在这个熟食店里决定的! 信任和自由——尊重每个人! 纽柯自己总结的文化有8条,其中第4条“信任和自由”(Trust and Freedom)。纽柯每个工厂都是有高度自主权的,从产品采购到制定生产目标、建立和管理安全计划,其最终的权力、责任都由每个工厂的总经理承担。纽柯领导层认为第一线的管理人员最了解情况,因而最有资格对日常生产经营做出决策,而不是公司总部再来设人管理这些业务。纽柯文化鼓励工人谏言献策,即使是失败的建议,可以直接到厂长办公室与厂长交谈。前总裁艾佛逊最喜爱的格言之一,“好的管理者也会做出坏的决策”。“当我们管理者做出错误决策时,每个雇员都有责任向我们提出以便我们能修正或改变那些错误的决策”。纽柯总部要求各分厂每年至少要和他们的全体员工面对面地交流一次,每次会见的员工不得超过50人。一家有500人的工厂,意味着厂长每年至少要召集10次见面会。总部对见面会的另一个要求是,“厂长不应该多说话,员工是主角。”艾佛逊认为,劳资之间的矛盾是由经理人员专制的管理方式造成。他说,管理人员不应当是“使”工人做事情,而应当是帮助工人做事情。 。。。。。。 纽柯的成功,证明了规模不是绝对因素(我国有多少“大而不强”!),是技术、成本和管理这些核心竞争力要素最终起了决定性作用,而尽管纽柯采用的是以回收废钢为原料的“短流程”炼钢方式,与我国大多“长流程”的联合钢企不同,似乎没有可比性。但大道相通,技术、成本和管理是任何生产方式的企业都回避不了的话题。纽柯“低成本、高效率”,其精髓就是让似乎复杂的管理变得简单再简单,只有简单,才能低成本、高效率、高效益,大道至简同样是简单管理之追求境界。 ( 注:2013全球钢产能排名纽柯为第14位,未进10强,而产能10强中国就占了6个。2014WSD竞争力最新排名纽柯位居第2,而WSD竞争力前20强中,中国一家企业也没有,中国宝钢为第21。详见:中国钢企在全球中的地位 http://blog.sciencenet.cn/blog-903722-807870.html ) 2001 年《从优秀到卓越》一书,是曾毕业于美国斯坦福大学的吉姆.柯林斯从全球1435家公司中选出11家卓越公司,以揭示这些卓越公司的成功秘诀。纽柯(Nucor)即在11家之中。我相信柯林斯的观点, “新经济中根本没有什么新东西。”“只有从过眼烟云的变革中看到背后永恒的管理法则,人们才能真正了解到伟大公司的伟大之处。” 部分数据来源: http://www.nucor.com/
个人分类: 管理idea|6826 次阅读|5 个评论
NLP 迷思之四:词义消歧(WSD)是NLP应用的瓶颈
热度 5 liwei999 2012-1-6 10:00
NLP 迷思之四:词义消歧(WSD)是NLP应用的瓶颈
引用老友 : 受教了。谢谢立委。 我同意“成语从来不是问题”。成问题的应该是一词多义,或歧义,对吧? 这个迷思不再局限于中文处理,它在整个NLP领域和NLP爱好者圈子里颇有迷惑性。WSD (Word Sense Disambiguation) 确系 NLP 难点,但在NLP应用上基本不是问题。 泛泛而言,一切歧义(词汇的,也包括结构歧义)都是自然语言的难点。形式语言(如计算机语言)好就好在基本不歧义。 但是,如果以信息抽取作为终极目标,绝大多数的一词多义也不是真正的问题,除非这种歧义影响了句子的结构分析(多数词汇歧义并不影响结构分析)。 原因在于信息抽取的时候,目标是明确的,建立的规则大多是词汇驱动的,而不是词义类别驱动的,因此歧义在抽取的时候有自动消失的条件。 举例说明:英语 buy 至少有两个义项: buy: (1)购买:Microsoft bought Powerset for $100 million (2)相信:I am not going to buy his argument 不做 WSD (Word Sense Disambiguation) ,也并不影响结构分析: 信息抽取也可以绕开 WSD,譬如,如果抽取的目标是公司购并(company acquisition)事件,下列由buy这几个词驱动的规则一样可以逮住上述(1)的事件,而并不需要对buy先行WSD再行事件抽取。因为事件抽取的条件自动排除了歧义,使得句子(2)不会被误抓为公司购并(argument 不是公司名)。 动词:buy|purchase|acquire 逻辑主语 (Actor):公司名 @1 逻辑宾语 (Undergoer):公司名 @2 ==》 《公司并购事件》: 收购公司: @1 被收购公司:@2 总之,很多时候可以绕开WSD来开发系统。实际上,多数时候必须要绕着走。domain independent WSD 差不多是 NLP 难度最大的课题了,幸好可以绕开。神佑世人,感谢上帝! @MyGod9 : 如果以机器翻译为目标呢? 如果是有近亲关系的语言之间做机器翻译,基本不需要 WSD,多数 ambiguity can carry over untouched. 即便是不同语系的语言之间做翻译,也要针对这个语言对来区分歧义,最好不要在不考虑目标语前先行WSD,因为后者大多吃力不讨好。 非统计类型的机器翻译系统的主流是转换式(transfer-based)机器翻译。词汇转换(包括针对目标语的词义消歧)与结构转换同步进行比较经济有利,利于维护。这就意味着机器翻译也与信息抽取有一定的共通之处:利用结构转换的条件同时消歧。 当然,机器翻译是NLP的一个特殊case,现在的主流都是统计模型了,因为 labeled data (双语语料库)只要有人类翻译活动就会作为副产品大量存在。这为机器学习创造了天然的好条件。统计模型支持的机器翻译,本质上也是转换式的。因此也不需要一个单独先行的WSD来支持。 WSD 可以作为 NLP 庙堂里的一尊菩萨供起来,让学者型研究家去烧香,实际系统开发者大可以敬鬼神而远之。:=) 说到这里,想起数年前我被华尔街 VC 在 due diligence 阶段请去鉴定一家做WSD的技术公司(名字就不提了),这家公司声称解决WSD有独到的技术,可以用来支持下一代搜索引擎,超越Google,因此吸引了华尔街投资家的注意。他们在白皮书中说得天花乱坠,WSD 是语言技术的皇冠,谁摘下了这颗皇冠,就掌握了核武器可以无坚不摧:这些脱离实际的空谈乍听起来很有理由,很能迷惑人。 可我是业内“达人”(开玩笑啦),不吃这一套。我给出的鉴定基本是否定性的,断定为极高风险,不建议投资:他们的demo系统也许确实做出了比其他系统更好的WSD结果(存疑,我 interview 他们的时候发现他们其实并没有做真正的业内系统的 apple-to- apple 比较),但是即便如此,其 scale up、适应不同domain 并得到实用,是几乎不可能的。我的小组以前做过WSD研究,也发表过 state-of-the-art 的结果和论文,知道这不是好吃的果子,也知道这是研究性强实用性弱的题目。我投票枪毙了这项风险投资。(如果是国家科学基金,WSD 当然是可以立项的。) 需要说明一句:枪毙技术投资的事情是不能轻易做的。大家都是技术人,都指望凭着技术和资金去改造世界,成就一番大事业。本是同根生,相煎何太急?今天我枪毙了你的技术投资项目,明天我要创业,说动了资本家后,是绝对不希望也被同仁给毙了。人同此心。本来就是风险投资嘛,资本家早就做好了失败的心理准备,他们打10枪只要中了一次,就不算亏本买卖了。要允许技术带有风险,要允许技术人“忽悠”资本家(他们大多是只听得懂“忽悠”型话语方式的人,真的,行内的“规矩”了,想不忽悠都不成),作为技术人要鼓励资本家拥抱风险。尽管如此,那次枪毙 WSD 我觉得做得很坦然,这是箭在弦上不得不发。 工业上 WSD 在可见的将来完全没有前途是注定的事情,用脚后跟都可以明白的事情,没有丝毫袒护的空间。这根本不是什么高风险高回报的问题,这是零回报的case,俗话都说了,女怕嫁错郎,男怕入错行,专业怕选错方向。方向错了,再努力都没戏,对于工业开发,WSD 就是这么一个错得离谱的方向。 朋友说了,如果这真是一个错误的方向,你为什么也拿政府的grant,做这个方向的研究了?(话说回来,不拿这个钱做这个研究,我能有这个权威和自信如此斩钉截铁地判断其应用价值几近于零么?)这个问题要这么看:其一,科学研究烧钱与工业投资烧钱本质不同,后者是以纯经济回报作为存在的理由。其二,政府的grant是竞标夺来的,我不拿,别人也要拿,总之,这纳税人的钱也省不下来。如果有问题,那是立项的问题。 说到立项,再多说几句。我们拿到的WSD研究项目是海军的SBIR创新基金,其主旨不同于鼓励纯科学研究的NSF,而是推动应用型技术的发展。从应用意义上说,这个立项方向是有错的。立项虽然是政府项目经理人之间竞标最后胜出的,但项目经理人不是一线科技人,他们的 idea 也是受到技术人影响的结果。说白了,还是技术人的忽悠。这个项目不大,问题还不大,如果一个大项目选错了方向,才真是糟蹋人民的钱财。历史上这样的案例还是不少的。远的有日本在上个世纪80年代上马的所谓“第五代计算机”的项目,忽悠得昏天黑地,似乎这个大项目的完成,新一代能够理解自然语言的人工智能电脑就会面世,日本就会成为世界电脑技术翘楚。结果呢,无疾而终(当然,那么大的投资砸下去,总会有一些零星的技术进步,也培养一批技术和研究人才,但作为整体目标,这个项目可以说是完败,头脑发热的日式大跃进)。美国呢,这样的热昏项目也有过。赫赫有名的 DARPA 是美国国家项目最成功的典范了,它推动了美国的高技术创新,催生了一些重要的技术产业,包括信息抽取(Information Extraction)和搜索技术,包括问答系统(Question Answering)。然而,即便如此成功的 program,有时也会有热昏如五代机这样的项目出台,完全错误的方向,不成比例的投资,天方夜谭的前景描述。 笔者当年为找研究基金,研读某 DARPA 项目的描述,当时的震撼可以说是目瞪口呆,满篇热昏的胡话,感觉与中国的大跃进可以一比。 惊异于科学界整体怎么会出现允许这样项目出来的环境,而且大家都争抢着分一杯羹,全然不顾其中的假大空。点到为止,就此打住。 【置顶:立委科学网博客NLP博文一览(定期更新版)】
个人分类: 立委科普|15875 次阅读|5 个评论
[转载]World Standards Day 2011—14 October 2011
LEOLAND 2011-8-29 15:15
World Standards Day 2011—14 October 2011 International standards – Creating confidence globally Message from IEC, ISO and ITU Dr. Klaus WUCHERER , IEC President Dr. Boris ALESHIN , ISO President Dr. Hamadoun TOURé , ITU Secretary-General In today’s world we need to have a high level of expectation that things will work the way we expect them to work. We expect that when we pick up the phone we will be able to instantly connect to any other phone on the planet. We expect to be able to connect to the Internet and be provided with news and information… instantly. When we fall ill, we rely on the healthcare equipment used to treat us. When we drive our cars, we have confidence that the engine management, steering and braking, and child safety systems are reliable. We expect to be protected against electrical power failure and the harmful effects of pollution. International standards give us this confidence globally. Indeed one of the key objectives of standardization is to provide this confidence. Systems, products and services perform as we expect them to because of the essential features specified in international standards. International standards for products and services underpin quality, ecology, safety, reliability, interoperability, efficiency and effectiveness. They do all of this while giving manufacturers confidence in their ability to reach out to global markets safe in the knowledge that their product will perform globally. Interoperability creates economies of scale and ensures users can obtain equal service wherever they travel. So international standards benefit consumers, manufacturers and service providers alike. Importantly, in developing countries this accelerates the deployment of new products and services and encourages economic development. International standards create this confidence by being developed in an environment of openness and transparency, where every stakeholder can contribute. It is the stated aim of the WSC partners – IEC, ISO and ITU – to facilitate and augment this confidence globally, so as to connect the world with international standards.
个人分类: 标准文存|1793 次阅读|0 个评论
今天是第十屆世界睡眠日WORLD SLEEP DAY 2011
LEOLAND 2011-3-21 18:28
曲津華 睡眠是個好東西——無論怎樣強調和讚美,都不過分! 世界睡眠日據說是由 World Association of Sleep Medicine (WASM) 發起的(我翻譯為“世界睡療協會”)。但每年都要慶祝的世界睡眠日,具體是哪一天?有 3 月 21 日版本,也有如下維基百科的多種版本。 YEAR DATE SLOGAN World Sleep Day 2008 14 March Sleep well, live fully awake 睡得好,生活好 World Sleep Day 2009 20 March Drive alert, arrive safe 睡眠質量高,駕車更安全 World Sleep Day 2010 19 March Sleep Well, Stay Healthy 好睡好健康 World Sleep Day 2011 18 March Sleep Well, Grow Healthy 睡得好,更健康 不 管它了,今晚就好好睡一下先——默念口號“睡得好,更健康”,美美地睡…… P.S. 睡療,比水療簡便、綠色、低碳。咱,值得擁有! 早前有一篇歌頌睡眠的小文 SLEEPING IS NEVER TIME WASTING 供參考。 http://bbs.sciencenet.cn/home.php?mod=spaceuid=247430do=blogid=290123 2011-03-21
个人分类: 科学劄记|3008 次阅读|0 个评论
[转载]suspicion on WSD
xrtang 2010-4-17 23:20
转 哈工大信息检索中心论坛上的一个帖子,是Yorick Wilks在本世纪之初写的对WSD的疑惑。帖子原地址: http://bbs.langtech.org.cn/viewthread.php?tid=2046 全文如下: Is Word Sense Disambiguation just one more NLP task? Is Word Sense Disambiguation just one more NLP task? Yorick Wilks Abstract: The paper compares the tasks of part-of-speech (POS) tagging and word-sense-tagging or disambiguation (WSD), and argues that the tasks are not related by fineness of grain or anything like that, but are quite different kinds of task, particularly because there is nothing in POS corresponding to sense novelty. The paper also argues for the reintegration of sub-tasks that are being separated for evaluation. Introduction I want to make clear right away that I am not writing as a sceptic about word-sense disambiguation (WSD) let alone as a recent convert: on the contrary, since my PhD thesis was on the topic thirty years ago. That (Wilks, 1968) was what we would now call a classic AI toy system approach, one that used techniques later called Preference Semantics, but applied to real newspaper texts, as controls on the philosophical texts that were my real interest at the time. But it did attach single sense representations to words drawn from a polysemous lexicon of 800 or so. If Boguraev was right, in his informal survey twelve years ago, that the average NLP lexicon was under fifty words, then that work was ahead of its time and I do therefore have a longer commitment to, and perspective on, the topic than most, for whatever that may be worth!. I want to raise some general questions about WSD as a task, aside from all the busy work in SENSEVAL: questions that should make us worried and wary about what we are doing here, but definitely NOT stop doing it. I can start by reminding us all of the obvious ways in which WSD is not like part-of-speech (POS) tagging, even though the two tasks are plainly connected in information terms, as Stevenson and I pointed out in (Wilks and Stevenson, 1998a), and were widely misunderstood for doing so. From these differences, of POS and WSD, I will conclude that WSD is not just one more partial task to be hacked off the body of NLP and solved. What follows acknowledges that Resnik and Yarowsky made a similar comparison in 1997 (Resnik and Yarowsky, 1997) though this list is a little different from theirs: There is broad agreement about POS tags in that, even for those committed to differing sets, there is little or no dispute that they can be put into one-many correspondence. That is not generally accepted for the sets of senses for the same words from different lexicons. There is little dispute that humans can POS tag to a high degree of consistency, but again this is not universally agreed for WS tagging, as various email discussions leading up to this workshop have shown. I'll come back to this issue below, but its importance cannot be exaggerated -- if humans cannot do it then we are wasting our time trying to automate it. I assume that fact is clear to everyone: whatever maybe the case in robotics or fast arithmetic, in the NL parts of AI there is no point modelling or training for skills that humans do not have! I do not know the genesis of the phrase `` lexical tuning, but the phenomenon has been remarked, and worked on, for thirty years and everyone seems agreed that it happens, in the sense that human generators create, and human analysers understand, words in quite new senses, ungenerated before or, at least, not contained in the point-of-reference lexicon, whether that be thought of as in the head or in the computer. Only this view is consistent with the evident expansion of sense lists in dictionaries with time; these new additions cannot plausibly be taken as established usages not noticed before. If this is the case, it seems to mark an absolute difference from POS tagging (where novelty does not occur in the same way), and that should radically alter our view of what we are doing here, because we cannot apply the standard empirical modelling method to that kind of novelty. The now standard empirical paradigm of assumes prior markup, in the sense of a positive answer to the question (2) above. But we cannot, by definition, mark up for new senses, those not in the list we were initially given, because the text analysed creates them, or they were left out of the source from which the mark up list came. If this phenomenon is real, and I assume it is, it sets a limit on phenomenon (2), the human ability to pre-tag with senses, and therefore sets an upper bound on the percentage results we can expect from WSD, a fact that marks WSD out quite clearly from POS tagging. The contrast here is in fact quite subtle as can be seen from the interesting intermediate case of semantic tagging: which is the task of attaching semantic, rather than POS, tags to words automatically, a task which can then be used to do more of the WSD task (as in Dini et al., 1998) than POS tagging can, since the ANIMAL or BIRD versus MACHINE tags can then separate the main senses of `` crane. In this case, as with POS, one need not assume novelty in the tag set, but must allow for novel assignments from it to corpus words e.g. when a word like `` dog or `` pig was first used in a human sense. It is just this sense of novelty that POS tagging does also have, of course, since a POS tag like VERB can be applied to what was once only a noun, as with `` ticket. This kind of novelty, in POS and semantic tagging, can be pre-marked up with a fixed tag inventory, hence both these techniques differ from genuine sense novelty which cannot be premarked. As I said earlier, the thrust of these remarks is not intended sceptically, either about WSD in particular, or about the empirical linguistic agenda of the last ten years more generally. I assume the latter has done a great deal of good to NLP/CL: it has freed us from toy systems and fatuous example mongering, and shown that more could be done with superficial knowledge-free methods than the whole AI knowledge-based-NLP tradition ever conceded: the tradition in which every example, every sentence, had in principle to be subjected to the deepest methods. Minsky and McCarthy always argued for that, but it seemed to some even then an implausible route for any least-effort-driven theory of evolution to have taken. The caveman would have stood paralysed in the path of the dinosaur as he downloaded deeper analysis modules, trying to disprove he was only having a nightmare. However, with that said, it may be time for some corrective: time to ask not only how we can continue to slice off more fragments of partial NLP as tasks to model and evaluate, but also how to reintegrate them for real tasks that humans undoubtedly can evaluate reliably, like MT and IE, and which are therefore unlike some of the partial tasks we have grown used to (like syntactic parsing) but on which normal language users have no views at all, for they are expert-created tasks, of dubious significance outside a wider framework. It is easy to forget this because it is easier to keep busy, always moving on. But there are few places left to go after WSD:-empirical pragmatics has surely started but may turn out to be the final leg of the journey. Given the successes of empirical NLP at such a wide range of tasks, it is not to soon to ask what it is all for, and to remember that, just because machine translation (MT) researchers complained long ago that WSD was one of their main problems, it does not follow that high level percentage success at WSD will advance MT. It may do so, and it is worth a try, but we should remember that Martin Kay warned years ago that no set of individual solutions to computational semantics, syntax, morphology etc. would necessarily advance MT. However, unless we put more thought into reintegrating the new techniques developed in the last decade we shall never find out. Can humans sense tag? I wish now to return to two of the topics raised above: first, the human task: itself. It seems obvious to me that, aside from the problems of tuning and other phenomena that go under names like vagueness, humans, after training, can sense-tag texts at reasonably high levels and reasonable inter-annotator consistency. They can do this with alternative sets of senses for words for the same text, although it may be a task where some degree of training and prior literacy are essential, since some senses in such a list are usually not widely known to the public. This should not be shocking: teams of lexicographers in major publishing houses constitute literate, trained teams and they can normally achieve agreement sufficient for a large printed dictionary for publication (about sense sets, that is, a closely related skill to sense-tagging). Those averse to claims about training and expertise here should remember that most native speakers cannot POS tag either, though there seems substantial and uncontentious consistency among the trained. There is strong evidence for this position on tagging ability, which includes (Green, 1989 see also Jorgensen, 1990) and indeed the high figures obtained for small word sets by the techniques pioneered by Yarowsky (Yarowsky, 1995). Many of those figures rest on forms of annotation (e.g. assignment of words to thesaurus head sets in Roget), and the general plausibility of the methodology serves to confirm the reality of human annotation (as a consistent task) as a side effect. The counterarguments to this have come explicitly from the writings of Kilgarriff (1993), and sometimes implicitly from the work of those who argue from the primacy of lexical rules or of notions like vagueness in regard to WSD. In Kilgarriff's case I have argued elsewhere (Wilks, 1997) that the figures he produced on human annotation are actually consistent with very high levels of human ability to sense-tag and are not counter-arguments at all, even though he seems to remain sceptical about the task in his papers. He showed only that for most words there are some contexts for which humans cannot assign a sense, which is of course not an argument against the human skill being generally successful. On a personal note, I would hope very much to be clearer when I see his published reaction to the SENSEVAL workshop what his attitude to WSD really is. In writing he is a widely published sceptic, in the flesh he is the prime organiser of this excellent event (SENSEVAL Workshop) to test a skill he may, or may not, believe in. There need be no contradiction there, but a fascinating question about motive lingers in the air. Has he set all this up so that WSD can destroy itself when rigourously tested? One does not have to be a student of double-blind tests, and the role of intention in experimental design, to take these questions seriously, particularly as he has designed the SENSEVAL methodology and the use of the data himself. The motive question here is not mere ad hominem argument but a serious question needing an answer. These are not idle questions, in my view, but go to the heart of what the SENSEVAL workshop is for: is it to show how to do better at WSD, or is to say something about wordsense itself (which might involve saying that you cannot do WSD by computer at all, or cannot do it well enough to be of interest?). In all this discussion we should remember that, if we take the improvement of (assessable) real tasks as paramount, those like MT, Information Retrieval and Information Extraction (IE), then it may not in the end matter whether humans are ever shown psycholinguistically to need POS tagging or WSD for their own language performance;-there is much evidence they do not. But that issue is wholly separate from what concerns us here; it may still be useful to advance MT/IE via partial tasks like WSD, if they can be shown performable, assessable, and modelable by computers, no matter how humans turn out to work. The implicit critique of the broadly positive position above (i.e. that WSD can be done by people and machines and we should keep at it) sometimes seems to come as well from those who argue (a) for the inadequacy of lexical sense sets over productive lexical rules and (b) for the inherently vague quality of the difference between senses of a given word. I believe both these approaches are muddled if their proponents conclude that WSD is therefore fatally flawed as a task;- and clearly not all do since some of them are represented here as participants. Lexical Rules Lexical rules go back at least to Givon's (1967) thirty-year old sense-extension rules and they are in no way incompatible with a sense-set approach, like that found in a classic dictionary. Such sense sets are normally structured (often by part of speech and by general and specific senses) and the rules are, in some sense, no more than a compression device for predicting that structuring. But the set produced by any set of lexical rules is still a set, just as a dictionary list of senses is a set, albeit structured. It is mere confusion to think one is a set and one not: Nirenburg and Raskin (1997) have pointed out that those who argue against lists of senses (in favour of rules, e.g. Pustejovsky 1995) still produce and use such lists. What else could they do? I myself cannot get sufficient clarity at all on what the lexical rule approach, whatever its faults or virtues, has to do with WSD? The email discussion preceding this workshop showed there were people who think the issues are connected, but I cannot see it, but would like to be better informed before I go home from here. If their case is that rules can predict or generate new senses then their position is no different (with regard to WSD) from that of anyone else who thinks new senses important, however modelled or described. The rule/compression issue itself has nothing essential to do with WSD: it is simply one variant of the novelty/tuning/new-sense/metonymy problem, however that is described. The vagueness issue is again an old observation, one that, if taken seriously, must surely result in a statistical or fuzzy-logic approach to sense discrimination, since only probabilistic (or at least quantitative) methods can capture real vagueness. That, surely, is the point of the Sorites paradox: there can be no plausible or rational qualitatively-based criterion (which would include any quantitative system with clear limits: e.g. tall = over 6 feet) for demarcating `` tall, `` green or any inherently vague concept. If, however, sense sets/lists/inventories are to continue to play a role, vagueness can mean no more than highlighting what all systems of WSD must have, namely some parameter or threshold for the assignment to one of a list of senses versus another, or setting up a new sense in the list. Talk of vagueness adds nothing specific to help that process for those who want to assign on some quantitative basis to one sense rather than another; algorithms will capture the usual issue of tuning to see what works and fits our intuitions. Vagueness would be a serious concept only if the whole sense list for a word (in rule form or not) was abandoned in favour of statistically-based unsupervised clusters of usages or contexts. There have been just such approaches to WSD in recent years (e.g. Bruce and Wiebe, 1994, Pedersen and Bruce, 1997, Schuetze Pederson, 1995) and the essence of the idea goes back to Sparck Jones 1964/1986) but such an approach would find it impossible to take part in any competition like SENSEVAL because it would inevitably deal in nameless entities which cannot be marked up for. Vague and Lexical Rule based approaches also have the consequence that all lexicographic practice is, in some sense, misguided: dictionaries according to such theories are fraudulent documents that could not help users, whom they systematically mislead by listing senses. Fortunately, the market decides this issue, and it is a false claim. Vagueness in WSD is either false (the last position) or trivial, and known and utilised within all methodologies. This issue owes something to the systematic ignorance of its own history so often noted in AI. A discussion email preceding this workshop referred to the purported benefits of underspecification in lexical entries, and how recent formalisms had made that possible. How could anyone write such a thing in ignorance of the 1970s and 80s work on incremental semantic interpretation of Hirst, Mellish and Small (Hirst, 1987; Mellish, 1983; Small et al., 1988) among others? None of this is a surprise to those with AI memories more than a few weeks long: in our field people read little outside their own notational clique, and constantly `` rediscover old work with a new notation. This leads me to my final point which has to do, as I noted above, with the need for a fresh look at technique integration for real tasks. We all pay lip service to this while we spend years on fragmentary activity, arguing that that is the method of science. Well, yes and no, and anyway WSD is not science: what we are doing is engineering and the scientific method does not generally work there, since engineering is essentially integrative, not analytical. We often write or read of `` hybrid systems in NLP, which is certainly an integrative notion, but we have little clear idea of what it means. If statistical or knowledge-free methods are to solve some or most cases of any linguistic phenomenon, like WSD, how do we then locate that subclass of the phenomena that other, deeper, techniques like AI and knowledge-based reasoning are then to deal with? Conversely, how can we know which cases the deeper techniques cannot or need not deal with? If there is an upper bound to empirical methods, and I have argued that that will be lower for WSD than for some other NLP tasks for the reasons set out above, then how can we pull in other techniques smoothly and seamlessly for the `` hard examples? The experience of POS tagging, to return to where we started, suggests that rule-driven taggers can do as well as purely machine learning-based taggers, which, if true, suggests that symbolic methods, in a broad sense, might still be the right approach for the whole task. Are we yet sure this is not the case for WSD? I simply raise the question. Ten years ago, it was taken for granted in most of the AI/NLP community that knowledge-based methods were essential for serious NLP. Some of the successes of the empirical program (and especially the TIPSTER program) have caused many to reevaluate that assumption. But where are we now, if a real ceiling to such methods is already in sight? Information Retrieval languished for years, and maybe still does, as a technique with a practical use but an obvious ceiling, and no way of breaking through it; there was really nowhere for its researchers to go. But that is not quite true for us, because the claims of AI/NLP to offer high quality at NLP tasks have never been really tested. They have certainly not failed, just got left behind in the rush towards what could be easily tested! Large or Small-scale WSD? Which brings me to my final point: general versus small-scale WSD. Our group is one of the few that has insisted on continuing with general WSD: the tagging and test of all content words in a text, a group that includes CUP, XERC-Grenoble and CRL-NMSU. We currently claim about 90% correct sense assignment (Wilks and Stevenson, 1998b) and do not expect to be able to improve much on that for the reasons set out above; we believe the rest is AI or lexical tuning! The general argument for continuing with the all-word paradigm, rather than the highly successful paradigm of Yarowsky et al., is that that is the real task, and there is no firm evidence that the small scale will scale up to the large because much of sense-disambiguation is mutual between the words of the text, which cannot be used by the small set approach. I am not sure this argument is watertight but it seems plausible to me. Logically, if you claim to do all the content words you ought, in principle, to be able to enter a contest like SENSEVAL that does only some of the words with an unmodified system. This is true, but you will also expect to do worse, as you have not have had as much training data for the chosen word set. Moreover you will have to do far more preparation to enter if you insist, as we would, on bringing the engines and data into play for all the training and test set words; the effort is that much greater and it makes such an entry self-penalising in terms of both effort and likely outcome, which is why we decided not to enter in the first round, regretfully, but just to mope and wail at the sidelines. The methodology chosen for SENSEVAL was a natural reaction to the lack of training and test data for the WSD task, as we all know, and that is where I would personally like to see effort put in the future, so that everyone can enter all the words; I assume that would be universally agreed to if the data were there. It is a pity, surely, to base the whole structure of a competition on the paucity of the data. Conclusion What we would like to suggest positively is that we cooperate to produce more data, and use existing all-word systems, like Grenoble, CUP, our own and others willing to join, possibly in combination, so as to create large-scale tagged data quasi-automatically, rather in the way that the Penn tree bank was produced with the aid of parsers, not just people. We have some concrete suggestions as to how this can be done, and done consistently, using not only multiple WSD systems but also by cross comparing the lexical resources available, e.g. WordNet (or EuroWordNet) and a major monolingual dictionary. We developed our own reasonably large test/training set with the WordNet-LDOCE sense translation table (SENSUS, Knight and Luk, 1994) from ISI. Some sort of organised effort along those lines, before the next SENSEVAL, would enable us all to play on a field not only level, but much larger. Bibliography 1 Bruce, R. and Wiebe, J. (1994) Word-sense disambiguation using decomposable models, Proc. ACL-94. 2 Dini, L., di Tommaso, V. and Segond, F. (1998) Error-driven word sense disambiguation. In Proc. COLING-ACL98, Montreal. 3 Givon, T. (1967) Transformations of Ellipsis, Sense Development and Rules of Lexical Derivation. SP-2896, Systems Development Corp., Sta. Monica, CA. 4 Green, G. (1989) Pragmatics and Natural Language Understanding. Erlbaum: Hillsdale, NJ. 5 Hirst, G. (1987) Semantic Interpretation and the Resolution of Ambiguity, CUP: Cambridge, England. 6 Jorgensen, J. (1990) The psychological reality of word senses, Journal of Psycholinguistic Research, vol 19. 7 Kilgarriff, A. (1993) Dictionary word-sense distinctions: an enquiry into their nature, Computers and the Humanities, vol 26. 8 Knight, K. and Luk, S. (1994) Building a Large Knowledge Base for Machine Translation, Proceedings of the American Association for Artificial Intelligence Conference AAAI-94, pp. 185-109, Seattle, WA. 9 Mellish, C. (1983) Incremental semantic interpretation in a modular parsing system. In K. Sparck-Jones and Y. Wilks (eds.) Automatic Natural Language Parsing, Ellis Horwood/Wiley: Chichester/NYC. 10 Nirenburg, S. and Raskin., V. (1997) Ten choices for lexical semantics. Research Memorandum, Computing Research Laboratory, Las Cruces, NM. 11 Pedersen, T. and Bruce, R. (1997) Distinguishing Word Senses in Untagged Text, Proceedings of the Second Conference on Empirical Methods in Natural Language Processing, pp. 197-207, Providence, RI. 12 Pustejovsky, J. (1995) The Generative Lexicon, MIT Press: Cambridge, MA. 13 Resnik, P. and Yarowsky, D. (1997) A Perspective on Word Sense Disambiguation Techniques and their Evaluation, Proceedings of the SIGLEX Workshop ``Tagging Text with Lexical Semantics: What, why and how?'', pp. 79-86, Washington, D.C. 14 Schutze, H. (1992) Dimensions of Meaning, Proceedings of Supercomputing '92, pp. 787-796, Minneapolis, MN. 15 Schutze, H. and Pederson, J. (1995) Information Retrieval based on Word Sense, Proc. Fourth Annual Symposium on Document Analysis and Information Retrieval. Las Vegas, NV. 16 Small, S., Cottrell, G., and Tanenhaus, M. (Eds.) (1988) Lexical Ambiguity Resolution, Morgan Kaufmann: San Mateo, CA. 17 Sparck Jones, K. (1964/1986) Synonymy and Semantic Classification. Edinburgh UP: Edinburgh. 18 Wilks, Y. (1968) Argument and Proof. Cambridge University PhD thesis. 19 Wilks, Y. (1997) Senses and Texts. Computers and the Humanities. 20 Wilks, Y. and Stevenson, M. (1998a) The Grammar of Sense: Using part-of-speech tags as a first step in semantic disambiguation, Journal of Natural Language Engineering, 4(1), pp. 1-9. 21 Wilks, Y. and Stevenson, M. (1998b) Optimising Combinations of Knowledge Sources for Word Sense Disambiguation, Proceedings of the 36th Meeting of the Association for Computational Linguistics (COLING-ACL-98), Montreal, Canada. 22 Yarowsky, D. (1995) Unsupervised Word-Sense Disambiguation Rivaling Supervised Methods, Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics (ACL-95), pp. 189-196, Cambridge, MA. About this document ... Is Word Sense Disambiguation just one more NLP task? This document was generated using the LaTeX2HTML translator Version 99.2beta6 (1.42) Copyright 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds. Copyright 1997, 1998, 1999, Ross Moore, Mathematics Department, Macquarie University, Sydney. The command line arguments were: latex2html cs9812.tex The translation was initiated by Gillian Callaghan on 2000-03-29
个人分类: 生活点滴|3143 次阅读|2 个评论
WSD and lexicographer
xrtang 2010-1-29 17:38
Well, this is the very first writing for my blog. I plan to use it to record my thinking and to exchange with fellows up here. I am currently reading word sense disambiguation: algorithm and applications. In the first two ariticles, one topic is focused, namely the invertory of word senses. Adam Kilgarriff, Nancy Ide and Yorick Wilks are proposing the the work of lexicographer may not be suitable for WSD and other NLP tasks such as information retrieval and machine translation. Thus they think it's high time to reexamine the theories of word sense and do the word sense inventory once again. I find this kind of idea interesting, as I was annotating some chinese polyseme words such as 大, 小 using HowNet as reference semantic system. What is always puzzling me is that there are always cases where I can not make a decision as to what sense option to apply. Sometimes more then one sense seems acceptable by intuition and some other times none seems to fit. The idea of homograph as is proposed in IdeWilks surely can avoid this problem. But when it comes to tasks where more precise sense division is needed, for example, metaphor recognition, homograph way not work. But one thing is for sure, the topic of sense inventory needs to be furthur considered.
个人分类: 生活点滴|3267 次阅读|1 个评论

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-6-17 04:52

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部