Here enclosed a piece of news (old news) about language gene: Scientists Identify a Language Gene Bijal P. Trivedi for National Geographic Today October 4, 2001 Researchers in England have identified the first gene to be linked to language and speech, suggesting that our human urge to babble and chat is innate, and that our linguistic abilities are at least partially hardwired. "It is important to realize that this is a gene associated with language, not the gene," said Anthony Monaco of the University of Oxford, England, who led the genetic aspects of the study. The gene is required during early embryonic development for formation of brain regions associated with speech and language. The gene, called FOXP2, was identified through studies of a severe speech and language disorder that affects almost half the members of a large family, identified only as "KE." Individuals with the disorder are unable to select and produce the fine movements with the tongue and lips that are necessary to speak clearly. "The most obvious feature is that they are unintelligible both to naive listeners and to other KE family members without the disorder," said neurologist Faraneh Vargha-Khadem of London's Institute for Child Health, who studied the family. The members of the family also have dyslexic tendencies, difficulty processing sentences, and poor spelling and grammar. FOXP2 is responsible for the rare disorder seen in the KE family that is a unique mixture of motor and language impediments, said Monaco. But, Monaco cautioned, "FOXP2 is unlikely to be the cause of less severe language deficits that affect approximately 4 percent of schoolchildren. FOXP2 will not be the major gene involved in most of these cases." Their findings are published in the October 4 issue of the journal Nature. Using data from the KE family, researchers narrowed the location of the FOXP2 gene to a region of chromosome 7 that contained about 70 genes. Analyzing these genes one by one is a task that could easily have taken more than a year. But Monaco's team made a breakthrough when researcher Jane Hurst of Oxford Radcliffe Hospital identified a British boy, unrelated to the KE family, who had an almost identical language deficit. The boy, known as "CS," had a visible defect in chromosome 7 that specifically affected the FOXP2 gene. "The defect was like a signpost, precisely highlighting the gene responsible for the speech disorder," said Monaco. The FOXP2 gene produces a protein called a transcription factor, which attaches itself to other regions of DNA and switches genes on and off. In the KE family, one of 2,500 units of DNA that make up the FOXP2 gene is mutated. Monaco suggested that this mutation prevents FOXP2 from activating the normal sequence of genes required for early brain development. "It is extraordinary that such a minute change in the gene is sufficient to disrupt a faculty as vital as language," he said. Although humans have two copies of every gene, just one mutated copy of FOXP2—as in the case of both CS and the KE family—can have devastating effects on brain development, said Vargha-Khadem. Brain imaging studies of the KE family revealed that affected members have abnormal basal ganglia—a region in the brain involved with movement—which could explain difficulty in moving the lips and tongue. Regions of the cortex involved in speech and language also appear aberrant. The discovery of FOXP2 offers Monaco and other geneticists a probe to fish for other genes involved in development—specifically those directly controlled by FOXP2. Also in progress is a collaborative project to study the evolution of the human FOXP2 gene by comparing it with versions in chimps and other primates. Monaco speculates that differences between the FOXP2 gene in humans and chimps may reveal a genetic basis for differing abilities to communicate. http://news.nationalgeographic.com/news/2001/10/1004_TVlanguagegene_2.html A related passage about language gene: 'Languagegene'speedslearning ‘ 语言基因 ’ 让你学的更快 Mouse study suggests that mutation to FOXP2 gene may have helped humans learn the muscle movements for speech. 对老鼠的研究表明: FOXP2 基因变异可能会对人们掌握说话时的肌肉运动有帮助。 A mutation t hat appeared more than half a million years ago may have helped humans learn the complex muscle movements that are critical to speech and language. 50 万年前出现的变异可能一直在帮助人们掌握复杂的肌肉运动,这种肌肉运动对说话和语言至关重要。 The claim stems from the finding that mice genetically engineered to produce the human form of the gene, called FOXP2 , learn more quickly than their normal counterparts. 这一说法源自这样的发现:通过基因改良的老鼠产生了人类具有的基因,这种基因叫 FOXP2 ,这些老鼠比他们的同伙(没有基因改良的老鼠)学的更快。 The work was presented by Christiane Schreiweis, a neuroscientist at the Max Planck Institute (MPI) for Evolutionary Anthropology in Leipzig, Germany, at the Society for Neuroscience meeting this week in Washington DC this week. 德国莱比锡的马科斯普朗克人类进化研究院 (MPI) 的一位神经学家,叫做克里斯汀 - 施瑞斯,在本周出席在华盛顿召开的神经科学协会会议上提交了上述发现。 Scientists discovered FOXP2 in the 1990s by studying a British family known as 'KE' in which three generations suffered from severe speech and language problems 1 . Those with language problems were found to share an inherited mutation that inactivates one copy of FOXP2 . 科学家发现 FOXP2 基因是上世纪 90 年代的事,当时科学家对英国一家代号为 ‘KE’ 的家庭进行研究,这家三代人都有严重的说话和语言障碍。这三代人都遗传了一个变异,这个变异阻止了 FOXP2 基因的复制。 Most vertebrates have nearly identical versions of the gene, which is involved in the development of brain circuits important for the learning of movement. The human version of FOXP2, the protein encoded by the gene, differs from that of chimpanzees at two amino acids, hinting that changes to the human form may have had a hand in the evolution of language. 多数脊椎动物都有几乎一样的 FOXP2 基因形式,这种基因涉及到对掌握运动至关重要的脑部的发育。人类的 FOXP2 基因(该基因编码的蛋白质)与黑猩猩的相比有两个氨基酸不同,这表明人类的这种基因变化可能对语言的进化发挥了作用。 A team led by Schreiweis’ colleague Svante Pbo discovered that the gene is identical in modern humans ( Homo sapiens ) and Neanderthals ( Homo neanderthalensis ), suggesting that the mutation appeared before these two human lineages diverged around 500,000 years ago. 施瑞斯的一个同事叫做塞万提 - 帕博,他领导的一个小组发现了现代人(智人)和穴居人(尼安德特人)的 FOXP2 基因是一样的。这表明在 50 万年前这两支人类先祖分道扬镳之前变异就出现了。 Altered squeaks 变了的叫声 A few years ago, researchers at the MPI Leipzig engineered mice to make the human FOXP2 protein. The ‘humanized’ mice were less intrepid explorers and, when separated from their mothers, pups produced altered ultrasonic squeaks compared to pups with the mouse version of FOXP2. 几年前,德国莱比锡的马科斯普朗克人类进化研究院 (MPI) 的研究人员对老鼠进行了基因改造,让老鼠具有人类的 FOXP2 基因蛋白。这种 ‘ 人类化的 ’ 老鼠变成了胆小的探险者,并且当把它们和它们的妈妈分开时,与带有老鼠原版 FOXP2 基因的小老鼠相比,这些基因改良后的小老鼠会发出变化了的超声波叫声。 Their brains, compared with those of normal mice, contained neurons with more and longer dendrites — the tendrils that help neurons communicate with each other. Another difference was that cells in a brain region called the basal ganglia were quicker to become unresponsive after repeated electrical stimulation, a trait called ‘long-term depression’ that is implicated in learning and memory. 改造后老鼠的大脑与正常老鼠的大脑相比较,含有更多的神经元而且神经元的树突更长。神经元树突是一种须状物,可以帮助神经元相互之间进行通讯交流。另外一个不同是,改造后的老鼠大脑底部神经中枢的脑细胞经过反复的电刺激后,更快进入冷漠状态,这一特征叫 ‘ 长期压抑 ’ ,这种 ‘ 长期压抑 ’ 涉及到学习和记忆。 At the neuroscience meeting, Schreiweis reported that mice with the human form of FOXP2 learn more quickly than ordinary mice. She challenged mice to solve a maze that involved turning either left or right to find a water reward. A visual clue, such as a star, along with the texture of the maze's surface, showed the correct direction to turn. 在神经科学大会上,施瑞斯报告说:具有人类 FOXP2 基因的老鼠比普通老鼠学习的更快。他让老鼠走迷宫,左转或者右转,走对了就奖给老鼠水喝。在迷宫里有诸如星状的可视标记,加上通道的表面的质感,可以指明正确的方向。 After eight days of practice, mice with the human form of FOXP2 learnt to follow the clues to the water 70% of the time. Normal mice took an additional four days to reach this level. Schreiweis says that the human form of the gene allowed mice to more quickly integrate the visual and tactile clues when learning to solve the maze. 经过 8 天练习后,带有人类 FOXP2 基因的老鼠在 70% 的情况下可以根据线索找到水喝。普通老鼠需要另外化四天时间练习才能达到这样的水平。施瑞斯说:在老鼠走迷宫时,人类的 FOXP2 基因让老鼠更快的把可视线索和触觉线索联系在一起。 In humans, she says, the mutation to FOXP2 might have helped our species learn the complex muscle movements needed to form basic sounds and then combine these sounds into words and sentences. 对人类而言,他说,向 FOXP2 基因的变异可能帮助了我们这一物种掌握复杂的肌肉运动,要形成基本声音然后把基本声音合成为字然后再合成为句子,复杂的肌肉运动是必须的。 Another MPI team member, Ulrich Bornschein, presented work at the neuroscience meeting showing that the changes to brain circuitry that lead to quicker learning come about with just one of the two amino-acid changes in the human form of FOXP2 . The second mutation may do nothing. 另一个 MPI 小组成员,叫做乌里奇 - 本斯新,在神经科学大会上提出了他的研究结果,他的结果表明:导致学习更快的脑部变化的只是人类 FOXP2 基因里两个变化了氨基酸中的一个,另一个变化了的氨基酸毫无作用。 “That makes sense,” says Genevieve Konopka, a neuroscientist at the University of Texas Southwestern Medical Center in Dallas, who also studies FOXP2 . Carnivores, including dogs and wolves, independently evolved the other human FOXP2 mutation, with no obvious effect on their brains. 位于达拉斯的得克萨斯大学西南医学中心的一位神经学家,叫做吉纳维夫 - 科诺普柯,也在研究 FOXP2 基因。他说: ‘ 是那样 ’ 。食肉动物,包括狗和狼,独立的进化成了其他的人类 FOXP2 基因变种,对它们的大脑没有明显影响。 Faraneh Vargha-Khadem, a neuroscientist at University College London who has studied the KE family in which FOXP2 is mutated, thinks that the new findings could help explain the gene's role in perfecting the facial movements involved in speech. 法拉尼 - 乌迦 - 科登是伦敦大学分院的神经学家,她研究了 KE 家族 FOXP2 基因变异,她认为新的发现可以帮助我们解释在说话时形成的脸部运动中 FOXP2 基因所起的作用。 But she does not see how changes in basic learning circuitry could explain how FOXP2 helps humans to automatically and effortlessly translate their thoughts into spoken language. “You are not deciding how you are going to move your muscles to form these sounds,” she says. 但是她没有找到如何用(负责学习的)脑部变化来解释 FOXP2 基因是如何帮助人类自觉地而且毫不费力地把想法转换成口头语言的。她说: “ 人们不用刻意去想如何使用你的肌肉来发出声音 ” 。 http://blog.sina.com.cn/s/blog_70f7edbc0100ydq3.html
做的方向是依存句法分析(dependency parsing) 捋捋最近读的论文: 首先是在读机器学习(具体点说,是深度学习)相关的论文,读到几篇关于神经概率语言模型(neural probabilistic language model)的文章; 语言模型与依存句法分析(dependency parsing)有很紧密的关系,希望可以把这两者结合起来! 下面是一个关于神经概率语言模型的reading list: 【1】 a neural probabilistic language model 这是神经概率语言模型的第一篇论文; 【2】 hierarchical probabilistic neural network language model 在【1】的基础上将词汇分层得到的,优化了算法的时间复杂度; 【3】 three new graphical models for statistical language modeling 作者是Geoffrey Hinton;就是他最早提出了深度学习,深度学习现在在机器学习领域特别火!从2006年开始,它标志着神经网络在机器学习界的复苏!本篇论文是作者将深度学习的部分思想应用在语言模型上的结果。 【4】 a scalable hierarchical distributed language model 作者仍是Geoffrey Hinton;在【3】的基础上将词汇分层得到,与【2】在【1】的基础上的改进工作很类似,不过【2】在【1】上的改进工作主要集中在使用了WordNet作为先验知识;而【4】在【3】上的改进工作是使用了自动的方法。 下面找出来其中最核心的一篇详细分析一下:【1】 a neural probabilistic language model 1.背景知识: 1.1 语言模型 语言模型就是根据一定的训练集按照某个算法训练出来的模型,这个模型可以(1)计算出来某个句子在该模型下出现的概率;(2)在前面若干个词给定的情况下判断下一个词出现的概率; 下边是一个例子: P("I love you") = 0.003, 但是P("I you love")=0.00000003;说明即使在统计的意义下,符合语法规定的句子出现的概率要远远大于不符合语法规定的句子出现的概率; P("you" | "I love") = 0.1,表示在前两个词是"I love"的前提下,下一个词是"you"的可能性是0.1 P("reading" | "I love") = 0.001,表示在前两个词是"I love"的前提下,下一个词是"reading"的可能性是0.001 这个概率表示语言模型体现了统计学的特征,而不单纯是语法学的特征! 1.2 N-gram语言模型 传统的语言模型是N-gram语言模型,即一个很重要的假设: 每一个词出现的概率仅与它前边的N个词有关。 用公示表示如下: N-gram在实际中应用很广泛,但是它有很明显的缺点: 1)参数空间不光滑,经常需要一些平滑算法来弥补 2)对于词典中没出现的词没有办法处理 3.神经语言模型(neural language model)
咱们也看看国际上如何认识中医学概念及其与信息和知识的关系。 阴虚 Yin Deficiency (Disease) 同义词 Hide synonyms (9) Yin Deficiencies Yinxu YIN DEFIC Deficiency, Yin Hsu, Yin Xu, Yin DEFIC YIN Yin Hsu Yin Xu 概念定义 In the YIN-YANG system of philosophy and medicine, an insufficiency of body fluid (called yinxu), manifesting often as irritability, thirst, constipation, etc. (The Pinyin Chinese-English Dictionary, 1979). (MeSH) 阳虚 Yang Deficiency (Disease) 同义词 Hide synonyms (8) DEFIC YANG Yangxu Yang Xu YANG DEFIC Yang Hsu Deficiency, Yang Hsu, Yang Xu, Yang 概念定义 In the YIN-YANG system of philosophy and medicine, a lack of vital energy (called yangxu in Chinese). It manifests itself in various systemic and organic diseases. (The Pinyin Chinese-English Dictionary, 1979) (MeSH) 本体语义网络与相关信息和知识: http://www.coremine.com/medical/#search?ids=521582,521583,826846tt=8191org=hst=TERMi=521582kb=summary
讲英语 贾伟 前阵子写过一篇博文《 学英语 》,我掐指一算,已经五个多月过去了。一般而言,五个多月的外语学下来,可以凑合着讲讲了。所以,咱们今天来讲英语。 有个段子,说去年圣诞节前上海的一名巡警拦下一个骑电动车闯红灯的男子,上去盘问: “ 叫什么名字? ” 男子答: “ 讲英语。 ” 。 民警感觉奇怪:“什么名字?” 那个男子提高了声音: “ 讲英语啊! ” 民警没辙,迟疑了一下,重新问道: “Ok, what's your name ? ” 男子掏出身份证,民警拿来一看: “ 蒋英宇“! 我看到这个段子,第一反应是,咱们上海这种国际大都市,警察的文化素质就是高啊! 有一次我在交大徐汇校区碰上一个女子操着法国腔的英语问去国际教育学院怎么走,我跟她讲了方向,怕她依然不懂,就带着她走了一段路,最后分手时她千恩万谢,我不咸不淡地说了句: ”you're quite welcome. In fact, 我希望我在巴黎街头问路,能有人同样用英文回答我。 ” 后面这句话纯粹是有感而发,因为我在巴黎问路就吃过苦头,每一次用英语问上去,对方都用我一字不懂的法语叽里哇啦的给我一通回答,让我郁闷不已。你伟大的法兰西民族瞧不上英语咱没法干涉,有本事你在上海也用法语(或者用中文)问路啊? 因为语言上的唯我独尊,法国佬给人的感觉是有些可爱的自大和固执。但同样在这个方面,香港人给我的印象要差一些。在这个中国的特别行政区, 有很多次我热情洋溢地用普通话跟店员交流,她们都(用粤语)说听不懂;然后我用一连串英语跟她们表述时,她们一面似懂非懂地频频点头,一面赶紧去叫经理来救场。时间久了,我还是能够品出些“端倪”来。香港其实并非是一个对中文全然不懂以至在沟通上一窍不通的地方,就有些港人而言,脑子里懂中文并不代表内心接受中文;当然内心崇拜英文也并不代表脑子里都能听得懂。有一次我乘出租车去浸会大学, 上车后用国语告诉司机地址,结果他用粤语告诉我他听不懂。我一字一顿地重说了一遍,他依然坚定地摇头说听不懂!看着车在飞奔,我又用英文重复了一遍,他还是摇头。我暗骂了一句,便闭口不再跟他啰嗦了,当时心里的直觉告诉我这个家伙是听得懂中文的,否则他的车怎么没有减速而是一个劲地往前开呢?果然不一会儿车到浸会大学的门口便停了下来,下车时我破例没给小费,反正跟他说什么都听不懂嘛。 一些美国同事曾问我在间隔十年后回到美国生活,能感受到什么差异?我说,今天的英文跟十年前相比有了很大的不同,当然这些差异都发生在细微的地方。譬如我刚到北卡, 适逢我们研究校区大楼 grand opening ,一位西装革履的议员上台发言,他庄严地说了一句(我们以前会认为非常口语化的)开场白: I am blown away by this building (这大楼把俺给震了) …… 我们平时开会时,主持会议的会先给大家一个 heads up – 它的意思稍稍发生了些变化,其含义已接近 updates 。而很多人在正式场合喜欢用 24/7 来简化 "24 hours, 7 days a week" 这样的表述。这些日常表述方法在以前也不能说没有,但它们似乎已经从家里或者餐厅里走进了办公室和会议室。在九十年代,你只能从小孩嘴里听到这样一个感叹词: cool, 现在我可以在学术大会上对前面演讲者的方法进行评论: This is a really cool method 而不必担心用词太随意了 。当然,英语中新词汇的最大来源还是网络。几年前如果有人要让你 to "friend" them, 你大概不知道是神马意思,今天很多人都明白它是指在 Facebook 或博客上把对方“加为朋友”的意思。现在老外会很自然地把事物的两方面用 Yin and yang (阴阳)来表示,而 networking 已经正式当作人际交往的词汇来使用了。另外,很多老外们喜欢“拽”两句来自中文的词汇,如 guanxi (关系), jiayou (加油), ma-ma-hu-hu (马马虎虎),以显示他们在语言上涉略广泛。我一个月前跟一位美国商人吃早餐,他在跟我说中国维持经济稳定的必要性时,用了一个词 – “ Baoba,” 我硬是想了几秒钟,才猜出他说的是我们 2012 年 GDP 增长要达到“保八”的目标,当然这个词从他嘴里说出来着实让我吃了一惊。 语言是文化的一部分,学习一门语言其实是在涉入一种文化。一方面,要准确地听懂一门语言,我们需要透彻地理解它所负载的文化信息;另一方面,随着文化的发展, 不仅是英文,每一种语言都处于一种动态的变化之中,讲一口好的英语要做到与时俱进。当然, 从英语的变化中,我们也可以管中窥豹,体会西方人的思维模式、社会行为和生活方式的变化。 记得 90 年代我很喜欢去密苏里的一些乡村餐馆吃饭,那儿有些招待年纪很大,见了我一口一个“ honey” ,就像是我外婆那样的慈祥和温馨。现在的情况有些变化,我前不久在夏洛特机场的一家寿司店吃午餐,店里清一色的都是身材高挑、面目清秀的年轻女子当招待,见了我齐声亲热地叫 ”Sweetie”, 我老人家一下子有点 hold 不住,有点诗人描写的那种“心花怒放”的感觉( by the way, 英语大概叫: heart flower angry open )。这种亲昵的称呼一直幸福地伴随着我吃完午餐,尤其在拿到账单时,一声 sweetie 格外音质柔美,在我耳际一直萦绕着,而且有力地主导我把小费数额往高处写 …… 直到后来我听到她们的另一声 sweetie 叫向一个拖着行李步履蹒跚走来的老先生,才如梦方醒,原来这跟当年密苏里外婆叫我 honey 的意思是一样一样滴。 我想在语言和文化上再啰嗦两句。在美国的华人教授很少进入管理层当官的,比较主流的看法是两个障碍:语言问题 – 英文讲得不如老美好;另一个问题是体系内有歧视因素。近几年我跟在美国高校的一些当上了系主任或院长的华人教授们探讨,说到这个“天花板”效应,大家比较一致的看法是:这个“天花板”是存在的,但似乎不是人家加在我们头上的,而是我们自己在心里给安放上去的。简单地说,在我看来,我的这些当了一官半职的华人朋友们英文水平也好不到哪里去,有的甚至很糟糕(土得掉渣的乡音都很明显),工作单位也有清一色白人(比较保守)的地方,但这些在高校当官的华人们带有共性的人文素养是:理解美国高校的管理理念,善于表达自己的观点,尤其重要的是最后一点 – 不卑不亢、用(跟美国同事)平等的心态去交流。 有时候语言的“简“和“繁“常在生活中碰撞出有意思的火花来。我有个博文说到跟诺贝尔奖获得者 Oliver Smithies 夫妇吃饭,那天晚上有个小插曲,晚饭后我们要了甜点 - 法式烤布蕾 Cr è me Br l é e 。桌上少了把吃甜食的小勺子, Oliver 扬手跟招待说: Will you produce me a spoon ?我当时听到他这句话心里也有些诧异,但一想到 Oliver 自小在英国长大,用 produce 大概是“拿来”的一种旧式表达。但,我们面前的这个二十多岁的招待可是像发现新大陆一样地抓住把柄,开始调侃上 Oliver 了。他严肃地说: Sir, I can’t produce the spoon for you! 这个 spoon 是不能够 produce 出来的,你知道,如果我能够现在给你 produce一把汤勺 出来,我就不会在这儿工作了,我理想的单位将是拉斯维加斯的赌场。" 他把八十几岁的老头挖苦了一通后,话锋一转,学着英国腔说道: However, I can fetch you a spoon – if that’s what you want! 我们一起大笑, Oliver 在一个抢白他的小孩子面前露出稀疏的牙齿,全然不以为忤地憨憨地笑个不停。