科学网

 找回密码
  注册

tag 标签: topic

相关帖子

版块 作者 回复/查看 最后发表

没有相关内容

相关日志

Topic Linkages between Papers and Patents
xiaohai2008 2012-9-27 16:33
@INPROCEEDINGS{XXQS+12, author = {Shuo Xu and Lijun Zhu and Xiaodong Qiao and Qingwei Shi and Jie Gui}, title = {Topic Linkages between Papers and Patents}, booktitle = {Proceedings of the 4th International Conference on Advanced Science and Technology}, year = {2012}, pages = {176--183}, location = {Beijing, China}, publisher = {Science and Engineering Research Support soCiety (SERSC)}, abstract = {The papers and patents are usually considered as the indicators of basic science studies and technologies, respectively. Previous linkage research between papers and patents mainly focus on the analysis of non-patent literature cited by patent from the viewpoint of citation analysis. Thus, one will miss many valuable scientific papers that are not cited by patents until now. This paper proposes a simple procedure for constructing topic linkages between papers and patents by analyzing these two kinds of information resources simultaneously. Experimental results on \emph{new energy vehicles} indicate that our approach is feasible and efficient.}, keywords = {Topic Linkages \sep Topic Models \sep Topic Similarity \sep Latent Dirichlet Allocation \sep Optimal Transportation Problem}, } 全文见: TopicLinkage.pdf 宣讲幻灯片见: TopicLinkages (Slides).pdf
个人分类: 机器学习|3468 次阅读|0 个评论
忙碌
hagmhsn 2012-8-24 14:58
其实在很早之前就估计到这个周会比较忙的,又要上课又有学生答辩。却不知真的是忙得没头没尾。 这学期要开始上课了,平生第一次上课就是一个很大的课,针对博士生的科学前沿课,四节联在一起,整整一个上午!恐怖!现在学校里都已经实现了多媒体化,没有了板书,同时也意味着授课内容要大幅度增加。我要讲的topic是最近几年高分子领域的热点,文章实在是太多了。做报告可以随意挑一些,而讲课则需要挑其中有代表性的。少不得,粗略看了三四百篇文献,挑出六七十篇再仔细阅读。花了一个月的时间(中间写基金报告又占去不少时间)终于做好了PPT,总共99 页。 周一下午,终于将PPT定稿。从头到尾要多熟悉几遍,每一遍至少要花去两个小时,是啊,这是为三个小时多的课准备的材料。温习的第一遍还没好,接到了一个电话,博士后老板的导师到了上海。是的,前些日子是和他讨论过邀请他来我们学校做报告的,不过,那个时候他的日程定不下来。却不料在这个关头,他直接到了上海。火速行动起来,定seminar room,制作海报,还有杂七杂八的准备工作。一直到周二上午终于搞定。嗯,留给我温习PPT的时间只有一个下午了。 周三早上的课是很早就开始的,7点40感到教室,现代化的教学手段当然少不了一些现代化的麻烦,好在一切都还顺利,电脑投影仪无线话筒都没出啥毛病,于是就傻坐了一刻钟等学生们到来。 刚开始讲课还有些紧张,不过同学们反响还不错,于是,一切都顺利成章了。我还惊讶地发现当我PPT里写着“break here”的那张出来时外面正好响起了下课铃声。中间休息10分钟,继续下半程。讲到后面,嗓子真是有些不舒服了,连续站三个小时也是个不小的考验。等我终于讲完了最后一页slide,下面竟然响起了掌声,看来,讲座课和基础课还是有些差别的:) 中午也没法休息,因为,马上就是一轮硕士生答辩,从一点半开始,一直到5点半。我们答辩委员中间没有时间休息,啊呀~第一次觉得一天好累的说。 上课是第一件大事,接下来就是为周四上午的seminar做准备了。安排人去接speaker,他住在浦东的hotel里,离这里将近30公里远,没办法,上海市太大了~~安排会场,安排中午吃饭的地方,呵呵,都是第一次,少不了有些忙乱。准备得差不多时,speaker到了,寒暄了一阵子,要交换名片了,却发现名片不在桌子上了。FT,早上过来时特意将名片放在一个显然能够找的到的地方,现在却忘记了这个地方是哪里?很快就到10点了,马上到会议室去,里面已经坐满了人,看来这个topic还是很popular的。对了,老板的导师叫Gregory S Girolami,是UIUC化学系前一任的系主任,大牛人是也!大牛人的实力就是可以把非常深刻的问题简单明了地给你讲出来。下面的老师和同学个个听得入迷,看得出来,大家都非常感兴趣。报告结束后,讨论很热烈,现在大伙的英语口语都比我强啊:) http://news.ecust.edu.cn/index_news_view.php?id=11435keyword=cataid = 吃过午饭,Girolami教授要去上海博物馆参观了,我送他去那里。一路上聊些轻松话题,我从他的last name里猜出他是意大利裔,how?呵呵,我以前也算是意甲的粉丝,知道很多著名足球运动员的last name都是什么“尼”呀,“里”呀,“米”。嗯,make sense!不过,在美国熏陶几年,我成了篮球迷,谁让乔老爷是北卡的人呢!也喜欢NCAA。对了,今年UNC又是热得发烫的准冠军队伍。说到高兴处,很自然回忆起Tar Heel夺冠时大伙在Franklin St上狂欢的场景。Girolami教授说,是啊,那一年UNC在决赛中会面的正是UIUC。oh yeah, sorry about that。相视一笑。 周五中午,被拉到办公大楼去参加“青年干部培训班”(青干班)。本对当干部不感兴趣,不过,既然是学院推荐,也不能不识抬举了。原以为是很boring的说教课,却发现培训我们的老师竟然是通古博今的大师级的人物。没有代表保先和谐之类的词,却都是中华五千年先贤们的思想精髓。我向来自诩深究古文学,到了这里才发现自己才是个小学生。兴趣上来了,时间就过得飞快,不觉中就到了五点钟,一个星期结束了。 青干班讨论时,有一位老师又一次强调了文学的重要性,回头看过来,我发现自己的文学水平真的也需要培训一番。 一篇流水帐,凑合看了:) 写于2008年3月21日
4389 次阅读|0 个评论
[转载]流式分析细胞周期方法
taodan2003 2012-7-16 22:37
现将自己以前做的细胞周期检测的方法总结出来和大家一起交流,有问题希望各位踊跃跟帖,我们会尽量给以答复。 一、具体步骤: 1 细胞培养: 取对数生长期的细胞,按1X10 6 /mL以1mL体积接种24孔板或2mL体积接种于6孔板内,进行所需的处理(比如加入药物),在特定的时间后终止培养,进行下一步的实验。(个人体会:细胞的浓度根据自己实验的要求,但根据经验最好总细胞数在1X10 6 以上,如果细胞太少,接下来的漂洗和固定还会损失的,这样上机时有时会出现细胞不够) 2 细胞固定: 离心收集细胞,弃上清,用预冷PBS洗细胞两次,加入预冷70%乙醇,于4℃固定过夜,或-20℃长期固定。(4℃过夜一般隔天就进行检测,如果想推迟几天测,那就保存在-20℃,有资料说-20℃可以保存一个月,个人建议尽量在最短时间内检测,有些实验是在不同时间点上收细胞,这时我就等最后一次固定完了一块测,基本上也多在一周内检测完毕,没有特地去比较保存时间对检测结果的影响) 3 细胞染色: 离心收集细胞,以1mL的PBS洗细胞一次,加入500uLPBS含50ug/mL溴化乙锭(PI),100ug/mL RNase A, 0.2% Triton X-100, 4℃避光孵育30分钟。(PI我是直接用PBS配成工作浓度,然后加入细胞沉淀混匀,RNA酶现加,但有时不加发现对实验结果也没太大的影响) 4 流式分析: 以标准程序用流式细胞仪检测,一般计数2-3万个细胞,结果用细胞周期拟和软件ModFit分析。 二、给几张自己做的结果让初学者有点感性认识吧: A. 正常培养条件下的细胞周期;B. 细胞因子饥饿法将细胞同步在G0/G1其;C. 显示G0/G1期前的亚二倍体,代表细胞凋亡。
6067 次阅读|0 个评论
“BioMedLib十大最佳论文”是什么东东?
热度 2 沈海军 2012-6-16 16:48
近两年,我多次接到生物医学科技搜索引擎BioMedLib的email,说我的论文被评为评为相关领域中发表的十大最佳论文之首。具体email附后。 另我发现,网上许多人都以自己的文章进BioMedLib十大最佳论文为豪。我还发现有人有这样的观点:“BioMedLib网站只要看到你发表了一篇文章,就根据你的文章研究内容进行检索,找出另外九篇文章,把你的文章自动放在第一,其它九篇放到后面,这个既不说明这十篇文章是某一领域的十佳文章,更不说明你的文章是十佳之首,说的直白点就是相似度检索,和研究水平毫无关系。” 是这样吗? “BioMedLib十大最佳论文”到底是什么东东?有没有知情者。 =================================== BioMedLib: "Who is Publishing in My Domain?" =================================== For your article Shen HJ: . Yao Xue Xue Bao; 2006 Sep;41(9):888-92 PMID: 17111839 the following section is the top 20 articles published on the same topic since you published yours. Please sign up to continue receiving this service (view the following link in your browser). This literature-monitoring service is provided to you free of charge by BioMedLib. http://wipimd.com/nlnsrvys9034fnoi?srvyi=47091wft=wimsqt11=17111839.001qt03=shj@nuaa.edu.cnmld=BLD2045TTTeiREovadt01=D2045T The monthly "Who Is Publishing in My Domain" service also includes free full-text publications (free PDF downloads), plus publications citing your article. You will be able to customize these lists to your informational needs in the registration page. Please forward this email to your co-authors, so that they can sign up as well. You can also sign up for a different article. http://wipimd.com/nlnsrvys9034fnoi?srvyi=47091wft=wimscmpgn89116=BLD2045TTTeiREova Regards, Article Delivery Services www.WIPIMD.com Email correspondence: custserv@bmlsearch.com -------------------------------------------------------------------------------- List 1: Top 20 Articles, in the Domain of Article 17111839, Since its Publication (2006) 1. .Shen HJ: Yao Xue Xue Bao; 2006 Sep;41(9):888-92 Go to the online record: http://bmlsearch.com/?kwr=17111839%5Bpmid%5Dcmpgn83301=BLD2045TTTeiREovaxpclps3=Matches 2. Molecular dynamics simulations of flexible polymer chains wrapping single-walled carbon nanotubes.Tallury SS, Pasquinelli MA: J Phys Chem B; 2010 Apr 1;114(12):4122-9 Go to the online record: http://bmlsearch.com/?kwr=20205372%5Bpmid%5Dcmpgn83301=BLD2045TTTeiREovaxpclps3=Matches 3. Molecular dynamics simulations of polymers with stiff backbones interacting with single-walled carbon nanotubes.Tallury SS, Pasquinelli MA: J Phys Chem B; 2010 Jul 29;114(29):9349-55 Go to the online record: http://bmlsearch.com/?kwr=20593831%5Bpmid%5Dcmpgn83301=BLD2045TTTeiREovaxpclps3=Matches 4. Molecular dynamics simulations of deformation and rupture of super carbon nanotubes under tension.Qin Z, Feng XQ, Zou J, Yin Y, Yu SW: J Nanosci Nanotechnol; 2008 Dec;8(12):6274-82 Go to the online record: http://bmlsearch.com/?kwr=19205194%5Bpmid%5Dcmpgn83301=BLD2045TTTeiREovaxpclps3=Matches 5. Probing diameter-selective solubilisation of carbon nanotubes by reversible cyclic peptides using molecular dynamics simulations.Friling SR, Notman R, Walsh TR: Nanoscale; 2010 Jan;2(1):98-106 Go to the online record: http://bmlsearch.com/?kwr=20648370%5Bpmid%5Dcmpgn83301=BLD2045TTTeiREovaxpclps3=Matches 6. Multicomponent ballistic transport in narrow single wall carbon nanotubes: analytic model and molecular dynamics simulations./spanMutat T, Adler J, Sheintuch M: J Chem Phys; 2011 Jan 28;134(4):044908 Go to the online record: http://bmlsearch.com/?kwr=21280799%5Bpmid%5Dcmpgn83301=BLD2045TTTeiREovaxpclps3=Matches 7. The thermal conductivity and thermal rectification of carbon nanotubes studied using reverse non-equilibrium molecular dynamics simulations.Alaghemandi M, Algaer E, Bƒhm MC, Mƒ¼ller-Plathe F: Nanotechnology; 2009 Mar 18;20(11):115704 Go to the online record: http://bmlsearch.com/?kwr=19420452%5Bpmid%5Dcmpgn83301=BLD2045TTTeiREovaxpclps3=Matches 8. Application of molecular dynamics simulations for structural studies of carbon nanotubes.Brƒ³dka A, Ko…‚oczek J, Burian A: J Nanosci Nanotechnol; 2007 Apr-May;7(4-5):1505-11 Go to the online record: http://bmlsearch.com/?kwr=17450918%5Bpmid%5Dcmpgn83301=BLD2045TTTeiREovaxpclps3=Matches 9. Molecular dynamics simulation studies of structural and mechanical properties of single-walled carbon nanotubes./spanMashapa MG, Ray SS: J Nanosci Nanotechnol; 2010 Dec;10(12):8083-7 Go to the online record: http://bmlsearch.com/?kwr=21121299%5Bpmid%5Dcmpgn83301=BLD2045TTTeiREovaxpclps3=Matches 10. Kinetics of water filling the hydrophobic channels of narrow carbon nanotubes studied by molecular dynamics simulations.Wu K, Zhou B, Xiu P, Qi W, Wan R, Fang H: J Chem Phys; 2010 Nov 28;133(20):204702 Go to the online record: http://bmlsearch.com/?kwr=21133447%5Bpmid%5Dcmpgn83301=BLD2045TTTeiREovaxpclps3=Matches 11. Molecular dynamics simulations on hydrogen adsorption in finite single walled carbon nanotube bundles.Knippenberg MT, Stuart SJ, Cheng H: J Mol Model; 2008 May;14(5):343-51 Go to the online record: http://bmlsearch.com/?kwr=18286311%5Bpmid%5Dcmpgn83301=BLD2045TTTeiREovaxpclps3=Matches 12. Investigation of the influence of thermostat configurations on the mechanical properties of carbon nanotubes in molecular dynamics simulations.Heo S, Sinnott SB: J Nanosci Nanotechnol; 2007 Apr-May;7(4-5):1518-24 Go to the online record: http://bmlsearch.com/?kwr=17450920%5Bpmid%5Dcmpgn83301=BLD2045TTTeiREovaxpclps3=Matches 13. Acute and long-term effects after single loading of functionalized multi-walled carbon nanotubes into zebrafish (Danio rerio).Cheng J, Chan CM, Veca LM, Poon WL, Chan PK, Qu L, Sun YP, Cheng SH: Toxicol Appl Pharmacol; 2009 Mar 1;235(2):216-25 Go to the online record: http://bmlsearch.com/?kwr=19133284%5Bpmid%5Dcmpgn83301=BLD2045TTTeiREovaxpclps3=Matches 14. Stabilization of aqueous carbon nanotube dispersions using surfactants: insights from molecular dynamics simulations./spanTummala NR, Morrow BH, Resasco DE, Striolo A: ACS Nano; 2010 Dec 28;4(12):7193-204 Go to the online record: http://bmlsearch.com/?kwr=21128672%5Bpmid%5Dcmpgn83301=BLD2045TTTeiREovaxpclps3=Matches 15. Molecular dynamics simulation study of ionic hydration in negatively charged single-walled carbon nanotubes.Guo X, Shao Q, Lu L, Zhu Y, Wei M, Lu X: J Nanosci Nanotechnol; 2010 Nov;10(11):7620-4 Go to the online record: http://bmlsearch.com/?kwr=21137996%5Bpmid%5Dcmpgn83301=BLD2045TTTeiREovaxpclps3=Matches 16. Molecular dynamics simulations of ion transport through carbon nanotubes. I. Influence of geometry, ion specificity, and many-body interactions.Beu TA: J Chem Phys; 2010 Apr 28;132(16):164513 Go to the online record: http://bmlsearch.com/?kwr=20441294%5Bpmid%5Dcmpgn83301=BLD2045TTTeiREovaxpclps3=Matches 17. Catalyzed growth of carbon nanotube with definable chirality by hybrid molecular dynamics-force biased Monte Carlo simulations.Neyts EC, Shibuta Y, van Duin AC, Bogaerts A: ACS Nano; 2010 Nov 23;4(11):6665-72 Go to the online record: http://bmlsearch.com/?kwr=20939511%5Bpmid%5Dcmpgn83301=BLD2045TTTeiREovaxpclps3=Matches 18. Perylene-based nanotweezers: enrichment of larger-diameter single-walled carbon nanotubes./spanBackes C, Schmidt CD, Hauke F, Hirsch A: Chem Asian J; 2011 Feb 1;6(2):438-44 Go to the online record: http://bmlsearch.com/?kwr=21254422%5Bpmid%5Dcmpgn83301=BLD2045TTTeiREovaxpclps3=Matches 19. Molecular dynamics analysis on buckling of defective carbon nanotubes.Kulathunga DD, Ang KK, Reddy JN: J Phys Condens Matter; 2010 Sep 1;22(34):345301 Go to the online record: http://bmlsearch.com/?kwr=21403253%5Bpmid%5Dcmpgn83301=BLD2045TTTeiREovaxpclps3=Matches 20. Molecular dynamics simulation study of the structural characteristics of water molecules confined in functionalized carbon nanotubes.Huang LL, Zhang LZ, Shao Q, Wang J, Lu LH, Lu XH, Jiang SY, Shen WF: J Phys Chem B; 2006 Dec 28;110(51):25761-8 Go to the online record: http://bmlsearch.com/?kwr=17181218%5Bpmid%5Dcmpgn83301=BLD2045TTTeiREovaxpclps3=Matches
9604 次阅读|2 个评论
更正:LDA、AT和ToT的Gibbs采样算法
热度 2 xiaohai2008 2012-6-15 16:40
今天发现原来发布的 LDA 、 AT 和 ToT 的Gibbs采样算法中有几个公式推导有点问题,不过最后的公式是正确的,不会影响大家写程序 现已纠正,并统一进行发布如下: Gibbs for LDA, ToT AT.pdf 对相关人员的理解造成不良影响的,在此一并表示歉意,希望这个新的版本对大家更有帮助。
个人分类: 机器学习|12355 次阅读|3 个评论
SUBJECT AND TOPIC: A NEW TYPOLOGY OF LANGUAGE
carldy 2012-5-11 14:46
SUBJECT AND TOPIC.mht 这篇文章算是同类文章中的佼佼者了。发表至今,近四十年来,也启迪了不少学者。 找了很多图书馆,也托了很多朋友老师,现在总算弄到一个电子版,分享在此,让更多的读者受益。 SUBJECT AND TOPIC: A NEW TYPOLOGY OF LANGUAGE by Charles N. Li Sandra A. Thompson 1976. In: Charles N. Li (ed.). Subject and Topic . London /New York: Academic Press, pp. 457-489.   I. Introduction II. Subject vs. Topic III. Characteristics of Topic-Prominent Languages IV. On the Basicness of Topic-Comment Sentences in Tp Languages V. The Typology and Some Diachronic Implications   SUBJECT AND TOPIC: A NEW TYPOLOGY OF LANGUAGE * by Charles N. Li Sandra A. Thompson *This paper is an amalgamation of three earlier papers and renders them obsolete: (1) "Chinese as a Topic-Prominent Language," prepared and circulated for the 7th International Conference on Sino-Tibetan Languages and Linguistics, October, 1974; (2) "Subject and Topic: A New Typology of Language," presented at the LSA Annual Meeting, New York, December, 1974; (3) "Evidence Against Topicalization in Topic-Prominent Languages," circulated prior to the Symposium on Subject and Topic. We are grateful to the participants of the Symposium and to James H-Y Tai for their valuable comments and to Dr. Edward Hope, who responded from Bangkok to our inquiries about a number of Lisu constructions. During the preparation of this paper, Charles N. Li was supported by a fellowship from the American Council of Learned Societies.   p. 459 I. Introduction. Since the emergence of descriptive linguistics, linguists have disagreed among themselves over the question of the extent to which languages could be expected to differ from one another. The present paper is an attempt to lay the foundation for a typology based on the grammatical relations subject-predicate and topic-comment . The notion of subject has long been considered a basic grammatical relation in the sentential structure of a language. However, the evidence we have gathered from certain languages suggests that in these languages the basic constructions manifest a topic-comment relation rather than a subject-predicate relation. This evidence shows not only that the notion of topic may be as basic as that of subject in grammatical descriptions, but also that languages may differ in their strategies in construction sentences according to the prominence of the notions of topic and subject . According to our study, there are four basic types of languages: (i) languages that are subject-prominent (a term introduced by E.L. Keenan); (ii) languages that are topic-prominent; (iii) languages that are both subject-prominent and topic-prominent; (iv) languages that are neither subject-prominent nor topic-prominent. In subject-prominent (Sp) languages, the structure of sentences favors a description in which the grammatical relation subject-predicate plays a major role; in topic-prominent (Tp) languages, the basic structure of sentences favors a description in which the grammatical relation topic-comment plays a major role. In type (iii) languages, there are two equally important distinct sentence constructions, the subject-predicate construction and the topic-comment construction; in type (iv) languages, the subject and the topic have merged and are no longer distinguishable in all sentence types. In order to clarify the subject-predicate construction and the topic-comment construction, we may use two types of English sentences as examples: 1 John hit Mary. Subject Predicate 2 As for education, John prefers Bertrand Russell's ideas. Topic Comment In Sp languages, the basic sentence structure is similar to 1, whereas in Tp languages, the basic sentence structure is similar to 2. However, this is not to say that in Tp languages, one cannot identify subjects, or that Sp languages do not have topics. In fact, all the languages we have investigated have the topic-comment construction, and although not all languages have the subject-predicate construction, there appear to be ways of identifying subjects in most Tp languages. p. 460 Our typological claim will simply be that some languages can be more insightfully described by taking the concept of topic to be basic, while others can be more insightfully described by taking the notion of subject as basic. This is due to the fact that many structural phenomena of a language can be explained on the basis of whether the basic structure of its sentences is analyzed as subject-predicate or topic-comment. According to a number of criteria which we will outline below, and a small sample of languages which we have investigated, the following typological table may be established. Subject-Prominent Languages Topic-Prominent Languages Indo-European Chinese Niger-Congo Lahu (Lolo- Burmese) Finno-Ugric Lisu (Lolo-Burmese) Simitic : Dyirbal (Australian) : Indonesian Malagasy : : Subject-Prominent and Topic-Prominent Languages Neither Subject-Prominent nor Topic-Prominent Languages Japanese Tagalog Korean Illocano : : : : It is obvious that the above table touches on only a very small number of languages in the world. This is partly due to the fact that in order to establish topic-prominence, a careful investigation of the syntactic structures of a language is necessary. Since the tradition in linguistic studies emphasizes the subject as the basic, universal grammatical relation, grammarians tend to assume that sentences of a language are naturally structured in terms of subject, object, and verb. In general, it is not considered that the basic structure of a sentence could be described in terms of topic and comment. 1 There are exceptions. For example, Schachter and Otanes (1972) stated that the Tagalog basic p. 461 sentence structure should not be described in terms of the notion subject Another example is E. Hope (1974) who has described a remarkable Tp language, Lisu, a Lolo-Burmese language. But in general, it is often difficult to determine the typology of a language in terms of subject-prominence and topic-prominence on the basis of reference grammars since many such grammars are biased toward the subject-predicate analysis. Modern generative linguistics does not represent any advance in this particular area. The assumption remains that the basic sentence structure should be universally described in terms of subject, object, and verb. Our goal in this paper is, therefore, a modest one: we wish to establish the value and the validity of a typology based on the notions of subject-prominence and topic-prominence. We will proceed as follows. First, we will outline the differences between subjects and topics in terms of a number of properties which they do not share; then we will discuss some of the characteristics of Tp languages. We will then show that the topic-comment structure in Tp languages is indeed a basic sentence type, and finally we will explain the implications of the typology for the study of universal grammar.   *Charles N. Li Sandra A. Thompson. 1976. "Subject and Topic: A New Typology of Language." In: Charles N. Li (ed.). Subject and Topic . London /New York: Academic Press, pp. 457-61. II. Subject vs. Topic . * (a) Definite. According to Chafe (this volume), a definite noun phrase is one for which "I think you already know and can identify the particular referent I have in mind." One of the primary characteristics of topics, then, is that they must be definite 2 (see Chafe, this volume, for further remarks on definiteness). According to this characterization of definiteness, proper and generic NPs are also understood as definite. The conditions regarding the speaker's assessment of the hearer's knowledge under which a proper noun can be appropriately used are the same as those under which a definite common noun phrase can be used. A generic noun phrase is definite because its referent is the class of items named by the noun phrase, which the hearer can be assumed to know about if he knows the meaning of that noun phrase. 3 A subject, on the other hand, need not be definite. For example, the subjects of 3 and 4 are indefinite: 3 A couple of people have arrived. 4 A piece of pie is on the table. (b) Selectional relations. An important property of the topic is that it need not have a selectional relation with any verb in a sentence; that is, it need not be an argument of a predicative constituent. This property of topic is p. 462 particularly noticeable among the topic-prominent languages since the topic-comment construction in such languages, as we will try to show, represents the basic sentence type. Consider sentences 5, 6, 7, and 8, all of which represent common sentence types in their respective languages. The underlined constituent in each sentence is the topic. The topics in these sentences, 5 "this field," 6 "elephants," 7 "that fire," 8 "those trees," have no selectional relation with the verbs. Similarly, in Japanese, the topic marked by the particle wa , and in Korean, the topic marked by the particle (n)un need not be selectionally related to the verb of the sentence, as shown in 9 and 10: 9 siban-un kakkjo-ga manso (Korean)   now- topic marker school- subject marker many   "The present time (topic), there are many schools." 10 Gakkoo-wa buku-ga isogasi-kat-ta (Japanese)   school- topic marker I- subject marker busy-past tense   "School (topic), I was busy." The subject, on the other hand, always has a selectional relation with some predicate in the sentence. It is true that the surface subject of some sentences may not be selectionally related to the main surface verb. For example, classical transformational analyses (e.g., Chomsky 1965; Rosenbaum 1967; Postal 1971; Postal and Ross 1971) recognize the surface subject. "John," in the following sentences to be selectionally unrelated to the main predicates, "be easy" and "appear." p. 463 11 John is easy to please. 12 John appears to be angry. This fact, however, does not contradict our claim that the subject of a sentence is always selectionally related to some predicate in the sentence. In the surface structure, the subject might not be adjacent to the predicate to which it is selectionally related, and it might even have assumed a new grammatical relationship with a verb to which it is not selectionally related. But the fact remains that a selectional relation must exist between the subject of a sentence and some verb in that sentence 4 , whereas no such relationship need exist between topic and verb. 5 (c) Verb determines "Subject" but not "Topic." A correlate of the fact that a subject is selectionally related to the verb is the fact that, with certain qualifications, it is possible to predict what the subject of any given verb will be. 6 Thus, in English, if a verb occurs with an agent as well as other noun phrases, the agent will become the subject unless a "special" construction is resorted to, such as the passive. (This way of stating the fact about subjectivalization is due to Fillmore, 1968:37.) If the verb is intransitive, either the patient or the actor, depending on whether the verb is a stative verb or an action verb, will be the subject. If the verb is causative, the causer will be the subject. These facts represent some of the language-independent generalizations about how the subject is determined by the verb. There is no doubt that not all verbs in a language can be classified with respect to subjectivization on a language-independent basis. For example, in English, the verb "enjoy" will take the experiencer but not the accusative as the subject, whereas the verb "please" will have the accusative noun phrase but not the experiencer as the subject. But the fact remains that given a verb, the subject is predictable. The topic, on the other hand, is not determined by the verb; topic selection is independent of the verb. Discourse may play a role in the selection of the topic, but within the constraints of the discourse, the speaker still has considerable freedom in choosing a topic noun phrase regardless of what the verb is. This characteristic of the topic is clearly demonstrated by our earlier examples, 5-8, with topic-comment structure. (d) Functional role. The functional role of the topic is constant across sentences; as Chafe (this volume) suggests: p. 464 "What the topics appear to do is limit the applicability of the main predication to a certain restricted domain. . . . The topic sets a spatial, temporal, or individual framework within which the main predication holds." Clearly, this function of specifying the domain within which the predication holds is related to the structure of the discourse in which the sentence is found. The topic is the "cencer of attention"; it announces the theme of the discourse. This is why the topic must be definite (see section II(a) above). The functional role of the topic as setting the framework within which the predication holds precludes the possibility of an indefinite topic. A feel for the bizarreness of such a topic can be gained from considering the impossibility of interpreting the following English sentence:     0   13 * A dog , I gave some food to it yesterday.     one   Looking at the functional role of the subject, on the other hand, reveals two facts. First, some NPs which can be clearly identified as subjects do not play any semantic role in the sentence at all; that is, in many subject-prominent languages, sentences may occur with "empty" or "dummy" subjects (see section III(c) below). Second, in case the subject NP is not empty, the functional role of the subject can be defined within the confines of a sentence as opposed to a discourse. According to Michael Noonan (personal communication), the subject can be characterized as providing the orientation or the point of view of the action, experience, state, etc., denoted by the verb. This difference in the functional roles between the subject and the topic explains the fact that the subject is always an argument of the verb, while the topic need not be (see section II(b) above). The explanation runs as follows: if we are to view the action, experience, state, etc., denoted by the verb from the point of view of an entity (or orient the description towards that entity), the entity must be involved in the action, experience or state, etc., and must therefore be an argument of the verb. Thus we see that the distinct functions of the topic and the subject turn out to explain the differences between them in definiteness and selectional relations. (e) Verb-agreement. It is well known that the verb in many languages shows obligatory agreement with the subject of a sentence. Topic-predicate agreement, however, is very rare, and we know of no language in which it is widespread or p. 465 obligatory. The reason for this is quite straightforward: topics, as we have seen, are much more independent of their comments than are subjects of their verbs. Evidence of this independence can be found in the fact, discussed in section II(a) and II(c), that the topic need not have any selectional relationship to any verb and that the topic is not determined by the verb of the sentence. Given this independence, it is to be expected that a constituent in the comment is not normally marked to agree with some grammatical property of the topic. Morphological agreement, then, where some inherent properties of the subject noun are represented by verbal affixes, is a common kind of surface coding for subjects (see E.L. Keenan, Definition of Subject , this volume). 7 (f) Sentence-initial position. Although the surface coding of the topic may involve sentence position as well as morphological markers, it is worth noting that the surface coding of the topic in all the languages we have examined always involve the sentence-initial position. In Lisu, Japanese, and Korean, the topic is obligatorily codified by morpheme markers. In Lahu, the topic is optionally codified by morpheme markers. But regardless of the morpheme markers, the topic in these languages must remain in sentence-initial position. Subject, on the other hand, is not confined to the sentence-initial position. In Malagasy and Chumash, for example, the subject occurs in sentence-final position, while Arabic and Jacaltec, for example, are VSO. The reason that the topic but not the subject must be in sentence-initial position may be understood in terms of discourse strategies. Since speech involves serialization of the information to be communicated, it makes sense that the topic, which represents the discourse theme, should be introduced first. The subject, being a more sentence-oriented notion, need not receive any priority in the serialization process. (g) Grammatical processes. The subject but not the topic plays a prominent role in such processes as reflexivization, passivization, Equi-NP deletion, verb serialization and imperativization (see E.L. Keenan, Definition of Subject , this volume). Thus the reflexive pronoun generally marks a co-referential relation with the subject of the sentence; passivization may be viewed, at least in part, as a process promoting the patient to the subjecthood; in Equi-NP deletion, the deleted constituent in the complement is generally the subject; verb serialization which is found in the Niger-Congo languages and the Sino-Tibetan languages, involves the concatenation of a series of verb phrases with one identical subject; the deleted second person morpheme in an imperative sentence is always the subject. The reason that the topic is p. 466 not involved in such grammatical processes is partially due to the fact that the topic, as we have shown earlier, is syntactically independent of the rest of the sentence. Reflexivization, passivization, Equi-NP deletion, verb serialization etc., are concerned with the internal syntactic structure of sentences. Since the topic is syntactically independent in the sentence, it is not surprising that it does not play a role in the statement of these processes. To sum up this section on the differences between the subject and the topic, we note that seven criteria have been established. These criteria are not intended to constitute a definition of either notion, but are rather designed to serve as guidelines for distinguishing the topic from the subject. We may single out three basic factors underlying these criteria: discourse strategy, noun-verb relations, and grammatical processes . The subject has a minimal discourse function in contrast with the topic. Hence, the topic but not necessarily the subject is discourse-dependent, serves as the center of attention of the sentence, and must be definite. As for noun-verb relations and grammatical processes, it is the subject rather than the topic that figures prominently. Thus, subject is normally determined by the verb, and is selectionally related to the verb; and the subject often obligatorily controls verb agreement. These properties of the subject are not shared by the topic. In conclusion, t he topic is a discourse notion, whereas the subject is to a greater extent a sentence-internal notion . The former can be understood best in terms of the discourse and extra-sentential considerations; the latter in terms of its functions within the sentence structure.   *Charles N. Li Sandra A. Thompson. 1976. "Subject and Topic: A New Typology of Language." In: Charles N. Li (ed.). Subject and Topic . London /New York: Academic Press, pp. 461-66. III. Characteristics of Topic-Prominent Languages. * Having examined a number of properties of topics as opposed to subjects, let us now turn to a discussion of some of the grammatical implications of topic-prominence and subject-prominence. (a) Surface coding. In Tp languages, there will be a surface coding for the topic, but not necessarily for the subject. For example, in Mandarin, the topic is always in initial position; in Lisu and Lahu, the topic is coded by a morphological marker. In none of these languages is there any surface coding for subject, though, as we have pointed out, the subject notion can be identified as playing a role in certain grammatical processes. In Japanese and Korean, which are both Tp and Sp, there is a morpheme marking the topic ( wa and (n)un , respectively) as well as one marking the subject ( ga and ka , respectively). p. 467 (b) The passive construction. The passive construction is common among Sp languages. Among Tp languages, on the other hand, passivization either does not occur at all (e.g., Lahu, Lisu), or appears as a marginal construction, rarely used in speech (e.g., Mandarin), or carries a special meaning (e.g., the "adversity" passive in Japanese). 8 The relative insignificance of the passive in Tp languages can be explained as follows: in Sp languages, the notion of subject is such a basic one that if a noun other than the one which a given verb designates as its subject becomes the subject, the verb must be marked to signal this "non-normal" subject choice. Fillmore states this requirement as follows for the verb "give" in English: "The 'normal' choice of subject for sentences containing an A(gent). . . is the A. The verb give also allows either O(bject) or D(ative) to appear as subject as long as this 'non-normal' choice is 'registered' in the V. This 'registering' of a 'non-normal' subject takes place via the association of the feature with the Y." (Fillmore, 1968:37) In Tp languages, it is the topic, not the subject, that plays a more significant role in sentence construction. Any noun phrase can be the topic of a sentence without registering anything on the verb. It is, therefore, natural that the passive construction is not as widespread in Tp languages as it is in Sp languages. (c) "Dummy" subjects. "Dummy" or "empty" subjects, such as the English it and there , the German es , the French il and ce , may be found in an Sp language but not in a Tp language. This is because in an Sp language a subject may be needed whether or not it plays a semantic role. Examples from English include: 14 It is raining. 15 It is hot in here. 16 It is possible that the war will end. 17 There is a cat in the garden. In a Tp language, as we have emphasized, where the notion of subject does not play a prominent role, there is no need for "dummy" subjects. In cases where no subject is called for, the sentence in a Tp language can simply do without a subject. For example, the Mandarin sentences corresponding to 15-17 are respectively 18-20. p. 468 18 Zher hen re (Mandarin)   here very hot     "It is hot in here."   19 Keneng zhe - chang zhanzhen jiu - yao jiesu le (Mandarin)   possible this - class. war will soon end aspect     "It is possible that this war will soon end."   20 You yi - tiao mao zai huayuan-li (Mandarin)   exist one class. cat at garden -in     "There is a cat in the garden."   (d) "Double subject." Tp languages are famous for their pervasive so-called "double subject" constructions. A number of examples have already occurred in our exposition. Here are four more, each from a different language: 21 Sakana wa tai ga oisii (Japanese)   fish top. red snapper subj. delicious     "Fish (topic), red snapper is delicious."   22 Pihengki - nun 747 - ka khu - ta (Korean)   airplane - top. - subj. big - stative     "Airplanes (topic), the 747 is big."   23 Neike shu yezi da (Mandarin)   that tree leaves big     "That tree (topic), the leaves are big."   24 ho o na - qho yi ve yo (Lahu)   elephant top. nose long prt. declar.     "Elephants (topic), noses are long."   Such sentences are, of course, the clearest cases of topic-comment structures. First, the topic and the subject both occur and can thus be distinguished easily. Second, the topic has no selectional relationship with the verb. Third, no argument can be given that these sentences could be derived by any kind of "movement" rule from some other sentence type. Fourth, all Tp languages have sentences of this type, while no pure Sp languages do, as far as we know. It has been suggested (Teng 1974) for Mandarin and by Park (1973) for Korean that these sentences involve a "sentential predicate." That is, a Mandarin sentence such as 25 Ta tou teng   he head ache     "He has a head-ache."   is analyzed by Teng (1974) as having the following structure: p. 469 26 (p. 461, Figure 2) While we agree with the spirit of this approach, we feel that his analysis makes sense only if languages with "double subject" constructions are seen as Tp. This is because in an Sp language, a predicate cannot be a sentence. If it were a sentence, it would leave the subject grammatically "stranded," as it were, with nothing to be the subject of. Viewing such constructions as composed of a topic and a comment, however, involves no anomaly since sentential comments are quite natural, given the grammatical independence of the topic from the rest of the sentence. 9 The pervasiveness of the "double subject" construction, then, is a significant feature of Tp languages. 10 In section IV(d), we will consider the basicness of "double subject" sentences in more detail. (e) Controlling co-reference. In a Tp language, the topic, and not the subject, typically controls co-referential constituent deletion. 11 Some examples from Mandarin include: 27 Neike shu yezi da, suoyi wo bu xihuan ___.   that tree leaves big so I not like ___     "That tree (topic), the leaves are big, so I don't like it "   The deleted object in the second clause can only be understood to refer to the topic "that tree," and not to the subject "leaves." 28 Nei kuai tian daozi zhangde hen da,   that piece land rice grow very big         suoyi______ hen zhiqian.     so ______ very valuable         "That piece of land (topic), rice grows very big, so it (the land) is very valuable."   Similarly, the deleted constituent in 28 refers to the topic "that piece of land," and not to the subject "rice." p. 470 Sentence 29 illustrates a case in which the subject "fire brigade" cannot control the deletion in the second clause, and the topic "that fire" is incompatible with that clause, so it is incoherent: 29 Nei chang huo xiaofangdui laide zao,   that classifier fire fire brigade came early         (*)suoyi ______ hen lei     so ______ very tired         "That fire (topic), the fire brigade came early, so they're very tired."   The point we are making is that in a Tp language, the topic takes precedence over the subject in controlling co-reference. 12 (f) V-final languages. Tp languages tend to be verb-final languages, as has been pointed out by Hsieh Hsin-I and W.P. Lehmann (personal communication). Japanese, Korean, Lisu, and Lahu are mature and indisputable verb-final languages, and Chinese, as we have argued elsewhere (see Li and Thompson 1974a and 1974b) is in the process of becoming one. In the final section, we will suggest a possible explanation for this fact. (g) Constraints on topic constituent. In certain Sp languages, the topic-comment type of sentence is highly constrained in terms of what can serve as the topic constituent. Indonesian, for instance, only allows the surface subject constituent and the genitive of the surface subject constituent to be the topic. 13 Consider sentence 30, a simple subject-predicate construction in Indonesian, 30 Ibu anak itu membeli sepatu (Indonesian)   mother child that buy shoe     "That child's mother bought shoes."   where the subject is ibu anak itu "that child's mother." The entire subject may be the topic: 31 Ibu anak itu , die membeli sepatu (Indonesian)   mother child that , she buy shoe     "That child's mother, she bought shoes."   The genitive of the subject, anak itu "that child," may also be the topic: 32 Anak itu , ibu - nja membeli sepatu (Indonesian)   child that , mother - poss.. buy shoe suffix     "That child, his mother bought shoes."   p. 471 However, if the object noun phrase, sepatu "shoe," of sentence 30 is the topic, the sentence is ungrammatical: 33 *Sepatu itu , ibu anak itu membeli (Indonesian)   shoe that , mother child that buy         In topic-prominent languages, on the other hand, there are no constraints on what may be the topic. (h) Basicness of topic-comment sentences. Perhaps the most striking difference between a Tp language and a non-Tp language is the extent to which the topic-comment sentence can be considered to be part of the repertoire of basic sentence types in the former but not in the latter. In the next section we will provide evidence for the basicness of topiccomment sentences in Tp languages. To summarize this section, we have brought out a number of distinguishing characteristics of Tp languages. In these languages, topics are coded in the surface structure and they tend to control co-referentiality; the topic-oriented "double subject" construction is a basic sentence type, while such subject-oriented constructions as the passive and "dummy" subject sentences are rare or non-existent.     IV. On the Basicness of Topic-Comment Sentences in Tp Languages. * Our aim in this section will be to show that topic-comment structures in Tp languages cannot be viewed as being derived from any other sentence type. We would like to make it quite clear that we are not arguing against any particular formulation according to which such a derivational relationship might be established; rather we are arguing against the desirability in principle of viewing topic-comment sentences as derivative, marginal, marked, or otherwise unusual sentence types in these languages. That is, we are not saying that some generative apparatus could not be imagined which would "handle" the cases we are about to present. Our claim is that the data which these Tp languages present are most naturally accounted for by taking the topic-comment sentences to be basic and not derived. (a) On the notion "basic sentence," E.L. Keenan ( Definition of Subject , this volume), in discussing the definition of "subject," offers a definition of "basic" sentence: "i) a sentence A is more basic than a sentence B if, and only if, the syntactic form and the meaning of B are understood as a function of those of A. (E.g., the form of B is some modification that of A, and the meaning of B is some modification of that of A.) "ii) a sentence is a basic sentence in L if and only if no other sentence of L is more basic than it." According to both of these criteria, topic-comment sentences in Tp languages are basic. There are no sentences more basic than they in terms of which their meaning or form can be specified. p. 472 (b) Lisu. The clearest data supporting this claim can be found in Lisu, a Tp language described in Hope (1974). Our data will be taken from this work and his response to our inquiry about a number of Lisu constructions while he was doing field work in Thailand. In Lisu, as we will endeavor to show, even the grammatical relations Agent and Patient cannot be identified. Thus, there is no way to identify the notion of subject. It is clear, then, that in Lisu, there is simply no subject-predicate sentence form from which the topic-comment sentences could be said to be derived. (1) Grammatical relations. The sentence word order in Lisu is verb-final. If there is more than one noun phrase preceding a verb, then the sentence is normally ambiguous as to which noun phrase represents the agent or the actor and which noun phrase represents the patient. The structure of a simple declarative sentence with a transitive verb will only indicate which noun phrase is the topic but not which noun phrase is the agent. Sentences 34 and 35 are typical simple declarative sentences in Lisu. 34 lathyu nya ana khu - a   people topic marker dog bite - declarative marker   "People (topic) they bite dogs /dogs bite them. 35 ana nya lathyu khu - a   dog topic marker people bite - declarative marker   Dogs (topic) they bite people /people bite them." Sentences 34 and 35 are equally ambiguous as far as agency is concerned. Both sentences may mean either people bite dogs or dogs bite people . The two sentences are different only in terms of the topic. In 34, lathyu "people" is the topic, whereas in 35 ana "dog" is the topic. One may wonder if a language such as Lisu, which completely neglects the codification of agency or subjecthood would give rise to communication Problems. Of course. there are sentences which are ambiguous, such as 36: 36 lame nya ana kyu - a   tiger topic marker dog bite - declarative marker   "Tigers (topic), they bite dogs /dogs bite them." p. 473 The fact is, however, that this total disregard for agency or subjecthood in the structure of the language does not impair its communicative function, as much as might be expected. First of all, the context, whether linguistic or extra-linguistic, provides a great number of semantic cues. Secondly, semantic properties such as humanness and animacy play a significant role in disambiguating sentences which may be otherwise ambiguous because of the lack of any indication of agency or subjecthood. In terms of pragmatics, one may safely assume that when one hears either 34 or 35, the intended meaning would be dogs bite people , since people are normally not expected to bite other creatures. Thus, although 34 and 35 are theoretically ambiguous, they do not present a communication problem in most circumstances. But the structure of the Lisu verb system also serves to minimize the potential ambiguity. For example, let us contrast the Lisu verb thywu "burn" with the English verb burn . Although both verbs share a great deal in meaning, there is a significant semantic difference between them. The Lisu verb thywu implies that what is being burnt must be inanimate. The English verb burn does not have such a selectional restriction. Thus, the Lisu sentence 37, whose English translation is acceptable, is ungrammatical: 37 *lathyu gu nya ana thywu - a   person that topic marker dog burn - declarative marker   "That person burned the dog." Instead, a causative construction would have to be used to express this proposition. Consider another Lisu verb sye "kill." Although it shares most of the meaning of the English verb kill , it has very different selectional properties. The Lisu verb sye obligatorily co-occurs with the noun yi-pe "an end," but need not occur with a patient noun which is selectionally required by the English verb, kill . Sentence 38 illustrates the usage of sye "kill." 38 asa nya yi-pe aye - a   asa topic marker end kill - declarative marker   "Asa killed and an end resulted." To further demonstrate that Agent and Patient are not systematically distinct in the grammar of Lisu, and hence, that there is no possibility of identifying the subject in Lisu sentences, we would like to cite some additional data. 39 lathyu nya aye ami khwa - a mu - a   people topic marker buffalo field hoe - decl. marker see - decl. marker   "The people (topic), they saw the buffaloes hoeing the field /the buffaloes saw them hoeing the field." 40 aye nya lathyu ami khwa - a mu - a   buffalo topic marker people field hoe - decl. marker see - decl.marker   "The buffaloes (topic), they saw the people hoeing the field /the people saw them hoeing the field." 41 ami nya aye lathyu khwa - a mu - a   field topic marker buffalo people hoe - decl. marker see - decl. marker   "The field (topic), the buffaloes saw the people hoeing it /the people saw the buffaloes hoeing it." 42 ana nya lame dza hi - a   dog topic marker tiger eat difficult - decl. marker   "Dogs (topic), they are difficult for tigers to eat /tigers are difficult for them to eat." 43 lame nya ana dza hi - a   tiger topic marker dog eat difficult - decl. marker   "Tigers (topic), they are difficult for dogs to eat /dogs are difficult for them to eat." 44 ana nya lame dza nisyi - a   dog topic marker tiger eat want - decl. marker   "Dogs (topic), tigers want to eat them /they want to eat tigers." 45 lame nya ana dza nisyi - a   tiger topic marker dog eat want - decl. marker   "Tigers (topic), dogs want to eat them /they want to eat dogs." p. 475 These Lisu sentences clearly show that neither word order nor morphology allows a grammatical distinction to be made between nouns in different relationships with the verb, and that there is, therefore, no identifiable subject in the sentence structure of this language. (2) Reflexive. In the Thailand dialect of Lisu, the reflexive consists of a construction which is either of the form repeating the co-referential NP meaning literally NP's body or of the form meaning his body where a pronoun is being used. 46 lame nya lame kudwe khu - a   tiger topic marker tiger body bite decl. marker   "The tiger (topic), he bit his body." 47 lame nya yi kudwe khu - a   tiger topic marker he body bite decl. marker   "The tiger (topic), he bit his body (i.e., himself)." 48 lame kudwe nya lame khu - a   tiger body topic marker tiger bite decl. marker   "His body (topic), the tiger bit it." 49 yi kudwe nya lame kyu - a   he body topic marker tiger bite decl. marker   "His body (topic), the tiger bit it." (3) Co-ordination. The coordination marker in Lisu is ce . If several topic noun phrases are conjoined, ce is used to replace one or all of the topic markers nya . p. 476 51 lathyu ce lame nya ana khu - a   people co-ord tiger topic marker dog bite decl. marker   "People and tigers (topic), they bite dogs /dogs bite them." 52 lathyu nya lame ce ana khu - a   people topic marker tiger co-ord dog bite decl. marker   a. "People and tigers (topic), they bite dogs /dogs bite them."   b. "People (topic), they bite dogs and tigers /dogs and tigers bite them." Again the above examples indicate that co-ordination does not involve any notion of subject. The two readings of 52 do indicate that co-ordination in Lisu follows the general constraint that the conjoined constituents should be semantically and syntactically parallel. (See Schachter 1974.) Hence, although 51 is ambiguous as far as the agent of biting is concerned, the conjoined NPs, lathyu "people" and lame "tiger" must have the same semantic role with respect to the action of biting. Sentence 52 is four-way ambiguous. However, the (a) readings are related to a surface structure in which lathyu "people" and lame "tiger" are conjoined topics and in which the co-ordination marker ce has replaced the topic marker nya , of the second topic, lame "tiger." The (b) readings, on the other hand, have a surface structure in which only the NP lathyu "people," is the topic, and the co-ordination marker ce conjoins the other two NPs, lame "tiger" and ana "dog," which are not topics. These examples show that the notion of subject does not play any role in the structure of compound sentences in Lisu. The Lisu examples presented above demonstrate that the syntactic relation of a noun phrase to the verb in a sentence is indeterminate, and that the notion of subject is quite irrelevant in the description of the sentences of this language. The only relevant notion in the syntactic structure of Lisu sentences is the topic, which is always marked by the morpheme nya and occupies the sentence initial position. It might be suggested that Lisu is actually closer to being a subject-prominent language than we have made it out to be. Recall that, as mentioned in footnote 2, nya does appear as a marker in sentences containing no presupposed noun phrases such as: p. 477 53 swu nya atha da - a   one topic marker forge knife - decl. marker   "Someone is forging a knife." Recall also that the rule governing the appearance of nya in such sentences is that it goes with the agent if there is one, with the dative if there is no agent, with the object if there is no dative, and with the instrumental if there is no dative. Now we might say that this function of nya is a subject-marking function since some noun phrase is being singled out not according to its case role, but according to a hierarchy that is typically invoked for subject-prominent languages. In support of our claim that Lisu is essentially a topic-prominent language, however, we want to point out that this apparently subject-oriented nya -marking mechanism is restricted to sentences involving no presupposed noun phrases, which are extremely rare in actual language use. Even a superficial study of discourse shows that communication typically involves some noun phrase whose referent is assumed by the speaker to be known to the hearer. Since this subject-marking function of nya occurs only in this relatively rare sentence type, and since the notion of subject seems to play no other role in the grammar of Lisu, then, we claim that the basic sentence structure is topic-comment, with no candidates for any source from which it can be said to be derived. (c) Mandarin. We are not the first to suggest that Mandarin Chinese is a Tp language. Hong (1956), Householder and Cheng (1967), Tai (1973), Huang (1973), and Alleton (1973) mention the idea, and Chao (1968:67-104) discusses the Topic-Comment concept at some length. It is important to note that, although he uses the terms subject and predicate throughout, we can interpret these terms as topic and comment. That this is his intention can be seen from the following remark: "The grammatical meaning of subject and predicate in a Chinese sentence is topic and comment, rather than actor and action." (p. 69) What we are interested in, of course, is the distinction between topic and subject and its implications for the establishment of a linguistic typology. Now, unlike Lisu, Mandarin does have structures that could be called subject-predicate sentences. For example, 53 Wode didi xihuan chi pingguo   my brother like eat apple   "My brother likes to eat apples." p. 478 In this example, the word order parallels that of its English translation. From examples of this type, one could conclude that Mandarin is, like English, a Sp language with the subject in initial position. In addition, although we are describing Mandarin as Tp, as indicated earlier, the notion of subject clearly plays a role in certain sentence structures. For example, the serial verb construction must be described as a sequence of predicates sharing the same subject: 54 Zhang-san mai le piao jinqu   Zhang-san buy asp. ticket go in   "Zhang-san bought a ticket and went in." or "Zhang-san bought a ticket to go in." Serial verb sentences, as we described them in Li and Thompson (1973), may generally be interpreted as expressing either purpose or actions which are consecutive, simultaneous, or alternating. We can show that the notion of subject must be referred to in an account of this construction by giving an example in which the noun shared by the two predicates is an agent of one and an experiencer of the other. That is, serial verb sentences cannot be described by simply referring to the agent of the two predicates: 55 Wo hue le gian xiangshou   I spend aspect money enjoy   "I spent money and had a good time." or "I spent money to have a good time." Furthermore, 56-59 illustrate that the subject may control reflexivization. 56 John xihuan ta - ziji   like he - self   "John likes himself." 57 John da ta - ziji   hit he - self   "John hit himself." 58 John skit ta - ziji de pengyou   is he - self genitive friend   "John is his own friend." 59 *John, wo xihuan ta - ziji   I like he self   *"John (topic) I like himself." Sentence 59 shows that when the sentence contains a topic which can be distinguished from what one might want to call the subject, this topic does not control reflexivization. p. 479 Thus, the grammar of Mandarin must refer to the subject to describe the process of reflexivization (see E.L. Keenan, Definition of Subject, this volume). However, even for Mandarin, the evidence against considering topic-comment sentences to be derived from sentences of a subject-predicate form is very strong. Thus many normal topic-comment sentences whose topics have no selectional relationship with the verb in the comment have no subject-predicate sources. Following are some examples of this type. 60 Huang - se de turd! dafen zui heshi   yellow - color relative clause marker soil manure most suitable   "The yellow soil (topic), manure is most suitable." 61 Nei - zuo fangzi xingkui qu - nian mei xia xue   that classifier house fortunate last - year not snow   "That house (topic), fortunately it didn't snow last year." 62 Dongwu wo zuzhang bao - shou zhengce   animal I advocate conservation policy   "Animals (topic), I advocate a conservation policy." 63 Zei - jian shiqing ni bu neng guang mafan yi-ge ren   this classifier matter you not can only bother one person   "This matter (topic), you can't just bother one person." The pervasiveness of sentences of this type provides very clear evidence against a process of topicalization. In addition, the subject is not systematically codified in the surface structure of Mandarin sentences. There is simply no noun phrase in Mandarin sentences which has what E.L. Keenan has termed "subject properties" ( Definition of Subject , this volume). This means that a noun phrase which one might want to defend as a subject is impossible to identify as such. As a case in point, let us look briefly at a certain construction which we think provides a clear illustration of the difference between Sp and Tp languages. We can call this construction the "pseudo-passive." Here are two examples: p. 480 64 Zhei - jian xinwen guangbo le   This - classifier news broadcast aspect   "This news (topic), it has been broadcast." 65 Net - ben shu yijing chuban le   That - classifier book already publish aspect   "That book (topic), it has already been published." Because the initial noun is in an object case relationship with the verb (see Fillmore 1968), one might try to claim that such sentences are actually passives. A similar sentence type exists in Bahasa Indonesia, as described by S. Chung (this volume). She shows that a sentence which is superficially an object topicalization is actually a passive because the fronted object noun can be shown to be functioning as a subject. A demonstration of this sort cannot be given for sentences such as 64 and 65 in Mandarin because, except as noted above, there seem to be no processes which refer to subject and no surface clues by which a subject could be identified. 14 (d) The "double subject" construction. The "double subject" sentences, as we have suggested (see above, Section III(d)), are prototypical topic-comment sentences. They are widespread in Chinese, Japanese, Korean, Lisu, and Lahu. If we can show that such sentences are not derived, then we will have given very strong support for our case. Precisely the same arguments against deriving the "double subject" sentences from any other sentence type hold for all the Tp languages we have examined. The only source which has, to our knowledge, ever been suggested for the "double subject" sentence is a subject-predicate type of sentence in which there is a genitive relationship expressed between NP1 and NP2. Thus, for Korean, we could say that 66 was related to 67: 66 John - u n mall - ka aphi - ta   topic head subj. sick stative   "John has a headache." 67 John - ui mall - ka aphi - ta   gen. head subj. sick stative   "John's head aches." Or, for Mandarin, we could say that 68 should be derived from 69: 68 Xiang bizi chang   elephant nose long   "Elephants have long noses." 69 Xiang de bizi chang   elephant gen. nose long   "Elephants' noses are long." p. 481 However, as pointed out in Yang (1972) and Teng (1974), there are many "double subject" sentences in which there are no genitive or partitive relationships between the two initial noun phrases. Examples include: 70 TV - un Zenith - ka tuntun - ha - ta (Korean)   strong stative   "The TV (topic), Zenith is durable." 71 Tamen shed dou bu lai (Mandarin)   they anyone all not come   "They (topic), none of them are coming." Thus, a genitive relationship only exists for a subset of "double subject" sentences. There is no gain, then in viewing such sentences as being derived from subject-predicate sentences with genitive phrases as subjects. Even in those cases in which a genitive relationship between the two noun phrases can be maintained, though nothing would be gained by postulating a derivation of "double subject" sentences from genitive subject sentences. This is because a "re-interpretation" would have to be claimed to have occurred in order to account for the fact that, in Mandarin at least, these two sentence types control co-referential noun phrase deletion differently. Compare 72 and 73: 72 Neike skit de yezi tai da, suoyi wo bu xihuan ______.   that tree 's leaves too big so I not like   "That tree's leaves are too big, so I don't like them. " 73 Neike skit yezi tai da, suoyi wo bu xihuan ______.   that tree leaves too big so I not like   "That tree (topic), the leaves are too big, so I don't like it ." In 72 the controller of the interpretation of the deleted constituent in the second clause is the subject "that tree's leaves," while in 73 the controller is the topic "that tree." Deriving 73 from 72 thus does not appear to be indicated. Teng (1974) presents a number of other arguments against such a derivation. p. 482 (e) Distribution. We hope to have shown that there is no reason to view topic-comment sentences in Tp languages as "marked" or otherwise special. However, it has been suggested to us that perhaps such sentences are more restricted in their distribution than other sentence types, in particular that they may not occur as freely in restrictive relative clauses and non-asserted complements. 15 But in fact, this is not the case. We present the following examples from Mandarin which show that clauses which must be analyzed as topic-comment structures can be embedded as restrictive relative clauses and as non-asserted complements. First, a relative clause structure: 74 Wo bu xihuan nei zhong yi jin sanshi kuai - gian de douzi   I not like that kind one catty 30 dollars ref. marker beans   "I don't like that kind of beans that costs 30 dollars a catty." The source sentence for the underlined relative clause is: 75 Nei zhong douzi yi jin sanshi kuai - gian   that kind beans one catty 30 dollars   "That kind of beans (topic) one catty is 30 dollars." which clearly cannot be analyzed as a subject-predicate construction. Here is another relative clause example: 76 Nei - ke yezi hen da de shu feichang gao   that - classifier leaves very big ref. marker tree unusual tall   "That tree with big leaves is exceptionally tall." Once again, the sentence underlying the underlined relative clause could not be claimed to be a subject-predicate construction: 77 Nei - ke shu yezi hen da   that - classifier tree leaves very big   "That tree (topic) the leaves are very big." We can also easily show that topic-comment sentences can be embedded as presupposed complements. Here is one example: 78 Wo fandui tamen shed dou bu lai   I oppose they anyone all not come   "I oppose the fact that none of them are coming." The underlined complement, once again, can only be a topic-comment clause. p. 483 What examples 74-78 show, then, is that it is not the case that topic-comment sentences in a Tp language are necessarily restricted to asserted clauses. Thus, the argument that such sentences are more "marked" because of this more limited distribution can be seen to have no empirical basis. What we have tried to do in this section is to argue that the topic function, which is highly marked and set off from the rest of the sentence in Sp languages, has in Tp languages been integrated into the basic syntax of the sentence. The topic notion must be reckoned with in constructing an adequate grammatical description of these languages, and topic-comment sentences must be counted among the basic sentence types provided by the language.     V. The Typology and Some Diachronic Implications. * We have presented evidence in favor of a typological distinction between languages in which the notion of topic plays a prominent role as opposed to those in which the notion of subject plays a prominent role. As with all typological distinctions, of course, it is clear that we are speaking of a continuum. Thus, Lisu, as we have seen, is more Tp than Mandarin. Philippine languages, as suggested by Schachter (this volume) seem to be neither highly Sp nor highly Tp, while Japanese and Korean could be described as both Sp and Tp. Malagasy, as described by E.L. Keenan ( Malagasy , this volume), seems to be less Sp than English does. These facts can be schematically represented as follows: p. 484 On the basis of synchronic as well as diachronic phenomena, it seems clear that subject and topic are not unrelated notions. Subjects are essentially grammaticalized topics; in the process of being integrated into the case frame of the verb (at which point we call them subjects), topics become somewhat impure, and certain of their topic properties are weakened, but their topic-ness is still recognizable. 16 That is why many of the topic properties are shared by subjects in a number of languages. For example, some Sp languages do not allow indefinite subjects. What we are proposing here is that the universal notion of topic may be manifested in different ways across languages. In some languages, such as Lisu and Mandarin, the topic properties are coded in a topic constituent, and topic-comment sentences figure among the basic sentence structures of these languages. In other languages, such as Malagasy (see E.L. Keenan, Malagasy , this volume), some topic properties are carried by the subject, the constituent which is grammatically closely related to the verb and which plays a major role in the description of a number of grammatical processes. In such languages, to express unambiguously the topic as the discourse theme involves a separate proposition whose only function is topic establishment. In English, for example, we might do it this way: 79 (Remember /You know) Tom? Well, he fell off his bike yesterday. Interestingly, this strategy is very commonly used by English-speaking children (see E.O. Keenan and B.B. Schieffelin, this volume) and by users of American Sign Language (see L. Friedman, this volume). In topic-prominent languages, on the other hand, topic-establishment is built into the syntactic structure of the sentence. The differences between the two types of languages can have profound structural implications, as we have tried to show. On the basis of the cross-linguistic evidence we have presented, we suggest the diachronic schema shown on the next page. To return to the question raised earlier as to why the Tp languages are overwhelmingly verb-final, we offer the following speculation: in propelling a language from stage (C) through stage (D) and then to stage (A), the sentence type that plays a major role is the "double subject" type of sentence. The more such sentences are used in the language, the closer the language comes to stage (A), since these are topic-comment structures par excellence. Now note that the "double subject" constructions are always of the form:   Tp (A) topic notion integrated into basic sentence structure; topic and subject distinct     (D) Both Tp and Sp topic sentences become less marked, more basic   (B) Neither Tp nor Sp topic becomes more closely integrated into case frame of verb     Sp (C) topic has become integrated into case frame of verb as a subject; subject and topic often indistinct, subjects having some non-topic properties; sentences with clear topics are highly marked   80 which is precisely the typical sentence structure of a verb-final language. This sentence type becomes pervasive as the relationship between NP 1 and NP 2 becomes less and less constrained. In conclusion, we hope to have pointed to a new arena to observe the enactment of a familiar drama: a synchronic typology is shown to be simply a slice of a diachronic cycle in which different languages are caught at various stages. In our search for linguistic universals, we are reminded that a typology is really a description of strategies for accomplishing the same communicative goals.   *Charles N. Li Sandra A. Thompson. 1976. "Subject and Topic: A New Typology of Language." In: Charles N. Li (ed.). Subject and Topic . London /New York: Academic Press, pp. 483-485.
个人分类: 语言学探讨 Linguistics|4 次阅读|0 个评论
Gibbs Sampling Algorithm for LDA
热度 1 xiaohai2008 2012-5-4 10:53
最近利用Gibbs Sampling算法对许多主题模型(topic model)进行了推导 目前考虑的主题模型包括:LDA (Latent Dirichlet Allocation),AT (Author-Topic),ACT (Author-Conference-Topic),ToT (Topic over Time),TNG (Topic N-Gram),BTM (Bigram Topic Model), LDACOL (LDA Collocation)等 最近会不断的贴在此处,供大家批评指正 今天先贴第一篇:LDAGibbs(全文见 LDAGibbs.pdf ),主要参考以下文献: @TECHREPORT{Hei09, author = {Heinrich, Gregor}, title = {Parameter Estimation for Text Analysis}, institution = {vsonix GmbH and University of Leipzig}, year = {2009}, type = {Technical Report Version 2.9}, abstract = {Presents parameter estimation methods common with discrete probability distributions, which is of particular interest in text modeling. Starting with maximum likelihood, a posteriori and Bayesian estimation, central concepts like conjugate distributions and Bayesian networks are reviewed. As an application, the model of latent Dirichlet allocation (LDA) is explained in detail with a full derivation of an aaproximate inference algorithm based on Gibbs sampling, including a discussion of Dirichlet hyperparameter estimation.}, }
个人分类: 机器学习|12584 次阅读|5 个评论
[转载]学术报告: 文本分析中的参数估计
xiaohai2008 2012-4-18 18:04
题 目:文本分析中的参数估计 报 告 人:史庆伟 博士后 时 间:2012年4月23日(星期一)下午2:00 地 点:中信所四层419会议室 (北京海淀区复兴路15号中央电视台西门) 报告人简介: 史庆伟,2010年进入中信所从事博士后科研工作,合作导师为乔晓东研究员,主要从事信息组织与检索方面的研究工作。 报告提纲:在大规模文献数据中发现、追踪特定领域的主题、热点是向科研人员提供优质服务的重要环节。基于统计的文本数据处理方法多年来已经取得了丰富的研究成果,特别是在Blei等人提出主题模型之后,很好地解决了向量空间模型、潜在语义索引等模型的数据高维稀疏、不能处理一词多义等问题,已广泛应用到学术挖掘、情感分析、协同过滤、社会媒体挖掘等领域。本报告将介绍离散概率分布贝叶斯估计的基本知识,便于理解基于主题的文本分析方法,如概率潜在语义分析(PLSA)、潜在狄利克雷分配(LDA)和其他混合模型方法。 欢迎所内外各界人士踊跃参加! 举办单位: 中信所研究生部 信息技术支持中心
个人分类: 机器学习|3836 次阅读|0 个评论
[转载]资料下载网址
baixx87 2012-3-12 00:07
综合资料库: Noaa 资料库: www.cdc.noaa.gov NCEP 资料介绍: http://www.lasg.ac.cn/cgi-bin/forum/topic.cgi?forum=13topic=3 欧洲气象中心资料 (grib 和 NC 格式的 ) : http://www.ecmwf.int/ Levitus 资料: http://ingrid.ldgo.columbia.edu/SOURCES/.LEVITUS94/.MONTHLY/ Ucar 资料 http://www.cgd.ucar.edu/cas/guide/Atmos/Surface/data.html NASA 资料: ftp://podaac.jpl.nasa.gov/seasurfaceheight/ 以前某天全国的天气情况   http://www.t7online.com/feature/hi301100.shtml 欧洲多模式超级集合 DEMETER 计划历史回报数据下载网址 http://www.ecmwf.int/research/demeter/index.html NCEP 系统资料: NCEP real-analyses and forecasts http://www.emc.ncep.noaa.gov/data/ NCEP/NCAR REANALYSIS http://dss.ucar.edu/pub/reanalysis/ NCEP Eta http://www.emc.ncep.noaa.gov/mmb/research/meso.products.html NCEP AVN http://www.emc.ncep.noaa.gov/modelinfo/ netCDF format NNRP1: 6 hourly, 2.5 degrees, from 1948 to present ftp://ftp.cdc.noaa.gov/pub/Datasets/ncep.reanalysis/ NNRP2: 6 hourly, 2.5 degrees, from 1979 to 2002 ftp://ftp.cdc.noaa.gov/pub/Datasets/ncep.reanalysis2/ 降水资料 CMAP 资料 : http://tao.atmos.washington.edu/data_sets/cmap_precip/ NCEP 资料 : http://www.lasg.ac.cn/cgi-bin/forum/topic.cgi?forum=2topic=484 地面资料 : http://www.lasg.ac.cn/cgi-bin/forum/topic.cgi?forum=2topic=445 细分辨率的全球径流指标场: http://www.lasg.ac.cn/cgi-bin/forum/topic.cgi?forum=2topic=365 全球土壤资料: http://acdisx.gsfc.nasa.gov/CAMP ... ITE/INT_DIS/readmes 全国 160 个站的降水资料 http://www.lasg.ac.cn/cgi-bin/fo ... 3topic=426start=0 http://ncc.cma.gov.cn/Website/index.php?ChannelID=43WCHID=5 海洋再分析资料: http://iridl.ldeo.columbia.edu/SOURCES/.UMD/.Carton/.goa/.beta7/ 海表高度: ftp://podaac.jpl.nasa.gov/seasurfaceheight/ ftp://podaac.jpl.nasa.gov/pub/se ... GB_423/MGB423.129.Z 中国常规气象观测资料:国气象局气象中心资料室就能拿到,只要一张 资料 50 元的光盘刻录费而已。 600 个站 50 年左右 . (海表)温度资料: 表面温度: http://www.lasg.ac.cn/cgi-bin/forum/topic.cgi?forum=2topic=377 GSST: Gridded Sea Surface Temperature 1990 年至今的海温资料 regcm3 的主页上有连接的 ftp://ftp.cdc.noaa.gov/pub/Datas ... ean.1990-present.nc 百年以上的 SST 资料 : http://www.lasg.ac.cn/cgi-bin/forum/topic.cgi?forum=2topic=322 ReynoldsSmith 的重构月平均海表温度资料( 2×2 ) http://www.cdc.noaa.gov/cdc/data.noaa.ersst.html 南海气候态温盐年平均格点资料: http://led.scsio.ac.cn/cgi-bin/topic.cgi?forum=2topic=24show=0    http://www.lasg.ac.cn/cgi-bin/forum/topic.cgi?forum=2topic=1518 ncep1*1 再分析资料和 avn 资料网址 ( by 小歹 ) ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/ http://dss.ucar.edu/datasets/ds083.2/data/ 1998 年每周的雪盖资料 (by jaodan) http://www.cpc.ncep.noaa.gov/data/snow/ mm5 中 terrain 部分中的 25 类植被数据 http://www.lasg.ac.cn/cgi-bin/forum/topic.cgi?forum=2topic=2679 ftp://ftp.ucar.edu/mesouser/Data/ 慕士塔格冰芯资料 http://www.lasg.ac.cn/cgi-bin/forum/topic.cgi?forum=13topic=406 卫星资料: http://www.nsoas.gov.cn/ 国家卫星海洋应用中心 Aviation model 的 avn data : http://weather.unisys.com/aviation/index.html Topex/Poseidon 卫星资料: http://sealevel.jpl.nasa.gov/mission/topex.html http://www-ccar.colorado.edu/research/topex/html/topex.html 免费的遥感卫片资料: http://poet.jpl.nasa.gov// 探空资料  http://cdc.cma.gov.cn/publicservice/gaokong.jsp 全国探空资料 http://www.weather.uwyo.edu/upperair/sounding.html1973 年至今的全球探空资料 水文资料: 水文资料 1 : http://espejo.unesco.org.uy/index.html 水文资料 2 : Global runoff data center(GRDC) http://www.bafg.de/grdc.htm 水文资料 3 : US Geologic Survey(USGS) http://water.usgs.gov/ 风场 /OLR/ 指数资料 NECP-QSCAT 混合风场 ( 一天四次 , 空间精度 0.5 度 ) ftp://ncardata.ucar.edu user :nonymous passwd :anonymous 目录 /datasets/ds744.4/data 混合风场 ncep 的 nc 格式的风场 : http://www.cdc.noaa.gov/cdc/reanalysis/ SODA ( Simple Ocean Data Assimilation )的资料: http://iridl.ldeo.columbia.edu/SOURCES/.UMD/.Carton/.goa/ NACR-NECP WIND STRESS   风场资料 http://podaac.jpl.nasa.gov/sst/ 全球或太平洋的风场资料 http://ingrid.ldeo.columbia.edu/SOURCES/ 74-99 年的 OLR 资料 : http://www.lasg.ac.cn/cgi-bin/forum/topic.cgi?forum=13topic=55 长序列南方涛动资料: http://www.jisao.washington.edu/pacs/additional_analyses/soi.html 国内卫星反演的风数据 : 有红外云导风和水汽云导风!联系 国家卫星气象中心张其松研究员和许健民院士 TOMS 臭氧资料: http://toms.gsfc.nasa.gov/ozone/ozone.html www.teachearth.com/resources/TOMS_Ozone_WWW.html 国内外地型 / 地图资料: 全球地形资料: http://www.ngdc.noaa.gov/ngdcinfo/newdownloads.html http://www.cdc.noaa.gov/Datasets/ferret/data/etopo60.cdf 中国近海的地形数据: http://www.whigg.ac.cn/bbs/dispbbs.asp?boardID=2ID=51page=1 ETOPO5: 全球 5 分* 5 分的地形资料。就怕是对中国近海水深等值线而言分辨率不高 http://www.lasg.ac.cn/cgi-bin/forum/topic.cgi?forum=2topic=2550 www.odci.gov/cia/publications/factbook/ edina.ed.ac.uk/digimap/ edina.ed.ac.uk/ukborders data.geocomm.com/catalog/index.html www.maproom.psu.edu/dcw/ www.english-nature.org.uk/pubs/gis/GIS_register.asp glcf.umiacs.umd.edu/index.shtml www.landmap.ac.uk http://www.whigg.ac.cn/bbs/dispbbs.asp?boardID=2ID=51page=1 地 面 气 象 电 码 手 册  http://218.22.141.109/ywk/dzgf/Dmsc001.htm Noaa 资料库: www.cdc.noaa.gov NCEP 资料介绍: http://www.lasg.ac.cn/cgi-bin/forum/topic.cgi?forum=13topic=3 NCEP 资料 : http://www.lasg.ac.cn/cgi-bin/forum/topic.cgi?forum=2topic=484 地面资料 : http://www.lasg.ac.cn/cgi-bin/forum/topic.cgi?forum=2topic=445 细分辨率的全球径流指标场: http://www.lasg.ac.cn/cgi-bin/forum/topic.cgi?forum=2topic=365 表面温度: http://www.lasg.ac.cn/cgi-bin/forum/topic.cgi?forum=2topic=377 GSST: Gridded Sea Surface Temperature 百年以上的 SST 资料 : http://www.lasg.ac.cn/cgi-bin/forum/topic.cgi?forum= 2topic=322 ncep1*1 再分析资料和 avn 资料网址 () : ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/ http://dss.ucar.edu/datasets/ds083.2/data/ 1998 年每周的雪盖资料 (by jaodan) http://www.cpc.ncep.noaa.gov/data/snow/ 有谁有扩展 svd 程序 (by jaodan) http://www.lasg.ac.cn/cgi-bin/forum/topic.cgi?forum=2topic=448 全球地形资料: http://www.ngdc.noaa.gov/ngdcinfo/newdownloads.html 中国近海的地形数据: http://www.whigg.ac.cn/bbs/dispbbs.asp?boardID=2ID=51page=1 卫星资料: http://www.nsoas.gov.cn/ 国家卫星海洋应用中心 Aviation model 的 avn data : http://weather.unisys.com/aviation/index.html 水文资料 1 : http://espejo.unesco.org.uy/index.html 水文资料 2 : Global runoff data center(GRDC) http://www.bafg.de/grdc.htm 水文资料 3 : US Geologic Survey(USGS) http://water.usgs.gov/ 南海气候态温盐年平均格点资料: http://led.scsio.ac.cn/cgi-bin/topic.cgi?forum=2topic=24show=0    http://www.lasg.ac.cn/cgi-bin/forum/topic.cgi?forum=2topic=1518 海洋再分析资料: http://iridl.ldeo.columbia.edu/SOURCES/.UMD/.Carton/.goa/.beta7/ 中国常规气象观测资料:国气象局气象中心资料室就能拿到,只要一张 资料 50 元的光盘刻录费而已。 600 个站 50 年左右 . Levitus 资料: http://ingrid.ldgo.columbia.edu/SOURCES/.LEVITUS94/.MONTHLY/ SODA ( Simple Ocean Data Assimilation )的资料: http://iridl.ldeo.columbia.edu/SOURCES/.UMD/.Carton/.goa/ 全球或太平洋的风场资料 http://ingrid.ldeo.columbia.edu/SOURCES/ 气团轨迹图和风场资料下载 http://cgermetex.nies.go.jp/metex/index.html %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 转自中国气象论坛: http://www.cmabbs.com/thread-4032-1-1.html 另外,大气所的FTP: ftp.iap.ac.cn
个人分类: Learning NOTE|0 个评论
HMM-LDA模型简介
热度 5 xiaohai2008 2012-1-29 08:31
Griffiths et. al.~\cite{GSBT05}认为,一个词出现在句子中是有原因的。作者将原因分为两种:一种是它起到的是句法(syntactic)功能,使整个句子符合一定的语言规范,也就是常说的虚词(function words);另一种它起到语义(semantic)功能,传递句子的真实含义,也就是常说的实词(content words)。 句法约束通常是短程(short-range)的,一般不会超过一个句子;而语义约束通常是长程(long-range)的,也就是说同一篇文档的不同句子表达相近或相关的内容,通常会使用相近或相关的词汇。句法约束一般用HMM(Hidden Markov Model,隐马尔科夫模型)或PCFG(Probabilistic Context Free Grammar,概率上下文无关文法)进行建模,而语义约束通常用主题模型(topic model)进行建模。以前的研究通常将二者分开进行考虑,Griffiths et. al.认为如果将二者组合在一起,估计会得到更好的效果,于是就提出了HMM-LDA模型,文献~\cite{GSBT05}对HMM-LDA模型有详细描述。 相关公式推导见附件 hmm-lda简介.pdf @STRING(NIPS17="Advances in Neural Information Processing Systems 17") @INCOLLECTION{GSBT05, author = {Griffiths, Thomas L. and Steyvers, Mark and Blei, David M. and Tenenbaum, Joshua B.}, title = {Integrating Topics and Syntax}, booktitle = NIPS17, publisher = {MIT Press}, year = {2005}, editor = {Saul, Lawrence K. and Weiss, Yair and Bottou, L\'{e}on}, pages = {537--544}, address = {Cambridge, MA}, }
个人分类: 机器学习|12396 次阅读|8 个评论
[转载]Reactive Magnetron Sputtering
xpzhanghit 2011-12-29 20:50
Dr Stijn Mahieu Ghent University Archived topic page last updated on 16 July 2008 http://www.scitopics.com/Reactive_Magnetron_Sputtering.html Magnetron sputtering is a widely used PVD (Physical Vapour Deposition) technique to deposit thin films. This technique is based on the generation of a magnetically enhanced glow discharge , the so-called magnetron discharge . When a reactive gas is added to the discharge, it becomes possible to deposit compound materials. This process, i.e. reactive sputter deposition, has been recently reviewed in two papers . Both papers discuss in detail the reactive sputter process and its stability problems . Indeed, the addition of the reactive gas results not only in the formation of a compound on the substrate but also on the target or cathode surface . This can result in a sudden decrease of the deposition rate and an abrupt change in the partial pressure of the reactive gas , the so-called hysteresis or poisoning effect . Although both papers give an excellent overview of the reactive sputter process and the techniques to circumvent the hysteresis effect , recent experimental and modelling results show that several fundamental aspects concerning reactive sputtering have not been elucidated yet. One of the major problems of the reactive sputter process is its complexity . Indeed, to understand and describe this deposition process in all its details, a complete characterisation and description of the sputter process is necessary. More specific, the interaction between target processes , plasma processes , the sputter process and subsequent collisional transport of the sputtered particles through the gas phase and all substrate processes should be taken into account since all of them are related to each other. Attempts to obtain such a total description of the sputter process are ongoing in or research group . A good understanding of the reactive sputtering process is essential when tailoring the thin film properties. Indeed, several authors have shown that the plasma chemistry plays a crucial role in the development of the microstructure and the crystallographic orientation of the deposited thin films . It is beyond doubt that the crystallographic orientation and the microstructure influence a wide variety of thin film properties. For “simple” materials these relationships have been thoroughly examined. However, most of the new technological interesting materials have a complex chemical (and crystalline) structure . These multi-elemental materials allow to tune many parameters, including lattice constants, electronic band structures, and magnetic properties. References * W.D. Westwood, Sputter Deposition, AVS Education Committee Books Series, Volume 2, AVS (New York) 2003 (ISBN 0-7354-0105-5) * “Fundamental understanding and modeling of reactive sputtering processes” S. Berg, T. Nyberg, Thin Solid Films 476 (2005) 215-230 * “Control of reactive sputtering processes” W.D. Sproul, D.J. Christie, D.C. Carter, Thin Solid Films 491 (2005) 1-17 * “Target poisoning during reactive magnetron sputtering: Part I: the influence of ion implantation” D. Depla, R. De Gryse, Surf. Coat. Technol. 183 (2004) 184-189 (Link-) * “Modeling of the target surface modification by reactive ion implantation during magnetron sputtering” D. Depla D, Z.Y. Chen, A. Bogaerts, V. Ignatova, R. De Gryse, R. Gijbels, J. Vac. Sci. Technol. A22 (2004) 1524-1529 (Link-) *“Comprehensive perspective on the mechanism of preferred orientation in reactive- sputter-deposited nitrides” Y. Kajikawa, S. Noda, H.Komiyama, J. Vac. Sci. Technol. A 21 (2003) 1943-1959 * “Mechanism of preferential orientation in sputter deposited titanium nitride and yttria- stabilized zirconia layers” S. Mahieu, P. Ghekiere, G. De Winter, S. Heirwegh, D. Depla, R. De Gryse, O.I. Lebedev, G. Van Tendeloo, J. Cryst. Growth 279 (2005) 100-109 (Link-) * "Reactive Sputter Deposition" edited by D. Depla and S. Mahieu, Springer, 978-3540766629 (Link-) * "Biaxial alignment in sputter deposited thin films" S. Mahieu, P. Ghekiere, D. Depla, R. De Gryse, Thin Solid Films 515 (2006) 1229 (Link-)
个人分类: 网文精短|9 次阅读|0 个评论
行业热门程度
热度 1 zllzll 2011-11-21 20:14
今天用Web of Science引擎通过关键词搜索了一些我比较关心的或者熟悉的研究topic,下面关键词右边的数字代表以“标题”为搜索目标命中的文章篇数。数字越大代表搜到的论文篇数越多,也就反映了从事这项行业的人数越多,行业相对更加热门。 所谓做相对热门课题还是相对冷门课题一直是一个比较争议的观点。我觉得从事相对热门课题一旦发表文章,引用次数肯定会相对高点,但是潜在审稿人不好找,人多了关系也不容易经营。从事相对冷门进入圈子比较快,全世界可能就这么几家,但也得先做点好东西才能更加被认可,否则随便发点小文章很少会有人问津了。 无论热门领域还是冷门领域想发很高的文章都是很难的,因为真正突破性的创作成果是非常不容易取得的,大部分人都是千方百计想填补gap。 做交叉永远是一个好思路,但难点在于知识面的扩充也需要时间的投入。总之,先立足自己的本行好好做吧,慢慢熬总会有个头。 quantum dot: 29,709 solar cell: 23,671 sol-gel: 22,034 Porphyrin: 20,846 Near-infrared: 18,391 mesoporous:14,520 cyclodextrin: 13,148 drug delivery: 12,194 graphene: 9,622 DFT: 9,583 cell imaging: 8,960 two photon: 8,224 Microfluidic: 8,005 Photodynamic therapy: 7,952 Lithium battery: 7,608 NADH: 6,019 white light: 4,891 Photochromic: 3,952 organic light emitting diode: 3,326 metal-organic framework: 2,430 fluorescent dye: 2,106 gold nanoparticle: 1,955 logic gate: 1,698 asymmetric catalysis: 1,481 fluorescent sensor: 1,085 supramolecular polymer: 986 coordination assembly: 872 rotaxane: 534 Cucurbituril: 170 (Searched on 2011/11/21)
个人分类: 科学与研究|4061 次阅读|1 个评论
pLSI: probabilistic Latent Semantic Index
xiaohai2008 2011-11-10 17:14
今天在中信所419会议室给大家分享了一种主题模型(topic model):pLSI(probabilistic Latent Semantic Index) 由于最近太忙了,PPT是昨天晚上熬到零晨3点赶出来的 在制作PPT的过程中,加深了对许多问题的理解和认识 而且在给大家分享学习心得的过程中,许多同仁提出了非常不错的问题 今天的PPT见 pLSI.pdf ,欢迎大家批评指正 同时也欢迎志同道合者以后多多合作交流
个人分类: 机器学习|8498 次阅读|0 个评论
Author Name Disambiguation for Citations Using Topic and ...
chengh3 2011-11-8 16:36
Author Name Disambiguation for CitationsUsing Topic and Web Correlation Kai-Hsiang Yang, fromInstitute of Information Science, Academia Sinica, Taiwan ECDL'08 (European Conference on Digital Library) 利用主题相关度和web相关度判断是否重名,用的是pair-wise聚类模型。
个人分类: 重名判别|2857 次阅读|0 个评论
[转载]BOAO Youth Forum for Hong Kong
whyhoo 2011-11-5 22:32
Prof Hu, distinguished speakers, students, ladies and gentlemen: Today's topic, "Education in transition", is central to the mission of Asian universities, indeed all universities. Our world is in transition. You might say that it is in a state of rapid change, or you would say that it is in turmoil. Either way, big change is the order of the day: changes in the economic system and changes in demand for new skills. Parents and students may expect that a university education will prepare students for a cushy job. But I am sorry to say that this is a misunderstanding of the purpose of higher education, especially in a world that is changing before our very eyes. I would say that education in general, and university education in particular, is "preparation for change" because we cannot have a static education for a fast-changing world. The job market is a very different today, compared to your parents' days. Few jobs now last a lifetime. Instead, people change jobs, employers or even occupations almost as often as they change their fashion. Old jobs are lost to the market and new jobs are being created every day. Some jobs go overseas. Others simply disappear. So, how do we prepare for change in Hong Kong through our education system? This is where the 334 reform comes in. For the non-HK people in the audience, 334 means that HK will go from a three year university system, to a four year university program with high school education being shortened from 7 to 6 years. 334 is not just about changing the number of years at each level. It is an opportunity to rethink our approach to education, to break free from our obsession with static examination learning and passive memory learning. Instead, students are asked to learn how to learn, to learn analytically and creatively. For centuries, students in Asia were taught to memorize and regurgitate, to respect scholarship and accept the authority of famous people. People in the West often compare us unfavorably with Western education which puts a premium on inquiry learning and creativity. Don't get me wrong. There are some things that Asian people do right. We Asians value education greatly. We believe that education will better our lives and our society. Asian students are also known for their discipline and drive. By now, you have all heard about our "Tiger Moms". But discipline and a belief in education alone will not bring out the best in each individual. At HKUST, we ask ourselves, "What kind of graduates do we want to produce? "Well, the ideal graduate is someone who asks a lot of questions, who is curious about things, who never takes anything for granted and who is prepared and has the experience and confidence to learn anything new. Someone who is not curious will not discover new things. Einstein famously said: "I have no special talents. I am only very curious." Well, I think he was a bit too modest but talent without curiosity and drive will not go far. But curiosity alone is not enough. We must learn by doing, often through trial and error. Some wise educator once said: "We remember 10% of what we hear, 20% of what we see, and 80% of what we do." This is why in designing our new undergraduate curriculum, we make an effort to offer opportunities for doing research at the undergraduate level under the guidance and mentorship of some our leading researchers. Our famous Undergraduate Research Opportunities Program is responsible for sending 50 of our graduates every year into Ph.D. programs in world-leading universities, many with full scholarships. There is another change: to move away from early specialization. Previously, our admission is discipline-based. Now it is school-based. Applicants can wait till their second year before deciding on their majors and all students share a common core program which is broad-based. Students can explore "Signature Courses" from other Schools. Your minor may turn out to spark your life-long interest or become your career. Two of the Nobel Laureates who came to lecture at our university this year became world-famous experts in a subject they took only as a minor in university. We offer multidisciplinary programs such as environmental studies, and biomedical engineering. Our business majors can learn some ST and our ST students can pick up some business and management skills. You have no doubt heard of the complaint that students are often unable to find work in the discipline in which they are trained. We have two responses to that. First, remember I already said that the purpose of university education is not just vocational training and many successful people built their careers in fields other than their university majors. Their successes are often helped by the general skills, logical thinking, the experience and confidence in having learned a subject in depth, and broad and global perspectives. Our second response is to train our students to create their own jobs and jobs for others, to pursue their passion as entrepreneurs. Over the years, our Entrepreneurship Center has helped 46 start-ups, giving them market advice and connections, and even steering them to angel investors. Our first PhD is now the founder and chairman of a tech company that is listed in the Hong Kong Stock Exchange. Just last week, he had a one-on-one with the legendary former GE CEO Jack Welch in a forum held at this very Convention Center. Finally, education is not just about yourself. The duty of the educated is to serve the society that gives you that privilege. That is why community service and engagement is part of our ideal student profile. We have two service learning programs: CONNECT and REDBIRD. To date, over 3,000 students have taken part in one or the other. So you can see that they are not just token projects. Ladies and gentlemen, no jobs are safe and permanent in this new global order. The wise graduate is one who sees opportunities in difficulties and future trends. That is the new leader for tomorrow. I hope those are graduates that HKUST will produce. Thank you. 原文见 http://www.ust.hk/eng/about/speeches_20110915-130.htm
1731 次阅读|0 个评论
Useful English expressions: how to change a topic
热度 1 zuojun 2011-4-28 03:12
You are chatting with a friend, or a colleague, or a co-worker. Suddenly, a topic comes up, which makes you uncomfortable. What do you do then? You could say politely: "I don't want to talk about this." Or, you could say: "I don't want to hear about this." "Let's talk about something else." If the other party is not smart enough to change the subject, then you should find a way to get out of there as fast as you can.
个人分类: From the U.S.|3919 次阅读|2 个评论
寻找topic的个人浅见
orient 2009-6-27 01:15
美国这边就是这个特点,必须自己找topic,但是必须是在导师的框架下。如果找不到,就一直这么找下去,有人甚至好几年都找不到。 做研究,到底什么是做研究。在来美之前,我一直认为,一定要找到能够有实用价值的课题。但是后来,可能也是由于环境的影响。让我对研究这样一个概念,有了不同的理解。特别是在看过一部影片,《爱因斯坦与艾丁顿》之后,让我对研究有了全新的认识。 一开始,刚来的时候,当时,导师说希望我能够发表高质量的论文,我心里还有抵触情绪。为什么会这样呢?因为,在国内,很多文章,特别是为了评职称,需要凑很多文章,这样,文章也就只能胡拼瞎凑了,根本没有理论上的价值,实用价值也没有。所以,我也就在心里把写文章和这种不严谨的作风联系了起来。 但是,后来才转过弯来,原来写文章的真正目的,是在学术期刊这样一个平台上,发表自己的见解,报告自己的研究进展,从而与其他的研究人员进行沟通,甚至于还可以为一个问题进行辩论。因为爱因斯坦与拉普拉斯进行谈论的时候,也是希望发表高质量的文章。也就是在学术上有所见解。 又扯远了,还是回到找topic这个问题上。 有人一谈到找topic,马上就说,多看论文吧。多看文章,到底没有错。但到底应该怎么看? 首先,应针对自己感兴趣的领域,去找这方面的好文章,至于什么是好文章,被好文章引用的一般来说是好文章,特别是那些非常严谨的科学家,你只要去读他的文章,只要按照他所给出来的关于一些具体知识点的出处,也就是所做的引用出处,你就可以找到很多文章。你会发现,在读文章时所遇到的很多问题,都可以在看相应的引用文章时找到答案。 然后,只是这么读不行,光把别人的东西理解还远远不够,虽然理解是第一步,是在学习别人的经验。关键要找到启发点,也就是那些能够让你产生新想法的地方。这就需要在阅读别人文章的时候,充分发挥联想能力,一个比较好的例子就是,scaling law of human travel,我已经在以前的博文中表扬过这篇文章了。作者竟然能够把一个非常有趣的问题,从钞票的流通规律联想到人类的旅行规律。 还有,就是应该在现实生活中,多多思考。在《美丽心灵》中,有一个情节,纳什看到鸟儿觅食马上激发了灵感。当然,这个灵感的出现,应该是基于雄厚的理论知识。但是,联想想象也是非常重要。 雄厚的理论知识,一个坚固的内核,是研究的基石。有了雄厚的理论知识,也非常有利于自己找到topic,比如在看到别人对具体问题的解决过程中,如果自己再另外一种方法论的领域非常熟悉,那么,你马上就会想到,用我所熟悉的另外一种方法能否将这个问题加以解决?
个人分类: 未分类|3461 次阅读|0 个评论

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-5-12 00:29

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部