科学网

 找回密码
  注册

tag 标签: Learning

相关帖子

版块 作者 回复/查看 最后发表

没有相关内容

相关日志

[转载]Slides of Machine Learning Summer School
timy 2011-6-17 10:36
From: http://mlss2011.comp.nus.edu.sg/index.php?n=Site.Slides MLSS 2011 Machine Learning Summer School 13-17 June 2011, Singapore Slides Speaker Topic Slides Chiranjib Bhattacharyya Kernel Methods Slides ( pdf ) Wray Buntine Introduction to Machine Learning Slides ( pdf ) Zoubin Ghahramani Gaussian Processes, Graphical Model Structure Learning Slides (Part 1 pdf , Part 2 pdf , Part 3 pdf ) Stephen Gould Markov Random Fields for Computer Vision Slides (Part 1 pdf , Part 2 pdf , Part 3 pdf )]] Marko Grobelnik How We Represent Text? ...From Characters to Logic Slides ( pptx ) David Hardoon Multi-Source Learning; Theory and Application Slides ( pdf ) Mark Johnson Probabilistic Models for Computational Linguistics Slides (Part 1 pdf , Part 2 pdf , Part 3 pdf ) Wee Sun Lee Partially Observable Markov Decision Processes Slides ( pdf , pptx ) Hang Li Learning to Rank Slides ( pdf ) Sinno Pan Qiang Yang Transfer Learning Slides (Part1 pptx Part 2 pdf ) Tomi Silander Introduction to Graphical Models Slides ( pdf ) Yee Whye Teh Bayesian Nonparametrics Slides ( pdf ) Ivor Tsang Feature Selection using Structural SVM and its Applications Slides ( pdf ) Max Welling Learning in Markov Random Fields Slides ( pdf , pptx )
个人分类: 机器学习|4265 次阅读|0 个评论
[转载]Classical Paper List on ML and NLP
wqfeng 2011-3-25 12:40
Classical Paper List on Machine Learning and Natural Language Processing from Zhiyuan Liu Hidden Markov Models Rabiner, L. A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. (Proceedings of the IEEE 1989) Freitag and McCallum, 2000, Information Extraction with HMM Structures Learned by Stochastic Optimization, (AAAI'00) Maximum Entropy Adwait R. A Maximum Entropy Model for POS tagging, (1994) A. Berger, S. Della Pietra, and V. Della Pietra. A maximum entropy approach to natural language processing. (CL'1996) A. Ratnaparkhi. Maximum Entropy Models for Natural Language Ambiguity Resolution. PhD thesis, University of Pennsylvania, 1998. Hai Leong Chieu, 2002. A Maximum Entropy Approach to Information Extraction from Semi-Structured and Free Text, (AAAI'02) MEMM McCallum et al., 2000, Maximum Entropy Markov Models for Information Extraction and Segmentation, (ICML'00) Punyakanok and Roth, 2001, The Use of Classifiers in Sequential Inference. (NIPS'01) Perceptron McCallum, 2002 Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms (EMNLP'02) Y. Li, K. Bontcheva, and H. Cunningham. Using Uneven-Margins SVM and Perceptron for Information Extraction. (CoNLL'05) SVM Z. Zhang. Weakly-Supervised Relation Classification for Information Extraction (CIKM'04) H. Han et al. Automatic Document Metadata Extraction using Support Vector Machines (JCDL'03) Aidan Finn and Nicholas Kushmerick. Multi-level Boundary Classification for Information Extraction (ECML'2004) Yves Grandvalet, Johnny Marià , A Probabilistic Interpretation of SVMs with an Application to Unbalanced Classification. (NIPS' 05) CRFs J. Lafferty et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. (ICML'01) Hanna Wallach. Efficient Training of Conditional Random Fields. MS Thesis 2002 Taskar, B., Abbeel, P., and Koller, D. Discriminative probabilistic models for relational data. (UAI'02) Fei Sha and Fernando Pereira. Shallow Parsing with Conditional Random Fields. (HLT/NAACL 2003) B. Taskar, C. Guestrin, and D. Koller. Max-margin markov networks. (NIPS'2003) S. Sarawagi and W. W. Cohen. Semi-Markov Conditional Random Fields for Information Extraction (NIPS'04) Brian Roark et al. Discriminative Language Modeling with Conditional Random Fields and the Perceptron Algorithm (ACL'2004) H. M. Wallach. Conditional Random Fields: An Introduction (2004) Kristjansson, T.; Culotta, A.; Viola, P.; and McCallum, A. Interactive Information Extraction with Constrained Conditional Random Fields. (AAAI'2004) Sunita Sarawagi and William W. Cohen. Semi-Markov Conditional Random Fields for Information Extraction. (NIPS'2004) John Lafferty, Xiaojin Zhu, and Yan Liu. Kernel Conditional Random Fields: Representation and Clique Selection. (ICML'2004) Topic Models Thomas Hofmann. Probabilistic Latent Semantic Indexing. (SIGIR'1999). David Blei, et al. Latent Dirichlet allocation. (JMLR'2003). Thomas L. Griffiths, Mark Steyvers. Finding Scientific Topics. (PNAS'2004). POS Tagging J. Kupiec. Robust part-of-speech tagging using a hidden Markov model. (Computer Speech and Language'1992) Hinrich Schutze and Yoram Singer. Part-of-Speech Tagging using a Variable Memory Markov Model. (ACL'1994) Adwait Ratnaparkhi. A maximum entropy model for part-of-speech tagging. (EMNLP'1996) Noun Phrase Extraction E. Xun, C. Huang, and M. Zhou. A Unified Statistical Model for the Identification of English baseNP. (ACL'00) Named Entity Recognition Andrew McCallum and Wei Li. Early Results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-enhanced Lexicons. (CoNLL'2003). Moshe Fresko et al. A Hybrid Approach to NER by MEMM and Manual Rules, (CIKM'2005). Chinese Word Segmentation Fuchun Peng et al. Chinese Segmentation and New Word Detection using Conditional Random Fields, COLING 2004. Document Data Extraction Andrew McCallum, Dayne Freitag, and Fernando Pereira. Maximum entropy Markov models for information extraction and segmentation. (ICML'2000). David Pinto, Andrew McCallum, etc. Table Extraction Using Conditional Random Fields. SIGIR 2003. Fuchun Peng and Andrew McCallum. Accurate Information Extraction from Research Papers using Conditional Random Fields. (HLT-NAACL'2004) V. Carvalho, W. Cohen. Learning to Extract Signature and Reply Lines from Email. In Proc. of Conference on Email and Spam (CEAS'04) 2004. Jie Tang, Hang Li, Yunbo Cao, and Zhaohui Tang, Email Data Cleaning, SIGKDD'05 P. Viola, and M. Narasimhan. Learning to Extract Information from Semi-structured Text using a Discriminative Context Free Grammar. (SIGIR'05) Yunhua Hu, Hang Li, Yunbo Cao, Dmitriy Meyerzon, Li Teng, and Qinghua Zheng, Automatic Extraction of Titles from General Documents using Machine Learning, Information Processing and Management, 2006 Web Data Extraction Ariadna Quattoni, Michael Collins, and Trevor Darrell. Conditional Random Fields for Object Recognition. (NIPS'2004) Yunhua Hu, Guomao Xin, Ruihua Song, Guoping Hu, Shuming Shi, Yunbo Cao, and Hang Li, Title Extraction from Bodies of HTML Documents and Its Application to Web Page Retrieval, (SIGIR'05) Jun Zhu et al. Mutual Enhancement of Record Detection and Attribute Labeling in Web Data Extraction. (SIGKDD 2006) Event Extraction Kiyotaka Uchimoto, Qing Ma, Masaki Murata, Hiromi Ozaku, and Hitoshi Isahara. Named Entity Extraction Based on A Maximum Entropy Model and Transformation Rules. (ACL'2000) GuoDong Zhou and Jian Su. Named Entity Recognition using an HMM-based Chunk Tagger (ACL'2002) Hai Leong Chieu and Hwee Tou Ng. Named Entity Recognition: A Maximum Entropy Approach Using Global Information. (COLING'2002) Wei Li and Andrew McCallum. Rapid development of Hindi named entity recognition using conditional random fields and feature induction. ACM Trans. Asian Lang. Inf. Process. 2003 Question Answering Rohini K. Srihari and Wei Li. Information Extraction Supported Question Answering. (TREC'1999) Eric Nyberg et al. The JAVELIN Question-Answering System at TREC 2003: A Multi-Strategh Approach with Dynamic Planning. (TREC'2003) Natural Language Parsing Leonid Peshkin and Avi Pfeffer. Bayesian Information Extraction Network. (IJCAI'2003) Joon-Ho Lim et al. Semantic Role Labeling using Maximum Entropy Model. (CoNLL'2004) Trevor Cohn et al. Semantic Role Labeling with Tree Conditional Random Fields. (CoNLL'2005) Kristina toutanova, Aria Haghighi, and Christopher D. Manning. Joint Learning Improves Semantic Role Labeling. (ACL'2005) Shallow parsing Ferran Pla, Antonio Molina, and Natividad Prieto. Improving text chunking by means of lexical-contextual information in statistical language models. (CoNLL'2000) GuoDong Zhou, Jian Su, and TongGuan Tey. Hybrid text chunking. (CoNLL'2000) Fei Sha and Fernando Pereira. Shallow Parsing with Conditional Random Fields. (HLT-NAACL'2003) Acknowledgement Dr. Hang Li , for original paper list.
个人分类: 模式识别|3020 次阅读|0 个评论
[转载]Clustering - Encyclopedia of Machine Learning (2010)
热度 1 timy 2011-2-14 23:47
From: http://www.springerlink.com/content/g37847m78178l645/fulltext.html Encyclopedia of Machine Learning Springer Science+Business Media, LLC2011 10.1007/978-0-387-30164-8_124 ClaudeSammut and GeoffreyI.Webb Clustering Clustering is a type of unsupervised learning in which the goal is to partition a set of examples into groups called clusters. Intuitively, the examples within a cluster are more similar to each other than to examples from other clusters. In order to measure the similarity between examples, clustering algorithms use various distortion or distance measures . There are two major types clustering approaches: generative and discriminative. The former assumes a parametric form of the data and tries to find the model parameters that maximize the probability that the data was generated by the chosen model. The latter represents graph-theoretic approaches that compute a similarity matrix defined over the input data. Cross References Categorical Data Clustering Cluster Editing Cluster Ensembles Clustering from Data Streams Constrained Clustering Consensus Clustering Correlation Clustering Cross-Language Document Clustering Density-Based Clustering Dirichlet Process Document Clustering Evolutionary Clustering Graph Clustering k-means clustering k-mediods clustering Model-Based Clustering Partitional Clustering Projective Clustering Sublinear Clustering
个人分类: 机器学习|3319 次阅读|2 个评论
[转载]Symp. on Learning Language Models from Multilingual
timy 2011-1-11 22:48
From: http://www.cs.york.ac.uk/aig/LLMMC/ Symposium on Learning Language Models from Multilingual Corpora (LLMMC) Part of the AISB 2011 Convention , 4-7 April 2011. Call for Papers International organizations like the UN and the EU, news agencies, and companies operating internationally are producing large volumes of texts in different languages. As a result, large publicly-available parallel paragraph- or sentence-aligned corpora have been created for many language pairs, e.g., French-English, Chinese-English or Arabic-English. The multilingual nature of the EU has given rise to many documents available in all or many of its official languages, which have been been assembled in multi-lingual parallel corpora such as Europarl (11 languages, 34-55M words for each) and JRC-Acquis (22 languages, 11-22M words for each). These parallel corpora have been used, both monolingually and multilingually, for a variety of NLP tasks, including but not limited to machine translation, cross-lingual information retrieval, word sense disambiguation, semantic relation extraction, named entity recognition, POS tagging, and syntactic parsing. With the advent of Internet, there has been also an explosion in the availability of semi-parallel multilingual online resources like Wikipedia that have been used for similar tasks and have a big potential for future exploration and research. In this symposium, we are interested in explicit models, usable and verifiable by humans, which could be used for either translation or for modelling individual languages, e.g., as applied to morphology, where the available translations can help identify word forms of the same lexical entry in a given language, or lexical semantics, where parallel corpora can help extract instances of relations like synonymy and hypernymy, which are essential for building thesauri and ontologies. The main purpose of the symposium will be to gather and disseminate the best ideas in this new area. Thus, we welcome review and position papers alongside original submissions. A considerable part of this one-day symposium will be dedicated to discussions to encourage the formations of new collaborations and consortia. Duration: a one-day symposium. Important Dates: Call for papers: December 13, 2010 Submissions: January 19, 2011 Notification: February 14, 2011 Submission of camera-ready versions: February 28, 2011 Symposium: April 6, 2011 Organizers: Dimitar Kazakov, The University of York, UK (kazakov AT cs DOT york DOT ac DOT uk) Preslav Nakov, National University of Singapore, Singapore (preslav DOT nakov AT gmail DOT com) Ahmad R. Shahid, The University of York, UK (ahmad AT cs DOT york DOT ac DOT uk) Program Committee: Graeme Blackwood, University of Cambridge, UK Phil Blunsom, University of Oxford, UK Francis Bond, Nanyang Technological University, Singapore Yee-Seng Chan, University of Illinois at Urbana-Champaign, USA Daniel Dahlmeier, National University of Singapore, Singapore Marc Dymetman, Xerox Research Centre Europe, France Andreas Eisele, Directorate-General for Translation, Luxembourg Michel Galley, Stanford University, USA Kuzman Ganchev, University of Pennsylvania, USA Corina R Girju, University of Illinois at Urbana-Champaign, USA Philipp Koehn, University of Edinburgh, UK Krista Lagus, Aalto University School of Science and Technology, Finland Wei Lu, National University of Singapore, Singapore Elena Paskaleva, Bulgarian Academy of Sciences, Bulgaria Katerina Pastra, Institute for Language and Speech Processing, Greece Khalil Sima'an, University of Amsterdam, The Netherlands Ralf Steinberger, Joint Research Centre, Italy Joerg Tiedemann, Uppsala University, Sweden Marco Turchi, Joint Research Centre, Italy Jaakko Vyrynen, Aalto University School of Science and Technology, Finland
个人分类: 同行交流|3256 次阅读|0 个评论
learning to rank on graph
petrelli 2010-9-29 11:03
个人分类: mlj|45 次阅读|0 个评论
[转载]learning by doing
jianfengmao 2010-9-9 17:27
One must learn by doing the thing; for though you think you know it, you have no certainty until you try. Sophocles ~ 450 B.C.
个人分类: R and Statistics|1927 次阅读|0 个评论
A Fast Algorithm for Learning a Ranking Function from Large-Scale Data Sets
petrelli 2010-9-6 17:15
位置:E:\petrelli\study\ML\paper\PAMI @article{raykar2008fast, title={{A fast algorithm for learning a ranking function from large-scale data sets}}, author={Raykar, V.C. and Duraiswami, R. and Krishnapuram, B.}, journal={IEEE transactions on pattern analysis and machine intelligence}, volume={30}, number={7}, pages={1158--1170}, year={2008}, publisher={Citeseer} } 基本内容: 文章利用sigmoid funcion 来作为原问题的近似loss function, 然后利用conjugate gradient algortihm求解,直接求解效率很低,于是文章利用erfc函数做了个近似,得到一个快速算法. 贡献: 主要用来解决Preference ranking的问题. 文章提出的算法我感觉也蛮有局限性的,而且相比目前最快的算法,我觉得不会比他提出的差.
个人分类: 科研笔记|86 次阅读|0 个评论
读The Elements of Statistical learning
petrelli 2010-9-5 20:29
Chapter 2 Overview of Supervised learning 2.1 几个常用且意义相同的术语: inputs在统计类的文献中,叫做predictors, 但经典叫法是independently variables,在模式识别中,叫做feature. outputs,叫做responses, 经典叫法是dependently variables. 2.2 给出了回归和分类问题的基本定义 2.3 介绍两类简单的预测方法: Least square 和 KNN: Least square产生的linear decision boundary的特点: low variance but potentially high bias; KNN wiggly and unstabla,也就是high variance and low bias. 这一段总结蛮经典: A large subset of the most popular techniques in use today are variants of these two simple procedures. In fact 1-nearest-neighbor, the simplest of all, captures a large percentage of the market for low-dimensional problems. The following list describes some ways in which these simple procedures have been enhanced: ~ Kernel methods use weights that decrease smoothly to zero with distance from the target point, ather than the eective 0=1 weights used by k-nearest neighbors. ~In high-dimensional spaces the distance kernels are modified to emphasize some variable more than others. ~Local regression fits linear models by locally weighted least squares rather than fitting constants locally. ~Linear models fit to a basis expansion of the original inputs allow arbitrarily complex models. ~Projection pursuit and neural network models consist of sums of non-linearly transformed linear models. 2.4 统计决策的理论分析 看不进去,没怎么看懂,明天看新内容前再看一遍,今天看的内容 p35-p43. 2.5节讨论了local methods KNN在高维特征下的问题, 在维数增大的情况下,要选取r部分的样本,所需要的边长接近1,这样会导致variance非常高. 2.6节分为统计模型,监督学习介绍和函数估计的方法来介绍,统计模型给出一般问的统计概率模型,监督学习说明了用训练样例来拟合函数,函数估计介绍了常用的参数估计,选取使目标函数最大的参数作为估计. 2.7 介绍了structured regression methods,它能解决某些情况下不好解决的问题. 2.8 一些估计器的介绍: 2.8.1 通过roughness penalty, 实质就是regularized methods,通过penalty 项 限制函数空间的复杂度. 2.8.2 kernel methods and local regression kermel function实际上和local neighbor方法类似,kernel反映了样本间的距离 2.8.3 basis functions and Dictionary methods 从dictionary中选出若干basis functions叠加作为得到的function. 单层前反馈神经网络和boosting 还有MARS,MART都属于这一类方法. 2.9 模型选择和bias, variance的折中 往往模型的复杂度越高(例如regularizer控制项越小), bias越低但是variance越高. 造成训练错误率很低但是测试错误率很高. 反之亦然. 简图2.11 看到61页.主要讲了解回归问题的若干线性方法, 首先是基本回归问题,然后介绍多回归,多输出,接着说subset selection, forward stepwise/stagewise selection(两种的区别是后者更新时不会对其他变量做调整). 3.4 shrinkage methods 便是加入regularizer来smooth化,因为subset selection后的数据偏离散. 如果用平方则是ridge regression, 如果用绝对值就是lasso,还有一种变形least angle regression,和lasso很相关,明天再看看吧.也就是61页到97页的内容. 补充:3.3节对linear regression问题中约束对应的p-norm进行了分析,当p=1.2(文中q表示这里的p)是和elastic net penalty外形很相似,但事实上前者光滑,后者sharp(non-differentiable), (可微意味着无穷阶可导). 3.4节 Least Angle Regression(LAR),和lasso几乎相同,但是在非零取值为0时,相应的变量要从active set中移出,重新计算direction. 3.5节讨论了principal component regression 和partial least squares的方法, 应该可以理解为降维,将原来的d维数据映射到m(md)上面再求解. 3.6 讨论了selection 和 shrinkage方法的比较,貌似的优化的方向选择的不同; 3.7多元输出的selection和shrinkage 3.8Lasso更多的讨论和路径算法 : 基本的优化形式loss+penalty, loss 和penalty的不同造成了关于lasso之类的很多讨论. 另外有提到线性规划用单纯形法求解,记录一下怕将来需要看线性规划的东西没有方向. 3.9 计算代价的分析 Chapter 4 解分类问题的线性方法 4.1 介绍了线性决策边界为线性方法 4.2 indicator matrix的线性回归, 4.3 LDS linear discrimant analysis, 假设每一类为多元高斯分布如下, 在利用到概率密度给出分类的条件概率时,若概率密度函数中的协方差矩阵 均相同就引出了LDA. 文章接着对LDA的各种情形和计算方式进行了讨论. 4.4 p137 明天重新过一遍,结束第4章
个人分类: 科研笔记|6251 次阅读|0 个评论
Deep Learning
openmind 2010-7-28 14:56
前几天一个R友提到deep learning. 因为我一直在关注NN的东西,所以,今天google了一把。NN,准确的说是ANN,难道开始因为Deep learning又热了? NN 一直在 AI 领域兴风作浪,从1960年的Perceptron,到1986年的 BP 算法【在此期间,SVM等SLT也是硕果累累】,再到2006年 Hinton等人在DBN中成功运用了deep learnning之后,如今的NN又开热了。 无容置疑,强大的计算能力使得DL成为可能。 试问,NN 的下一个热点又将是什么呢?谁又是第二个Hinton呢? 看看,如何将你NN也DL一把?如此以来,grants 和 theses,papers 就有了,嘻嘻。 更多可以参考: http://deeplearning.net/
个人分类: 科普天地|536 次阅读|0 个评论
国际医学继续教育信息资源 BMJ Learning
xupeiyang 2010-7-16 06:28
请见: http://learning.bmj.com/learning/channel-home.html Dear Prof Xu Peiyang, For every course you complete on BMJ Learning you can print a certificate of completion as proof for accreditation. The following courses have been recommended by other primary care doctors on BMJ Learning. If you are not working in primary care please update your details to ensure you only receive relevant communication. Welcome to BMJ Learning BMJ Learning is the world's largest and most trusted independent online learning service for medical professionals. We offer over500 peer reviewed, evidence based learning modules and our service is constantly updated. Train and test your knowledge and skills today. Accreditation of BMJ Learning courses is provided by several international authorities - including DHA, HAAD, EBAC, MMA, CME, RNZCGP, KIMS, and others . Please contact your relevant College or Association for information, or to request that they accredit BMJ Learning if they do not already.
个人分类: 信息交流|3249 次阅读|0 个评论
[转载]Unsupervised feature learning and Deep Learning
xrtang 2010-7-10 17:10
浏览EMNLP2010的网站,看到Andrew Ng将要做的一个报告的介绍,其中说到了特征学习和深度学习问题。最近在为知识库缺乏发愁。介绍中提到集中从无标注数据中学习特征的方法,在这里记录下来,以备后用。 1. sparse coding 链接:http://www.scholarpedia.org/article/Sparse_coding 2. ICA algorithm Independent componential analysis 3. Deep belief networks 链接:http://www.scholarpedia.org/article/Deep_belief_networks
个人分类: 未分类|6310 次阅读|0 个评论
[转载]Change:March/April 2010: In This Issue
geneculture 2010-5-4 02:43
Frame of Reference: Open Access Starts with You by Lori A. Goetsch Federal legislation now requires the deposit of some taxpayer-funded research in open-access repositoriesthat is, sites where scholarship and research are made freely available over the Internet. The National Institutes of Health's open-access policy requires submission of NIH-funded research to PubMed Central, and there is proposed legislationthe Federal Research Public Access Act of 2009that extends this requirement to research funded by 11 other federal agencies. Academic Researchers Speak by Inger Bergom, Jean Waltman, Louise August and Carol Hollenshead Non-tenure-track (NTT) research faculty are perhaps the most under-recognized group of academic professionals on our campuses today, despite their increasingly important role within the expanding academic research enterprise. The American Association for the Advancement of Science reports that the amount of federal spending on RD has more than doubled since 1976. The government now spends about $140 billion yearly on RD, and approximately $30 billion of this amount goes to universities each year in the form of grants and contracts. Taking Teaching to (Performance) Task: Linking Pedagogical and Assessment Practices by Marc Chun Imagine a typical student taking an average set of courses. She has to complete a laboratory write-up for chemistry, write a research paper for linguistics, finish a problem set for mathematics, cram for a pop quiz in religious studies, and write an essay for her composition class. Her professors almost exclusively lecture (which, it's been said, is a way for information to travel from an instructor's lecture notes to the student's notebook without engaging the brains of either). And somehow she is supposed to not only learn the course content but also develop the critical thinking skills her college touts as central to its mission. Why Magic Bullets Don't Work by David F. Feldon We always tell our students that there are no shortcuts, that important ideas are nuanced, and that recognizing subtle distinctions is an essential critical-thinking skill. Mastery of a discipline, we know, requires careful study and necessarily slow, evolutionary changes in perspective. Then we look around for the latest promising trend in teaching and jump in with both feet, expecting it to transform our students, our courses, and our outcomes. Alternatively, we sniff disdainfully at the current educational fad and proudly stand by the instructional traditions of our disciplines or institutions, secure in our knowledge that the tried and true has a wisdom of its own. This reductive stance is a natural one. As university faculty who work within disciplines, we have each chosen a slice of human knowledge about which we are passionate, and we often settle on the most expedient (but sound) answer to the question of how to teach so that we can move on to the interesting issues and problems that led us to pursue academic careers in the first place. Further, the professional demands on us and the rewards for our work generally do not align with high levels of sustained effort invested in teaching. However, what we tell students about mastering our respective disciplines are the same truths that apply to finding effective instructional strategies: The devil is always in the details, and nuance is critical. Yet in our desire to do right by our students and still invest the bulk of our efforts in teaching content, we put our faith in over-simplified generalizations that never seem to realize the full benefits that they promise. There have been many sweeping statements made regarding the best ways to teach students in the 21st century. Two of the most au courant are traditional lectures are ineffective and internet-based technologies help students learn. There is empirical evidence to support the truth in each of these statements, truebut only if they meet specific parameters, which rarely carry over from their origins in educational research to guide their implementation in practice. Are lectures bad for learning? When we look beyond the rhetoric surrounding instructional practices to examine data, it turns out that bad lectures do limit students' learning and motivation. However, good lectures can be inspiring and have a positiveeven transformativeimpact on student outcomes. Given this unenlightening information, the real question becomes, What differentiates a good lecture from a bad one? Good lectures share a number of key properties with any type of effective instruction. They begin by establishing the relevance of the material for students through explicit connections with their goals or interests. They activate prior knowledge by connecting new content with what students already know and understand or problems with which they are currently grappling. They present information in a clear and straightforward manner that does not require disproportionate effort to translate into terms and concepts meaningful to students. They limit the information presented to a small number of core ideas that are thoroughly but not redundantly explained. Studies that systematically control the relevant features of lectures find significant learning benefits for students when these principles are implemented. However, the large-scale correlative studies of instructional format and student achievement that report negative outcomes for lectures do not control for or even ask about the presence or absence of these features. Thus it may be that the negative findings are a more accurate reflection of generally lackluster or ill-informed implementation of this teaching technique than a condemnation of the technique itself. Of course, simply knowing or even applying these general principles for effective lecturing does not guarantee positive results. Students enter courses with differing backgrounds, levels of prior knowledge, goals, and interests. Given that each of the guidelines above explicitly frames practice in terms of characteristics that vary by learner, the underlying challenge is to find ways to connect with the broadest cross-section of students and find supplemental or alternate means of connecting with those who do not fit that mold. Many instructors succeed at this through the use of assignments that require students to grapple with problems prior to the lecture. Others use clickers to stimulate engagement and structure situations in which the information presented is salient. However, the effective use of such practices involves understanding the students at whom the course is targeted. Is technology good for learning? Both the definitions and the uses of instructional technology are highly varied, so conversations about its benefits and limitations also tend to rely on overly broad generalizations. The two major foci of these discussions currently are game/simulation-based learning and so-called Web 2.0 technologies that allow users to interact with each other via the internet and to contribute content of various types directly to websites. Advocates claim that these applications are important for improving student learning outcomes; they enhance relevance for students by engaging them through the generationally preferred medium of digital media and provide them with opportunities to actively engage with a course's content. While there are indeed instances where such benefits are realized, they are not reflected in comprehensive literature reviews or meta-analyses of the research. There is a simple explanation for this: not all uses of a technology are created equal. The key features that drive engagement and learning pertain to the designs that underlie the technology rather than to the technology itself. When games and other digital learning environments are developed in accordance with principles of effective instruction, they achieve positive results. But they do not yield better results than less sophisticated instructional delivery systems that use the same instructional designs. Why? Because the active ingredients that affect students' learning are the same in both cases. One of the most durable descriptions of this phenomenon is Richard E. Clark's grocery truck metaphor: Media are mere vehicles that deliver instruction but do not influence student achievement any more than the truck that delivers our groceries causes changes in our nutrition (Clark, 1983, p. 445). What the new media do offer are tools for interacting with instructors, peers, and content in ways that are not affordable or possible otherwise. When these interactions offer opportunities to observe or manipulate information and phenomena in meaningful ways, they can facilitate learning. Generally, the features that are most helpful for students include enabling the representation of concepts at multiple levels of abstraction (e.g., via concrete representation, abstract functional models or mathematical models), providing opportunities for more extensive practice than would otherwise be possible and offering immediate feedback to direct further learning efforts. While they are potentially valuable learning tools, such technologies need to be designed in such a way that they are not confusing or overwhelming for the students who will use them. With any software, there is a learning curve for mastering the interface used to interact with it. To the extent that the interface functions in a standard way, students will be able to draw on previous technology experiences in using it. However, if it is significantly different from familiar interfaces, they will need to invest substantial effort in mastering its use before getting to content-related learning. The greater the departure from familiar software environments, the steeper the learning curve. Thus the technology itself can act as a learning impediment for students with limited technology backgrounds. It may be the case that the potential learning benefits offered outweigh the cognitive costs, but it should not be assumed without evidence that this will be the case. The role of cognition There are two threads linking effective lectures and effective technology use. The first is consideration of what students bring to the table in terms of goals, interests, and prior knowledge. The second is the deliberate management of the opportunities for students to engage with content in order to focus their investment of mental effort on key ideas. In educational research, a powerful framework for considering these factors jointly is cognitive load theory (CLT). When games and other digital learning environments are developed in accordance with principles of effective instruction, they achieve positive results. But they do not yield better results than less sophisticated instructional delivery systems that use the same instructional designs. CLT operates under the central premise that learners are only capable of attending to a finite amount of information at a given time due to the limited capacity of the working (short-term) memory system. So it is necessary to carefully manage the flow of information with which learners must grapple. It is likely that anyone who has taken an introductory course in educational or cognitive psychology will have heard of George Miller's (1956) magical number that people can only process seven information elements at a time, plus or minus two. However, what many people do not know is that this number is probably a substantial overestimate. Miller obtained his finding by asking people to listen to strings of random numbers and recite them back as accurately as possible. These numbers were not linked to any context, and he assumed that they were ubiquitous placeholders for any type of information that people might need to process. What did not occur to Miller is that people use strings of numbers for many everyday tasks and have developed memory strategies to retain them. Think, for example, of how you remember a telephone number or your social security number; most people group the digits into two or three chunks (e.g., XXX-XXXX or XXX-XX-XXXX). It is these chunks that occupy space in working memory and help to organize the information so that it does not get lost. Subsequent research holds that the upper limit of our short-term memory is actually closer to four information pieces or chunks. Given these tight bandwidth constraints, how do human beings handle any complex taskespecially one that has more than four discrete elements? To simplify, we handle the task-relevant information much as we would a phone number: we divide it into meaningful units based on our knowledge of the content and task structure. The more knowledge we have about a task, situation, or content area, the more efficiently and adaptively we are able to map discrete pieces of information onto schemas. These schemas are the abstract representations of our knowledge that serve as integrated templates for rapidly organizing the relevant facets of a situation. With deeper, more meaningful, and more interconnected knowledge, our schemas become more refined, nuanced, and capable of encoding increasing amounts of incoming information as a single chunk. Information that would occupy only one chunk for an advanced learner might be viewed by a novice as several discrete pieces of information. Cognitive load is conceptualized as the number of separate chunks (schemas) processed concurrently in working memory while learning or performing a task, plus the resources necessary to process the interactions between them. Therefore a given learning task may impose different levels of cognitive load for different individuals based on their levels of relevant prior knowledge. Cognitive load is experienced as mental effort; novices need to invest a great deal of effort to accomplish a task that an expert might be able to handle with virtually none, because they lack sufficiently complex schemas. When cognitive load (the information to be processed) exceeds working memory's capacity to process it, students have substantial difficulties. The most straightforward effect is that they are unable to learn or solve problems. However, other problematic outcomes can also occur. First, students may revert to using older or less effortful approaches to the problem that impose a less heavy load on working memory. This means that previously held misconceptions or erroneous approaches may be brought to bear, reinforcing knowledge that is counter to the material they are trying to learn. Second, students may default to pursuing less effortful goals. In other words, they may procrastinate. In such situations, thinking about the whole of a complex task may be so overwhelming that students turn to more manageable activities: checking their email, cleaning their desks, or taking on whatever other chores do not exceed their processing ability. (Rumor has it that faculty have similar experiences.) For this reason, one of the strategies for overcoming procrastination is to reduce the magnitude of a goal by breaking a large task into its component parts and dealing with only one piece at a time. This limits the complexity of the task faced, which reduces the cognitive load it imposes to manageable levels. Managing cognitive load in teaching In order to optimize the benefits of instruction, CLT prioritizes available information according to the type of cognitive load it imposes. Intrinsic load represents the inherent complexity of the material to be learned. The higher the number of components and the more those components interact, the greater the intrinsic load of the content. Extraneous load represents information in the instructional environment that occupies working memory space without contributing to comprehension or the successful solving of the problem presented. Germane load is the effort invested in the necessary instructional scaffolding and in learning concepts that facilitate further content learning. Cognitive load is conceptualized as the number of separate chunks (schemas) processed concurrently in working memory while learning or performing a task, plus the resources necessary to process the interactions between them. In this context, scaffolding refers to the cognitive support of learning that is provided during instruction. Just as a physical scaffold provides temporary support to a building that is under construction, with the intent that it will be removed when the structure is able to support itself, an instructional scaffold provides necessary cognitive assistance for learners until they are able to practice the full task without help. Extensive instruction typically provides multiple levels of support that are removed gradually to facilitate the ongoing development of proficiency. Processing the information provided as scaffolding imposes cognitive load. However, to the extent that it prevents the cognitive overload that would otherwise result for a learner struggling with new material, it is cost beneficial. Thus, the three driving principles of CLT are: 1) present content to students with appropriate prior knowledge so that the intrinsic load of the material to be learned does not occupy all the available working memory, 2) eliminate extraneous load, and 3) judiciously impose germane load to support learning. For any instructional situation, the goal is to ensure that intrinsic, extraneous, and germane load combined do not exceed working memory capacity. But how can we manage this? Although we do not control the innate complexity of the material we teach, we can assess the prior knowledge of our students to ensure they understand prerequisite concepts. If they have schemas in place to facilitate the processing of the new concept, their intrinsic load is lower than if they need to grapple with every nuance of the material without the benefit of appropriate chunking strategies. This is an opportunity to effectively use technology. The use of clickers during lectures or short online assessments to be completed prior to attending class can provide a quick picture of which necessary elements students have in place before a new concept is introduced. If they lack the prerequisite knowledge, then the instructor should teach or provide that material first in order to prevent the advanced material from exceeding students' ability to process it. The good news about extraneous load is that it should be eliminated whenever possible rather than managed. In fact, there are a number of simple and straightforward principles for doing so in instructional materials as well as in the classroom. Some have to do with the information presented. For example, ancillary information that is not directly on point should be eliminated. This includes things like biographies of historic figures in science texts when the instructional objective is to teach a theory or procedure. While it may be an interesting human-interest story to consider whether or not an apple really fell on Newton's head, processing that information detracts from the working memory available to understand gravitational theory or how to solve problems using the law of gravity. Other practices target the presentation of information. For example, it is better to integrate explanatory text into a diagram than to keep it separate, because the cognitive load of mentally integrating the information can be avoided when they are collocated. On the other hand, reading aloud the text that students are looking at forces redundant processing of the same information and impedes their ability to retain the material. Because sensory information enters working memory through modality-specific pathways, which themselves have limited bandwidth, it is helpful for information to be distributed across modalities wherever possible. It is also helpful for all necessary information to enter working memory at approximately the same time. Thus, the first example uses linguistic and visual information together, which distributes the information across modalities and avoids the unnecessary load of holding the information from the diagram in working memory while searching for the appropriate text or vice versa. In contrast, the second example overloads the pathway that handles verbal information because it simultaneously delivers read and spoken information. It also requires that information from the text be held in working memory while the speech is processed, because people typically read to themselves much more quickly than words are read aloud. Germane load is a highly complicated issue. Building scaffolds for learning imposes cognitive load. Novices being introduced to material for the first time need a great deal of explicit instruction, using very small chunks of information, to deeply process new information or problem-solving strategies. As they acquire more knowledge and skill, though, the external scaffolding which initially helped them becomes unnecessary and redundant. If such learning supports are not eliminated for those students, they cease to facilitate learning as germane load and begin to hinder it as extraneous load. This expertise reversal effect is the biggest challenge for developing effective instruction, because students do not all attain the same level of comprehension at the same time. What is germane and helpful load for one student may be extraneous and harmful for another. Effective Practices The keys to applying cognitive load theory effectively in a course are advance planning and the ongoing monitoring of students' progress. Because the central premise of CLT is to optimize the allocation of students' working memory resources for mastering particular information, it is vital to identify very specifically what the instructional objectives are for the course as a whole and for each class meeting or module. If we cannot be precise about what we want students to know and be able to do, we will not be able to structure their experiences to help them accomplish this. Next, we need to sequence the objectives so as to present material in the order in which it is needed. If some topics build on others in the course, the prerequisite pieces should be taught before they are needed. For example, we should teach processes and procedures in the same sequence that students will perform them, so that work products from preceding steps can be used in subsequent steps. If the concepts, knowledge, or skills being taught do not have an inherent sequence, then it is generally most effective to order them from simplest to most complex. Once we have figured out what content needs to be taught and the appropriate progression of topics, it is most helpful to students when we let them in on the secret. Trying to impose order on disconnected information is highly effortful. If we simply turn students loose on the material without presenting clearly what they should be trying to get from it and how it fits into the larger picture of the course's content, much of their cognitive resources will be allocated to figuring out what information is important (extraneous load) rather than focusing on constructing the knowledge necessary to meet our learning objectives. Although the logic of the course content and sequence may be obvious to us as knowledgeable instructors and content experts, our students arrive without the benefit of the schemas we have developed. Regardless of their previous experiences (or perhaps because of them), they sincerely appreciate knowing up front what they will be learning, what is expected of them, how they will be assessed, and how all of these elements fit together. When these components of the course are unclear, students invest substantial effort in figuring them out. Further, they may reach incorrect conclusions, which leads to more extraneous effort as they work at cross purposes to the course. Having mapped out the information in the course, we also need to determine how well students comprehend any knowledge on which later course content depends. This does not mean that we must burden our students (and ourselves) with exams or large assignments every week. Instead, we can use lightweight, rapid assessments that are not formally graded but are attuned to the key concepts upon which the new material draws. These can include short online surveys on the content that must be submitted a few days before class, quick check-in conversations as class begins, or multiple-choice questions on key issues that students must respond to using personal response systems (clickers). These tools are most effective when students are accountable for submitting a response but not for the accuracy of their answers. The purpose is to inform the instruction we provide rather than to increase students' anxiety (i.e., emotionally invoked extraneous load) about not knowing a correct answer. If students generally have a strong grasp of the prerequisite material, the likelihood of cognitive overload will be small, less scaffolding will be needed, and they can move directly into problem-solving. But if their understanding is weak, it will be important to review the prior material in detail, structure the new content as much as possible, and move slowly through it. When introducing problem-solving procedures to novices, providing worked examples is a very helpful practice. This involves demonstrating and explaining the reasoning processes that are involved in solving a class of problems, using a representative example. This helps to manage cognitive load effectively in several ways. When a problem is taken on, there are two sources of potential load for a learner. The first is the need to structure the information provided to effectively frame and analyze the problem. The second is the application of appropriate problem-solving strategies. The worked example both demonstrates problem-framing and provides a concrete model of an appropriate problem-solving strategy. sincerely appreciate knowing up front what they will be learning, what is expected of them, how they will be assessed, and how all of these elements fit together. This reduces the degree of uncertainty under which the students are working on three fronts. First, it allows them to map concrete instances onto relevant schemas, facilitating effective chunking. Second, it reduces their reliance on highly effortful trial-and-error attempts to identify productive solutions, which substantially increase cognitive load and time spent without providing any learning advantage. Last, it breaks the procedure down into distinguishable steps that can be considered in smaller, more manageable chunks. After walking through a full example, an excellent way to help students practice without getting overloaded is to provide a partially worked example and ask them to pick up where the completed part of the example leaves off. Having them practice the last steps first ensures that all aspects of the strategy to be learned are practiced. In complex, open-ended problems, students can get off track midway through an exercise and never have the opportunity to practice its later elements. As students become proficient in the later steps, they can be given problems with fewer steps completed for them. In this way, instructors can effectively control the overall level of cognitive load imposed by the problem and ramp up to full problems after students have developed effective schemas and chunking strategies. Practice makes perfect As students encounter repeated instances of problem types during their learning, their strategies become more nuanced (to accommodate small differences between the problems) and less effortful to execute. As they practice, their skills require less and less conscious monitoring, which reduces the level of cognitive load that problem-solving imposes. This lets them efficiently address problems of increasing complexity. Experts are able to solve problems beyond the scope of what laymen can handle precisely because their core problem-solving procedures impose virtually no load on working memory. Therefore, they can assimilate very subtle nuances and much more complex problem features with their extra cognitive capacity. The benefits of practice are just as powerful for teachers as they are for students. Teaching effectively and using cognitive load theory to guide practice is challenging. It requires the focused consideration of many details regarding our students, their knowledge, and our instructional goals. But with sustained effort, careful observations of what seems to yield more efficient and effective learning, and a willingness to make changes as necessary, these practices become less effortful. This frees up our own working memory resources to use for addressing both further complexities in addressing the learning needs of our students and the subtleties of our own disciplinary passions. Resources 1. Bernard, R. M., Abrami, P.C., Lou, Y., Borokhovski, E., Wade, A., Wozney, L., Wallet, P. A., Fiset, M. and Huang, B. (2004) How does distance education compare with classroom instruction? A meta-analysis of the empirical literature. Review of Educational Research 74 :3, pp. 379-439. 2. Bernard, R. M., Abrami, P. C., Borokhovski, E., Wade, C. A., Tamim, R. M., Surkes, M. A. and Bethel, E. C. (2009) A meta-analysis of three types of interaction treatments in distance education. Review of Educational Research 79 :3, pp. 1243-1289. 3. Clark, R. C., Nguyen, F. and Sweller, J. (2005) Efficiency in learning: Evidence-based guidelines to manage cognitive load , John Wiley Sons, San Francisco. 4. Clark, R. E. (2001) Learning from media: Arguments, analysis, and evidence , Information Age Publishing, Charlotte, NC. 5. Cowan, N. (2000) The magical number 4 in short-term memory: A reconsideration of mental storage capacity. Behavioral and Brain Sciences 24 , pp. 87-185. 6. Feldon, D. F. (2007) Cognitive load in the classroom: The double-edged sword of automaticity. Educational Psychologist 42 :3, pp. 123-137. 7. Kalyuga, S., Ayres, P., Chandler, P. and Sweller, J. (2003) The expertise reversal effect. Educational Psychologist 38 :1, pp. 23-31. 8. Mayer, R. E. (2009) Multimedia learning , 2 Cambridge University Press, New York. 9. Miller, G. A. (1956) The magical number seven, plus or minus two: Some limits on our capacity for processing information. The Psychological Review 63 , pp. 81-97. 10. Schwartz, D. L. and Bransford, J. D. (1998) A time for telling. Cognition Instruction 16 :4, pp. 475-522. 11. van Merrinboer, J. J. G. and Sweller, J. (2005) Cognitive load theory and complex learning: Recent developments and future directions. Educational Psychology Review 17 :2, pp. 147-177. David Feldon is an assistant professor of STEM education and educational psychology at the University of Virginia. His research examines the development of expertise in science, technology, engineering, and mathematics through a cognitive lens. He also studies the effects of expertise on instructors' abilities to teach effectively within their disciplines. http://www.changemag.org/index.html Editorial: Motivating Learning by Margaret A. Miller Knowing how students learn and solve problems informs us how we should organise their learning environment and without such knowledge, the effectiveness of instructional designs is likely to be random . John Sweller (Instructional Science 32: 931, 2004.) I've written in the past about the things we want students to learn, how we help them learn, and about resistance (mine and virtually everyone else's) to change. In this issue, those concerns converge. Determining what we want students to learn is the amazingly difficult first step in developing assessments of that learning, as the article by Dary Erwin and Joe DeFillippo demonstrates. And Marc Chun talks about linking teaching, learning, assessment, and the ultimate use of higher-order thinking skills by both teaching and assessing those skills through tasks that mimic how they will be used in real life. But what particularly intrigues me is the connection between cognition and change. Educational psychologists have developed a number of constructs to explain how the mind works. In this issue, David Feldon suggests that a familiarity with cognitive load theory can be a big help in developing effective pedagogies, for example, a framework we see invoked in Carl Wieman's attempts to improve science instruction. But there is other knowledge about human cognitive architecture that can also be useful as we think about teaching and learning. For instance, the human cognitive default is to solve problems with as small a mental investment as possible; we typically retreat to earlier mental models and quicker and less effortful automated problem-solving strategies when new information threatens to overwhelm us. So as Feldon suggests, teachers need to find some way to keep the investment low enough and the cognitive load light enough that those mechanisms don't come into play. We can also exploit the fact that we're more likely to try to solve problems in areas that are important to us by showing students the relevance of what we're teaching to their lives and concerns. But given the fundamentally conservative nature of human cognition, perhaps the question should be, why doesn't the whole learning system grind to a halt? In a way, it's remarkable that we ever learn anything at all. I remember that when my son was about a year old, he developed the locomotive strategy of scooting around on his knees (it beat crawling, since he could carry things). Once he had built up calluses thick enough to protect those knees, it was a remarkably efficient way to get from point A to point B, and it halved the height from which he would fall if something went wrong. I remember thinking at the time, what will ever motivate him to get up on his hind legs and wobble around when a misstep would cause him to fall from twice the height? What will prompt him, in short, to face the perils of change when things work so well and comfortably for him as they are? Come to think of it, our bipedal walk is a great metaphor for our alternation between imbalance and stability. The act of walking, researchers have discovered, is a continual falling forward, regaining our balance, then falling forward again. Something impels us to lift that foot and risk the fall, then we consolidate our new position momentarily, then we lift that foot and fall again, and so on. At the species level, there are clearly advantages in the impulse to generate, test out, and practice both old and new survival strategies (e.g., bipedalism) that can give one an evolutionary edge. But what lies on top of that drive for individual students? How do we motivate them to lift one foot and put it down a little ahead, let us help them organize and consolidate their momentary new equilibrium, and then lift the other? I think the answer can be found by looking not at learning in school but at spontaneous learning, particularly during play. When they play, children seem to be motivated by several things. Curiosity, for one. Another stimulus is wanting to master the environment (a bone-deep tendency, crucial to the human race's survival, that is as dangerous as fire when out of control but as just as life-giving when contained), which is why children need plenty of free play where they make up the rules (as opposed to playing board games or participating in sports). A third stimulus may be the desire to imitate and take one's place among trusted and admired others, either peers or adults. Those tendencies don't need to be lost as one ages, as the success of Elderhostel attests to, although Grandgrindian schooling can certainly grind them down. So our job as teachers may be to stand in what Vygotsky called the zone of proximal development, the stage in their cognitive growth that students haven't quite gotten to yet, and beckon them forward into what for them is uncharted but possibly alluring territory (the ending of Huckleberry Finn floats into my mind, where Huck tells Jim that it's time to light out for the territories, or the song by Jacques Brel in which he mentions his childhood longing for le Far West). We motivate students to make that leap by stimulating their curiosity about the subject; by showing our own passion for it; by lessening the dangers of the move as we, knowing what their current maps look like, show them the path from there to here and how to organize their understanding of the new landscape; and by giving them as much control as possible over the learning environment. But more: I point you to Matt Procino's account (in the Listening to Students in the previous issue) of taking over a class in child development. He modeled for students the very behavior he wanted them to exhibit in life as a result of what they learned in his class by soliciting sometimes uncomfortable feedback as he learned how to teach. Similarly, he had earlier let his Outward Bound students see that he too was afraid of the challenges he was asking them to take on but that they could summon the courage to do so becausesee?he was doing it. From the point of view of the students, an admired other gave them two things to imitate: not only how you scale a cliff but how you deal with the fear of scaling a cliff. People generally can't be dragged or whipped into forward movement; they'll run back to their earlier spot of equilibrium the minute the threat (of bad grades, for instance) stops. I know that I plant my feet stubbornly whenever I feel bullied (leading one professorwho tried to argue me into liking Wordsworth's Michael, a poem I detest to this dayto say to me in exasperation, Miss Miller, why are you sometimes so dense ?) But I'm apt to leap joyfully ahead when beckoned by someone I trust and admire into knowledge that he or she is passionate about. And I want to be among the people who inhabit that new zone. That's why, at the end of a successful dissertation defense, I always say to the newly minted PhD, Welcome to the community of scholars.
个人分类: 高等教育学|149 次阅读|0 个评论
英语单词的象形记忆法(2)
Bernie 2010-2-3 12:40
Figuration of English Words in Outlook and Sound Introduction The idea had been in my brain for more than 20 years to example words before I published it on my blog on 22th Sept. 2007. The list of letter meanings is the central body of the idea. The 26 letters each has its own meaning based on its outlook or on a derivation summarized from thousands of words. However, the meaning of a letter shows flexibility, and letter O, for example, looks not only a round blur puzzle but also a circle area and a ring to link two parts into a new word as well. We need the flexibility to figure out the meaning of million words individually. The list also shows the meanings of only some two-letter groups because meanings of the other groups as word have been defined in common dictionaries or as word fixes in textbooks. The meaning of each group has a typical way of flexibility: one comes from the list and the other is comes from the meanings of its constitutional letters. Cu, for example, means “cumulate” in the list. Cu can also mean “cut down” because C means “cut” and U means “down” in the list. It is generally not necessary to define meanings of any letter groups having more than three letters because you can find their meanings out of a dictionary. The idea is that you may like to hold a copy of the list in one of your hands to figure out meaning of any a word in your own story of figuration. After some days you will not need the list any more except a few occasional checks because the list is easy to remember. You may need to know that meaning of consonant letters plays a more important role than the vowels and a vowel letter usually behaves like an emphasis to the meaning of the consonant or consonants in front of it without much its own meaning, i.e. a vowel with its preceding consonant or consonants usually makes only one meaning. A very important skill is to break a word into letter groups. A consonant with its next following vowel should be a group but how to break connected consonants and vowels, and even to separate sub-word groups are optional. There are no rules but skills. Sometimes, you need to add or to take off a letter in letter groups to explain a word because people did so when they create words for shortness or good look. For example, dispirit=dis+spirit; distress=dis+stress; account=a+count and applause=a+plause. Let us begin with the word “love”. Why is “love” consisted of these four specific letters? Why love means to load venom or change to your heart or to others? Is it the real story that the word “love” was invented many many years ago in that way? It does not matter but the figuration may be interesting and helpful to remember it. Do you believe that the idea is a miracle? You can find that the idea and the list work well for thousands and thousands words. It is even true that your stories of figuration for many words are exactly the same as they are in the textbooks of word origin in prefixes, suffixes and stems from Greek, Latin and European colloquies including old English or in a text book of etymology. The principle is that any word came to its present meaning in its reason of formation or figuration and shall we find it? The history lost the figuration and shall we discuss and make an agreement to define it now? The idea is intended to supply people a way to remember words easily and funnily. However, who cares about your stories of the figuration being true or not in history of word origin provided they help you to remember words and they are interesting. Your word stories may not the same as mine or you may be surprised to find out yours are exactly the same of mine. However most importantly, if you share them with people, you may find that you are special and genius. If you share the stories of figuration from one of your friends to compare with yours, you may find out that your friend is special and he/she is a friend of yours in sake of nature and personality. To exchange stories of a specific word or a group of words would be an interesting game in an intimate circle of friends confidentially. You will remember words in your own best way by the figuration learning and practice. A word looks no longer hard and tedious but active, affective and interesting. Every one will become a writer to author a personal word storybook like a diary book and it may be published soon. If you would like to join my work, especially a native English speaker and a language specialist, if you would like to help me or join me to publish a book of the word stories in a form of a dictionary, would you please feel free to contact me by email: ypzong@mail.neu.edu.cn or feed back on my blog? Thank you!
个人分类: 未分类|4734 次阅读|1 个评论
[转载]Ontology Construction[ZZ]
timy 2010-2-3 00:59
From: http://cgi.cse.unsw.edu.au/~handbookofnlp/index.php?n=Chapter16.Chapter16 Ontology Construction Philipp Cimiano, Paul Buitelaar and Johanna Vlker In this chapter we provide an overview of the current state-of-the-art in ontology construction with an emphasis on NLP-related issues such as text-driven ontology engineering and ontology learning methods. In order to put these methods into the broader perspective of knowledge engineering applications of this work, we also present a discussion of ontology research itself, in its philosophical origins and historic background as well as in terms of methodologies in ontology engineering. Bibtex Citation @incollection{Cimiano-handbook10, author = {Philipp Cimiano and Paul Buitelaar and Johanna V\{o}lker}, title = {Ontology Construction}, booktitle = {Handbook of Natural Language Processing, Second Edition}, editor = {Nitin Indurkhya and Fred J. Damerau}, publisher = {CRC Press, Taylor and Francis Group}, address = {Boca Raton, FL}, year = {2010}, note = {ISBN 978-1420085921} } Online Resources Ontologies General and upper-level ontologies CYC http://www.opencyc.org DOLCE http://www.loa-cnr.it/DOLCE.html SUMO http://www.ontologyportal.org Linguistic ontologies OntoWordnet http://www.loa-cnr.it/Papers/ODBASE-WORDNET.pdf Swinto, LingInfo Domain-specific ontologies (publicly available) in some example domains Biomedical Foundational Model of Anatomy http://sig.biostr.washington.edu/projects/fm/AboutFM.html Gene Ontology http://www.geneontology.org Repository of biomedical ontologies http://bioportal.bioontology.org Business/Financial XBRL ontology http://xbrlontology.com Geography Geonames ontology http://www.geonames.org/ontology/ Ontology Repositories and Search Engines Swoogle http://swoogle.umbc.edu/ Watson http://watson.kmi.open.ac.uk OntoSelect http://olp.dfki.de/ontoselect/ Oyster Ontology Development Ontology Development 101 http://ksl.stanford.edu/people/dlm/papers/ontology-tutorial-noy-mcguinness-abstract.html Ontology Editors Protg http://protege.stanford.edu NeOn Toolkit http://www.neon-toolkit.org Swoop http://www.mindswap.org/2004/SWOOP/ TopRaid Composer Ontology Engineering Methodologies DILIGENT http://semanticweb.org/wiki/DILIGENT HCOME http://semanticweb.org/wiki/HCOME METHONTOLOGY http://semanticweb.org/wiki/METHONTOLOGY OTK methodology http://semanticweb.org/wiki/OTK_methodology Ontology Learning Tools Text2Onto http://www.neon-toolkit.org/wiki/index.php/Text2Onto OntoLearn http://lcl.di.uniroma1.it/tools.jsp OntoLT http://olp.dfki.de/OntoLT/OntoLT.htm
个人分类: 自然语言处理|3809 次阅读|0 个评论
[转载][JMIR] Learning in a Virtual World: Experience With Using Second Life for Me
xupeiyang 2010-1-26 08:12
Journal of Medical Internet Research (JMIR) Volume 12 (2010) * Impact Factor (2008): 3.6 - Ranked top (#1/20) in the Medical Informatics and second (#2/62) in the Health Services Research category * http://www.jmir.org/2010 Content Alert, 25 Jan 2010 ==================================== ================================= UPCOMING ISSUE Volume 12, Issue 1 http://www.jmir.org/2010/1 ================================= The following article(s) has/have just been published in the UPCOMING JMIR issue (Volume 12 / Issue 1): (articles are still being added for this issue) Original Papers ------------------ Learning in a Virtual World: Experience With Using Second Life for Medical Education John Wiecha, Robin Heyden, Elliot Sternthal, Mario Merialdi J Med Internet Res 2010 (Jan 23); 12(1):e1 HTML (open access): http://www.jmir.org/2010/1/e1/ PDF (members only): http://www.jmir.org/2010/1/e1/PDF Background: Virtual worlds are rapidly becoming part of the educational technology landscape. Second Life (SL) is one of the best known of these environments. Although the potential of SL has been noted for health professions education, a search of the worlds literature and of the World Wide Web revealed a limited number of formal applications of SL for this purpose and minimal evaluation of educational outcomes. Similarly, the use of virtual worlds for continuing health professional development appears to be largely unreported. Objective: Our objectives were to: 1) explore the potential of a virtual world for delivering continuing medical education (CME) designed for post-graduate physicians; 2) determine possible instructional designs for using SL for CME; 3) understand the limitations of SL for CME; 4) understand the barriers, solutions, and costs associated with using SL, including required training; 5) measure participant learning outcomes and feedback. Methods: We designed and delivered a pilot postgraduate medical education program in the virtual world, Second Life. Our objectives were to: (1) explore the potential of a virtual world for delivering continuing medical education (CME) designed for physicians; (2) determine possible instructional designs using SL for CME; (3) understand the limitations of SL for CME; (4) understand the barriers, solutions, and costs associated with using SL, including required training; and (5) measure participant learning outcomes and feedback. We trained and enrolled 14 primary care physicians in an hour-long, highly interactive event in SL on the topic of type 2 diabetes. Participants completed surveys to measure change in confidence and performance on test cases to assess learning. The post survey also assessed participantsattitudestoward the virtual learning environment. Results: Of the 14 participant physicians, 12 rated the course experience, 10 completed the pre and post confidence surveys, and 10 completed both the pre and post case studies. On a seven-point Likert scale (1, strongly disagree to 7, strongly agree), participants mean reported confidence increased from pre to post SL event with respect to: selecting insulin for patients with type 2 diabetes (pre = 4.9 to post = 6.5,P= .002); initiating insulin (pre = 5.0 to post = 6.2,P= .02); and adjusting insulin dosing (pre = 5.2 to post = 6.2,P= .02). On test cases, the percent of participants providing a correct insulin initiation plan increased from 60% (6 of 10) pre to 90% (9 of 10) post (P= .2), and the percent of participants providing correct initiation of mealtime insulin increased from 40% (4 of 10) pre to 80% (8 of 10) post (P= .09). All participants (12 of 12) agreed that this experience in SL was an effective method of medical education, that the virtual world approach to CME was superior to other methods of online CME, that they would enroll in another such event in SL, and that they would recommend that their colleagues participate in an SL CME course. Only 17% (2 of 12) disagreed with the statement that this potential Second Life method of CME is superior to face-to-face CME. Conclusions: The results of this pilot suggest that virtual worlds offer the potential of a new medical education pedagogy to enhance learning outcomes beyond that provided by more traditional online or face-to-face postgraduate professional development activities. Obvious potential exists for application of these methods at the medical school and residency levels as well. ============================================ Please support JMIR by becoming a member today! JMIR needs to raise $100k to support software upgrades http://www.jmir.org/support.htm Memberships start at $4.92 per month ============================================ 2008 Impact Factor 3.6 confirms JMIR as THE top ranked health informatics, health services research and health policy journal for the Internet age. See http://www.jmir.org/announcement/view/24 for the new journal impact factors released in June 2009 ! ============================================ This is not an unsolicited email. You are receiving this email because you subscribed to JMIR content alerts. To unsubscribe from content alerts please log in at http://www.jmir.org/user/profile and uncheck the checkbox New issue published email notifications. If you lost your password, please go to http://www.jmir.org/login/lostPassword. ________________________________________________________________________ Journal of Medical Internet Research - The leading peer-reviewed ehealth journal - Open Access - Fast Review - Impact Factor: 3.6 *** JMIR is now ranked the number one (#1/20) in the Medical Informatics and second (#2/62) in the Health Services Research category! http://www.jmir.org
个人分类: 信息检索|1878 次阅读|0 个评论
List of Conferences and Workshops Where Transfer Learning Paper Appear
timy 2009-11-6 10:49
From: http://www.cse.ust.hk/~sinnopan/conferenceTL.htm List of Conferences and Workshops Where Transfer Learning Paper Appear This webpage will be updated regularly. Main Conferences Machine Learning and Artificial Intelligence Conferences AAAI 2008 Transfer Learning via Dimensionality Reduction Transferring Localization Models across Space Transferring Localization Models over Time Transferring Multi-device Localization Models using Latent Multi-task Learning Text Categorization with Knowledge Transfer from Heterogeneous Data Sources Zero-data Learning of New Tasks 2007 Transferring Naive Bayes Classifiers for Text Classification Mapping and Revising Markov Logic Networks for Transfer Learning Measuring the Level of Transfer Learning by an AP Physics Problem-Solver 2006 Using Homomorphisms to Transfer Options across Continuous Reinforcement Learning Domains Value-Function-Based Transfer for Reinforcement Learning Using Structure Mapping IJCAI 2009 Transfer Learning Using Task-Level Features with Application to Information Retrieval Transfer Learning from Minimal Target Data by Mapping across Relational Domains Domain Adaptation via Transfer Component Analysis Knowledge Transfer on Hybrid Graph Manifold Alignment without Correspondence Robust Distance Metric Learning with Auxiliary Knowledge Can Movies and Books Collaborate? Cross-Domain Collaborative Filtering for Sparsity Reduction Exponential Family Sparse Coding with Application to Self-taught Learning 2007 Learning and Transferring Action Schemas General Game Learning Using Knowledge Transfer Building Portable Options: Skill Transfer in Reinforcement Learning Transfer Learning in Real-Time Strategy Games Using Hybrid CBR/RL An Experts Algorithm for Transfer Learning Transferring Learned Control-Knowledge between Planners Effective Control Knowledge Transfer through Learning Skill and Representation Hierarchies Efficient Bayesian Task-Level Transfer Learning ICML 2009 Deep Transfer via Second-Order Markov Logic Feature Hashing for Large Scale Multitask Learning A Convex Formulation for Learning Shared Structures from Multiple Tasks EigenTransfer: A Unified Framework for Transfer Learning Domain Adaptation from Multiple Sources via Auxiliary Classifiers Transfer Learning for Collaborative Filtering via a Rating-Matrix Generative Model 2008 Bayesian Multiple Instance Learning: Automatic Feature Selection and Inductive Transfer Multi-Task Learning for HIV Therapy Screening Self-taught Clustering Manifold Alignment using Procrustes Analysis Automatic Discovery and Transfer of MAXQ Hierarchies Transfer of Samples in Batch Reinforcement Learning Hierarchical Kernel Stick-Breaking Process for Multi-Task Image Analysis Multi-Task Compressive Sensing with Dirichlet Process Priors A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning 2007 Boosting for Transfer Learning Self-taught Learning: Transfer Learning from Unlabeled Data Robust Multi-Task Learning with t-Processes Multi-Task Learning for Sequential Data via iHMMs and the Nested Dirichlet Process Cross-Domain Transfer for Reinforcement Learning Learning a Meta-Level Prior for Feature Relevance from Multiple Related Tasks Multi-Task Reinforcement Learning: A Hierarchical Bayesian Approach The Matrix Stick-Breaking Process for Flexible Multi-Task Learning Asymptotic Bayesian Generalization Error When Training and Test Distributions Are Different Discriminative Learning for Differing Training and Test Distributions 2006 Autonomous Shaping: Knowledge Transfer in Reinforcement Learning Constructing Informative Priors using Transfer Learning NIPS 2008 Clustered Multi-Task Learning: A Convex Formulation Multi-task Gaussian Process Learning of Robot Inverse Dynamics Transfer Learning by Distribution Matching for Targeted Advertising Translated Learning: Transfer Learning across Different Feature Spaces An empirical Analysis of Domain Adaptation Algorithms for Genomic Sequence Analysis Domain Adaptation with Multiple Sources 2007 Learning Bounds for Domain Adaptation Transfer Learning using Kolmogorov Complexity: Basic Theory and Empirical Evaluations A Spectral Regularization Framework for Multi-Task Structure Learning Multi-task Gaussian Process Prediction Semi-Supervised Multitask Learning Gaussian Process Models for Link Analysis and Transfer Learning Multi-Task Learning via Conic Programming Direct Importance Estimation with Model Selection and Its Application to Covariate Shift Adaptation 2006 Correcting Sample Selection Bias by Unlabeled Data Dirichlet-Enhanced Spam Filtering based on Biased Samples Analysis of Representations for Domain Adaptation Multi-Task Feature Learning AISTAT 2009 A Hierarchical Nonparametric Bayesian Approach to Statistical Language Model Domain Adaptation 2007 Kernel Multi-task Learning using Task-specific Features Inductive Transfer for Bayesian Network Structure Learning ECML/PKDD 2009 Relaxed Transfer of Different Classes via Spectral Partition Feature Selection by Transfer Learning with Linear Regularized Models Semi-Supervised Multi-Task Regression 2008 Actively Transfer Domain Knowledge An Algorithm for Transfer Learning in a Heterogeneous Environment Transferred Dimensionality Reduction Modeling Transfer Relationships between Learning Tasks for Improved Inductive Transfer Kernel-Based Inductive Transfer 2007 Graph-Based Domain Mapping for Transfer Learning in General Games Bridged Refinement for Transfer Learning Transfer Learning in Reinforcement Learning Problems Through Partial Policy Recycling Domain Adaptation of Conditional Probability Models via Feature Subsetting 2006 Skill Acquisition via Transfer Learning and Advice Taking COLT 2009 Online Multi-task Learning with Hard Constraints Taking Advantage of Sparsity in Multi-Task Learning Domain Adaptation: Learning Bounds and Algorithms 2008 Learning coordinate gradients with multi-task kernels Linear Algorithms for Online Multitask Classification 2007 Multitask Learning with Expert Advice 2006 Online Multitask Learning UAI 2009 Bayesian Multitask Learning with Latent Hierarchies Multi-Task Feature Learning Via Efficient L2,1-Norm Minimization 2008 Convex Point Estimation using Undirected Bayesian Transfer Hierarchies Data Mining Conferences KDD 2009 Cross Domain Distribution Adaptation via Kernel Mapping Extracting Discriminative Concepts for Domain Adaptation in Text Mining 2008 Spectral domain-transfer learning Knowledge transfer via multiple model local structure mapping 2007 Co-clustering based Classification for Out-of-domain Documents 2006 Reverse Testing: An Efficient Framework to Select Amongst Classifiers under Sample Selection Bias ICDM 2008 Unsupervised Cross-domain Learning by Interaction Information Co-clustering Using Wikipedia for Co-clustering Based Cross-domain Text Classification SDM 2008 Type-Independent Correction of Sample Selection Bias via Structural Discovery and Re-balancing Direct Density Ratio Estimation for Large-scale Covariate Shift Adaptation 2007 On Sample Selection Bias and Its Efficient Correction via Model Averaging and Unlabeled Examples Probabilistic Joint Feature Selection for Multi-task Learning Application Conferences SIGIR 2009 Mining Employment Market via Text Block Detection and Adaptive Cross-Domain Information Extraction Knowledge transformation for cross-domain sentiment classification 2008 Topic-bridged PLSA for cross-domain text classification 2007 Cross-Lingual Query Suggestion Using Query Logs of Different Languages 2006 Tackling Concept Drift by Temporal Inductive Transfer Constructing Informative Prior Distributions from Domain Knowledge in Text Classification Building Bridges for Web Query Classification WWW 2009 Latent Space Domain Transfer between High Dimensional Overlapping Distributions 2008 Can Chinese web pages be classified with English data source? ACL 2009 Transfer Learning, Feature Selection and Word Sense Disambiguation Graph Ranking for Sentiment Transfer Multi-Task Transfer Learning for Weakly-Supervised Relation Extraction Cross-Domain Dependency Parsing Using a Deep Linguistic Grammar Heterogeneous Transfer Learning for Image Clustering via the SocialWeb 2008 Exploiting Feature Hierarchy for Transfer Learning in Named Entity Recognition Multi-domain Sentiment Classification Active Sample Selection for Named Entity Transliteration Mining Wiki Resources for Multilingual Named Entity Recognition Multi-Task Active Learning for Linguistic Annotations 2007 Domain Adaptation with Active Learning for Word Sense Disambiguation Frustratingly Easy Domain Adaptation Instance Weighting for Domain Adaptation in NLP Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification Self-Training for Enhancement and Domain Adaptation of Statistical Parsers Trained on Small Datasets 2006 Estimating Class Priors in Domain Adaptation for Word Sense Disambiguation Simultaneous English-Japanese Spoken Language Translation Based on Incremental Dependency Parsing and Transfer CVPR 2009 Domain Transfer SVM for Video Concept Detection Boosted Multi-Task Learning for Face Verification With Applications to Web Image and Video Search 2008 Transfer Learning for Image Classification with Sparse Prototype Representations Workshops NIPS 2005 Workshop - Inductive Transfer: 10 Years Later NIPS 2005 Workshop - Interclass Transfer NIPS 2006 Workshop - Learning when test and training inputs have different distributions AAAI 2008 Workshop - Transfer Learning for Complex Tasks
个人分类: 机器学习|7577 次阅读|0 个评论
ZZ: 迁移学习( Transfer Learning )
timy 2009-11-6 09:46
转载于: http://apex.sjtu.edu.cn/apex_wiki/Transfer%20Learning 迁移学习( Transfer Learning ) 薛贵荣 在传统的机器学习的框架下,学习的任务就是在给定充分训练数据的基础上来学习一个分类模型;然后利用这个学习到的模型来对测试文档进行分类与预测。然而,我们看到机器学习算法在当前的Web挖掘研究中存在着一个关键的问题:一些新出现的领域中的大量训练数据非常难得到。我们看到Web应用领域的发展非常快速。大量新的领域不断涌现,从传统的新闻,到网页,到图片,再到博客、播客等等。传统的机器学习需要对每个领域都标定大量训练数据,这将会耗费大量的人力与物力。而没有大量的标注数据,会使得很多与学习相关研究与应用无法开展。其次,传统的机器学习假设训练数据与测试数据服从相同的数据分布。然而,在许多情况下,这种同分布假设并不满足。通常可能发生的情况如训练数据过期。这往往需要我们去重新标注大量的训练数据以满足我们训练的需要,但标注新数据是非常昂贵的,需要大量的人力与物力。从另外一个角度上看,如果我们有了大量的、在不同分布下的训练数据,完全丢弃这些数据也是非常浪费的。如何合理的利用这些数据就是迁移学习主要解决的问题。迁移学习可以从现有的数据中迁移知识,用来帮助将来的学习。迁移学习(Transfer Learning)的目标是将从一个环境中学到的知识用来帮助新环境中的学习任务。因此,迁移学习不会像传统机器学习那样作同分布假设。 我们在迁移学习方面的工作目前可以分为以下三个部分:同构空间下基于实例的迁移学习,同构空间下基于特征的迁移学习与异构空间下的迁移学习。我们的研究指出,基于实例的迁移学习有更强的知识迁移能力,基于特征的迁移学习具有更广泛的知识迁移能力,而异构空间的迁移具有广泛的学习与扩展能力。这几种方法各有千秋。 1.同构空间下基于实例的迁移学习 基于实例的迁移学习的基本思想是,尽管辅助训练数据和源训练数据或多或少会有些不同,但是辅助训练数据中应该还是会存在一部分比较适合用来训练一个有效的分类模型,并且适应测试数据。于是,我们的目标就是从辅助训练数据中找出那些适合测试数据的实例,并将这些实例迁移到源训练数据的学习中去。在基于实例的迁移学习方面,我们推广了传统的 AdaBoost 算法,提出一种具有迁移能力的boosting算法:Tradaboosting ,使之具有迁移学习的能力,从而能够最大限度的利用辅助训练数据来帮助目标的分类。我们的关键想法是,利用boosting的技术来过滤掉辅助数据中那些与源训练数据最不像的数据。其中,boosting的作用是建立一种自动调整权重的机制,于是重要的辅助训练数据的权重将会增加,不重要的辅助训练数据的权重将会减小。调整权重之后,这些带权重的辅助训练数据将会作为额外的训练数据,与源训练数据一起从来提高分类模型的可靠度。 基于实例的迁移学习只能发生在源数据与辅助数据非常相近的情况下。但是,当源数据和辅助数据差别比较大的时候,基于实例的迁移学习算法往往很难找到可以迁移的知识。但是我们发现,即便有时源数据与目标数据在实例层面上并没有共享一些公共的知识,它们可能会在特征层面上有一些交集。因此我们研究了基于特征的迁移学习,它讨论的是如何利用特征层面上公共的知识进行学习的问题。 2.同构空间下基于特征的迁移学习 在基于特征的迁移学习研究方面,我们提出了多种学习的算法,如CoCC算法 ,TPLSA算法 ,谱分析算法 与自学习算法 等。其中利用互聚类算法产生一个公共的特征表示,从而帮助学习算法。我们的基本思想是使用互聚类算法同时对源数据与辅助数据进行聚类,得到一个共同的特征表示,这个新的特征表示优于只基于源数据的特征表示。通过把源数据表示在这个新的空间里,以实现迁移学习。应用这个思想,我们提出了基于特征的有监督迁移学习与基于特征的无监督迁移学习。 2.1 基于特征的有监督迁移学习 我们在基于特征的有监督迁移学习方面的工作是基于互聚类的跨领域分类 ,这个工作考虑的问题是:当给定一个新的、不同的领域,标注数据及其稀少时,如何利用原有领域中含有的大量标注数据进行迁移学习的问题。在基于互聚类的跨领域分类这个工作中,我们为跨领域分类问题定义了一个统一的信息论形式化公式,其中基于互聚类的分类问题的转化成对目标函数的最优化问题。在我们提出的模型中,目标函数被定义为源数据实例,公共特征空间与辅助数据实例间互信息的损失。 2.2 基于特征的无监督迁移学习:自学习聚类 我们提出的自学习聚类算法 属于基于特征的无监督迁移学习方面的工作。这里我们考虑的问题是:现实中可能有标记的辅助数据都难以得到,在这种情况下如何利用大量无标记数据辅助数据进行迁移学习的问题。自学习聚类 的基本思想是通过同时对源数据与辅助数据进行聚类得到一个共同的特征表示,而这个新的特征表示由于基于大量的辅助数据,所以会优于仅基于源数据而产生的特征表示,从而对聚类产生帮助。 上面提出的两种学习策略(基于特征的有监督迁移学习与无监督迁移学习)解决的都是源数据与辅助数据在同一特征空间内的基于特征的迁移学习问题。当源数据与辅助数据所在的特征空间中不同时,我们还研究了跨特征空间的基于特征的迁移学习,它也属于基于特征的迁移学习的一种。 3 异构空间下的迁移学习:翻译学习 我们提出的翻译学习 致力于解决源数据与测试数据分别属于两个不同的特征空间下的情况。在 中,我们使用大量容易得到的标注过文本数据去帮助仅有少量标注的图像分类的问题,如上图所示。我们的方法基于使用那些用有两个视角的数据来构建沟通两个特征空间的桥梁。虽然这些多视角数据可能不一定能够用来做分类用的训练数据,但是,它们可以用来构建翻译器。通过这个翻译器,我们把近邻算法和特征翻译结合在一起,将辅助数据翻译到源数据特征空间里去,用一个统一的语言模型进行学习与分类。 引文: . Wenyuan Dai, Yuqiang Chen, Gui-Rong Xue, Qiang Yang, and Yong Yu. Translated Learning: Transfer Learning across Different Feature Spaces. Advances in Neural Information Processing Systems 21 (NIPS 2008), Vancouver, British Columbia, Canada, December 8-13, 2008. . Xiao Ling, Wenyuan Dai, Gui-Rong Xue, Qiang Yang, and Yong Yu. Spectral Domain-Transfer Learning. In Proceedings of the Fourteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2008), Pages 488-496, Las Vegas, Nevada, USA, August 24-27, 2008. . Wenyuan Dai, Qiang Yang, Gui-Rong Xue and Yong Yu. Self-taught Clustering. In Proceedings of the Twenty-Fifth International Conference on Machine Learning (ICML 2008), pages 200-207, Helsinki, Finland, 5-9 July, 2008. . Gui-Rong Xue, Wenyuan Dai, Qiang Yang and Yong Yu. Topic-bridged PLSA for Cross-Domain Text Classification. In Proceedings of the Thirty-first International ACM SIGIR Conference on Research and Development on Information Retrieval (SIGIR2008), pages 627-634, Singapore, July 20-24, 2008. . Xiao Ling, Gui-Rong Xue, Wenyuan Dai, Yun Jiang, Qiang Yang and Yong Yu. Can Chinese Web Pages be Classified with English Data Source? In Proceedings the Seventeenth International World Wide Web Conference (WWW2008), Pages 969-978, Beijing, China, April 21-25, 2008. . Xiao Ling, Wenyuan Dai, Gui-Rong Xue and Yong Yu. Knowledge Transferring via Implicit Link Analysis. In Proceedings of the Thirteenth International Conference on Database Systems for Advanced Applications (DASFAA 2008), Pages 520-528, New Delhi, India, March 19-22, 2008. . Wenyuan Dai, Gui-Rong Xue, Qiang Yang and Yong Yu. Co-clustering based Classification for Out-of-domain Documents. In Proceedings of the Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2007), Pages 210-219, San Jose, California, USA, Aug 12-15, 2007. . Wenyuan Dai, Gui-Rong Xue, Qiang Yang and Yong Yu. Transferring Naive Bayes Classifiers for Text Classification. In Proceedings of the Twenty-Second National Conference on Artificial Intelligence (AAAI 2007), Pages 540-545, Vancouver, British Columbia, Canada, July 22-26, 2007. . Wenyuan Dai, Qiang Yang, Gui-Rong Xue and Yong Yu. Boosting for Transfer Learning. In Proceedings of the Twenty-Fourth International Conference on Machine Learning (ICML 2007), Pages 193-200, Corvallis, Oregon, USA, June 20-24, 2007. . Dikan Xing, Wenyuan Dai, Gui-Rong Xue and Yong Yu. Bridged Refinement for Transfer Learning. In Proceedings of the Eleventh European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD 2007), Pages 324-335, Warsaw, Poland, September 17-21, 2007. (Best Student Paper Award) . Xin Zhang, Wenyuan Dai, Gui-Rong Xue and Yong Yu. Adaptive Email Spam Filtering based on Information Theory. In Proceedings of the Eighth International Conference on Web Information Systems Engineering (WISE 2007), Pages 159170, Nancy, France, December 3-7, 2007. Transfer Learning (2009-10-29 03:03:46由 grxue 编辑)
个人分类: 机器学习|7215 次阅读|0 个评论

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-5-1 00:07

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部