原址: http://www.wired.com/wiredenterprise/2013/05/hinton/ Computer Brain Escapes Google’s X Lab to Supercharge Search BY ROBERT MCMILLAN 05.20.13 6:30 AM Geoffrey Hinton (right), Alex Krizhevsky, and Ilya Sutskever (left) will do machine-learning work at Google. Photo: U of T Two years ago Stanford professor Andrew Ng joined Google’s X Lab, the research group that’s given us Google Glass and the company’s driverless cars . His mission: to harness Google’s massive data centers and build artificial intelligence systems on an unprecedented scale. He ended up working with one of Google’s top engineers to build the world’s largest neural network; A kind of computer brain that can learn about reality in much the same way that the human brain learns new things. Ng’s brain watched YouTube videos for a week and taught itself which ones were about cats. It did this by breaking down the videos into a billion different parameters and then teaching itself how all the pieces fit together. But there was more. Ng built models for processing the human voice and Google StreetView images. The company quickly recognized this work’s potential and shuffled it out of X Labs and into the Google Knowledge Team. Now this type of machine intelligence — called deep learning — could shake up everything from Google Glass, to Google Image Search to the company’s flagship search engine. It’s the kind of research that a Stanford academic like Ng could only get done at a company like Google, which spends billions of dollars on supercomputer-sized data centers each year. “At the time I joined Google, the biggest neural network in academia was about 1 million parameters,” remembers Ng. “At Google, we were able to build something one thousand times bigger.” Ng stuck around until Google was well on its way to using his neural network models to improve a real-world product: its voice recognition software. But last summer, he invited an artificial intelligence pioneer named Geoffrey Hinton to spend a few months in Mountain View tinkering with the company’s algorithms. When Android’s Jellly Bean release came out last year, these algorithms cut its voice recognition error rate by a remarkable 25 percent. In March, Google acquired Hinton’s company. Now Ng has moved on (he’s running an online education company called Coursera), but Hinton says he wants to take this deep learning work to the next level. A first step will be to build even larger neural networks than the billion-node networks he worked on last year. “I’d quite like to explore neural nets that are a thousand times bigger than that,” Hinton says. “When you get to a trillion , you’re getting to something that’s got a chance of really understanding some stuff.” Hinton thinks that building neural network models about documents could boost Google Search in much the same way they helped voice recognition. “Being able to take a document and not just view it as, “It’s got these various words in it,” but to actually understand what it’s about and what it means,” he says. “That’s most of AI, if you can solve that.” Test images labeled by Hinton’s brain. Image: Geoff Hinton Hinton already has something to build on. Google’s knowledge graph : a database of nearly 600 million entities. When you search for something like “ The Empire State Building ,” the knowledge graph pops up all of that information to the right of your search results. It tells you that the building is 1,454 feet tall and was designed by William F. Lamb. Google uses the knowledge graph to improve its search results, but Hinton says that neural networks could study the graph itself and then both cull out errors and improve other facts that could be included. Image search is another promising area. “‘Find me an image with a cat wearing a hat.’ You should be able to do that fairly soon,” Hinton says. Hinton is the right guy to take on this job. Back in the 1980s he developed the basic computer models used in neural networking. Just two months ago, Google paid an undisclosed sum to acquire Hinton’s artificial intelligence company , DNNresearch , and now he’s splitting his time between his University of Toronto teaching job, and working for Jeff Dean on ways to make Google’s products smarter at the company’s Mountain View campus. In the past five years, there’s been a mini-boom in neural networking as researchers have harnessed the power of graphics processors (GPUs) to build out ever-larger neural networks that can quickly learn from extremely large sets of data. “Until recently… if you wanted to learn to recognize a cat, you had to go and label tens of thousands of pictures of cats,” says Ng. “And it was just a pain to find so many pictures of cats and label then.” Now with “unsupervised learning algorithms,” like the ones Ng used in his YouTube cat work, the machines can learn without the labeling, but to build the really large neural networks, Google had to first write code that would work on such a large number of machines, even when one of the systems in the network stopped working. It typically takes a large number of computers sifting through a large amount of data to train the neural network model. The YouTube cat model, for example, was trained on 16,000 chip cores . But once that was hammered out, it too k just 100 cores to be able to spot cats on YouTube. Google’s data centers are based on Intel Xeon processors, but the company has started to tinker with GPUs because they are so much more efficient at this neural network processing work, Hinton says. Google is even testing out a D-Wave quantum computer , a system that Hinton hopes to try out in the future. But before then, he aims to test out his trillion-node neural network. “People high up in Google I think are very committed to getting big neural networks to work very well,” he says.
文献列表(更新) 因为文献太多一次读书会不可能面面俱到,采取follow每个领域最重要的1~2个研究者的最具代表性的工作的方式,挑选出下面文章重点研读。文章前的标签代表类型,NB=神经生物学发现,CM=计算模型,ML=机器学习算法。 Overview: deep architecture in brain and machine(1次) James DiCarlo 【NB】Chris I. Baker (2004) Visual Processing in the Primate Brain. In Handbook of Psychology, Biological Psychology, Wiley. 【NB】【CM】DiCarlo JJ, Zoccolan D, Rust NC. (2012) How does the brain solve visual object recognition? Neuron, 73(3):415-34. 【CM】Cadieu CF, et al. (2013) The Neural Representation Benchmark and its Evaluation on Brain and Machine. International Conference on Learning Representations (ICLR) 2013. Early visual system (retinal ganglion cell, LGN, V1), canonical cortical circuits(1.5次) Matteo Carandini, Rodney Douglas 【NB】Matteo Carandini (2012) Area V1. Scholarpedia, 7(7):12105. http://www.scholarpedia.org/article/Area_V1 【NB】【CM】Carandini M, et al. (2005) Do we know what the early visual system does? Journal of Neuroscience, 25:10577-10597. 【NB】Douglas, RJ and Martin, KAC (2007) Recurrent neuronal circuits in the neocortex. Current Opinion in Biology, 17:496-500. 【NB】Douglas, RJ and Martin, KAC (2010) Canonical cortical circuits. Chapter 2 in Handbook of Brain Microcircuits 15-21. 【ML】Kevin Jarrett, Koray Kavukcuoglu, Marc’Aurelio Ranzato, and Yann LeCun. (2009) What is the Best Multi-Stage Architecture for Object Recognition? in Proc. International Conference on Computer Vision (ICCV’09). learning features(selectivity) sparse coding, cortical maps(0.5次) Bruno Olshausen 【CM】Olshausen, B. A., Field, D. J. (1997). Sparse coding with an overcomplete basis set: a strategy employed by V1? Vision Research, 37 (23), 3311-3325. 【CM】Bednar JA. (2012) Building a mechanistic model of the development and function of the primary visual cortex. Journal of Physiology (Paris), 106:194-211. learning transformations(invariance)(1次) Aapo Hyvarinen, Yan Karklin 【CM】Hyvarinen, A. and Hoyer, P. (2001). A two-layer sparse coding model learns simple and complex cell receptive fields and topography from natural images. Vision Research, 41(18):2413–2423. 【CM】Karklin, Y., Lewicki, M. S. (2009). Emergence of complex cell properties by learning to generalize in natural scenes. Nature, 457(7225), 83-85. 【CM】Adelson E.H. and Bergen J.R. (1985) Spatiotemporal energy models for the perception of motion. Journal Opt. Soc. Am. 【ML】Ian J. Goodfellow, Quoc V. Le, Andrew M. Saxe, Honglak Lee, and Andrew Y. Ng. (2009) Measuring invariances in deep networks. Advances in Neural Information Processing Systems (NIPS). 补充 【ML】Q.V. Le, et, al. Building high-level features using large scale unsupervised learning. ICML, 2012. V2(1次) 【NB】Lawrence C. Sincich and Jonathan C. Horton (2005) The Circuitry of V1 and V2: Integration of Color, Form, and Motion. Annu. Rev. Neurosci. 28:303–26. 【NB】Roe AW, Lu HD, Chen G (2008) Functional architecture of area V2. Encyclopedia of Neuroscience (Squire L, ed.). Elsevier, Oxford, UK. 【CM】Cadieu C.F. Olshausen B.A. (2012) Learning Intermediate-Level Representations of Form and Motion from Natural Movies. Neural Computation. 【ML】Zou, W.Y., Zhu, S., Ng, A., and Yu, K. (2012) Deep learning of invariant features via simulated fixations in video. In Advances in Neural Information Processing Systems (NIPS). 【CM】Gutmann MU Hyvarinen A (2013) A three-layer model of natural image statistics. Journal of Physiology-Paris. 专题讨论:learning mid-level features(形式待定) 【ML】Memisevic, R., Exarchakis, G. (2013) Learning invariant features by harnessing the aperture problem. International Conference on Machine Learning (ICML). 【ML】Kihyuk Sohn, Guanyu Zhou, Chansoo Lee, and Honglak Lee. (2013) Learning and Selecting Features Jointly with Point-wise Gated Boltzmann Machines。 Proceedings of the 30th International Conference on Machine Learning (ICML). 【ML】Roni Mittelman, Honglak Lee, Benjamin Kuipers, and Silvio Savarese. (2013) Weakly Supervised Learning of Mid-Level Features with Beta-Bernoulli Process Restricted Boltzmann Machines. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). V4, shape perception(1次) Charles E. Connor 【NB】Pasupathy, A. Connor, C.E. (2002) Population coding of shape in area V4. Nature Neuroscience 5: 1332-1338. 【NB】Connor, C.E. (2007) Transformation of shape information in the ventral pathway. Current Opinion in Neurobiology 17: 140-147. 【NB】【CM】Roe AW, et al. (2012) Towards a unified theory of visual area V4. Neuron 74(2):12-29. 【NB】【CM】Cadieu C, Kouh M, Pasupathy A, Connor CE, Riesenhuber M, Poggio T. (2007) A model of V4 shape selectivity and invariance. Journal of Neurophysiology, 98(3), 1733-50. IT, object face recognition(1次) Keiji Tanaka, Doris Tsao 【NB】Charles G. Gross (2008) Inferior temporal cortex. Scholarpedia, 3(12):7294. http://www.scholarpedia.org/article/Inferior_temporal_cortex 【NB】Tanaka, K. (1996). Inferotemporal cortex and object vision. Annual Review of Neuroscience, 19, 109–139. 【NB】【CM】Tsao DY, Livingstone, MS. (2008) Mechanisms for face perception. Annual Review of Neuroscience, 31: 411-438. 【NB】【CM】Tsao D.Y., Cadieu C. and Livingstone M. (2010) Object Recognition: Physiological and Computational Insights. Chapter 24 in Primate Neuroethology. Edited by M. Platt and A. Ghazanfar. Oxford University Press. 海马体,记忆,睡眠(1次) 待补充 视觉系统的发育和进化,低等动物的视觉(1次) Jon Kaas The Evolution Of The Visual System In Primates 待补充 Neuronal oscillation and synchrony(1次) Christoph von der Malsburg, Markus Siegel 【NB】Siegel M., Donner T. H., Engel A. K. (2012) Spectral fingerprints of large-scale neuronal interactions. Nature Reviews Neuroscience 13:121-134 【NB】Varela F (2001) The brainweb: Phase synchronization and large-scale integration. Nature Reviews Neuroscience 2, 229-239. 【CM】Donner T. H., Siegel M. (2011) A framework for local cortical oscillation patterns. Trends in Cognitive Sciences 15(5): 191-199 【CM】von der Malsburg C. (1999) The What and Why of Binding: The Modeler’s Perspective. Neuron, Vol. 24, 95–104.
原址: http://www.mathworks.cn/matlabcentral/fileexchange/38310-deep-learning-toolboxwatching=38310 Description PLEASE GO TO https://github.com/rasmusbergpalm/DeepLearnToolbox FOR NEWEST VERSION DeepLearnToolbox A Matlab toolbox for Deep Learning. Deep Learning is a new subfield of machine learning that focuses on learning deep hierarchical models of data. It is inspired by the human brain's apparent deep (layered, hierarchical) architecture. A good overview of the theory of Deep Learning theory is Learning Deep Architectures for AI For a more informal introduction, see the following videos by Geoffrey Hinton and Andrew Ng. The Next Generation of Neural Networks (Hinton, 2007) Recent Developments in Deep Learning (Hinton, 2010) Unsupervised Feature Learning and Deep Learning (Ng, 2011) If you use this toolbox in your research please cite: Prediction as a candidate for learning deep hierarchical models of data (Palm, 2012) Directories included in the toolbox NN/ - A library for Feedforward Backpropagation Neural Networks CNN/ - A library for Convolutional Neural Networks DBN/ - A library for Deep Belief Networks SAE/ - A library for Stacked Auto-Encoders CAE/ - A library for Convolutional Auto-Encoders util/ - Utility functions used by the libraries data/ - Data used by the examples tests/ - unit tests to verify toolbox is working For references on each library check REFS.md Required Products MATLAB MATLAB release MATLAB 7.11 (R2010b)
(1) Hinton Geoffrey: http://www.cs.toronto.edu/~hinton/ father of RBM, it's him to make the RBM trainable in practice. (2) Andrew Ng: http://ai.stanford.edu/~ang/ Great professor and great speaker. His student helped to popularize the deep belief network (3) Honglak Lee: http://web.eecs.umich.edu/~honglak/ It's him to win the best application paper award of ICML 2009. Currently he works on how to model invariance using RBM. (4) Ruslan Salakhutdinov: http://www.utstat.toronto.edu/~rsalakhu/ He is student of Prof. Hinto,and his major contribution is introduction of deep boltzmann machine. Prof.Hinto coined deep belief network. There two kinds of networks share some similarity, both belonging to deep architectures. (5) Graham Taylor: http://www.uoguelph.ca/~gwtaylor/ He is also the student of Prof. Hinton, and his major contribution is the introduction of gated boltzmann machine, which makes generate gray scale images possible. (6) Hugo Larochelle: http://www.dmi.usherb.ca/~larocheh/index_en.html Again he is Prof. Hinto's student, and his major contribution is applying RBM to model attentionla data. (7) Mark Ranzato: http://www.cs.toronto.edu/~ranzato/ He finished his Ph.D under Prof. Yann Lecun, and spent two-years' postdoc under Prof. Hinton. His contribution is introduction of one duplicate image to model covariance among neighboring pixels. (8) Roland Memisevic: http://www.iro.umontreal.ca/~memisevr/ He modeled temporal data using RBM. Now he found a faculty position in the University of Montreal. (9) Yoshua Bengio: http://www.iro.umontreal.ca/~bengioy/yoshua_en/index.html Great professor. His work of 'Learning Deep Structure for AI' is a must-read. (10) Yann Lecun: http://yann.lecun.com/ He is a legend. He disregarded CV guys.He is super smart, and his work may revolutionize object recognition. (11) Rob Fergus: http://cs.nyu.edu/~fergus/ NYU guy, who rejected when I applied for him. Anyway, a genius, I love him. (12) Kai Yu: http://www.dbs.ifi.lmu.de/~yu_k/ He inspired me why whitening doesn't make data independent. Sincere thanks to him. These professor are those who I am most familiar with. However, with emergence of deep belief network and deep boltzmann machine, there are so many other scholars. You may find a list from 2012 UCLA Deep Learning Summer School: http://www.ipam.ucla.edu/programs/gss2012/
引言: 神经网络( N eural N etwork)与支持向量机( S upport V ector M achines,SVM)是统计学习的代表方法。可以认为神经网络与支持向量机都源自于感知机(Perceptron)。感知机是1958年由Rosenblatt发明的线性分类模型。感知机对线性分类有效,但现实中的分类问题通常是非线性的。 神经网络与支持向量机(包含核方法)都是非线性分类模型。1986年,Rummelhart与McClelland发明了神经网络的学习算法 B ack P ropagation。后来,Vapnik等人于1992年提出了支持向量机。神经网络是多层(通常是三层)的非线性模型, 支持向量机利用核技巧把非线性问题转换成线性问题。 神经网络与支持向量机一直处于“竞争”关系。 Scholkopf是Vapnik的大弟子,支持向量机与核方法研究的领军人物。据Scholkopf说,Vapnik当初发明支持向量机就是想"干掉"神经网络(He wanted to kill Neural Network)。支持向量机确实很有效,一段时间支持向量机一派占了上风。 近年来,神经网络一派的大师Hinton又提出了神经网络的Deep Learning算法(2006年),使神经网络的能力大大提高,可与支持向量机一比。 Deep Learning假设神经网络是多层的,首先用Boltzman Machine(非监督学习)学习网络的结构,然后再通过Back Propagation(监督学习)学习网络的权值。 关于Deep Learning的命名,Hinton曾开玩笑地说: I want to call SVM shallow learning. (注:shallow 有肤浅的意思)。其实Deep Learning本身的意思是深层学习,因为它假设神经网络有多层。 总之,Deep Learning是值得关注的统计学习新算法。 深度学习(Deep Learning) 是ML研究中的一个新的领域,它被引入到ML中使ML更接近于其原始的目标:AI。查看 a brief introduction to Machine Learning for AI 和 an introduction to Deep Learning algorithms . 深度学习是关于学习多个表示和抽象层次,这些层次帮助解释数据,例如图像,声音和文本。 对于更多的关于深度学习算法的知识,可以参看: The monograph or review paper Learning Deep Architectures for AI (Foundations Trends in Machine Learning, 2009). The ICML 2009 Workshop on Learning Feature Hierarchies webpage has a list of references . The LISA public wiki has a reading list and a bibliography . Geoff Hinton has readings from last year’s NIPS tutorial . 这篇综述主要是介绍一些最重要的深度学习算法,并将演示如何用 Theano 来运行它们。 Theano是一个python库,使得写深度学习模型更加容易,同时也给出了一些关于在GPU上训练它们的选项。 这个算法的综述有一些先决条件。首先你应该知道一个关于python的知识,并熟悉numpy。由于这个综述是关于如何使用Theano,你应该先阅读 Theano basic tutorial 。一旦你完成这些,阅读我们的 Getting Started 章节---它将介绍概念定义,数据集,和利用随机梯度下降来优化模型的方法。 纯有监督学习算法可以按照以下顺序阅读: Logistic Regression - using Theano for something simple Multilayer perceptron - introduction to layers Deep Convolutional Network - a simplified version of LeNet5 无监督和半监督学习算法可以用任意顺序阅读(auto-encoders可以被独立于RBM/DBM地阅读): Auto Encoders, Denoising Autoencoders - description of autoencoders Stacked Denoising Auto-Encoders - easy steps into unsupervised pre-training for deep nets Restricted Boltzmann Machines - single layer generative RBM model Deep Belief Networks - unsupervised generative pre-training of stacked RBMs followed by supervised fine-tuning 关于mcRBM模型,也有一篇新的关于从能量模型中抽样的综述: HMC Sampling - hybrid (aka Hamiltonian) Monte-Carlo sampling with scan() 上文翻译自 http://deeplearning.net/tutorial/ 深度学习是 机器学习 研究中的一个新的领域,其动机在于建立、模拟人脑进行分析学习的神经网络,它模仿人脑的机制来解释数据,例如图像,声音和文本。深度学习是 无监督学习 的一种。 深度学习的概念源于 人工神经网络 的研究。含多隐层的多层感知器就是一种深度学习结构。深度学习通过组合低层特征形成更加抽象的高层表示属性类别或特征,以发现数据的分布式特征表示。 深度学习的概念由Hinton等人于2006年提出。基于深信度网(DBN)提出非监督贪心逐层训练算法,为解决深层结构相关的优化难题带来希望,随后提出多层自动编码器深层结构。此外Lecun等人提出的卷积神经网络是第一个真正多层结构学习算法,它利用空间相对关系减少参数数目以提高训练性能。 一、Deep Learning的前世今生 图灵在 1950 年的论文里,提出图灵试验的设想,即,隔墙对话,你将不知道与你谈话的,是人还是电脑 。 这无疑给计算机,尤其是人工智能,预设了一个很高的期望值。但是半个世纪过去了,人工智能的进展,远远没有达到图灵试验的标准。这不仅让多年翘首以待的人们,心灰意冷,认为人工智能是忽悠,相关领域是“伪科学”。 2008 年 6 月,“连线”杂志主编,Chris Anderson 发表文章,题目是 “理论的终极,数据的泛滥将让科学方法过时”。并且文中还引述经典著作 “人工智能的现代方法”的合著者,时任 Google 研究总监的 Peter Norvig 的言论,说 “一切模型都是错的。进而言之,抛弃它们,你就会成功” 。 言下之意,精巧的算法是无意义的。面对海量数据,即便只用简单的算法,也能得到出色的结果。与其钻研算法,不如研究云计算,处理大数据。 如果这番言论,发生在 2006 年以前,可能我不会强力反驳。但是自 2006 年以来,机器学习领域,取得了突破性的进展。 图灵试验,至少不是那么可望而不可即了。至于技术手段,不仅仅依赖于云计算对大数据的并行处理能力,而且依赖于算法。这个算法就是,Deep Learning。 借助于 Deep Learning 算法,人类终于找到了如何处理 “抽象概念”这个亘古难题的方法。 于是学界忙着延揽相关领域的大师。Alex Smola 加盟 CMU,就是这个背景下的插曲。悬念是 Geoffrey Hinton 和 Yoshua Bengio 这两位牛人,最后会加盟哪所大学。 Geoffrey Hinton 曾经转战 Cambridge、CMU,目前任教University of Toronto。相信挖他的名校一定不少。 Yoshua Bengio 经历比较简单,McGill University 获得博士后,去 MIT 追随 Mike Jordan 做博士后。目前任教 University of Montreal。 Deep Learning 引爆的这场革命,不仅学术意义巨大,而且离钱很近,实在太近了。如果把相关技术难题比喻成一座山,那么翻过这座山,山后就是特大露天金矿。技术难题解决以后,剩下的事情,就是动用资本和商业的强力手段,跑马圈地了。 于是各大公司重兵集结,虎视眈眈。Google 兵分两路,左路以 Jeff Dean 和 Andrew Ng 为首,重点突破 Deep Learning 等等算法和应用 。 Jeff Dean 在 Google 诸位 Fellows 中,名列榜首,GFS 就是他的杰作。Andrew Ng 本科时,就读 CMU,后来去 MIT 追随 Mike Jordan。Mike Jordan 在 MIT 人缘不好,后来愤然出走 UC Berkeley。Andrew Ng 毫不犹豫追随导师,也去了 Berkeley。拿到博士后,任教 Stanford,是 Stanford 新生代教授中的佼佼者,同时兼职 Google。 Google 右路军由 Amit Singhal 领军,目标是构建 Knowledge Graph 基础设施。 1996 年 Amit Singhal 从 Cornell University 拿到博士学位后,去 Bell Lab 工作,2000 年加盟 Google。据说他去 Google 面试时,对 Google 创始人 Sergey Brian 说,“Your engine is excellent, but let me rewirte it!” 换了别人,说不定一个大巴掌就扇过去了。但是 Sergey Brian 大人大量,不仅不怪罪小伙子的轻狂,反而真的让他从事新一代排名系统的研发。Amit Singhal 目前任职 Google 高级副总裁,掌管 Google 最核心的业务,搜索引擎。 Google 把王牌中之王牌,押宝在 Deep Learning 和 Knowledge Graph 上,目的是更快更大地夺取大数据革命的胜利果实。 Reference Turing Test. http://en.wikipedia.org/wiki/Turing_test The End of Theory: The Data Deluge Makes the Scientific Method Obsolete http://www.wired.com/science/discoveries/magazine/16-07/pb_theory Introduction to Deep Learning. http://en.wikipedia.org/wiki/Deep_learning Interview with Amit Singhal, Google Fellow. http://searchengineland.com/interview-with-amit-singhal-google-fellow-121342 原文链接: http://blog.sina.com.cn/s/blog_46d0a3930101fswl.html 作者微博: http://weibo.com/kandeng#1360336038853 二、Deep Learning的基本思想和方法 实际生活中,人们为了解决一个问题,如对象的分类(对象可是是文档、图像等),首先必须做的事情是如何来表达一个对象,即必须抽取一些特征来表示一个对象,如文本的处理中,常常用词集合来表示一个文档,或把文档表示在向量空间中(称为VSM模型),然后才能提出不同的分类算法来进行分类;又如在图像处理中,我们可以用像素集合来表示一个图像,后来人们提出了新的特征表示,如SIFT,这种特征在很多图像处理的应用中表现非常良好,特征选取得好坏对最终结果的影响非常巨大。因此,选取什么特征对于解决一个实际问题非常的重要。 然而,手工地选取特征是一件非常费力、启发式的方法,能不能选取好很大程度上靠经验和运气;既然手工选取特征不太好,那么能不能自动地学习一些特征呢?答案是能!Deep Learning就是用来干这个事情的,看它的一个别名Unsupervised Feature Learning,就可以顾名思义了,Unsupervised的意思就是不要人参与特征的选取过程。因此,自动地学习特征的方法,统称为Deep Learning。 1)Deep Learning的基本思想 假设我们有一个系统S,它有n层(S1,…Sn),它的输入是I,输出是O,形象地表示为: I =S1=S2=…..=Sn = O,如果输出O等于输入I,即输入I经过这个系统变化之后没有任何的信息损失,保持了不变,这意味着输入I经过每一层Si都没有任何的信息损失,即在任何一层Si,它都是原有信息(即输入I)的另外一种表示。现在回到我们的主题Deep Learning,我们需要自动地学习特征,假设我们有一堆输入I(如一堆图像或者文本),假设我们设计了一个系统S(有n层),我们通过调整系统中参数,使得它的输出仍然是输入I,那么我们就可以自动地获取得到输入I的一系列层次特征,即S1,…, Sn。 另外,前面是假设输出严格地等于输入,这个限制太严格,我们可以略微地放松这个限制,例如我们只要使得输入与输出的差别尽可能地小即可,这个放松会导致另外一类不同的Deep Learning方法。上述就是Deep Learning的基本思想。 2)Deep Learning的常用方法 a). AutoEncoder 最简单的一种方法是利用人工神经网络的特点,人工神经网络(ANN)本身就是具有层次结构的系统,如果给定一个神经网络,我们假设其输出与输入是相同的,然后训练调整其参数,得到每一层中的权重,自然地,我们就得到了输入I的几种不同表示(每一层代表一种表示),这些表示就是特征,在研究中可以发现,如果在原有的特征中加入这些自动学习得到的特征可以大大提高精确度,甚至在分类问题中比目前最好的分类算法效果还要好!这种方法称为AutoEncoder。当然,我们还可以继续加上一些约束条件得到新的Deep Learning方法,如如果在AutoEncoder的基础上加上L1的Regularity限制(L1主要是约束每一层中的节点中大部分都要为0,只有少数不为0,这就是Sparse名字的来源),我们就可以得到Sparse AutoEncoder方法。 b). Sparse Coding 如果我们把输出必须和输入相等的限制放松,同时利用线性代数中基的概念,即O = w1*B1 + W2*B2+….+ Wn*Bn, Bi是基,Wi是系数,我们可以得到这样一个优化问题: Min |I – O| 通过求解这个最优化式子,我们可以求得系数Wi和基Bi,这些系数和基础就是输入的另外一种近似表达,因此,它们可以特征来表达输入I,这个过程也是自动学习得到的。如果我们在上述式子上加上L1的Regularity限制,得到: Min |I – O| + u*(|W1| + |W2| + … + |Wn|) 这种方法被称为Sparse Coding。 c) Restrict Boltzmann Machine (RBM) 假设有一个二部图,每一层的节点之间没有链接,一层是可视层,即输入数据层( v ),一层是隐藏层( h ),如果假设所有的节点都是二值变量节点(只能取0或者1值),同时假设全概率分布p( v, h )满足Boltzmann 分布,我们称这个模型是Restrict Boltzmann Machine (RBM)。下面我们来看看为什么它是Deep Learning方法。首先,这个模型因为是二部图,所以在已知 v 的情况下,所有的隐藏节点之间是条件独立的,即p( h | v ) =p( h 1| v )…..p( h n| v )。同理,在已知隐藏层 h 的情况下,所有的可视节点都是条件独立的,同时又由于所有的v和h满足Boltzmann 分布,因此,当输入 v 的时候,通过p( h | v ) 可以得到隐藏层 h ,而得到隐藏层 h 之后,通过p(v|h) 又能得到可视层,通过调整参数,我们就是要使得从隐藏层得到的可视层 v1 与原来的可视层 v 如果一样,那么得到的隐藏层就是可视层另外一种表达,因此隐藏层可以作为可视层输入数据的特征,所以它就是一种Deep Learning方法。 如果,我们把隐藏层的层数增加,我们可以得到Deep Boltzmann Machine (DBM);如果我们在靠近可视层的部分使用贝叶斯信念网络(即有向图模型,当然这里依然限制层中节点之间没有链接),而在最远离可视层的部分使用Restrict Boltzmann Machine,我们可以得到Deep Belief Net (DBN) 。 当然,还有其它的一些Deep Learning 方法,在这里就不叙述了。总之,Deep Learning能够自动地学习出数据的另外一种表示方法,这种表示可以作为特征加入原有问题的特征集合中,从而可以提高学习方法的效果,是目前业界的研究热点。 原文链接: http://blog.csdn.net/xianlingmao/article/details/8478562 三、深度学习(Deep Learning)算法简介 查看最新论文 Yoshua Bengio, Learning Deep Architectures for AI, Foundations and Trends in Machine Learning, 2(1), 2009 深度(Depth) 从一个输入中产生一个输出所涉及的计算可以通过一个流向图(flow graph)来表示:流向图是一种能够表示计算的图,在这种图中每一个节点表示一个基本的计算并且一个计算的值(计算的结果被应用到这个节点的孩子节点的值)。考虑这样一个计算集合,它可以被允许在每一个节点和可能的图结构中,并定义了一个函数族。输入节点没有孩子,输出节点没有父亲。 对于表达 的流向图,可以通过一个有两个输入节点 和 的图表示,其中一个节点通过使用 和 作为输入(例如作为孩子)来表示 ;一个节点仅使用 作为输入来表示平方;一个节点使用 和 作为输入来表示加法项(其值为 );最后一个输出节点利用一个单独的来自于加法节点的输入计算SIN。 这种流向图的一个特别属性是深度(depth):从一个输入到一个输出的最长路径的长度。 传统的前馈神经网络能够被看做拥有等于层数的深度(比如对于输出层为隐层数加1)。SVMs有深度2(一个对应于核输出或者特征空间,另一个对应于所产生输出的线性混合)。 深度架构的动机 学习基于深度架构的学习算法的主要动机是: 不充分的深度是有害的; 大脑有一个深度架构; 认知过程是深度的; 不充分的深度是有害的 在许多情形中深度2就足够(比如logical gates, formal neurons, sigmoid-neurons, Radial Basis Function units like in SVMs)表示任何一个带有给定目标精度的函数。但是其代价是:图中所需要的节点数(比如计算和参数数量)可能变的非常大。理论结果证实那些事实上所需要的节点数随着输入的大小指数增长的函数族是存在的。这一点已经在logical gates, formal neurons 和rbf单元中得到证实。在后者中Hastad说明了但深度是d时,函数族可以被有效地(紧地)使用O(n)个节点(对于n个输入)来表示,但是如果深度被限制为d-1,则需要指数数量的节点数O(2^n)。 我们可以将深度架构看做一种因子分解。大部分随机选择的函数不能被有效地表示,无论是用深地或者浅的架构。但是许多能够有效地被深度架构表示的却不能被用浅的架构高效表示(see the polynomials example in the Bengio survey paper )。一个紧的和深度的表示的存在意味着在潜在的可被表示的函数中存在某种结构。如果不存在任何结构,那将不可能很好地泛化。 大脑有一个深度架构 例如,视觉皮质得到了很好的研究,并显示出一系列的区域,在每一个这种区域中包含一个输入的表示和从一个到另一个的信号流(这里忽略了在一些层次并行路径上的关联,因此更复杂)。这个特征层次的每一层表示在一个不同的抽象层上的输入,并在层次的更上层有着更多的抽象特征,他们根据低层特征定义。 需要注意的是大脑中的表示是在中间紧密分布并且纯局部:他们是稀疏的:1%的神经元是同时活动的。给定大量的神经元,任然有一个非常高效地(指数级高效)表示。 认知过程看起来是深度的 人类层次化地组织思想和概念; 人类首先学习简单的概念,然后用他们去表示更抽象的; 工程师将任务分解成多个抽象层次去处理; 学习/发现这些概念(知识工程由于没有反省而失败?)是很美好的。对语言可表达的概念的反省也建议我们一个稀疏的表示:仅所有可能单词/概念中的一个小的部分是可被应用到一个特别的输入(一个视觉场景)。 学习深度架构的突破 2006年前,尝试训练深度架构都失败了:训练一个深度有监督前馈神经网络趋向于产生坏的结果(同时在训练和测试误差中),然后将其变浅为1(1或者2个隐层)。 2006年的3篇论文改变了这种状况,由Hinton的革命性的在深度信念网(Deep Belief Networks, DBNs)上的工作所引领: Hinton, G. E., Osindero, S. and Teh, Y., A fast learning algorithm for deep belief nets .Neural Computation 18:1527-1554, 2006 Yoshua Bengio, Pascal Lamblin, Dan Popovici and Hugo Larochelle, Greedy Layer-Wise Training of Deep Networks , in J. Platt et al. (Eds), Advances in Neural Information Processing Systems 19 (NIPS 2006), pp. 153-160, MIT Press, 2007 Marc’Aurelio Ranzato, Christopher Poultney, Sumit Chopra and Yann LeCun Efficient Learning of Sparse Representations with an Energy-Based Model , in J. Platt et al. (Eds), Advances in Neural Information Processing Systems (NIPS 2006), MIT Press, 2007 在这三篇论文中以下主要原理被发现: 表示的无监督学习被用于(预)训练每一层; 在一个时间里的一个层次的无监督训练,接着之前训练的层次。在每一层学习到的表示作为下一层的输入; 用无监督训练来调整所有层(加上一个或者更多的用于产生预测的附加层); DBNs在每一层中利用用于表示的无监督学习RBMs。Bengio et al paper 探讨和对比了RBMs和auto-encoders(通过一个表示的瓶颈内在层预测输入的神经网络)。Ranzato et al paper在一个convolutional架构的上下文中使用稀疏auto-encoders(类似于稀疏编码)。Auto-encoders和convolutional架构将在以后的课程中讲解。 从2006年以来,大量的关于深度学习的论文被发表,一些探讨了其他原理来引导中间表示的训练,查看 Learning Deep Architectures for AI 原文链接: http://www.cnblogs.com/ysjxw/archive/2011/10/08/2201782.html 四、拓展学习推荐 Deep Learning 经典阅读材料: The monograph or review paper Learning Deep Architectures for AI (Foundations Trends in Machine Learning, 2009). The ICML 2009 Workshop on Learning Feature Hierarchies webpage has a list of references . The LISA public wiki has a reading list and a bibliography . Geoff Hinton has readings from last year’s NIPS tutorial . Deep Learning工具—— Theano : Theano 是deep learning的Python库,要求首先熟悉Python语言和numpy,建议读者先看 Theano basic tutorial ,然后按照 Getting Started 下载相关数据并用gradient descent的方法进行学习。 学习了Theano的基本方法后,可以练习写以下几个算法: 有监督学习: Logistic Regression - using Theano for something simple Multilayer perceptron - introduction to layers Deep Convolutional Network - a simplified version of LeNet5 无监督学习: Auto Encoders, Denoising Autoencoders - description of autoencoders Stacked Denoising Auto-Encoders - easy steps into unsupervised pre-training for deep nets Restricted Boltzmann Machines - single layer generative RBM model Deep Belief Networks - unsupervised generative pre-training of stacked RBMs followed by supervised fine-tuning 最后呢,推荐给大家基本ML的书籍: Chris Bishop, “Pattern Recognition and Machine Learning”, 2007 Simon Haykin, “Neural Networks: a Comprehensive Foundation”, 2009 (3rd edition) Richard O. Duda, Peter E. Hart and David G. Stork, “Pattern Classification”, 2001 (2nd edition) 原文链接: http://blog.csdn.net/abcjennifer/article/details/7826917 五、应用实例 1、计算机视觉 ImageNet Classification with Deep Convolutional Neural Networks , Alex Krizhevsky, Ilya Sutskever, Geoffrey E Hinton, NIPS 2012. Learning Hierarchical Features for Scene Labeling , Clement Farabet, Camille Couprie, Laurent Najman and Yann LeCun, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013. Learning Convolutional Feature Hierachies for Visual Recognition , Koray Kavukcuoglu, Pierre Sermanet, Y-Lan Boureau, Karol Gregor, Michaeuml;l Mathieu and Yann LeCun, Advances in Neural Information Processing Systems (NIPS 2010), 23, 2010. 2、语音识别 微软研究人员通过与hintion合作,首先将RBM和DBN引入到语音识别声学模型训练中,并且在大词汇量语音识别系统中获得巨大成功,使得语音识别的错误率相对减低30%。但是,DNN还没有有效的并行快速算法,目前,很多研究机构都是在利用大规模数据语料通过GPU平台提高DNN声学模型的训练效率。 在国际上,IBM、google等公司都快速进行了DNN语音识别的研究,并且速度飞快。 国内方面,科大讯飞、百度、中科院自动化所等公司或研究单位,也在进行深度学习在语音识别上的研究。 3、自然语言处理等其他领域 很多机构在开展研究,但目前深度学习在自然语言处理方面还没有产生系统性的突破。 六、参考链接: 1. http://baike.baidu.com/view/9964 ... enter=deep+learning 2. http://www.cnblogs.com/ysjxw/archive/2011/10/08/2201819.html 3. http://blog.csdn.net/abcjennifer/article/details/7826917 本文转载来自 http://elevencitys.com/?p=1854 Stanford大学的Deep Learning 和 tutorial: http://ufldl.stanford.edu/wiki/index.php/UFLDL_Tutorial
ICLR 2013 : International Conference on Learning Representations Link: https://sites.google.com/site/representationlearning2013/ When May 2, 2013 - May 4, 2013 Where Scottsdale, Arizona Submission Deadline Jan 15, 2013 Notification Due Mar 15, 2013 Call For Papers Overview ------------- It is well understood that the performance of machine learning methods is heavily dependent on the choice of data representation (or features) on which they are applied. The rapidly developing field of representation learning is concerned with questions surrounding how we can best learn meaningful and useful representations of data. We take a broad view of the field, and include in it topics such as deep learning and feature learning, metric learning, kernel learning, compositional models, non-linear structured prediction, and issues regarding non-convex optimization. Despite the importance of representation learning to machine learning and to application areas such as vision, speech, audio and NLP, there is currently no common venue for researchers who share a common interest in this topic. The goal of ICLR is to help fill this void. A non-exhaustive list of relevant topics: - unsupervised representation learning - supervised representation learning - metric learning and kernel learning - dimensionality expansion, sparse modeling - hierarchical models - optimization for representation learning - implementation issues, parallelization, software platforms, hardware - applications in vision, audio, speech, and natural language processing, robotics and neuroscience. - other applications Submission Process ------------------------------ ICLR2013 will use a novel publication model that will proceed as follows: - Authors post their submissions on arXiv and send us a link to the paper. A separate, permanent website will be setup to handle the reviewing process, to publish the reviews and comments, and to maintain links to the papers. - The ICLR program committee designates anonymous reviewers as usual. - The submitted reviews are published without the name of the reviewer, but with an indication that they are the designated reviews. Anyone can write and publish comments on the paper (non anonymously). Anyone can ask the program chairs for permission to become an anonymous designated reviewer (open bidding). The program chairs have ultimate control over the publication of each anonymous review. Open commenters will have to use their real name, linked with their Google Scholar profile. - Authors can post comments in response to reviews and comments. They can revise the paper as many time as they want, possibly citing some of the reviews. - On March 15th 2013, the ICLR program committee will consider all submitted papers, comments, and reviews and will decide which papers are to be presented at the conference as oral or poster. Although papers can be modified after that date, there is no guarantee that the modifications will be taken into account by the committee. - The best of the accepted papers (the top 25%-50%) will be given oral presentations at the conference. We have made arrangements for revised versions of selected papers from the conference to be published in a JMLR special topic issue. - The other papers will be considered non-archival (like workshop presentations), and could be submitted elsewhere (modified or not), although the ICLR site will maintain the reviews, the comments, and the links to the arXiv versions. Invited Speakers ------------------------ Jeff Bilmes (U. Washington) Jason Eisner (JHU) Geoffrey Hinton (U. Toronto) Ruslan Salakhutdinov (U. Toronto) Max Welling (U.Amsterdam) Alan Yuille (UCLA) General Chairs --------------------- Yoshua Bengio, Université de Montreal Yann LeCun, New York University Program Chairs ----------------------- Aaron Courville, Université de Montreal Rob Fergus, New York University Chris Manning, Stanford University Contact ----------- The organizers can be contacted at: iclr2013.programchairs@gmail.com Related CFPs ECML-PKDD 2013 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases ICML 2013 The 30th International Conference on Machine Learning MLDM 2013 International Conference on Machine Learning and Data Mining ICMLC 2013 2013 5th International Conference on Machine Learning and Computing UAI 2013 29th Conference on Uncertainty in Artificial Intelligence CIKM 2013 ACM Conference of Information and Knowledge Management
If you cannot watch it now, please save the link for future... http://www.flixxy.com/hubble-ultra-deep-field-3d.htm A Blog by 王元君 (发表于2009-7-15 18:23:39) has a few pictures from the above link, and his thoughts... http://www.sciencenet.cn/m/user_content.aspx?id=243699