科学网

 找回密码
  注册

tag 标签: 计算机程序

相关帖子

版块 作者 回复/查看 最后发表

没有相关内容

相关日志

程序是函数的表达方式
热度 5 accsys 2015-6-21 06:23
程序是函数的表达方式 姜咏江 计算机的普遍应用让人们大开了眼界。使用计算机要编写程序,程序从数学的角度讲是什么?一句话,程序就是函数。 数学将世界当中的因果关系抽象为函数。经典的中学数学中,认为函数有三种基本的表示方法,那就是列表、图像和解析式。现在应该添加上程序表达方式才好。其实在计算机的使用中,我们已经将各种经典的函数用计算机程序表达出来了,因而强调程序是函数的第四种表达方式对发展学生的智力非常重要。 程序表达方式的函数比经典的函数表达方式更精确,更完整,更具有一般性。程序的输入集合就是函数的定义域,输出集合就是值域,程序本身就是函数关系。离开中学数学教学多年了,不知现在中学课程是否有相似的论述? 2015-6-21
个人分类: 随笔|3353 次阅读|13 个评论
[转载]一个计算机程序员眼中的DNA天书
jianhuihong 2015-5-10 11:39
http://ds9a.nl/amazing-dna/ DNA seen through the eyes of a coder or If you are a hammer, everything looks like a nail `This is one of the coolest things I've read in a while.' -- jwz This is just some rambling by a computer programmer about DNA. I'm not a molecular geneticist. If you spot the inevitable mistakes, please mail me ( bert hubert ) at ahu@ds9a.nl . I'm not trying to force my view unto the DNA - each observation here is quite 'uncramped'. To see where I got all this from, head to the bibliography . Quick links: The source code , Position Independent Code , Conditional compilation , Epigenetics , Dead code, bloat, comments ('junk dna') , fork() and fork bombs ('tumors') , Mirroring, failover , Cluttered APIs, dependency hell , Viruses, worms , Central Dogma , Binary patching aka 'Gene therapy' , Bug Regression , Reed-Solomon codes: 'Forward Error Correction' , Holy Code , Framing errors: start and stop bits , Massive multiprocessing: each cell is a universe , Self hosting bootstrapping , The Makefile , Further reading . Updates24th of February 2013: Added a bit on epigenetics, updated the font, small updates here and there. 3rd of January 2008: A lot of updates are arriving since this page was linked on Reddit.com , I'm currently evaluating and merging the suggested changes. Please do keep sending updates! 23rd of September 2006: Small update on the number of genes. Some other updates have been sent to me over the past four years, and I'll try to work them in to the page. 16th of June 2002: Added tiny piece on the halting problem and cancer. I think this is a new insight, but I'm not sure. On the todolist: Code reuse through alternative splicing. 18th of May 2002: In the meantime some people who *are* geneticists have read this and have spotted and fixed some, but not many, mistakes. I recently added information on the cell as a state machine and on forking and forkbombs. 24th of May 2002: Some clarifications from the great people on #bioinformatics on OPN. Added a bunch of pictures to lighten up the page. Added piece on the Central Dogma . The source code Is here . This not a joke. We can wonder about the license though. Maybe we should ask the walking product of this source: Craig Venter . The source can be viewed via a wonderful set of perl scripts called ' Ensembl '. The human genome is about 3 gigabases long, which boils down to 750 megabytes. Depressingly enough, this is only 2.8 Mozilla browsers . DNA is not like C source but more like byte-compiled code for a virtual machine called 'the nucleus'. It is very doubtful that there is a source to this byte compilation - what you see is all you get. The language of DNA is digital, but not binary. Where binary encoding has 0 and 1 to work with (2 - hence the 'bi'nary), DNA has 4 positions, T, C, G and A. Whereas a digital byte is mostly 8 binary digits, a DNA 'byte' (called a 'codon') has three digits. Because each digit can have 4 values instead of 2, an DNA codon has 64 possible values, compared to a binary byte which has 256. A typical example of a DNA codon is 'GCC', which encodes the amino acid Alanine. A larger number of these amino acids combined are called a 'polypeptide' or 'protein', and these are chemically active in making a living being. See also http://www.ultranet.com/~jkimball/BiologyPages/C/Codons.html . Position Independent Code Dynamically linked libraries (.so under Unix, .dll on Microsoft) code cannot use static addresses internally because the code may appear in different places in memory in different situations. DNA has this too, where it is called 'transposing code': Nearly half of the human genome is composed of transposable elements or jumping DNA. First recognized in the 1940s by Dr. Barbara McClintock in studies of peculiar inheritance patterns found in the colors of Indian corn, jumping DNA refers to the idea that some stretches of DNA are unstable and transposable, ie., they can move around -- on and between chromosomes. http://www.ornl.gov/hgmis/resource/people.html Conditional compilation Of the 20,000 to 30,000 genes now thought to be in the human genome, most cells express only a very small part - which makes sense, a liver cell has little need for the DNA code that makes neurons. But as almost all cells carry around a full copy ('distribution') of the genome, a system is needed to #ifdef out stuff not needed. And that is just how it works. The genetic code is full of #if/#endif statements. This is why ' stem cells ' are so hot right now - these cells have the ability to differentiate into everything. The code hasn't been #ifdeffed out yet, so to speak. Stated more exactly, stem cells do not have everything turned on - they are not at once liver cells and neurons. Cells can be likened to state machines, starting out as a stem cell. Over the lifetime of the cell, during which time it may clone ('fork()') many times, it specializes. Each specialization can be regarded as chosing a branch in a tree. Each cell can make (or be induced to make) decisions about its future, which each make it more specialized. These decisions are persistent over cloning using transcription factors and by modifying the way DNA is stored spacially ('steric effects'). A liver cell, although it carries the genes to do so, will generally not be able to function as a skin cell. There are some indications out there that it is possible to 'breed' cells 'upwards' into the hierarchy, making them pluripotent. See also this article . Epigenetics imprinting: runtime binary patching Although the actual relevant changes in the DNA of an organism rarely occur within a generation, substantial tinkering goes on by activating or deactivating parts of our genome, without altering the actual code. This can be compared to the Linux kernel, which at boot time discovers what CPU it is running on, and actually disables parts of its binary code in case (for example) it is running on a single CPU system. This goes beyond something like if(numcpus 1), it is the actual nopping out of locking. Crucially, this nopping occurs in memory and not on the disk based image. Similarly, as an embryo develops in the mother's womb, its DNA is edited substantially to reduce its growth rate, and the size of the placenta. In such a way, the competing interests of the father ('large strong children') and the mother ('survive pregnancy') are balanced. Such 'imprinting' can only happen within the mother, since the father's genome doesn't know anything about the size of the mother. Recently, it is also becoming clear that the metabolic status of the parents influences the chances of long life, cancer and diabetes in their grandchildren . This also makes sense, as surviving in a food poor climate may require a different metabolic strategy than in one where food is abundantly available. Mechanisms behind epigenetics and imprinting are 'methylation', which attaches methyl groups to DNA to 'flip' their activation status, but also histone modification, which can curl up DNA so it is not activated. Some of these DNA edits are heritable and passed on to children, other forms may only impact one animal. This field is still developing rapidly, and it may be that our DNA is much more dynamic than originally thought. Dead code, bloat, comments ('junk dna') The genome is littered with old copies of genes and experiments that went wrong somewhere in the recent past - say, the last half a million years. This code is there but inactive. These are called the 'pseudo genes'. Furthermore, 97% of your DNA is commented out. DNA is linear and read from start to end. The parts that should not be decoded are marked very clearly, much like C comments. The 3% that is used directly form the so called 'exons'. The comments, that come 'inbetween' are called 'introns'. These comments are fascinating in their own right. Like C comments they have a start marker, like /*, and a stop marker, like */. But they have some more structure. Remember that DNA is like a tape - the comments need to be snipped out physically! The start of a comment is almost always indicated by the letters 'GT', which thus corresponds to /*, the end is signalled by 'AG', which is then like */. However because of the snipping, some glue is needed to connect the code before the comment to the code after, which makes the comments more like html comments, which are longer: '!--' signifies the start, '--' the end. So an actual stretch of DNA with exons and introns might look like this: ACTUAL CODE!-- blah blah blah blah ---- blah --ACTUAL CODE | | | | | | exon 1 donor intron 1 branch acceptor exon 2 (start of comment) (end of comment) The start of the comment is clear, which is then followed by a lot of non-coding DNA. Somewhere very near the end of the comment there is a 'branch site', which indicates that the comment will end soon. Then some more comment follows, and then the actual terminator. The actual cutting of the comments happens after the DNA has been transcribed into RNA and is performed by looping the comment and bringing the pieces of actual code close together. Then the RNA is cut at the 'branch site' near the end of the comment, after which the 'donor' (comment start) and 'acceptor' (comment end) are connected to each other. Now, what are these comments good for? That discussion is part of a holy war that can rival the vi/emacs one. When comparing different species, we know that some introns show fewer code changes than the neighboring exons. This suggests that the comments are doing something important. There are lots of possible explanations for the massive amount of non-coding DNA - one of the most appealing (to a coder) has to do with 'folding propensity'. DNA needs to be stored in a highly coiled form, but not all DNA codes lend themselves well to this. This may remind you of RLL or MFM coding. On a hard disk, a bit is encoded by a polarity transition or the lack thereof. A naive encoding would encode a 0 as 'no transition' and 1 as 'a transition'. Encoding 000000 is easy - just keep the magnetic phase unchanged for a few micrometers. However, when decoding, uncertainty creaps in - how many micrometers did we read? Does this correspond to 6 zeroes or 5? To prevent this problem, data is treated such that these long stretches of no transitions do not occur. If we see 'no transition,no transition,transition,transition' on disk, we can be sure that this corresponds to '0011' - it is exceedingly unlikely that our reading process is so imprecise that this might correspond to '00011' or '00111'. So we need to insert spacers so as to prevent too little transitions. This is called 'Run Lenght Limiting' on magnetic media. The thing to note is that sometimes, transitions need to be inserted to make sure that the data can be stored reliably. Introns may do much the same thing by making sure that the resulting code can be coiled properly. However, this area of molecular biology is a minefield! Huge diatribes rage about variants with exciting names like 'introns early' or 'introns late', and massive words like 'folding propensity' and 'stem-loop potential'. I think it best to let this discussion rage on a bit. A fascinating link of uncertain scientific value is http://post.queensu.ca/~forsdyke/introns.htm . 2013 Update: ten years on, the debate still hasn't settled! It is very clear that 'junk dna' is a misnommer, but as to its immediate function, there is no consensus. Check out Fighting about ENCODE and junk for a discussion of where we stand. fork() and fork bombs ('tumors') Like with unix, cells are not 'spawned' - they are forked. All cells started out from your ovum which has forked itself many times since. Like processes, both halves of the fork() are (mostly) identical to begin with, but they may from then on decide to do different things. As with unix, great problems arise when cells keep on forking. They quickly exhaust resources, sometimes leading to death. This is called a tumor. The cell is riddled with 'ulimits' and 'watchdogs' to prevent this sort of thing from happening. The number of divisions is limited by Telomere shortening , for example. A cell cannot clone unless very stringent conditions are met - a ' secure by default ' configuration. It is only when these safeguards fail that tumors can grow. Like with computer security, it is hard to strike a balance between security ('no cells can divide') and usability. Compare this to the well known Halting Problem , first described by the founder of Computer Science, Alan Turing . Perhaps it is as impossible to predict if a program will ever finish as it is to create a functional genome that cannot get cancer? Mirroring, failover Each DNA Helix is redundant in itself - you can see the genome as a twisted ladder whereby each spoke contains two bases - hence the word 'basepair'. If one of these bases is missing, it can be derived from the one on the other side. T always binds to A, C always to G. So, we can state that the genome is mirrored within the helix. 'RAID-1' so to speak. Furthermore, there are two copies of each chromosome present - one from each parent, with the notable exception of the Y chromosome, which is only present in males. The actual details are complicated - but most genes are thus present twice. In case one is broken or unusefully mutated, the other independent copy is still there. This is what we would normally call 'failover'. Cluttered APIs, dependency hell As proteins interact in the cell, they rely on eachothers' characteristics. It has just been shown that proteins that interact with a lot of other proteins cannot evolve, or at least, only do so at a very slow rate. See Nature, 28 June 2001, and M. Kimura, T. Ohta, Science, 26 April 2002. They propose that this is because of great internal dependencies which inhibit the changing of the 'contract' of the protein. It is also noted that evolution does take place, but very slowly as both parts of the dependency need to evolve in a compatible way at the same time. Viruses, worms Somebody recently proposed in a discussion that it would be really cool to hack the genome and compromise it so as to insert code that would copy itself to other genomes, using the host-body as its vehicle. 'Just like the nimda worm!' He shortly thereafter realised that this is exactly what biological viruses have been doing for millions of years. And they are exceedingly good at it. A lot of these viruses have become a fixed part of our genome and hitch a ride with all of us. To do so, they have to hide from the virus scanner which tries to detect foreign code and prevent it from getting into the DNA. The Central Dogma: .c - .o - a.out/.exe When scientists were still discovering the basics of genetics they were faced with lots of different chemicals but the correlation was unclear. When it became clear what comes from what it was hailed as a great triumph and called 'The Central Dogma'. This dogma tells us that DNA is used to make RNA and that RNA is used to make proteins, which is like saying that from a .c file comes a .o object file, which can be compiled into an executable (a.out/exe). It also tells us that this is the only order in which information flows. Now, the Central Dogma has recenly been tarnished somewhat. Like any billion year old coding project, a lot of hacking has been going on, and sometimes information flows the other way. Sometimes RNA patches the DNA and at other times, the DNA is modified by proteines created earlier. But generally, the dependencies are clear, so the Central Dogma remains important. Binary patching aka 'Gene therapy' We can fiddle easily enough with DNA. There are companies to which you can send an ASCII file with DNA characters, and they will synthesise the corresponding 'output' for you. We can also splice DNA into developing animals and plans. It is far harder to 'patch the running executable', as any programmer can attest. It is just like that with the genome. To change a running copy ('a human'), you need to edit each and every relevant copy of the gene you want to patch. For many years, medical science has tried to patch people with SCID, or 'Severe Combined Immunodefeciency', which is a very nasty disease which in effect disables the immune system - leading to very ill patients. It has been clear for quite a while now which letters in the DNA need to be fixed in order to cure these people. Many attempts where made to patch running people, using viruses that insert new DNA into living organisms, but this proved to be very hard. The genome is guarded far too well for such a simple approach to work - cells guard their code better than Microsoft! However, recently the right virus was found which was able to breach the protection of the genome and fix the broken characters, leading to apparently healthy people . Bug Regression When fixing a bug in a computer program, we often introduce new bugs in the course of doing so. The genome is rife with this thing. A lot of African Americans are immune to Malaria but instead suffer from sickle cell anemia: In tropical regions of the world where the parasite-borne disease malaria is prevalent, people with a single copy of a particular genetic mutation have a survival advantage. ... While inheriting one copy of the mutation confers a benefit, inheriting two copies is a tragedy. Children born with two copies of the genetic mutation have sickle cell anemia, a painful disease that affects the red blood cells. http://www.fda.gov/fdac/features/496_sick.html There are quite a few examples of this happening. See also the wonderful book 'Genome' by Matt Ridley. Reed-Solomon codes: 'Forward Error Correction' Like computer storage, DNA (and its intermediate 'RNA') can get corrupted. To protect against common 'single bit errors', the encoding from individual DNA letters to proteins is degenerate. There are 4 RNA characters, U, C, G and A - in other words, a 'byte' is 2 bits long. Three characters correspond to an amino acid. 6 bits could conceivably map to 64 amino acids, yet there are only 20 in use. For example, UCU, UCC, UCA and UCG all encode for 'Serine', whereas only UGG maps to 'Tryptophan'. Now, it turns out that some likely 'typos' (UCU - UCC) in the encoding lead to an identical amino acid being expressed. For more about this fascinating phenomenon, read 'Metamagical Themas' by Douglas Hofstadter . Holy Code: /* you are not expected to understand this */ Some code is sacred. We may not remember who wrote it, or why - we just know that it works. The guy who thought it up may have left the company already. Such code is not to be tinkered with. DNA knows the concept of the 'molecular clock'. Some parts of the genome are actively changing and some parts are sacrosanct. A good example of the latter are the Histone genes H3 and H4. These genes are fundamental to the actual storage of the genome and are thus of paramount importance. Any failure in this code rapidly leads to a non-functioning organism. So it is to be expected that this code isn't tinkered with and that turns out the case. The H3 an H4 genes have a *zero* effective mutation rate in humans. But it goes far beyond that. You share almost the exact same code with anything from chickens to grass or moulds. RATES OF NUCLEOTIDE SUBSTITUTION PER SITE PER 1000 MILLION YEARS BETWEEN VARIOUS HUMAN AND RODENT PROTEINS-CODING GENES WITH DIVERGENCE SET AT 80 MILLION YEARS BASED ON FOSSIL EVIDENCE: gene Number of codons Effective rate histone 3 135 0.00 histone 4 101 0.00 insulin 51 0.13 gamma interferon 136 2.79 http://www.staffs.ac.uk/schools/sciences/biology/Handbooks/evolseqphylo.htm Now, it does appear that there are two ways the genome can make sure that code does not mutate. The first way is described above: use amino acids that are highly degenerate and making sure that those typos that DO occur result in the same output. Furthermore, genes can be copied earlier or later in the cell's reproductive process, leading to more or less favourable copying conditions. Many more of such conditions apply. It appears as if H3 and H4 were authored very carefully as they do have a lot of 'synonymous changes', which through the clever techniques described above do not lead to changes in the output. Framing errors: start and stop bits ...0 0000 0001 0000 0010 0000 0011 0...This clearly describes the 8 bit values 1, 2 and 3. The spaces I added make it clear where a byte starts and stops. Many serial devices employ stop and start bits to encode where you start reading. If we shift this sequence slightly: ...00 0000 0010 000 00100 000 00110 ...It suddenly reads 2, 4, 6! To prevent this from happening in DNA there are elaborate signals that tell the cell where to start reading. Interestingly, there are pieces of genome that can be read from multiple starting points, and produce useful (but different) results either way. That is what I call a cool hack! Each way a strand of DNA can be read is called an Open Reading Frame and there are generally 6, 3 each way. Massive multiprocessing: each cell is a universe Now, DNA is not like a computer programming language. It really isn't. But there are some whopping analogies. We can view each cell as a CPU, running its own kernel. Each cell has a copy of the entire kernel, but choses to activate only the relevant parts. Which modules or drivers it loads, so to speak. If a cell needs to do something ('call a function'), it whips up the right piece of the genome and transcribes it into RNA. The RNA is then translated into a sequence of amino acids, which together make up a protein the DNA coded for. Now for the really cool bit :-) This protein is tagged with a shipping address. This is a marker consisting of several amino acids which tell the rest of the cell where this protein needs to go. There is machinery which acts on these instructions, and delivers the protein, which is potentially on the outside of the cell. The delivery instruction is then stripped off and several post processing steps may be performed, possibly activating the protein - which is good, because you may not want to transport an active protein through places where it should not do work. A Cell Self hosting bootstrapping If we were to destroy all existing C compilers on the planet and leave only the code for one, we would be in great trouble. Yes, we have the C code to a C compiler, but we need a C compiler to compile it! In actual fact, this was solved by not writing the first C compiler in C (duh), but in a language that was available already: B. See here for details about 'bootstrapping'. The same holds for the genome. To create a new 'binary' of a specimen, a *living* copy is required. The genome needs an elaborate toolchain in order to deliver a living thing. The code itself is impotent. This toolchain is commonly called 'your parents'. Update: Recently, it has become possible to 'bootstrap' life with very little actually living source material. The dictum every cell comes from a cell is becoming less true. See for example Mycoplasma laboratorium . It appears that RNA, which is an intermediate code between DNA and a protein, may have been the 'B' for DNA. Which begs the question where RNA came from. It is very interesting to note that extra-terrestial objects often contain amino acids! See http://www.google.com/search?hl=enq=amino+acids+meteorites The Makefile Organisms typically start out as a single cell, which as said before contains two entire copies of the genome. The big tarfile so to speak, with all files extracted, ready to go. Now what? Enter the Homeobox genes . Cells must be copied and assigned a purpose. The Homeobox genes start out by laying a 'top to bottom' dependency which reads 'start with the head'. In order to make this happen, a chemical gradient is created by which cells can sense where they are, and decide if they need to do things useful for building a head, or for building a primordial notochord. Only discovered in 1983, the Homeobox genes are a very exciting area of research right now. It is interesting to note that like a Makefile, 'HOX' genes only trigger things in other genes and don't materially build things themselves. The homeobox 'syntax' appears to be very 'holy' in the sense described above. What happens if you copy paste the 'legs selector' part of a mouse HOX gene into the fruitfly Homeobox: 'In fact, when the mouse Hox-B6 gene is inserted in Drosophila, it can substitute for Antennapedia and produce legs in place of antennae' http://www.ultranet.com/~jkimball/BiologyPages/H/HomeoboxGenes.html The fruitfly and human genomes did not branch just millions of years ago but hundreds of millions of years ago. And you can copy paste parts ('Selectors' in the genetic language) of the Makefile and it still clicks. Please note that the 'build a leg' routine in a fruitfly is of course radically different from that in a mouse, but the 'selector' correctly triggers the right instructions. Further reading Genome by Matt Ridley An amazing account of an effect each chromosome has on our lives. Very readable yet strict in not 'dumbing down' the theory. Contains an impressive set of references. Source of many of the more impressive examples found on this page. And to help Matt along in the quest he clearly sets out in his book, I would like to state quite clearly: Genes are not there to cause diseases Human Molecular Genetics, second edition by Tom Strachan and Andrew P. Read Neatly fills the gap between 'primary literature' (ie, peer reviewed academic magazines and papers) and introductory textbooks. I'm litteraly dragging myself through this book, constantly looking things up in order to understand everything. If you really want to know the details about introns, exons, RNA in all its variants, how genes cause and prevent diseases, this is the book. The Selfish Gene by Richard Dawkins Richard Dawkins is the Richard Stevens of evolution theory. Both have contributed practical work but are most famous for their crystal clear expositions of existing theory, opening up the world they describe to an audience of millions. In this book, Dawkins explains evolution from a 'gene' standpoint rather then from a 'species' standpoint. It turns out to make a lot more sense this way and helps understand how genes power you, and not the other way around. It is not that genes help you do what you want to do, you ARE the genes. Also explains a lot about how genes work along the way. The Blind Watchmaker : Why the Evidence of Evolution Reveals a Universe Without Design by Richard Dawkins Again a book by Dawkins. More about evolution than about genes but clearly explains how evolution can be responsible for the intricate design found in many living things. Again very readable and fascinating on every level. Metamagical Themas by Douglas Hofstadter This is an 'idea' book. It is filled to the brim with ideas, they simply ooze out of the pages. Many of these ideas are about information theory, genetics, life, intelligence, music, mathematics and people. Clearly not a genetic textbook but has been influential in imbueing enthousiasm for all things genetic in many people. Can often be found dirt cheap in second hand bookstores. Recommended.
个人分类: 科学|1384 次阅读|0 个评论
需要一个比较公理?
热度 1 accsys 2014-10-20 06:50
需要一个比较公理 姜咏江 我提出以下比较公理,是发现许多有关程序的时间属性是用程序自身结构进行证明的。这违反了比较的原则,因而得不到结果。对与错?希望博友讨论。 【比较公理】 事物的比较都遵从以下几条: 1• 任何事物都不必自身比较; 2• 比较只能通过事物的确定属性进行; 3• 不同种属性不能比较; 4• 比较必有标准单位 1 ,称为尺; 5• 比较的结果用等级或数表示。 以上 5 条称为公理,原因是不能用逻辑方法证明。说几个比较的实际例子。 ( 1 )物体的质量比较,可以用重量属性比较,不能用体积或组成材质等其他属性比较。 ( 2 )人与人的比较,可以用身高、体重等属性等进行;但不能用一人的身高和另一人的体重进行比较。 ( 3 )程序设计的好不好,可以用程序使用的语句数量,来比较程序编写的简捷程度;用程序运行的时间来比较处理问题的优秀程度。但不能用程序结构形式来比较程序处理问题的优秀程度,因为没有标准的比较单位 1 。 2014-10-20
个人分类: 科研讨论|3074 次阅读|5 个评论
视觉缘何难解: 聪明反被聪明误?
lizhaoping 2014-9-5 05:16
此博文是本人在牛津大学出版社的博文网上的原文翻译, 英文原文在 http://blog.oup.com/2014/08/sight-brain-cognitive-neuroscience/ 译文的标题是清华大学研究生梁鸣翻译的。以下的内容由本人翻译。 约半个世纪前, 美国麻省理工学院的一位教授给出了一个本科生暑期科研项目:写一个计算机程序来认出输入图片中的物体。何尝不可? 物体认知不就是将输入的图片数据用个算法处理后的输出吗?完全可以用来给那些聪明的学生们练练身手。几十年过去了, 那个暑期科研项目的目标还不能很好完成, 但催生了全球的计算机视觉领域。 我们所谓的聪明,包括具有做高等数学和编复杂的计算机程序等类似的智力。不可思议的是这种智力竟然不够来编个程序来认出(比方说)以下这个图里的物体。 此图是 书“ Understanding Vision:Theory, models, and data 中的图5.51A 你能编个程序从此图的像素数据的输入来认出苹果吗?一个学前儿童 (她的眼睛作为摄像头,她的大脑做处理器) 能毫不费力的认出这个苹果, 不用学过高等数学或编程等。其中的难点之一是个“先有鸡还是先有蛋”的问题: 在图中找出属于苹果的像素后能帮我们认出苹果, 而认出苹果后能帮我们找出图中属于苹果的像素。 更不久前又有个关于视觉的预想不到而不可思议的发现:我们几乎象瞎子一样没看见大多在我们眼前的物体。 “什么?! 可我眼前看得一清二楚啊”, 你可能会抗 议。 那么, 你是否一下就能看出下面两张图片之间的区别呢? 此图是 书“ Understanding Vision:Theory, models, and data 中的图1.6 大多人在几秒之内看不出两张图片之间(很大)的区别 --- 为什么呢? 我们的大 脑给的直觉是“我看得一清二楚”, 这个直觉吻合我们对“自己几乎是瞎子”的 无知。于是,我们看不见自己的看不见! 如此的睁眼瞎, 怎么让我们在这复杂的世界上活得好好的? 这 个故事说来话长, 还没个结尾, 故事人物包括大脑注意力的机制。 所谓“聪明” 包括我们能有意识的利用熟悉的规律和已有的经验来做推理。 但是, 如果我们大脑视觉的大多机制是潜意识的, 和我们下意识的经验大相径庭, 那该如何呢? 的确,人脑中大多掌控视觉的脑区离主管下意识思维的前额叶最远。怪不得前两个视觉现象的例子如此违背我们的直觉! 怪不得虽然我们对大脑视觉的探索已有好几个世纪, 只在最近20年内我们才发现自己竟然几乎是瞎子。 另一个不可思议的视觉现象是6年前才发现的: 我们的眼睛或注意力会被看不见的视觉信号自动吸引。 根据我们的日常经验,视野中强力自动地吸引我们注意力的物体都显得非常与众不同。 比如, 万绿丛中的一点红会使我们不由自主地朝它看,除非我们是色盲。这个经验总结: “只有明显与众不同的物体才会强有力地自动吸引我们的注意力”,又是错了!在下面的图中, 我们看见的图是两个图的重叠,用看三维电影的立体眼镜, 我们可以将一个图给左眼看, 同时另一个图给右眼看。 此图是 书“ Understanding Vision:Theory, models, and data 中的图5.9 我们地感觉就好像这个看见的图(这个图只包括棒, 不包括箭)是同时给两个眼睛一起看的一样。 那个特殊的唯一朝向(右倾)的棒最显得与众不同。 而那特殊的,唯一呈现给左眼的,棒看上去和大多棒没什么区别,也就是说, 我们看不出这根棒的与众不同之处: 它的输入眼。然而,这个特殊眼睛来源的棒经常比特殊朝向的棒更能吸引我们的注意力, 以至于我们更倾向于第一眼先朝它看,而不是先朝唯一朝向的棒看。甚至在我们被要求快速找到唯一朝向的棒而不要管别的棒的情况下, 这个倾向还是存在。 就好像这个特殊眼睛来源的棒有个“万绿丛中一点红”一般的与众不同之处来不由自主地吸引我们的注意力,只是我们下意识的识别能力对这类与众不同之处是“色盲” 的。 很多视觉研究科学家不亲身经历一下不敢相信这个视觉现象。 对于我们”聪明“的,下意识的大脑来说,这些不符直觉的视觉现象奇怪得超出我们的理解范畴了吗? 研究视觉是否象地球人想要理解火星人那么难? 登陆火星能帮助那些地球人达到他们的目标。 可是我们下意识的大脑是否太”聪明“, 太偏见,以至于不能适当地”笨“化一点去理解那些潜意识的视觉机制吗? 是不是因为我们是过分”聪明“, 具有太多下意识成见的, 视觉动物, 以至于我们不能理解我们自己的视觉的能力?(至少, 如果我们要研究电鱼是如何用对电的感管来感知周围的物体, 我们不会有成见。)认识到我们的困难是克服困难的第一步, 这样我们才能真正聪明起来, 而不被我们的半桶水淹没。 --------------------- 相关资料: 见教科书 “ Understanding Vision: theory, models, and data , 牛津大学出版社, 2014年
3796 次阅读|0 个评论
在计算机程序中,完成重复的任务有两种方式:递归和迭代(循环)。
charlesqwu 2013-10-31 01:37
在计算机程序中,完成重复的任务有两种方式:递归和迭代(循环) 递归的一个例子: 从前有座山,山里有座庙,庙里一个老和尚在给小和尚讲故事,内容是“ 从前有座山,山里有座庙,庙里一个老和尚在给小和尚讲故事,内容是“ 从前有座山,山里有座庙,庙里一个老和尚在给小和尚讲故事,内容是“ ...... 循环的一个例子: 炉子上有99锅汤,让我不小心喝了一锅,炉子上还有98锅汤; 炉子上有98锅汤,让我不小心喝了一锅,炉子上还有97锅汤; 炉子上有97锅汤,让我不小心喝了一锅,炉子上还有96锅汤; ......
个人分类: Web|5949 次阅读|0 个评论
《计算机程序的构造和解释》(1)
businessman 2013-4-4 11:25
2013 年 4 月 3 日 ________________________________________________________________________________________ 题记:好记心不如烂笔头!阅读时往往会有所悟,而这种感悟则通常比书籍本身的内容更有价值,因为这是你阅读时的思考,代表了你对于相关知识的个人领悟。以文字形式记录下来,方便以后参考。 2013 年 4 月 3 日 ________________________________________________________________________________________ (1) 艾伯森 等著作的 《 计算机程序 的构造和解释:原书第2版》,裘宗燕 译,机械工业出版社,2004。 (2)相关网站有本书源代码及其他教辅资料,网址为 : www-mitpress.mit.edu/sicp/ 2013 年 5 月 7 日 (出版者的话) ________________________________________________________________________________________ 转载:【 http://book.51cto.com/art/201304/390027.htm 】 文艺复兴以降,源远流长的科学精神和逐步形成的学术规范,使西方国家在自然科学的各个领域取得了垄断性的优势;也正是这样的传统,使美国在信息技术发展的六十多年间名家辈出、独领风骚。在商业化的进程中,美国的产业界与教育界越来越紧密地结合,计算机学科中的许多泰山北斗同时身处科研和教学的最前线,由此而产生的经典科学著作,不仅擘划了研究的范畴,还揭示了学术的源变,既遵循学术规范,又自有学者个性,其价值并不会因年月的流逝而减退。 近年,在全球信息化大潮的推动下,我国的计算机产业发展迅猛,对专业人才的需求日益迫切。这对计算机教育界和出版界都既是机遇,也是挑战;而专业教材的建设在教育战略上显得举足轻重。在我国信息技术发展时间较短的现状下,美国等发达国家在其计算机科学发展的几十年间积淀和发展的经典教材仍有许多值得借鉴之处。因此,引进一批国外优秀计算机教材将对我国计算机教育事业的发展起到积极的推动作用,也是与世界接轨、建设真正的世界一流大学的必由之路。 机械工业出版社华章公司较早意识到“出版要为教育服务”。自 1998 年开始,我们就将工作重点放在了遴选、移译国外优秀教材上。经过多年的不懈努力,我们与 Pearson,McGraw?Hill,Elsevier,MIT,John Wiley Sons,Cengage 等世界著名出版公司建立了良好的合作关系,从他们现有的数百种教材中甄选出 Andrew S. Tanenbaum,Bjarne Stroustrup,Brain W. Kernighan,Dennis Ritchie,Jim Gray,Afred V. Aho,John E. Hopcroft,Jeffrey D. Ullman,Abraham Silberschatz,William Stallings,Donald E. Knuth,John L. Hennessy,Larry L. Peterson 等大师名家的一批经典作品,以“计算机科学丛书”为总称出版,供读者学习、研究及珍藏。大理石纹理的封面,也正体现了这套丛书的品位和格调。 “计算机科学丛书”的出版工作得到了国内外学者的鼎力襄助,国内的专家不仅提供了中肯的选题指导,还不辞劳苦地担任了翻译和审校的工作;而原书的作者也相当关注其作品在中国的传播,有的还专程为其书的中译本作序。迄今,“计算机科学丛书”已经出版了近两百个品种,这些书籍在读者中树立了良好的口碑,并被许多高校采用为正式教材和参考书籍。其影印版“经典原版书库”作为姊妹篇也被越来越多实施双语教学的学校所采用。 权威的作者、经典的教材、一流的译者、严格的审校、精细的编辑,这些因素使我们的图书有了质量的保证。随着计算机科学与技术专业学科建设的不断完善和教材改革的逐渐深化,教育界对国外计算机教材的需求和应用都将步入一个新的阶段,我们的目标是尽善尽美,而反馈的意见正是我们达到这一终极目标的重要帮助。华章公司欢迎老师和读者对我们的工作提出建议或给予指正,我们的联系方法如下: 华章网站:www?hzbook?com 电子邮件: hzjsj@hzbook ?com 联系电话:(010)88379604 联系地址:北京市西城区百万庄南街1号 华章科技图书出版中心 2013 年 5 月 7 日 ________________________________________________________________________________________ (1)每一位严肃的计算机科学家都应该阅读这本书。由于本书清晰、简洁和富于才智,我们强烈推荐本书,它适合所有希望深刻理解计算机科学的人们。—— Mitchell Wand 《美国科学家》杂志 (2)本书 1984 年出版,成型于美国麻省理工学院(MIT)多年使用的一本教材,1996 年修订为第 2 版。在过去的二十多年里,本书对于计算机科学的教育计划产生了深刻的影响。 (3)这本书再次印证了很多优秀的教材源于教学的原始材料,并再次证明好东西经常是在已有的基础上积累而出的,并不是一躇而就的。要善于利用自己手上已有的资料或材料。 (4) 带着崇敬和赞美,将本书献给活在计算机里的神灵。 “我认为,在计算机科学中保持计算中的趣味性是特别重要的事情。这一学科在起步时饱含着趣味性。当然,那些付钱的客户们时常觉得受了骗。一段时间之后,我们开始严肃地看待他们的抱怨。我们开始感觉到,自己真的像是要负起成功地、无差错地、完美地使用这些机器的责任。我不认为我们可以做到这些。我认为我们的责任是去拓展这一领域,将其发展到新的方向,并在自己的家里保持趣味性。我希望计算机科学的领域绝不要丧失其趣味性意识。最重要的是,我希望我们不要变成传道士,不要认为你是兜售圣经的人,世界上这种人已经太多了。你所知道的有关计算的东西,其他人都能学到。绝不要认为似乎成功计算的钥匙就掌握在你的手里。你所掌握的,也是我认为并希望的,也就是智慧:那种看到这一机器比你第一次站在它面前时能做得更多的能力,这样你才能将它向前推进。” Alan J. Perlis(1922 年 4 月 1 日 ~ 1990 年 1 月 7 日) 2013 年 5 月 7 日 (序) ________________________________________________________________________________________ 教育者、将军、减肥专家、心理学家和父母做规划(program),而军人、学生和另一些社会阶层则被人规划(as programmed)。解决大规模问题需要经过一系列规划,其中的大部分东西只有在工作进程中才能做出来,这些规划中充满着与手头问题的特殊性相关的情况。如果想要把做规划这件事情本身作为一种智力活动来欣赏,你就必须转到计算机的程序设计(programming),你需要读或者写计算机程序——而且要大量地做。有关这些程序具体是关于什么的、服务于哪类应用等等的情况常常并不重要,重要的是它们的性能如何,在用于构造更大的程序时能否与其他程序平滑衔接。程序员们必须同时追求具体部分的完美和汇合的适宜性。在这部书里使用“程序设计”一词时,所关注的是程序的创建、执行和研究,这些程序是用一种 Lisp 方言(Scheme)书写的,为了在数字计算机上执行。采用 Lisp 并没有对我们可以编程的范围施以任何约束或者限制,而只不过确定了程序描述的记法形式。 Lisp 是一个幸存者,已经使用了四分之一个世纪。在现存的活语言里,只有 Fortran 比它的寿命更长些。这两种语言都支持着一些重要领域中的程序设计需要,Fortan 用于科学与工程计算,Lisp 用于人工智能。这两个领域现在仍然很重要,它们的程序员都如此倾心于这两种语言,因此,Lisp 和 Fortran 都还可能继续生存至少四分之一个世纪。 Alan J. Perlis 纽黑文,康涅狄格 2013 年 5 月 7 日 (第 2 版前言) ________________________________________________________________________________________ 软件很可能确实与其他任何东西都不同,它的本意就是被抛弃:这一观点的全部就是总将它看作一个肥皂泡吗?—— Alan J. Perlis 自 1980 年以来,本书的材料就一直在 MIT 作为计算机科学学科入门课程的基础。在本书第 1 版出版之前,我们已经用这一材料教了 4 年课,而到这个第 2 版出版,时间又过去了 12 年。我们非常高兴地看到这一工作被广泛接受,并被结合到其他一些教材中。我们已经看到自己的学生掌握了本书中的思想和程序,并将它们构筑到新的计算机系统或者语言的核心里。这就在文字上实现了一个古犹太教法典的双关语,我们的学生已经变成了我们的创造者。我们非常幸运能有如此有能力的学生和如此有建树的创造者。 …… 这一版本中强调了几个新问题,其中最重要的是有关在不同的途径中,计算模型里对于时间的处理所起的中心作用:带有状态的对象、并发程序设计、函数式程序设计、惰性求值和非确定性程序设计。 2013 年 5 月 7 日 (第 1 版前言) ________________________________________________________________________________________ 一台计算机就像是一把小提琴。你可以想象一个新手试了一个音符并丢掉了它。后来他说,听起来真难听。我们已经从大众和我们的大部分计算机科学家那里反复听到这种说法。他们说,计算机程序对个别具体用途而言确实是好东西,但它们未缺乏弹性。一把小提琴或者一台打字机也同样缺乏弹性,那是你学会了如何去使用它们之前。—— Marvin Minsky,“为什么说程序设计很容易成为一种媒介,用于表述理解浮浅、草率而就的思想” …… 我们所设计的这门计算机科学导引课程反映了两方面的主要考虑。首先,我们希望建立起一种看法:一个计算机语言并不仅仅是让计算机去执行操作的一种方式,更重要的,它是一种表述有关方法学的思想的新颖的形式化媒介。因此,程序必须写得能够供人们阅读,偶尔地去供计算机执行。其次,我们相信,在这一层次的课程里,最基本的材料并不是特定程序设计语言的方法,不是有效计算某种功能的巧妙算法,也不是算法的数学分析或者计算的本质基础,而是一些能够用于控制大型软件系统的智力复杂性的技术。 我们的目标是,使完成了这一科目的学生能对程序的风格要素和审美观有一种很好的感觉。他们应该掌握了控制大型系统中的复杂性的主要技术。他们应该能够去读 50 页长的程序,只要该程序是以一种值得模仿的形式写出来的。他们应该知道在什么时候哪些东西不需要去读,哪些东西不需要去理解。他们应该很有把握地去修改一个程序,同时又能保持原来作者的精神和风格。 …… 设计这门课程的基础是我们的一种信念,“计算机科学”并不是一种科学,而且其重要性也与计算机本身并无太大关系。计算机革命是有关我们如何去思考的方式,以及我们如何去表达自己的思考的一个革命。在这个变化里最基本的东西,就是出现了这样一种或许最好是称为过程性认识论的现象——这就是如何从一种命令式的观点去研究知识的结构,这一观点是与经典数学领域中所采用的更具说明性的观点完全不同的。数学为精确处理“是什么”提供了一种框架,而计算则为精确处理“怎样做”的概念提供了一种框架。 在教授这里的材料时,我们采用的是 Lisp 语言的一种方言。我们绝没有形式化地教授这一语言,因为完全不必那样做。我们只是使用它,学生可以在几天之内就学会它。这也是类 Lisp 语言的重要优点:它们只有不多几种构造复合表达式的方式,几乎没有语法结构。所有的形式化性质都可以在一个小时里讲完,就像下象棋的规则似的。在很短的时间之后,我们就可以不再去管语言的语法细节(因为这里根本就没有),而进入真正的问题——弄清楚我们需要去计算什么,怎么将问题分解为一组可以控制的部分,如何对这样的部分开展工作。Lisp 的另一优势在于,与我们所知的任何其他语言相比,它可以支持(但并不是强制性的)更多通过高阶函数抓住公共的使用模式,可以用赋值和数据操作去模拟局部状态,可以利用流和延时求值连接起一个程序里的各个部分,可以很容易地实现嵌入性语言。所有这些都融合在一个交互式的环境里,带有对递增式程序设计、构造、测试和排除错误的绝佳支持功能。我们要感谢一代又一代的 Lisp 大师,从 John McCarthy 开始,是他们铸造起了这样一个具有空前威力的如此优美的工具。 作为我们所用的 Lisp 方言,Scheme 试图将 Lisp 和 Algol 的威力和优雅集成到一起。我们从 Lisp 那里取来了元语言的威力,它来自简单的语法形式,程序与数据对象的统一表示,以及带有废料收集的堆分配数据。我们从 Algol 那里取来了词法作用域和块结构,这是当年参加 Algol 委员会的那些程序设计语言先驱者们的礼物。我们想特别提出 John Reynolds 和 Peter Landin,为了他们对丘奇的 lambda 演算与程序设计语言的结构之间关系的真知灼见。我们也认识到应该感谢那些数学家们,他们在计算机出现之前,就已经在这一领域中探索了许多年。这些先驱者包括丘奇(Alonzo Church)、罗塞尔(Barkley Rosser)、克里尼(Stephen Kleene)和库里(Haskell Curry)。 2013 年 5 月 7 日 ________________________________________________________________________________________ (1)最近想同时看十来本书,但大都困于前言或序言这样的总结性材料中,因为如果找不到网上已经存在的文字,则我需要把其中绝大部分给我带来思想启发以及历史客观事实和专家评论等文字抄录到科学网上。这耗费了我大量的时间,但我还是要坚持这一做法,因为或许当你看完厚厚的一本书之后,其实你最终会发现你只需要一句话就能够概括那本书的所有内容。而显然困住我的这些部分正是具备这些性质的内容。 (2)正所谓英雄所见略同,这些书中有很多意见是一致的。感觉还是蛮有收获的,至少自己的思维有所拓展,心态有所宁静。希望这些能够鼓励和激励我继续坚持下去,……
个人分类: 学习手记|1758 次阅读|0 个评论
一些地化软件和相应的实验练习
热度 1 wenrongc 2013-1-6 14:55
一些地化软件和相应的实验练习
2012年春,作为变质岩和岩浆岩岩石学课程的助教,在教授指导下重新编写了岩浆岩岩石学的实验课教材。其中大幅度更新了计算机程序在岩浆岩岩石学中的应用,基本上做到每一个lab介绍一个岩石地化程序,并且编写了练习内容。尽管对程序的练习内容都比较初步,但也许可以作为本科生兴趣启发和进一步研究的入门。我把实验课教材在这里和大家分享,如有不足和错误,请务必指出。如转载,请标注出处,谢谢! 介绍的软件有: (1)GCDkit. (Lab 8) 地球化学投图软件。Windows免费程序,下载地址:http://www.gcdkit.org/ 。 用来练习基本的地球化学投点工作。 LAB 8 TEXTURAL AND GEOCHEMICAL IGNEOUS CLASSIFICATIONS.pdf (2) PhasePlot. (Lab 9) 用直观的饼图显示不同成分岩浆在不同温压条件下相平衡时的矿物,地化组成。免费Mac OS程序,要求在Lion版本下运行。下载地址:https://itunes.apple.com/us/app/phaseplot/id469767419?mt=12。实验课中这个软件用来模拟地幔部分熔融和岩浆起源。 LAB 9 Ultramafic rocks, cumulates and melt sources.pdf (3)MELTS. (Lab 10) 地学化学动力学模拟软件,也用来模拟不同的岩浆过程下岩浆和矿物平衡相组成。免费Mac OS和Linux程序,下载地址:http://melts.ofm-research.org/。 也有基础的网页应用版,直接在网页上运行:http://melts.ofm-research.org/applet.html。 这个软件用来练习岩浆的结晶分异作用。 LAB 10 MAFIC-FELSIC PLUTONIC ROCKS.pdf (4)Excel. (Lab 11) 运用简单的Excel表格练习岩浆混合作用。 LAB 11 MAFIC-FELSIC VOLCANIC ROCKS.pdf 其他: (5) 一些网上的数据库,可以下载用作练习的素材: http://earthref.org/GERM/ http://www.navdat.org/ 其他的一些地球化学软件如Perple_X(http://www.perplex.ethz.ch/), 用于编制P-t格子图,但还在学习中,没有加入实验教材。希望有同好可以交流!
13585 次阅读|1 个评论
说说《“纯”计算机程序到底能不能申请专利?》
热度 1 liwei999 2012-10-1 18:22
说说《“纯”计算机程序到底能不能申请专利?》 作者: mirror (*) 日期: 10/01/2012 05:06:29 “纯”计算机程序到底能不能申请专利? 的确有些令人费解的部分。这个帖子废了很长的篇幅,描述出来了这种状态。 Quote 将计算机程序,划分为“解决了技术问题”的成果,和“纯粹的人的智力活动的方法和规则”两种类型,并将后者排除在专利保护之外,是否是一个从逻辑或技术角度合适的做法? 帖子的作者认为这样处理不妥。并认为 Quote 案例:汉字编码方法及计算机汉字输入方法该案例指出,纯粹的汉字编码方法,不能获得专利保护。但是,“如果把汉字编码方法与该编码方法可使用的特定键盘相结合”,则可以构成一个用以解决特定技术问题的受专利保护的发明。这个案例,我怀疑就是从臭名昭著的“五笔字型专利”的成功案例中,为了自圆其说,而作出的一个掩耳盗铃的解释。 作为第一近似,说专利制度是保护 技术 发明(所有)人的利益的。第二近似,就是要区分“解决特定问题”的技术,和“高度的、纯粹的人的智力活动的方法和规则”的技术两类。前者因为主张很具体,很容易受到保护。而后者的覆盖面可能很宽,只有被证明是人类首创的技术,才可以被保护。比如某种数学上的函数用于暗号的技术,完全可以被专利制度保护。不但是专利,而且相关程序还要受到著作权保护。 作为专利制度,只有对“垃圾”一样的权利主张也能保护好,才能真正起到保护有意的技术的作用。不能把专利制度想象得那么美好。 同理,对于民主制度,也存在着同样的问题。不冤枉一个好人,也绝不放过一个坏人的概率论,是不存在的。 ---------- 就“是”论事儿,就“事儿”论是,就“事儿”论“事儿”。
个人分类: 镜子大全|2126 次阅读|1 个评论
“纯”计算机程序到底能不能申请专利?
热度 1 seawan 2012-10-1 16:28
问题的背景: 所谓“纯”计算机程序,指“仅仅记录在载体上的计算机程序”。这样的成果,按照现在的专利规则,是不能申请专利的。 参见: 专利审查指南 2010.pdf 第259页。 在上面的《审查指南》中,还有具有例子: 例1:利用计算机程序求解圆周率的方法( P261 ): 不受 专利保护。 例2:一种去除图像噪声的方法( P265 ): 受 专利保护。 之所以第2个案例可以受到保护,是因为该 专 利解决的是一个 技术问题 , “该发明专利申请是一种通过执行计算机程序实现外部技术数据处理的解决方案,属于专利法第二条第二款规定的技术方案,属于专利保护的客体。” 现在,我的问题是: 将计算机程序,划分为“解决了技术问题”的成果,和“纯粹的人的智力活动的方法和规则”两种类型, 并将后者排除在专利保护之外,是否是一个从逻辑或技术角度合适的做法? 我认为不是。 因为,从上面的划分中可以看出,专利规定的制定者认为,“纯粹的人的智力活动的规则和方法”,是不可能解决“技术问题”的。显见这是一个不正确的前提。 其次,根据现行专利规定,我们很容易可将一个本来不是解决“技术问题”的发明,转换成为一个“解决技术问题的发明”。 以“圆周率求解方法”案例为例,改为:“因为在某技术活动中,需要在n纳秒以内计算出圆周率以达到某某控制要求。现有算法在某某CPU平台上不能达到该指标,于是我们发明了新的求圆周率的算法,并且计算时间达到了n/2秒以内。” 不论上述论断是否真实,但结果是将一个原来不合乎专利规则的申请,变成了合乎规则的申请。 实际上,在《审查指南》 P270 中,已经有一个现成的例子,可以作为这种“投机取消”的学习样板: 案例:汉字编码方法及计算机汉字输入方法 该案例指出, 纯粹的汉字编码方法,不能获得专利保护。 但是,“ 如果把汉字编码方法与该编码方法可使用的特定键盘相结合” ,则可以构成一个用以解决特定技术问题的受专利保护的发明。 这个案例,我怀疑就是从“鼎鼎大名”的“五笔字型专利”的成功案例中,为了自圆其说,而作出的一个掩耳盗铃的解释。 而在 P268 , P269 页的两个例子,更是无法自圆其说的。 例如, P268 中对例8的否决,理由是:“但该游戏装置是公知的游戏装置,对游戏过程进行的控制既没有给游戏装置的内部性能例如数据传输、内部资源管理等带来改进,也 没有给游戏装置的构成或 功能 带来任何技术上的改变 。” 注意上面的红色字体的内容: 对“游戏规则的改变”,这本身就是游戏(从而也包括游戏机本身)的功能的改变! 我们无法认为,一个“象棋游戏”和一个“拳击游戏”,是一个“功能上完全没有改变的”游戏! 另外,这这将“方法和规则”完全排除在“技术”概念的外延之外,也是一种武断和偏颇的做法。 总之,现有的专利规定,在关于计算机程序的发明保护上面,有着不容忽视的缺陷,加上现在个单位对发明专利的重视而轻视软件著作权,对于保护国产软件产业的良性发展,也必将产生不利的影响。
个人分类: 官文参考|7860 次阅读|2 个评论
[转载]专利审查指南 2010.pdf
seawan 2012-9-29 21:31
专利审查指南 2010.pdf 摘: 2?涉及计算机程序的发明专利申请的审查基准 审查应当针对要求保护的解决方案,即每项权利要求所限 定的解决方案。 根据专利法第二十五条第一款第 (二) 项的规定,对智力 活动的规则和方法不授予专利权。涉及计算机程序的发明专利 申请属于本部分第一章第 4?2节所述情形的,按照该节的原则 进行审查: (1)如果一项权利要求仅仅涉及一种算法或数学计算规则, 或者 计算机程序本身或仅仅记录在载体 (例如磁带、磁盘、光 盘、磁光盘、ROM、PROM、VCD、DVD或者其他 的 计 算 机 可 读介质) 上的计算机程序 ,或者游戏的规则和方法等,则该权 利要求属于智力活动的规则和方法, 不属于专利保护的客体。 如果一项权利要求除其主题名称之外,对其进行限定的全 部内容仅仅涉及一种算法 或 者 数 学 计 算 规 则,或 者 程 序 本 身, 或者游戏的规则和方法等,则该权利要求实质上仅仅涉及智力 活动的规则和方法,不属于专利保护的客体。 例如,仅由所记录的程序限定的计算机可读存储介质或者 一种计算机程序产品,或者仅由游戏规则限定的、不包括任何 技术性特征,例如不包括任何物理实体特征限定的计算机游戏 装置等,由于其实质上仅仅涉及智力活动的规则和方法,因而 不属于专利保护的客体。但是,如果专利申请要求保护的介质 涉及其物 理 特 性 的 改 进,例 如 叠 层 构 成、磁 道 间 隔、材 料 等, 则不属此列。
个人分类: 官文参考|3444 次阅读|0 个评论
[转载]计算机病毒繁衍时可能会借鉴孔雀择偶方式
crossludo 2012-8-31 09:34
计算机病毒繁衍时可能会借鉴孔雀择偶方式 修改代码可减少计算机病毒后代进化 在网络中,计算机病毒能不断复制并造成严重的破坏,然而它们会如何为繁衍而寻找配偶呢?在新出版的《进化》杂志上,美国密歇根州立大学必康进化研究中心博士后克里斯·钱德勒通过创建数字化环境发现,计算机病毒可通过计算机程序配对繁衍。 钱德勒说,实际上计算机病毒配对繁衍仍是一个引发大量争论的问题,人们为研究这个问题曾提出了一些好的想法,但难以对这些想法进行检测。他和包括动物学助理教授伊恩·德沃金、计算机科学和工程副教授查尔斯·奥福瑞亚等研究人员合作,找到了不同的研究途径。 他们开创的新研究途径包括在名为“阿维达”(Avida)的虚拟世界中放置各种各样的程序。“阿维达”软件环境中的特殊计算机程序能够竞争并繁衍。“阿维达”创建者奥福瑞亚认为,由于“阿维达病毒”(Avidian)在自我复制时会发生变异,因而数字生物会像生物那样发生进化。 在孔雀世界,雄性孔雀漂亮的尾羽是吸引雌性配偶的重要特征。“阿维达病毒”具有产生如同孔雀尾羽的性感表征的能力,同时它们能够随机选择配偶。正如研究人员预测的那样,“阿维达病毒”通常会选择最艳丽的配偶。 为防止计算机病毒繁衍出更强的病毒,研究人员修改了“阿维达病毒”的遗产代码,允许它们生长出夸大的性表征。由于现在即使最弱的“阿维达病毒”也能生长出漂亮的数字尾羽,因此研究人员期望遗产代码的修改能减少“阿维达病毒”选择艳丽配偶获得进化。 【圈点】 完美的计算机系统是不存在的,它的漏洞衍生出了神秘的计算机病毒。想想吧,只是一段信息,却在它的生存空间(也就是网络环境)中存活、自我复制、繁衍甚至进化,直如生命一般。只是它本身少了个承载体,要借着别人的实体发威。而今美国学者拿出了生物择偶方式来类比病毒的配对,使得这些代码更沾了点生老病死的味道,就差能喜怒哀乐了。不过这真说不好,就像我们的大自然,你怎知它不是一台巨型计算机呢?
个人分类: 遗传进化|1092 次阅读|0 个评论
【技术商业】计算机程序帮助中国学生听懂英语口音
lihujun 2012-7-8 22:16
诺丁汉大学的研究人员最近发明了一个独特的计算机程序,以帮助亚洲学生提高在嘈杂环境中听懂带有口音的英语的能力。 该校心理学院、教育学院和英国文学院的研究人员发现,一些亚洲学生对听懂一些带有口音的英语感到困难,尤其是在分辨尾音(如rope和 robe) 以及起始音(如tin 和 thin) 的时候。这可能会让学生很难再听懂接下来的发言,因为听错一个单词可能影响到理解整个句子的意思。这种困难还可能受到环境噪音的影响,比如如在电话里或者购物中心,而变得愈发严重。 为了解决这一问题, 研究人员开发出一个英语口语辨别能力培训( Spoken English Discrimination,简称SED)的计算机项目,它可训练讲中文的人们如何在不理想的环境下(如带有口音的英语或者嘈杂环境中)检测语音的差异。 研究团队陆续从欧洲资助创新奖学金(ERDF)和诺丁汉大学Hermes奖学金计划获得资助。这些资金将支持团队继续开发新开发新产品,评估市场需求并创造新的商业合作机会。 诺丁汉大学正努力将这个项目融入到已有的英语语言教学中, 因为它涵盖了特定的文化、口音和不同的噪音环境因素。通常,这些因素并没有被考虑在语言教学中。 研究团队由心理学院Nicola Pitchford和Walter van Heuven作为课题负责人。 Nicola Pitchford 说: “我们的发现显示这个培训计划对于亚洲国家学生在辨识口音上有着举足轻重的作用。许多政府机构和中国的一个大型电信营运商都希望利用这个技术开发成安装在手机上的教育型软件。仅仅在中国,就有超过3亿人在学习英语,因此我们对于这个项目的未来抱有很大的希望。" 来源: 诺丁汉大学 Eurekalert!中文版
11029 次阅读|0 个评论
专题学习——系统仿真(20110723)
mafeicheng 2011-10-15 12:52
——专题介绍人:李亚婷 一、系统仿真 就是根据系统分析的目的,在分析系统各要素性质及其相互关系的基础上,建立能描述系统结构或行为过程的、且具有一定逻辑关系或数量关系的仿真模型,据此进行试验或定量分析,以获得正确决策所需的各种信息。 利用计算机程序进行建模:善于模拟过程,解决非线性问题;深入了解社会现象;与社会观察的结果对比。 二、系统动力学 ①系统动力学将生命系统和非生命系统都作为信息反馈系统来研究,并且认为,在每个系统之中都存在着信息反馈机制,是以控制论为理论基础的; ②系统动力学把研究对象划分为若干子系统,并且建立起各个子系统之间的因果关系网络,立足于整体以及整体之间的关系研究,以整体观替代传统的元素观; ③系统动力学的研究方法是建立计算机仿真模型—流图和构造方程式,实行计算机仿真试验,验证模型的有效性,为战略与决策的制定提供依据。 1.状态变量 又称存储体,用方框表示。 通过速率变量的影响产生变化。 状态方程可根据有关基本定律来建立,如连续性原理、能量质量守恒原理等。状态方程有三种最基本的表达方式:微分方程表达、差分方程表达和积分方程表达。在一定的条件下,这三种表达方式可以互相转化。 2.速率变量 又称流,用阀门表示 控制着状态变量的变化,速率方程规定了这种控制的方式和强度。 一般说来,速率方程可以是状态变量、辅助变量、外生变量等的代数组合。但是应该特别注意的是,状态变量对速率变量的作用关系不是通过物质的直接转移来实现的,而是通过状态变量变化的信息传递来实现的。 3.辅助变量及外生变量 辅助变量用圆圈表示,由于速率方程函数关系的确定是一个比较困难的过程,因此有必要引入辅助变量对速率方程进行分解,以使得构模的思路更加清晰。辅助变量是为了构模方便而人为引入的信息反馈变量,它是状态信息变量的函数。 外生变量用两个同心圆。外生变量是系统边界以外对系统发生作用或产生影响的环境因素,外生变量也可以是政策变量。 4.常数与表函数 在特殊的情况下,外生变量呈现出固定不变的状态时就退化成常数。常数的流图符号是一杠上加小圆圈。 系统中变量与变量之间的关系除了可以用各种代数形式的函数来表示之外,还可以用图表的方式来表示,这样的图表函数称为表函数,它的流图符号是圆圈内加两横。表函数反映了两个变量之间某种特定的非线性关系。 5.流线与延迟 流图中的流线(通道)分成物质流线和信息流线两种,分别用带箭头的实线和虚线表示。箭头的指向表示物质或信息的运动方向。 在流线上经常会出现各种延迟现象。如工厂的产品要经过运输才能到达仓库;信件发出后要经过一定的时间才能寄到等等。发生在物质流上的延迟叫物质延迟,发生在信息流上的延迟叫信息延迟。它们的流图符号为一个方框内标上延迟的种类和延迟变量的名字。 6.物质的源与汇 在流图中物质的源与汇都是用水潭符号表示。 源与它相连的物质流线箭头均朝外,汇则均向内。成对出现的,有时用双向箭头流线与其它变量相连。 7.守恒子系统 所有的物质都取之于源而聚入汇,但是出自于不同源的物质只能聚入与其同质的汇中去,这就是守恒子系统的概念。 是指以一个状态变量为中心组成的局部系统。根据源和汇的特点可以得到以下两个结果: a.同质守恒子系统之间既可以用物质流线连接,表示正向的因果关系,也可以用信息流线连接,表示反馈的因果关系。 b.不同质的守恒子系统之间只能用信息流线连接,不能用物质流线连接。 三、基于主体的建模 在特定条件下去适应环境,并且通过自适应学习,以实现一系列设定好的计算的独立实体或者程序。 自治能力:代理运行时不受外界干预和控制,对其自身行为和内部状态有自主控制能力。 社会能力:代理通过某种沟通方式与其他主体进行交互。 反应能力:代理可以感知所处的环境,并通过行为响应并适应环境。 主动行为:代理的行为是主动的,不仅简单地响应环境,还能够主动采取目标定向的行为。 1)特征 A 确定性与随机性相结合 基于代理的建模方式是从底层建立仿真模型,每一个体由相对比较简单的确定法则组成。建模思想认为,个体的运动和变化不是来自系统的外部,而是在一定条件下系统内部各种因素相互作用的结果。利用蒙特卡洛方法模拟随机状态,每一个体根据其自身的准则产生近似随机行为的复杂现象,反映出“适应性造就复杂性”。 B.动态仿真 波动、不平衡是复杂系统运动的常态,系统本身处于不断运动变化当中。个体之间的非线性作用使得整个系统的宏观状态不是各个微观个体的简单叠加;系统整体的性质与各子系统的性质并不存在必然的因果关系。这是传统的数学建模和其他将宏观与微观割裂开来研究的分析方法力所不及的。因此,基于主体的计算模型具有更强的描述和表达能力,更接近客观现实世界的真实情况。 C.微观与宏观 任何复杂系统中大量个体的动态行为成为整个系统演变的基础,个体与环境,个体与个体之间的相互影响、相互作用,形成系统演化的主要动力。基于主体的建模思想是将系统的宏观变化看作是微观变化导致的结果。赋予宏观模型一个虚拟的微观基础,便于探索微观层次众多个体交互作用导致整个系统显现某种动态演化的趋势,即宏观层次所呈现的模式或规律。 四、系统动力学与基于主体的建模 系统动力学多用于长期的战略模型,并对高度集合的对象进行建模:在系统动力学模型中人、产品、事件和其他离散物都是大量地显示出来。这样,它们就失去了所有的个体特征、历史或动态变化。 抽象层问题(总体交互),系统动力学是很好的选择。 个体细节(个体交互)使用基于主体建模方法对模型的全部或局部重新概念化。 五、系统仿真的工具 系统动力学建模工具:Vensim 基于主体建模工具:Swarm、Repast、 Ascape Anylogic是一款专业虚拟原型环境,用于设计包括离散,连续和混合行为的复杂系统。其不仅支持UML语音面向对象的建模方法,也支持Java建模。
个人分类: 读书会之专题介绍|4326 次阅读|0 个评论

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-5-7 12:05

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部