科学网

 找回密码
  注册

tag 标签: Words

相关帖子

版块 作者 回复/查看 最后发表

没有相关内容

相关日志

选择
热度 4 wangdh 2015-2-27 12:54
选择 (王德华) 看到朋友微信上的一句话: You make your choices and you live with them,and in the end you are those choices . 就是说,人生在世,很多事情,很多时候,你需要做出选择,你必须做出选择。你一旦做出选择,你就需要带着她们继续生活。最终,你的人生就是那些选择。或者说,你的那些选择,组成了你的人生。 还看到一句话: We can't always choose our circumstances, butwe can choose how we handle them . 意思是我们不能总是幸运能够选择自己所处的环境,但是我们有能力选择如何面对所处的环境。 动物行为学研究表明,行为决策是有生物学基础的,有进化基础的。人类在各种条件下做出的决定,应该也是有生物学和进化生物学基础的。祖先经历的环境变化,经历的生活磨难,会在我们的血脉里留下痕迹。贫乏的生活经历,对于生存没有多少帮助。丰富的经历,冒险的经历,极端环境的挑战,只要能够活下来,就会积累了生存的技能。 想起电影 《妙笔生花》( The Words )里也有一句相似的话,是一位由于丢失自己酷爱的书稿、失去女儿、失去妻子家庭而贫困潦倒的老者对因剽窃了这位老者的书稿而正如日中天的一位年轻人说的: “ We all make choices in life. The hard thing is to live with them . ” 意思是, 我们在一生中都会做出很多选择。带着这些选择继续生活,才是人生中最难的事情。 ” 能够面对自己的选择,能够对自己的选择负责,确实是需要点勇气的。所谓的敢做敢当。一直能够做到从容地面对自己的选择,就应该算是完美的人生,没有成果和失败之说,只有丰富满足和缺憾可怜之别。 毫无疑问,人的一生会有很多选择。我们在不同的阶段,不同的环境下,不同的心境下,不同的眼界下,会根据不同的目的做出不同的选择。在有很多种选择的时候,要做出选择是很难的事情,有时候很折磨人。很多时候,当没有任何选择的时候,更是很痛苦的事情,需要强大的心理和实力去面对。经历过极端,一切就是平常。 人们没法选择什么时机来到这个世界上,没法选择来到一个什么样的环境中。那是父母的事情,是他们的选择。没有生活能力的时候,也无法自己决策一些事情,进幼儿园,背起书包开始上学,基本都是父母和老师给代为决策和选择。所以,小孩子盼着快点长大,盼着能够自己选择的那一天的道理。 终于,你开始了自己的选择。你选择了读研。你选择了经商。你选择了做科学。你选择了你的伴侣。你选择了你的各种类型的朋友。你选择了你的后代什么时候来到人间。 在大自然中,一个生物体的命运受自然选择的限制。达尔文的自然选择,简单理解就是天择,最适者生存。生存是需要技能的。技能,有遗传的部分,有学习的部分。 还是电影《妙笔生花》中那位老者的话: “ 我们在一生中都会做出很多选择。带着这些选择继续生活,才是人生中最难的事情。 ” 个人是这样。一个团体,甚至一个国家也是这样。
个人分类: 个人随感|10634 次阅读|13 个评论
Online: Encoding Words into Cloud Models, KNOWL-BASED SYST
xiaojunyang 2013-10-22 23:39
Encoding Words into Cloud Models from Interval-valued Data via Fuzzy Statistics and Membership Function Fitting Xiaojun Yang , Liaoliao Yan , Hui Peng , Xiangdong Gao Abstract When constructing the model of a word by collecting interval-valued data from a group of individuals, both interpersonal and intrapersonal uncertainties coexist. Similar to the interval type-2 fuzzy set (IT2 FS) used in the enhanced interval approach (EIA), the Cloud model characterized by only three parameters can manage both uncertainties. Thus, based on the Cloud model, this paper proposes a new representation model for a word from interval-valued data. In our proposed method, firstly, the collected data intervals are preprocessed to remove the bad ones. Secondly, the fuzzy statistical method is used to compute the histogram of the surviving intervals. Then, the generated histogram is fitted by a Gaussian curve function. Finally, the fitted results are mapped into the parameters of a Cloud model to obtain the parametric model for a word. Compared with eight or nine parameters needed by an IT2 FS, only three parameters are needed to represent a Cloud model. Therefore, we develop a much more parsimonious parametric model for a word based on the Cloud model. Generally a simpler representation model with less parameters usually means less computations and memory requirements in applications. Moreover, the comparison experiments with the recent EIA show that, our proposed method can not only obtain much thinner footprints of uncertainty (FOUs) but also capture sufficient uncertainties of words. Keywords Computing with words; Cloud model; Enhanced interval approach; Fuzzy statistics; Membership function fitting; Histogram Framework of the Proposed Method Step 1: Data collection. The datasets introduced in “ D. Wu, J. M. Mendel, and S. Coupland , “ Enhanced Interval Approach for Encoding Words Into Interval Type-2 Fuzzy Sets and Its Convergence Analysis ,” IEEE Transactions on Fuzzy Systems , vol. 20 , no. 3 , pp. 499-513 , Jun . 20 12 ” are used in our research. Step 2 : Data preprocessing. 1) Bad data processing 2) Outlier processing 3) Tolerance limit processing 4) Reasonable-interval processing Step 3 : Fuzzy statistics of data intervals. The fuzzy statistics is to compute the histogram of the m intervals. ( Fig. 4, red solid curve ) Fig. 4 . The fuzzy statistics histogram and fitting curve of the word “Some” . Step 4 : MF fitting using Gaussian curve function. The fuzzy statistical histogram obtained in Step 3 can be considered as the MF of the fuzzy opinions of a word. Nevertheless, in some situations such as in applications of CWW, the obtained fuzzy opinions may be manipulated later in arithmetic operations. In these cases, mathematically explicit and continuous forms of MFs may be a necessity. In order to develop a more parsimonious parametric model for a word, we fit the fuzzy statistical histogram data ( x s , m ( x s )) obtained in Step 3 using Gaussian curve function. ( Fig. 4, black dashed curve ) Step 5 : Representation of a word by a Cloud model. To handle the uncertainties of a word, we choose the Cloud model with the minimum number of parameters that best approximates the data to represent a word, so as to obtain a more parsimonious parametric model for a word. ( Fig.5 ) Fig. 5 . The representation of the word “Some” by a Cloud model . Experimental Results Fig. 6 . The fuzzy statistics histograms (red curves) and fitted curves (black curves) of all 32 words. Fig. 7 . The resulting Cloud models of all 32 words . Fig. 10 . The area (light-colored shaded area) and error (dark shaded areas) of the word “A lot”. The sum of dark shaded areas is defined as the error of an FOU. (a) The result of EIA, (b) the result of our proposed method. Fig. 11 . The areas and errors of the IT2 FS FOUs for all 32 words obtained by the EIA. Fig. 12 . The areas and errors of the Cloud model FOUs for all 32 words obtained by our proposed method. Fig. 13 . The visualized comparisons of the areas and errors between the EIA and our proposed method for all 32 words. (a) The areas, (b) the errors. Discussions and Conclusion Based on the Cloud model, we have developed a more parsimonious parametric model for a word from interval data. From the experimental results, we have observed that the proposed method can result in much thinner FOUs. 1) The representation of a Cloud model FOU is simpler than that of an IT2 FS FOU. It only needs three parameters to define a Cloud model FOU, whereas an IT2 FS FOU needs eight or nine parameters. “Generally a MF shape with simpler representation is preferred, especially when the parameters of the MF need to be optimized, because simpler representation usually means faster convergence”. Therefore, we should obtain a parsimonious parametric model with the minimum number of parameters that best approximates the data for a word. 2) The Cloud model FOUs are much thinner than the IT2 FS FOUs. Thinner FOUs may represent a more desirable tradeoff between uncertainty and accuracy, and thinner FOUs can be used to better distinguish close words . Experimental results show that the Cloud model FOUs for words are not only much thinner but also can capture sufficient uncertainties. The reason may be that, we approximate the fuzzy statistical histograms of the words using the Cloud models directly in our proposed method, while in the IA and EIA, each person’s data interval is assumed to be uniformly distributed but later a symmetrical triangle T1 MF, a left-shoulder T1 MF, or a right-shoulder T1 MF is used to approximate a uniformly distributed data interval. Constructing the models of words is the first step in the CWW paradigm. In this paper we have proposed a method of e ncoding w ords into Cloud models from i nterval-valued d ata using f uzzy s tatistics and MF f itting. Similar to the IT2 FS , the Cloud model is able to represent both interpersonal and intrapersonal uncertainties about collecting word data from a group of individuals. Based on the Cloud model, we have developed a more parsimonious parametric model for a word. The experimental results show that our proposed method can not only result in much thinner FOUs but also capture sufficient uncertainties of words. Thus, the Cloud model may be a suitable model for a word, and this may explore an other efficient way in the CWW and the representation of human knowledge and perceptions . Future researches will be concerned with constructing the system of CWW using Cloud models. Access this article: Knowledge-Based Systems, Available online 19 October 2013 http://www.sciencedirect.com/science/article/pii/S0950705113003250#f0020 http://dx.doi.org/10.1016/j.knosys.2013.10.014
个人分类: 学术论文简介|8137 次阅读|0 个评论
[转载]分享——无字证明(Proof without words )
xuhy07 2013-3-1 15:06
分享——无字证明( Proof without words ) ( 说明:所用图片均来自网络,大部分来自网站 http://mathoverflow.net/ ) 1. 勾股定理(the pythagorean theorem) $a^2 + b^2 = c^2$ 2. 斐波那契数列的恒等式(Fibonacci identities) 斐波那契数列:1、1、2、3、5、8、13、21 .... 其递推公式:$F_{n+1}=F_n+F_{n-1}$,通项公式: $F_n=\frac{1}{\sqrt 5}\left $, 关于斐波那契数列,有一个漂亮的恒等式 $F_0^2 + F_1^2 + \cdots + F_n^2 = F_nF_{n+1}, F_0 = 1$。 3. 三个求和公式 (A)$1+2+\cdots+(n-1)=C_n^2\equiv\left(\begin{array}{c}2\\n\end{array}\right)$ (B)$1^2+2^2+\cdots+n^2=\frac{1}{3}n(n+1)(n+\frac{1}{2})$ (C)$1^3+2^3+\cdots+n^3=(1+2+\cdots+n)^2$ 4. 分部积分公式(Integration by Parts) 5. 最受数学家喜爱的无字证明 一些定理的直观理解虽然毫无逻辑可言,完全算不上是数学证明,但这些精巧而欢乐的视角,依然让数学家们如痴如醉。下图是由一个个小三角形组成的正六边形棋盘,现在请你用右边的三种朝向不同的菱形把整个棋盘全部摆满(图中只摆了其中一部分),证明当你摆满整个棋盘后,你所使用的每种菱形数量一定相同。 文章末尾提供了一个非常帅的“证明”。把每种菱形涂上一种颜色,整个图形瞬间有了立体感,看上去就成了一个个立方体在墙角堆叠起来的样子。三种菱形分别从左侧、右侧、上方观察整个立方体图形能够看到的面,它们的数目显然应该相等。 严格地说,这个本来不算数学证明。但它把一个纯组合数学问题和立体空间图形结合在了一起,实在让人拍案叫绝。这个问题及其鬼斧神工般的“证明”流传甚广,深受数学家们的喜爱。 《最迷人的数学趣题:一位数学名家精彩的趣题珍集》(Mathematical Puzzles: A Connoisseur'sCollection)的封面 ===================================================== 附电子书 Proofs Without Words——Exercises in Visual Thinking.djvu prfwithout.djvu
个人分类: 备忘分享|6948 次阅读|0 个评论
Democracy is not a given, even in the US (revised)
热度 1 zuojun 2013-1-21 14:30
Yes, the US is a democratic country. You can criticize the US President, but you'd better to careful when you go again your immediate boss. Don't believe me? Then, try it yourself. This is a real story of how ordinary Americans fight for their rights, at the lowest level, namely, for their rights as condo owners. I live in a condo in Honolulu, where the cost of owning a 2-bedroom condo can be as high as owing a mansion in New Orleans. (How do I know this? Well, a tax driver in New Orleans told me so, who used to live in Hawaii.) There are more than 230 units in our two buildings. The cheapest is 2-bedroom with two bathrooms (and a full kitchen and a lanai). The cost? I am sure you are curious. It's selling around USD420k these days. In the US, each condo has an association, run by owners with the assistance of a paid professional management company. Serving on the Board of the Association is a thankless job, but someone has to do it. I was told to become a board member shortly after I moved in, around 2004. I thought it might be interesting to learn something, like how Americans run their condos. I did learn a lot, and even made quite a few friends. I resigned two years ago, because I didn't feel I had time to attend the bi-monthly meetings when I planned to travel more (after my son went to college). Coincidentally, the new board president took matters into his own hands, and things started to get worse and worse. About a year ago, a few of us, three old board members plus a new owner, decided "enough is enough." We started to meet every month or so, to plan for change. The final stroke came just before the 2012 Christmas, a notice of getting $16 million loan for repairs. We, KT HUI, started to act. We meet every weekend to plan each move thoughtfully. Our first letter to owners called for "not sending in the consent form" for the loan. Our second letter raised 13 questions about the loan. The letters I am preparing to send out tomorrow are Letter #3, which calls for voting at the upcoming Annual Meeting in person, in order to unseat the current board president. I always enjoyed the democracy in this country, but only realized lately that one has to fight for one's rights even in a democratic country. It's pure labor to ready these letters. Luckily, every time we sent out a letter, we received some kind words from our fellow owners. Here is the latest: "Mahalo for your hard work, persistence, diligence, etc. I really appreciate it!" Me, too.
个人分类: Uniquely Hawaii|2639 次阅读|1 个评论
A few of words: to those walking alone
Jack1259 2012-10-13 18:37
1、 我想结婚了。 2 、就算是很多年不联系也依然想念着的人是存在的。 3、 做不了恋人就做陌生人原来是真的。 4、 我开始变老了,但是仍旧很幼稚。 5 、太重感情的人往往死的很惨。 6 、肯帮助别人的人越来越少了。人的热情会慢慢消退,人的惰性会慢慢增加。 7 、肯主动帮助别人的人几乎不存在了。他人不对你使坏心眼已经是恩赐了。 8、 我果真是个自作多情的人。我本来就是连根葱都算不上的。 9、 话越来越少,也越来越不爱笑了。 10、父母越来越老了。 11、时间久了,什么感情都会变的。清清淡淡的友谊反而更长久。 12、信任是一种很滑稽的好感,我求之,但不得之。 13、我可以看出别人的算计,但是不再心甘情愿的当傻子了。 14、轰轰烈烈的爱情是没人愿意陪的。 15、心里的平衡点逐渐减少。 16、能让我快乐的做个傻瓜的人没有了。 17、做一个认真生活,用心生活,真实生活的人很累,很痛苦。 18、想得太多考虑太多是错的。 19、不听老人言,吃亏在眼前是真的。 20、越来越不想一个人扛起所有的压力,开始喜欢逃避问题了。 21、有些裂痕是随着时间也抹不掉的。 22、生活可以让人变得麻木。 23、想念一个人到痛哭流涕原来是可能的。 24、越长大,心里的阴暗面越大。 25、把自己弄丢了。什么时候丢的,丢在哪了,怎么找回来,统统不知道。 26、越来越不会装坚强了。 27、以德报怨是会被人利用的。虚情假意比比皆是。 28、病从口入,祸从口出是不变的真理。 29、我有时候的确缺心眼。 30、别人需要你的时候自会找你,不需要你的时候,就把你晾着。 31、我是个做事情没毅力的人。 32、我很恋家。 33、虽然受过的是伤,但想起来是爱。 34、有些事以为说开了就没事,其实结果还是一样的难过。
2218 次阅读|0 个评论
words fail me
carldy 2012-7-8 21:12
words fail me I cannot find the proper words to describe my thoughts and feelings; I am unable to express my reaction, often because the situation is shocking, surprising, or unusual; we can also say "I'm speechless." 1. Words fail me when I think of what he has done! 2. I would like to tell you about the difficulties in this African country but I can't. Words just fail me! 3. Words fail me to describe how beautiful was my girlfriend in her new dress. P.S. 每一堵墙都是一扇门 Every wall is a door
个人分类: 读书心得体会 Harvest|3948 次阅读|0 个评论
[转载]Social Approaches7 Words and Pictures in a Biology Textbook
carldy 2012-2-26 11:04
http://eca.state.gov/education/engteaching/pubs/BR/functionalsec3_7.htm Social Approaches 7 Words and Pictures in a Biology Textbook Greg Myers Most genre analysis in ESP has focused on the verbal texts of common genres faced by university students. The assumption seems to be that the pictures, graphs, and tables do not provide such a barrier. But in many fields the visual elements are a crucial part of learning, and they can be just as conventionalized and discipline-specific as the verbal texts. In this paper I draw on current approaches to the relations of words and pictures to consider the range of illustrations in one commonly used molecular genetics textbook. Introduction If you compare any current science textbook to its predecessor from thirty or forty years ago, you are likely to be struck by the vast increase in the number of illustrations. Not only are there more of them, there is a wider range of types, from photographs to diagrams to graphs and tables. This is true even at the university level, where one might expect the students to get along without the pictures, or at least without the colors. We may take the development of pictures in science texts as another example of the growing dominance of the visual over the verbal in our culture as a whole, a vast shift in our systems of representation that applies to ads, product instructions, and journalism as well as to education. Major textbooks may have as much as a fifth of their space taken up by pictures. Clearly these pictures are doing more than just illustrating, supplementing, and breaking up the dense blocks of text and attracting the attention of any reluctant readers. Learning to read them is a part of learning scientific discourse. One educational danger is that students may think that, however hard they have to work at the written text, what the pictures say is obvious. This may be a particular problem for the many students for whom English is not their first language, but who must at some stage use textbooks in English. They may turn to the pictures as a shortcut to the meaning of the text, represented in a universal visual language. But this visual set of conventions is no more universal than the English of the written text. It is important that we as teachers stress: - the complex interrelation of words and pictures in these texts - the possibility of multiple readings of images - the different ways the images represent meanings - the different ways the images signal degrees of reality - the ways images change over time Science students need to learn to be as critical in their reading of the pictures as they would be, ideally, in their reading of the words, recognizing the forms of persuasion and the assumptions that support them. In this paper I will take my examples from Benjamin Lewin's Genes V, a major textbook in molecular genetics. Of course I could have chosen physics textbooks with more mathematical formulae, or engineering textbooks with more graphs, or textbooks in zoology or botany that had more photographs and maps to focus on organisms and environments; genetics cannot stand for any of these fields. But molecular genetics is a good field in which to seek examples, because it is rapidly developing new visual conventions, yet the textbooks are still fairly accessible to those of us with no training in the field. The first edition of Genes was published in 1983, five years into a huge revolution in the study of genes of higher animals and plants. The version I have, from ten years later, is already the fifth edition, and is enormously changed from the first edition--that is an indication of the pace of change in this field. One reason I chose this book is that the author, Benjamin Lewin, is also the editor of the major journal in the field, and is thus familiar with a wide range of the latest images in research publications. I have written elsewhere how a discovery in this field was popularized, from the scientific articles, to reviews, to textbooks and to articles for general readers in Scientific American and in newspapers. I will focus here on the chapter that deals with that discovery, that eukaryotic genes may be interrupted. The textbook is a key part of this process of establishing the discovery, conveying some of the basic concepts, language, and imagery of the new research, but eliminating most of the detailed arguments, the evidence, and the names in the original scientific papers. The success of this particular textbook (and others like it) may someday be seen as marking the end of a genetics that focused on whole organisms and their inherited characteristics, and the triumph of a genetics based firmly on the level of molecules. There have been two main approaches to analysis of pictures in texts; one that treats the pictures as utterances, and the other that treats the utterances as pictures. The approach that treats the pictures as utterances then analyzes them in terms of linguistic pragmatics, such as principles of relevance or cooperation. A picture of a ribosome is the equivalent of This is a ribosome. In the approach that treats utterances as pictures, both words and pictures are kinds of signs, to be analyzed in terms of the same general semiotic processes. So the letters r-i-b-o-s-o-m-e and a picture of a big and a small flattened oval, one on top of the other, can both be conventional representations of an entity in the cell. I will draw on both pragmatic and semiotic approaches, and give references for those who want to pursue either of them further. From semiotics I borrow some analyses of the relation of words and pictures, and the different ways words and pictures can represent. From pragmatics I borrow the concept of modality, as applied to both utterances and pictures. But rather than give separate reviews of these approaches (which can be found in Hodge and Kress, 1988; Kress and van Leewen, 1988; and Bastide, 1985/1992, I will go over some questions I think teachers and students should ask about the pictures in textbooks. What Is The Picture Doing Here? The first question we need to ask is why the picture is there at all. As anyone who writes textbooks in any field knows, pictures take up space, cost money, and are an incredible hassle at every stage of production. So each picture must be there for a reason. In textbooks, as in articles, there is nearly always an explicit statement of what that reason is, in two places, in the caption above or below the picture, and in the text before a reference to the figure number. These textual references become stereotyped, using just a few verbs for instance, so we need to look more closely at what they direct us to do with the picture. Some examples are in my Table 1. The verbs in the text and caption suggest several uses for the pictures. The reference, Figure 23.1 shows that treats the picture as a specific example of the results. In the third example, Figure 23.19 compares, the picture is taken as doing the comparison. In this chapter, the most common verb in the text reference is summarizes; the pictures are treated as economical concentrations of large amounts of data, enabling the text to refer to general rules and not to specific species or genes. Reprinted by permission of the author. Figure 1 If we move from the texts to the captions, we see three ways the words direct our reading of the pictures: by providing a gloss for decoding specific elements or the whole picture, an interpretation of the meaning of the picture in disciplinary terms, or essential background information assumed by members of the field. The caption to Figure 23.1 is a gloss; it has three clauses corresponding to the four steps in the diagram; then it has a final detail labeling part of the diagram. The caption to 23.15 is an interpretation that tells us how to interpret these blurry lines as evidence of cross-hybridization; that is, the visual signs are given a meaning in terms of the world of nature. The first sentence of the caption to Figure 23.19 is also an interpretation that puts the message of the diagram in terms of nature; it is followed by a gloss that tells us how to read the different colored sections of the strands. This limited set of uses of pictures is more striking when we consider what pictures do not do in this textbook. They never provide proof; the show that is so common here always means illustrate, not demonstrate. Why are glosses, interpretations, and background statements needed at all? Each of these pictures has, by itself, many potential meanings. Figure 23.1 could be read as being about straightening out the helix; Figure 23.15 could be read as showing differences between X and Y chromosomes; Figure 23.19 could be read as comparing the different sizes of the two chains. We are used to thinking of language as having multiple meanings, but pictures do too. The textual references and captions constrain our readings, trying to get us to choose one of the many possible interpretations. Roland Barthes (1964/1976) described anchorage of the meanings of the text in the picture. But here we have pictures with many possible meanings, doubly anchored by a caption that acts as a verbal gloss, and a pointer in the text that tells us what kind of statement the text is making. Or perhaps what we have here is more like Barthes' relay, the back and forth relation of text and pictures as in a comic book. Here, the text directs us to the picture, which leads us back to the caption, which leads to the picture, which leads back to the text. The kinds of verbs used in the text suggest that the picture always contains the same information as the text, but in different forms. To read these different forms, students must learn visual conventions of representations and certainty to go with their learning of linguistic conventions. How Does The Picture Refer? Genes V seems to have a narrow range of illustrations, compared to popularizations or even to other textbooks. There are few photographs, no cartoons, no maps, no portraits of scientists or pictures of equipment. But even within the narrow range of graphs and diagrams that it develops, we can find examples of quite different kinds of signs; the different pictures refer to entities in the world in quite different ways. We can look to semiotics for ways of defining these differences: indexical references based on a link to the referent, iconic references based on resemblance, and symbolic references based on arbitrary conventions. Indexical signs are linked directly to the thing referred to. The lip-shaped red print on a cheek or sheet of paper is taken as a sign of a kiss because it is supposed to be the mark lipstick leaves behind. Figure 23.15, of Figure 1, is an autoradiogram of the results of an experiment in which one bit of the human genome was used as a probe to pick out bits from the DNA of various animals. Though there are a number of steps in this procedure, a biologist thinks of the fragments as making this picture themselves--that is why it is an auto radiogram. The fragments have migrated in the gel according to their size, which is why there can be the scale of kilobases on the right. Then the radioactively labeled fragment binds with some sizes of bits on the gel and not others. These labeled bits then show up on the photographic paper put over it. Even the slight blurriness that accompanies this method testifies to the direct link of sign and referent. Such images were very effective in scientific articles when the discovery was first announced, because they demonstrated the results as well as reporting them. They are relatively uncommon in textbooks, where such demonstration is not always needed, and a more stylized representation may convey the information more simply. Another way a picture can refer to something is by resembling it, as a photograph resembles a person; these are referred to as iconic signs. In Figure 1, the drawing of the chromosome with bands in 23.16 says chromosome because of the resemblance, however stylized. Throughout the book there are standard icons of DNA as helix, of mouse, fly, tRNA molecule; they are part of what assures the student of the book's accessibility. Once the resemblance is established, the image can be conventionalized and will still be read unambiguously. In Figure 1, the pictures at the top and bottom of 23.17 come without the blurriness, and without the scales of sizes. We do not read them for particular information; we may read them as standing for do an autoradiogram at this stage, not for the results of a particular experiment. Finally, a symbolic sign can refer to a referent because of some arbitrary convention, treated as an agreement between people. The usual examples are from language, where words generally have an arbitrary relation to things. The letters used to symbolize the bases of DNA in the middle of 23.17 are purely conventional signs: G is not in any way shaped like Guanine, nor is C like Cytosine, not T like Thymine. The names were given for quite logical reasons, I suppose, but not for visual resemblance. The letters are just the remnants of those names. A map showing a sequence as a string of letters is entirely arbitrary; there is no resemblance between the twisted coils of DNA and this neat linear and measured arrangement. The graphs through the chapter are also arbitrary--the length of the purple rectangle correlates with, say, the number of exons of a given length, but it does not resemble the exons, nor is it directly associated with them. These three ways of referring--as an index, an icon, or a symbol-- offer pretty good categories for most purposes. But in any real example one usually sees some blurring of the distinctions. For instance, the use of CATG for the bases is arbitrary, and so is the convention that a single strand of DNA is shown as a horizontal line with the 5' end at the left. But a biologist would argue that the linear nature of DNA is related to an important natural feature of the molecule, abstracted here but nonetheless natural--it can be seen as a sequence rather than as a ratio or a shape. The conventional way to read CATG is from right to left, horizontally, but that it is read is, for a biologist, a natural fact. On the other hand, an iconic or indexical sign can become conventionalized as a symbol. For instance, the same drawings of a mouse and of a fly appear rubber-stamped throughout all the diagrams of the book whenever these standard organisms for genetic experiments are referred to; when they need to show that the result of an experiment is a dead mouse, the same rubber-stamp is used upside down. This is no longer being read as a picture resembling a mouse, but as a conventionalized sign. Signs, therefore, can be on the border between two kinds of representation, and can be conventionalized so that they are read as symbolic, not iconic. Also, any one figure can incorporate examples of different categories of signs. For instance Figure 1 includes indexical, iconic, and symbolic signs, without the reader feeling that it is incomprehensibly heterogeneous. If the distinctions between indexical, iconic, and symbolic signs blur, why insist on them? Usually my purpose is to make readers more critical of the indexical and iconic end of the scale. The power of these signs comes from their assumed naturalness, but they are only natural if one ignores the complex means used to produce them, whether by gel electrophoresis or by photography. But I think with textbooks the danger is somewhat different. Students may focus on the symbolic function of signs, missing the ways symbols are created and used, in the discipline and in their own learning. Broadly, we can see a movement in the scientific literature from indexical signs (in the first reports) to iconic (in popularizations and textbooks) to conventionalized symbols. And it may be that we see a movement in each major student's career, from icons to symbols. How Real Is It? After these comments on the processes of representation with all pictures treated as signs, it may seem naive to ask how real the various images claim to be. But images do come with marks of greater and less certainty. Hodge and Kress (1988), and Kress and van Leeuwen (1988), have developed the idea of a visual modality, like our use of might or could or probably in language. So, for instance, the image in 23.15 with or without its caption, stakes a claim to being a statement of evidence, quite separate from the experimenter's interpretation, and is thus more certain. The drawings of an autoradiogram in 23.17, without all the grainy detail and without the scale, make no such claim. No one would consider it cause for complaint if the bands in 23.17 were not just as depicted, while we would complain if the bands in 23.15 were moved. Gilbert and Mulkay (1984) have pointed out how hard it is to suggest indefiniteness and uncertainty in pictures. One way is to reduce the images to cartoon simplicity, cutting out the details of a photograph, autoradiograph, or plot of data points. Another element in this modality is time. The most certain images are also those tied to a moment in the past. A photograph or an autoradiogram is a unique result of a single experiment (thus the textbook's copyright acknowledgment to the researcher seems appropriate). The diagrams, on the other hand, suggest that what is shown is a timeless, general molecule and process. Figure 23.15 shows an actual series of experiments in one image. The arrows in 23.17 show an idealized research strategy that can be repeated over and over with different molecules into the future. In these terms, there is a very strong tendency in the textbook towards the conventionalized and timeless images. Popularizations, on the other hand, stress the unique event, the lucky moment at which nature was revealed, and they may value the first images for their news value. This takes us back to the relation of pictures to words. The words seem invariably to tie the picture to the present tense of idealized scientific fact. This is true even for historic photographs. The questions of the production of these images, which once filled paragraphs of a Methods section, are now forgotten. Images that usually have different kinds of modality are composed together on the example page into a montage in which such distinctions are lost. Changes In Conventions of Representation One reason to make these distinctions between types of textual links to pictures, types of referring, and levels of certainty is that they enable us to compare different ways of representing knowledge. We can see this in two kinds of comparisons, from genre to genre as the fact develops, and within one genre over time. In each case, there is a development from one end of the scale to the other. One key change in the development between genres--research article to review article to textbook and popularization--is the change in the status of claims as facts. This change also affects the relation of words and pictures. Latour and Woolgar (1979/1986) present a scale of facticity, in which competing researchers try to push a claim up and down, especially through the use of attribution: Watson and Crick claimed that DNA is a double helix vs. DNA is a double helix vs. the strands unwind... (in this last instance, the point is that one does not even have to mention the helical structure; it is so much a part of the discipline). A scientific claim develops from the weakened, contingent form of its first statement and debate around it to the unmodified certainties of fact, or it gets pushed back to being a mere claim once made by someone. I have written elsewhere about the stages of popularization of the discovery of the split gene. Here I need only note that from article to textbook, we move from pictures that demonstrate (providing evidence), to pictures that illustrate (showing, summarizing, defining). We move from indexical pictures, like the autoradiogram, to iconic pictures, showing transcription at work, to more symbolic figures in which the process is reduced to the intersection of strings of letters. This visual process is a complement to the linguistic production of a fact. Even if the popularizations sometimes revive the original images produced by the thing, they now have a different meaning --they are not read in detail, but have the meaning of historic image of split genes along with portraits of the scientist, part of the textual museum of science. The textbook, in comparison to both the first research articles and the popularizations, excludes most of the specific, local, contingent images to generalized diagrams of processes and summaries of comparisons. Another kind of development is in the book, Genes , itself. Its editions have changed in size and in the use of color. But the editions have also changed as what was a hot research area, for example, split genes becomes a taken-for-granted fact to be passed on so that readers can understand the latest work. Subsequent editions include more or less the same chapter, but succeeding chapters play a different role in the knowledge of the field. We can see this in comparing the types of figures used in the corresponding chapters, from electron micrographs and autoradiographs at the iconic/indexical end, to bar graphs and tables at the symbolic end. Most of the images in both editions are conventionalized diagrams of sequences. But as the field develops, the imagery becomes more and more conventionalized, less based on direct images and resemblance (which make up 16 of the 26 figures in Genes I ), more based on purely quantitative and linear representations of disciplinary categories (which make up 14 of the 23 figures in Genes V ). The pedagogical effect of these changes can be seen in comparing figures with similar functions in the first and fifth editions. Both editions begin with the discovery of split genes. In Genes I , the background section is complemented by a reproduction of one of the original electron micrographs, and then by a diagram telling us with letters and arrows how to read the split genes in the wiggles. The caption says one of the original electron micrographs and thanks Philip Sharp. The drama of discovery is still part of the fact. In Genes V , the discovery is still mentioned. But there are no images at all of what split genes might look like. Instead the first figure in the chapter, as I have noted, is the conventionalized linear representation of the DNA. The only suggestion that this diagram is to be taken as looking at all like something in nature is the irregularity of the squiggles in the introns. As a fact is incorporated into the field, the circumstances of the discovery, the name of the discoverer, and just what the thing looked like, all become submerged. A student coming to these more conventionalized representations must work that much harder to learn the visual conventions of the discipline and to attach them to the things the discipline studies (here, transcription and translation of DNA) and the practices through which it studies these things (here, restriction enzyme fragmentation, probes, autoradiographs, sequencing). In simple terms, a student is less and less likely to have come across these conventions in secondary school, or in other science subjects; they are learned along with the written language of the specific discipline. Implications For Teaching Why is this important to our students? The reader may or may not share my view that science students should be taught about the social processes underlying science. But even if only interested in facilitating students' entry into the field, teachers may want to make explicit to students some of the ways of interpreting pictures. Students should always ask why the picture is there, how it refers, and how certain it is. Teachers can bring this out without doing all the semiotic and linguistic analysis I have done here. Rather than adding Roland Barthes' terminology to their troubles; teachers can just ask some simple questions. What would the book lose if this figure were deleted? What information does it add? Why choose this form of presentation, say, a diagram rather than, say, a photograph? What different captions are possible? Why is the visual placed in that particular portion in the text? What conventions do readers use to interpret it? Students might try explaining it to someone outside the discipline. Or students may just try turning it upside down--it is usually only the conventions of reading that might make this disorienting. (It is disorienting to turn a map or globe upside down because we have become familiar with the convention of North = up, and the geo-political perspective that goes with this orientation). The aim of such questioning is not to make it more difficult for students to learn conventions of a new discipline, but to raise their consciousness as to how the conventions work. Here I have focused on just one book from one discipline. Different disciplines have different initiations into the iconography; for instance, geology starts its initiation earlier, and makes it more explicit, perhaps because geologists realize the importance of the visual in their discourse. Chemistry presents some visual conventions, such as the periodic table and the chemical bond, at the earliest level. Even within molecular genetics, there have been several different approaches to introducing the field, and each approach has a different place for the visual. This is the same problem that teachers of languages for academic purposes always have; conventions vary across disciplines. With the visual language of a discipline, as with its written language, we can start by making students aware of the differences and make explicit how they must deal with these differences. Greg Myers is a lecturer at Lancaster University, where he teaches on the Culture and Communication Program. He is a graduate of Pomona College and Columbia University, and has taught previously at the University of Texas and at the University of Bradford. He is author of Writing Biology and Words in Ads and of articles in numerous scholarly journals.
个人分类: 论文撰写技巧 skills for graduate thesis|1770 次阅读|0 个评论
[转载]自己整理的Bag of Words/Bag of Features的Matlab源码
lipiji1986 2010-11-29 19:34
原文: http://www.zhizhihu.com/html/y2011/3536.html 快毕业了,整理点东西。 先做了一个简单的图像分类的Demo。是能够让图像分类的初学者能够直观的从实际中观察揣摩理解图像分类的环节,各个环节的步骤以及重要性,哪些环节是问题的 本质 等等。 图像的特征用到了Dense Sift,通过Bag of Words词袋模型进行描述,当然一般来说是用训练集的来构建词典,因为我们还没有测试集呢。虽然测试集是你拿来测试的,但是实际应用中谁知道测试的图片是啥,所以构建BoW词典我这里也只用训练集。 其实BoW的思想很简单,虽然很多人也问过我,但是只要理解了如何构建词典以及如何将图像映射到词典维上去就行了,面试中也经常问到我这个问题,不知道你们都怎么用生动形象的语言来描述这个问题? 用BoW描述完图像之后,指的是将训练集以及测试集的图像都用BoW模型描述了,就可以用SVM训练分类模型进行分类了。 在这里除了用SVM的RBF核,还自己定义了一种核:histogram intersection kernel,直方图正交核。因为很多论文说这个核好,并且实验结果很显然。能从理论上证明一下么?通过自定义核也可以了解怎么使用自定义核来用SVM进行分类。 代码下载链接: http://www.zhizhihu.com/html/y2011/3536.html
个人分类: 科研技术|18804 次阅读|0 个评论

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-5-20 23:19

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部