科学网—标签 - Statistics

相关帖子	版块	作者	回复/查看	最后发表

WileyChina 2016-6-30 11:28

我们非常荣幸地聘请上海交大-耶鲁联合生物统计中心副主任、上海交通大学生物信息和生物统计学系和数学系长聘教授俞章盛担任Wiley期刊Statistics in Medicine副主编。Wiley出版集团编辑拓展副总监胡昌杰先生代表期刊共同主编RalphD'Agostino, Simon Day, Joel Greenhouse，Louise Ryan向俞教授颁发了聘书。 Statistics in Medicine 《医学统计学》（纸本ISSN：0277-6715，在线ISSN: 1097-0258），1982年创刊，全年24期，SCI 2015年影响因子1.533。刊载统计及其它计量方法应用于医学领域（包括医学数据的收集、分析、描述、阐释）的研究论文，涉及医学数据、临床实验、诊断、剂量控制、流行病学和卫生保健等。该杂志旨在通过论文发表和其它计量方法，影响医学实践和其相关联的科学。出版的主要标准是对实际医学问题的统计方法恰当，阐释清晰。该杂志力求提高在统计学家，临床医生和医学研究者之间的交流。俞章盛教授简介俞章盛教授本科毕业于华东师范大学统计专业，先后在华东师范大学和美国鲍林格灵州立大学获得人口学和统计学硕士学位，之后获得美国密歇根大学生物统计学博士学位，并于2005-2006年在哈佛大学做访问学者，之后在俄亥俄州立大学生物统计系和印第安纳大学医学院生物统计系任教，2014获聘印第安纳大学医学院生物统计系副教授(终身教职)。2009-2013年分别担任中印第安纳统计协会的副会长和会长。2014年获得上海市“东方学者”特聘教授。作为生物（临床医学）统计领域的专家，俞教授广泛地与临床医学专家合作，为提高医学研究的效率，研究方法的适用性，研究结果的正确理解作出了卓越的贡献。俞教授还担任Journalof System Science and Complexity期刊副主编,Heart Rhythm StatisticalEditor，Pediatric Pulmonology编委。

个人分类: Life Science|6098 次阅读|0 个评论

[转载]Statistics, Syntax, Semantics, Pragmatics, Apobetics

热度 1 geneculture 2015-12-22 23:38

The 5-level structure of information, which includes Statistics, Syntax, Semantics, Pragmatics, and Apobetics. This review is from: In the Beginning Was Information In this fascinating book, the author Werner Gitt explains in detail the principle of information theory, namely defining the characteristics of information and all the observational evidence we have for the origin and formation of information. He carefully and clearly delineates what is considered information for the purposes of the theory, and the 5-level structure of information, which includes Statistics, Syntax, Semantics, Pragmatics, and Apobetics. It is shown that the well-known theory of information given by Shannon is an important contribution, but can only describe the lowest (statistical) level of information, while ignoring the most crucial aspects of its higher level definition. All information, as defined by the book, has these higher level aspects, which include the structure and code (syntax); the meaning (semantics); the intended action (pragmatics); and purpose or goal (apobetics). Of course that is an oversimplification of the concept, but Gitt does a fine job of explaining it with numerous fascinating examples both from the biological and technical realm. Gitt shows how all attempts to generate (or simulate the generation of) information apart from a mental process have failed. This is the most fundamental hurdle that the theory of evolution must overcome in order to claim validity as a complete explanation of the origin of life apart from the Creator or a mental source. DNA is undeniably information, and it is coded in such an efficient and marvelous way, that it is utterly unmatched by the greatest technological advancements of today. Even an experiment to show the formation of meaningful DNA from materialistic processes, in sufficient quantity to produce life, would still fall far short of proving this necessary step for evolution, since apart from a meaningful context of proteins and RNA to participate in the replication, transcription, and translation of the information in DNA, DNA is useless. And as it is well known in biology, the paradox goes deeper: the proteins that are required for replication, etc are coded for BY the DNA! The challenge of information theory to evolution can not be brushed aside, and this book does an excellent job of laying out the theory in a detailed yet understandable and compelling manner. Gitt's book offers a fresh look at the creation and evolution debate by presenting a robust positive case for creation on the basis of the theorems and natural laws encompassed by information theory and the countless observations that have affirmed this theory. He discuss numerous examples that have been proposed contrary to the it, and how they have failed to falsify the theory. Gitt devotes limited time to discounting evolution, but makes reference to other writings of his that deal with it more specifically. The purpose of the book is not so much to deconstruct evolutionary theory, but to establish by scientific theorems that all known information has a mental source, and this has yet to be disproven. He is also unabashedly a Christian and a believer in special creation, which comes across clearly in his book, yet he rightly admits that the existence of God can not be proved. However, he points out the consistency of the inference of a Creator with all other observations about information. In the Beginning Was Information will be a very informative book not only for creationists, but evolutionists as well, due to its thorough explanation of information. If you read this book, by all means read the appendix at the end, it contains some of the most intriguing examples in the whole book!

个人分类: 信息学基础研究|733 次阅读|2 个评论

混合设计方差分析（Mixed-design ANOVA）

热度 1 Liz0109 2015-12-16 06:56

最近被各种ANOVA搞得头晕，在这里稍微总结一下。 1. Mixed-degin ANOVA是什么？什么情况使用它？要了解Mixed-degin ANOVA，就要先了解普通的方差分析（ANOVA）和重复测定方差分析（Repeated measures ANOVA）。普通的ANOVA和Repeated measures ANOVA的区别在于，前者要求样本是相互独立的，后者则要样本彼此是不独立的。Mixed-degin ANOVA，顾名思义是两种ANOVA的结合，其中即有独立样本，又有不独立的样本。什么是独立样本和非独立样本呢？样本的独立性体现在实验设计中，取决于你获取因变量（ Dependent variance，DV，就是实验数据）的方法。普通的ANOVA实验设计中，每一个样本只处于一种条件下，且只会被测定一次。换句话说就是，实验中不同处理下以及同一处理下的所有重复都必须是相互独立的个体。因此，当测定不同处理下的 DV 时，需要测定不同的样本；当需要重复测量某一处理下的 DV 的时候，也需要测定不同的样本（重复样）。例如，一个实验设计是 2 处理、 3 水平、 3 重复的，那么一共就需要 2*3*3=18 个样本，会得到18个DV。反之，如果将同一个体放在不同的实验条件下进行反复测定，那么就是非独立样本。例如，观测一组病人（30人）服降压药之后 3 个不同时间段血压的变化，那么一共只需要 30 个样本，但是会得到3*30=90个DV。简而言之，普通的 ANOVA 中测的永远是组间（ between-subject ） DV ； Repeatedmeasures ANOVA 中测的则是组内（ within-subject ） DV 。了解了独立样本和非独立样本之后，就可以看下什么情况下用哪种ANOVA了。假设实验设计了两个因素（自变量; Independent variance, IV）: 1) 当IVs均为Between-subject factors时，用Two-way ANOVA 2) 当IVs均为Within-subject factors时，用Two-way repeated-measures ANOVA 3) 当一个IV是Between-subject factor，一个IV是Within-subject factor时，用Two-factor ,mxed-design ANOVA 2. Mixed-design ANOVA的实验设计案例 (Two-factor) 2个IVs： 1）Within-subject factor = Time (4)； 2）Between-subject factor = Treatments (group 1 2) 结果： Subject Group Time 1 Time 2 Time 3 Time 4 1 1 3 4 7 3 2 1 6 8 12 9 3 1 7 13 11 11 4 1 0 3 6 6 5 2 5 6 11 7 6 2 10 12 18 15 7 2 10 15 15 14 8 2 5 7 11 9 3. Mixed-design ANOVA 的计算公式（图片来源：Wikipedia； https://en.wikipedia.org/wiki/Mixed-design_analysis_of_variance）自由度的算法为： 1） dfBS = R – 1 2） dfBS(Error) = Nk – R 3） dfWS = C – 1 4） dfBSXWS = (R – 1)(C – 1) 5） dfWS(Error) = (Nk – R)(C – 1) 其中R为Between-subject的水平数，Nk为参与的个体数，C为Within-subject的测试次数。因此，以上例子中，R＝2，Nk＝8，C＝4。则: dfBS = R – 1＝2-1=1 dfBS(Error) = Nk – R=8-2=6 dfWS = C – 1=4-1=3 dfBSXWS = (R – 1)(C – 1)=(2-1)*(4-1)=3 dfWS(Error) = (Nk – R)(C – 1)=(8-2)*(4-1)=18 3. 在软件中运行 Mixed-measuredANOVA SPSS中的运行步骤： 1) 打开对话框：Analyze - General Linear Model - Repeated Measures （图片来源： http://wwwstage.valpo.edu/other/dabook/ch14/c14-1.htm ；下同） 2）定义Within-subject factor (name level) 3) 加入Within-subject variables及定义Between-subject factors 4）选择要输出的数据 5）OK (关于如何使用数据，详见链接。) SAS中的运行步骤：看不懂Code，详见 http://www.ats.ucla.edu/stat/sas/faq/anovmix1.htm 参考链接： 1. Mixed-design analysis of variance https://en.wikipedia.org/wiki/Mixed-design_analysis_of_variance 2. Mixed ANOVA using SPSS Statistics https://statistics.laerd.com/spss-tutorials/mixed-anova-using-spss-statistics.php 3. Mixed- Model Factorial ANOVA: Combining Independent and Correlated Group Factors http://wwwstage.valpo.edu/other/dabook/ch14/c14-1.htm 4. How can I perform a repeated measures ANOVA with proc mixed? http://www.ats.ucla.edu/stat/sas/faq/anovmix1.htm 5. Two-way ANOVA or Mixed ANOVA https://www.researchgate.net/post/Two-Way_ANOVA_or_Mixed_ANOVA

39463 次阅读|1 个评论

[转载]Data: WTO & UNSD

lixujeremy 2015-4-5 15:00

WTO International trade and market access data UNSD Statistical Databases

个人分类: Data|946 次阅读|0 个评论

[转载]Workflow for statistical analysis

zhangdong 2013-5-7 16:19

I generally break my projects into 4 pieces: load.R clean.R func.R do.R load.R: Takes care of loading in all the data required. Typically this is a short file, reading in data from files, URLs and/or ODBC. Depending on the project at this point I'll either write out the workspace usingsave() or just keep things in memory for the next step. clean.R: This is where all the ugly stuff lives - taking care of missing values, merging data frames, handling outliers. func.R: Contains all of the functions needed to perform the actual analysis. source()'ing this file should have no side effects other than loading up the function definitions. This means that you can modify this file and reload it without having to go back an repeat steps 1 2 which can take a long time to run for large data sets. do.R: Calls the functions defined in func.R to perform the analysis and produce charts and tables. The main motivation for this set up is for working with large data whereby you don't want to have to reload the data each time you make a change to a subsequent step. Also, keeping my code compartmentalized like this means I can come back to a long forgotten project and quickly read load.R and work out what data I need to update, and then look at do.R to work out what analysis was performed. source: http://stackoverflow.com/questions/1429907/workflow-for-statistical-analysis-and-report-writing/

1706 次阅读|0 个评论

研究室的笨鸟（0）前言

热度 14 fs007 2012-4-29 02:17

前言：释题寻正【寻正按：本文保留版权，任何媒体，包括常规出版业、网络媒体、博客等，没有获得授权，不得转载它处。在本书未完成之前，中国科学网博客是我唯一登载此系列内容的媒体。】由于工作中每为研究者收集的数据所困扰，常想动笔写一本小册子，书名叫Research Methods for Dummy，为做临床研究或者基础研究者在研究统计设计上提供一些简短的指南，免去大家的诸多烦恼。在西方文化中，为普通或者专业读者撰写for Dummy的技术指南属于真正的科普，又某种程度上超越科普，很受欢迎，不过，东方文化中，似乎大家不太耐烦看到Dummy一词。如果直译，我的这个小册子要叫给笨瓜的研究方法学。如果真以此命名，中国的读者可能避之如瘟疫，谁也不愿意被人当作笨瓜。因此，如何命名这本小册子就成了挑战，选择了一个适当的题目后，就需要做些解释工作。笨鸟的出处在于国人耳熟能详的“笨鸟先飞”一词，最早见于据说是关汉卿所作的《状元堂陈母教子》一剧中。陈家老三为母所宠，见大哥二哥得了状元，给二哥添堵，说，哎呀老哥，你在其他人面前牛，那是因为我这只灵鸟未出，让你们这些笨鸟先飞出来炫耀了。【二哥，你得了官也。我和你有个比喻：我似那灵鸟在后，你这等笨鸟先飞。】此后，笨鸟逐渐演化为国人谦虚的说法之一了：哎呀，咱得了福建省状元，多亏了笨鸟先飞，比我厉害的多了去，没啥了不起的，没啥了不起的…… 尽管国人谦虚未必当真，但人人都有做笨鸟的时候。中国教育体制特限制儿童天性，听话的是好学生，不听话的就不受老师的喜爱，而那些灵气足的幼儿，智商高的幼儿，在完成同样的学习任务后总有比一般笨鸟更多的时间淘气，更多的机会成为坏学生。一旦被贴上坏学生的标签，基于皮格马利翁效应（Pygmalion Effect【说你行，你就行的心理学版，原理为心理诱导】），这些学生反而更易为中国教育体制淘汰。面对那些早年弃学，却在混社会中表现出巨大创造力与灵性的幼年好友，我总觉得自己是不折不扣的笨鸟。在中国1980年代，改革开放初期，那些灵鸟搞活了经济，于是乎引得中国知识阶层大呼不公平，天天报怨“卖嘴皮子的，不如卖茶叶蛋的”。中国是笨鸟文化，所以觉得灵鸟真个儿飞到前面去了，反而不习惯。正是基于这种文化心态，我将本册子命名为《研究室的笨鸟》，以避专门为笨瓜写技术指导之嫌。俗语云，“尺有所短，寸有所长”，我们都在为“科学事业”做自己的贡献，难免在某一方向积累了些许经验，也难免无法针对研究中所有的问题都一清二楚，在某个时候做做笨鸟，先飞一飞，试一试，或许就把工作完成了，做一个合适的笨鸟，未尝不是科学的福气。此小册子不是统计学或者数据分析管理的专业指南，在市面上这方面的书籍早就汗牛充栋了，有时甚至让读者因为选择过多而无所适从。艰深的统计学教材让不少笨鸟灰心丧气，拜托作者了，咱们没有兴趣当统计学家，无意在学而有成后跟您在职场中竞争，能告诉我怎么把这个简单的分析搞定好不好？知其然要知其所以然，这种科学的精神固然令人敬佩，但在现实中却有难以忍受的成本。我在工作中遇到过无数研究者，他们在其教育培养过程中全接受过正规的统计学课程教育，有的甚至在相关领域得到过研究生学位，但他们却会在统计分析中犯下最基本的错误，让初学统计学的学生就能一眼看出来的错误。我也见过拥有统计学方面的博士学位者在统计学上犯错。无论是东西方，似乎唯科学论式的教学，一心要让学生完全懂得统计学的诀窍，反而让学生在实际应用中尽忘所学，在工作中浪费时间无数。我没有统计学方面的学位，受到的统计学训练不多于一般向我求助的研究者，而且，我对大多数自己能熟练应用的统计工具不知其所以然。各种程度上，批评者可以说我欲瞎子牵瞎子，不戴眼睛的高度近视为盲人指路。有时，我可能因为自己的自大与无知，当真进入此种状态，读者应当谨慎地对待我的作品与相关建议。我之所以产生撰写此册子的想法，就因为在工作有强烈的理由（Compelling Reasons），在我审的稿件中，有作者把逻辑回归得到的比率比（Odds Ratio）当作各参模变量的比重来产生新的综合评分，或者经常性地因为研究者不适当的数据处理而不停地返工，这些理由让我相信，统计学教学中，统计老师太过于注重分解统计分析的机制与机械计算，学生反而因为芝麻而丢了西瓜。这个小册子，就是把西瓜还给实验室的笨鸟。如果你的确认同我的笨鸟一说，在研究过程中为统计分析所苦，在收集整理数据时，不时要为数据的结构与准确性烦恼，你就可能从本书中受益。在写作过程中我需要一个参照，这个参照如果不是科学的门外汉，也是统计学的新手。因此，本书也为在实验室外飞来飞去的笨鸟而作，让普通读者有以斑窥豹的机会，从阅读本书中理解到科学研究是怎么做出来的。愿我的读者能从中受益，也希望得到中肯的批评意见，能使本册子进一步完善。【寻正按：本文保留版权，任何媒体，包括常规出版业、网络媒体、博客等，没有获得授权，不得转载它处。在本书未完成之前，中国科学网博客是我唯一登载此系列内容的媒体。】 2012.04.28

个人分类: 笨鸟先飞|6478 次阅读|19 个评论

几本不错的书－Bayesian statistics方面

热度 2 jianfengmao 2010-7-30 16:46

本身不是做统计学的。兴趣倒是不小。刚开始关注Bayes。到目前，看到几本这方面的书。各有不同，但都有参考价值： 1. 适合非统计专业的人阅读的入门级 1.1 Introduction to WinBUGS for Ecologists 向生态学者介绍 Bayesian Modeling的书。浅显易懂。可惜，支持这本书网站一直打不开，还没有运行过它的例子。如果你仅了解最基本的统计回归，你就可以通过这本书开始 Bayesian 了。 1.2 A First Course in Bayesian Statistical Methods 这本和上本类似。公式相对多了点，但完全适合非专业人士自学。作者在前言里这样写道My experience has been that once a student understands the basic idea of posterior sampling, their data analyses quickly become more creative and meaningful, using relevant posterior predictive distributions and interesting functions of parameters. 看来，Bayesian不但有用而且好像很容易。 2. 高级别的 2.1 Bayesian Data Analysis, by Gelman, Carlin, Stern, and Rubin (1995, 2004) 2.2 Data Analysis Using Regression and Multilevel/Hierarchical Models, by Gelman and Hill (2007) 这两本只是翻了翻，follow 书中例子不难，想弄明白的话，似乎是做梦。但从应用角度来说，也是不可不读的书。 3. 未知级别 3.1 Bayes and Empirical Bayes Methods for Data Analysis 3.2 Bayesian Analysis for Population Ecology_R 3.3 Bayesian Analysis of Gene Expression Data 3.4 Bayesian Biostatistics 3.5 Bayesian Computation With R-2ed 3.6 Bayesian Disease Mapping Hierarchical Modeling in Spatial Epidemiology 3.7 Bayesian Methods for Ecology 3.8 Bayesian Modeling Using WinBUGS 3.9 Bayesian Statistical Modelling_R 3.10 Bayesian_core_a_practical_approach_to_computational 3.11 Introduction to Bayesian Scientific Computing- 3.12 Introducing Monte Carlo Methods with R 3.13 Introduction to Probability Simulation and Gibbs Sampling with R

个人分类: R and Statistics|11165 次阅读|3 个评论

Statistics与Probability的区别

agri521 2010-7-21 20:46

个人分类: 统计计算|7534 次阅读|0 个评论

Grid computing of spatial statistics:using the TeraGrid for Gi*(d) analysis

guodanhuai 2010-3-24 15:56

Wang, S., M. K. Cowles, et al. (2008). Grid computing of spatial statistics: using the TeraGrid forGi* analysis. Concurrency and Computation: Practice and Experience 20(14): 1697-1720. The massive quantities of geographic information that are collected by modern sensing technologies are difficult to use and understand without data reduction methods that summarize distributions and report salient trends. Statistical analyses, therefore, are increasingly being used to analyze large geographic data sets over a broad spectrum of spatial and temporal scales. Computational Grids coordinate the use of distributed computational resources to form a large virtual supercomputer that can be applied to solve computationally intensive problems in science, engineering, and commerce. This paper presents a solution to computing a spatial statistic, Gi*(d) using Grids. Our approach is based on a quadtree-based domain decomposition that uses task-scheduling algorithms based on GridShell and Condor. Computational experiments carried out on the TeraGrid were designed to evaluate the performance of solution processes. The Grid-based approach to computing values for Gi*(d) shows improved performance over the sequential algorithm while also solving larger problem sizes. The solution demonstrated not only advances knowledge about the application of the Grid in spatial statistics applications but also provides insights into the design of Grid middleware for other computationally intensive applications. Copyright 2008 John Wiley Sons, Ltd.

个人分类: GIsystem & GIscience|3840 次阅读|0 个评论

R Code for CRW simulation

entomology 2008-7-28 22:59

R Code for CRW simulation #copy and paste the following code in R #to simulate Correlated Random Walk in an open space #Original code by Xiaohua Dai #required libraries require(circular) require(CircStats) ##CRW initial parameters #length ~ gamma distribution (sh, sc) #For a gamma distribution: gamma(shape, scale) # mean = shape*scale # variance = shape*scale*scale #Then, scale = variance/mean, shape = mean/scale #shape parameter: sh = 0.285 #scale parameter: sc = 362 #turning angle ~ wrapped cauchy distribution (m, rh, s) #mean turning angle in radians: m = 0.145 #mean resultant length rho: rh = 0.356 #square displacements R = matrix(0,1000,25) #x,y coordinates x = matrix(0,1000,25) y = matrix(0,1000,25) #turning angles the = matrix(0,1000,25) #lower 2.5% CI of R r25 = matrix(0,25) #mean of R rm = matrix(0,25) #upper 2.5% CI of R r975 = matrix(0,25) #Start simulation; sim = times of simulation for(sim in 1:1000){ for(step in 2:25){ l - rgamma(1,shape=sh,scale=sc) ta - rwrappedcauchy(1,mu=m,rho=rh) the = the +ta x = x +l*cos(the ) y = y +l*sin(the ) R = x ^2+y ^2 } } for(step in 1:25){ r25 = sort(R ) rm = mean(R ) r975 = sort(R ) } #output write.table(data.frame(r25,rm,r975),CRWoutput.txt) write.csv(data.frame(r25,rm,r975),CRWoutput.csv) Wednesday July 5, 2006 - 11:15am (EEST) Permanent Link | 0 Comments

个人分类: R Statistics R统计|2916 次阅读|0 个评论

R code for grid-based movement simulation

entomology 2008-7-28 22:58

R code for grid-based movement simulation Grid size: 1km 1km square Initial Agent: Individual animal Local movements: Habitat selection index H i (according to the percentage levels of utilization distribution, UD i , incell i ): ## H could be also determined according to the habitat quality, prey density, etc. Time step: 0.5hr At time step t : agent atcell m (center coordinate = ( x t , y t )) When t +1 the agent move to (or stay at) one of the nine cells ( n = m -4, , m +4) as follows ( x t -1, y t -1) ( x t , y t -1) ( x t +1, y t -1) ( x t -1, y t ) ( x t , y t ) ( x t +1, y t ) ( x t -1, y t +1) ( x t , y t +1) ( x t +1, y t +1) Possibility ( p ) of moving to/staying atcell n is P n = H n / SUM ( H i ), i from m- 4 to m +4. #####Here's the R script to simulate animal movement###### #Original code by Xiaohua Dai # Required R packages require(adehabitat) require(car) require(spdep) ## Initial parameters # Location time series (x,y) # time = number of time steps time - 15000 x - array(0,time) y - array(0,time) # Number of animal occurences at location x,y: location # Grid map of Kruger # (NOTE: zero-value grids buffer around its border: # 1. to make the grid contains NRow * NCol cells # 2. to ensure each cell in Kruger has 8 neighbouring cells) location - image.asc(Kruger) # The values of habitat selection index H decrease with the increasing of utilization level # H = 0 when the cells are not in home range therefore elephants wont move to the cells H - location UD - image.asc(KrugerUD) H - round(100/UD) BB - array(H) neigh - cell2nb(NRow,NCol,torus=FALSE,type=queen) # Generate 8 neigHours for each cell image(as.asc(H)) # Display the grid space of habitats # Location coordinates (lx, ly) # Use lxy to combine lx and ly together as a data frame lx - rep(1:NRow, NCol) # e.g. 123412341234 ly - rep(1:NCol, each=NRow) # e.g. 111122223333 lxy - data.frame(lx,ly) # Initial location of animal loc - round(runif(1,min=1,max=length(lx))) ##Movement simulation for(t in 1:time){ # Record location time series x - lxy$lx y - lxy$ly # Draw location point points(lxy$lx ,lxy$ly , col = round(runif(1, max=10)), pch = 19) # 9-cell neigHourhood matrix of habitat selection # Repeat the number of k according to its selection level BB ] # Previous cell also included since animal have a certain probability to stay in it. cxy - rep(loc,BB ) for(i in 1:8) { k - neigh ] #8 neigHouring cells cxy - c(cxy, rep(k,BB )) } # Sample one value in the selection array cxy # The larger BB ] is, the higer probability for the animal to move to cell k # Move to the new location and add 1 to the number of animal occurence at loc loc - some(cxy,1) location - location +1 }# Simulate the next move Wednesday July 5, 2006 - 11:22am (EEST) Permanent Link | 0 Comments

个人分类: R Statistics R统计|2836 次阅读|0 个评论

R code to simulate animal movement in a torus

entomology 2008-7-28 22:57

R code to simulate animal movement in a torus # Original code by Xiaohua Dai # Required R packages require(adehabitat) require(car) require(spdep) ## Initial parameters # Location time series (x,y) # time = number of time steps time - 15000 x - array(0,time) y - array(0,time) # Number of animal occurences at location x,y: location # location - round(runif(length(HB),min=1,max=3)) BB - array(HB) neigh - cell2nb(CellN,CellN,torus=TRUE,type=queen) # Generate 8 neighbours for each cell image(as.asc(HB)) # Display the grid space of habitats # Location coordinates (lx, ly) # Use lxy to combine lx and ly together as a data frame lx - rep(1:CellN, CellN) ly - rep(1:CellN, each=CellN) lxy - data.frame(lx,ly) # Initial location of animal loc - round(runif(1,min=1,max=length(lx))) ##Movement simulation for(t in 1:time){ # Record location time series x - lxy$lx y - lxy$ly # Draw location point points(lxy$lx ,lxy$ly , col = round(runif(1, max=10)), pch = 19) # 9-cell neighbourhood matrix of habitat selection cxy - loc for(i in 1:8) { k - neigh ] #8 neighbouring cells in a torus # Repeat the number of k according to its preference degree BB ] # Previous cell also included since animal have a certain probability to stay in it. cxy - c(cxy, rep(k,BB )) } # Sample one value in the selection array cxy # The larger BB ] is, the higer probability for the animal to move to cell k # Move to the new location and add 1 to the number of animal occurence at loc loc - some(cxy,1) location - location +1 }# Simulate the next move ## Estimation of Kernel Home-Range with 25%, 50% and 95% percentage # for home range contour estimation xy - data.frame(x,y) ud - kernelUD(xy) ver - getverticeshr(ud, 95) plot(ver, add=TRUE) ver - getverticeshr(ud, 50) plot(ver, add=TRUE) ver - getverticeshr(ud, 25) plot(ver, add=TRUE) Wednesday July 5, 2006 - 11:23am (EEST) Permanent Link | 0 Comments

个人分类: R Statistics R统计|2393 次阅读|0 个评论

R code to generate convex hulls around point clusters

entomology 2008-7-28 22:56

R code to generate convex hulls around point clusters #Original code by Roger Bivand #Modified by Xiaohua Dai require(maptools) require(sp) require(amap) require(shapefiles) #reading point shape foodloc - readShapePoints(foodtree.shp) # yourloc - readShapePoints(yourshape.shp) xy - coordinates(foodloc) xy_clusts - hcluster(xy, method=euclidean, link=complete) # hcluster use twice less memory, as it doesn't store distance matrix # complete linkage hierarchical clustering plot(xy_clusts) # shows the clustering tree cl - cutree(xy_clusts, 200) # 200 is the number of clusters which_cl - tapply(1:nrow(xy), cl, function(i) xy ) chulls_cl - lapply(which_cl, function(x) x ) plot(xy) res - lapply(chulls_cl, polygon) n - length(chulls_cl) polygons - lapply(1:n, function(i) { chulls_cl ] - rbind(chulls_cl ], chulls_cl ] ) # the convex hulls do not join first and last points, so we copy here Polygons(list(Polygon(coords=chulls_cl ])), ID=i) }) out - SpatialPolygonsDataFrame(SpatialPolygons(polygons), data=data.frame(ID=1:n)) plot(out) # note standard-violating intersecting polygons! tempfile - tempfile() writePolyShape(out, tempfile) in_again - readShapePoly(tempfile) plot(in_again, border=blue, add=TRUE) #output test - read.shapefile(tempfile) write.shapefile(test,ptcluster) #Refer to: #http://www.google.com/search?hl=zh-CNq=%22outline+polygons+of+point+clumps%22+r-projectbtnG=Google+%E6%90%9C%E7%B4%A2lr= Wednesday July 5, 2006 - 12:34pm (EEST) Permanent Link | 0 Comments

个人分类: R Statistics R统计|2339 次阅读|0 个评论

C code: Nearest neighbor contingency table analysis

entomology 2008-7-28 22:55

C code: Nearest neighbor contingency table analysis // 最近邻体列联表分析 Nearest neighbor contingency table analysis // Original code by Xiaohua Dai void CExcellentView::OnNearestNeighbor() { // TODO: Add your command handler code here COleVariant VOptional((long)DISP_E_PARAMNOTFOUND,VT_ERROR); _Worksheet ws1,ws2,ws3,ws4; Range rg1, rg2, rg3, rg4, cols, rows; VARIANT ret; wss.AttachDispatch(wb.GetWorksheets(),true); ws1.AttachDispatch(wss.GetItem(_variant_t(sheet1)),true); rg1.AttachDispatch(ws1.GetUsedRange(),true); ret = rg1.GetValue(); cols.AttachDispatch(rg1.GetColumns(),true); rows.AttachDispatch(rg1.GetRows(),true); long lNumRows = rows.GetCount(); long lNumCols = cols.GetCount(); COleSafeArray sa1(ret),sa2(ret),sa3(ret),sa4(ret); long index ; VARIANT val; int r,c; double n ; //double s , t ; double ss ,tt ,uu ; for (r=2;r=lNumRows;r++) { for (c=2;c=lNumCols;c++) { index = r; index = c; sa1.GetElement(index,val); val.dblVal += 0.5; n = val.dblVal; } } for(r=1;rlNumRows-1;r++) { //s = SegregationIndex(n ,n -n ,n -n ,n -n -n +n ); //t = Likelihood(n ,n -n ,n -n ,n -n -n +n ); for(c=1;clNumCols-1;c++) { ss = OddsRatio(n ,n ,n ,n ); tt = RevisedTest(n ,n ,n ,n ); uu = Likelihood(n ,n ,n ,n ); } } for (r=2;rlNumRows;r++) { for (c=2;clNumCols;c++) { VARIANT v1,v2,v3; v1 = _variant_t(ss ); v2 = _variant_t(tt ); v3 = _variant_t(uu ); index = r; index = c; sa2.PutElement(index,v1); sa3.PutElement(index,v2); sa4.PutElement(index,v3); } } ws2.AttachDispatch(wss.GetItem(_variant_t(sheet2)),true); ws2.SetVisible(TRUE); ws2.Activate(); rg2 = ws2.GetRange(COleVariant(A1),COleVariant(A1)); rg2 = rg2.GetResize(COleVariant(lNumRows),COleVariant(lNumCols)); rg2.SetValue(COleVariant(sa2)); rg2.AttachDispatch(ws2.GetCells(),true); /*rg2.SetItem(_variant_t(long(lNumRows+1)),_variant_t(long(1)),_variant_t(SegIndex)); rg2.SetItem(_variant_t(long(lNumRows+2)),_variant_t(long(1)),_variant_t(ChiSquare)); for (c=2;clNumCols;c++) { rg2.SetItem(_variant_t(long(lNumRows+1)),_variant_t(long(c)),_variant_t(s )); rg2.SetItem(_variant_t(long(lNumRows+2)),_variant_t(long(c)),_variant_t(t )); }*/ ws3.AttachDispatch(wss.GetItem(_variant_t(sheet3)),true); ws3.SetVisible(TRUE); ws3.Activate(); rg3 = ws3.GetRange(COleVariant(A1),COleVariant(A1)); rg3 = rg3.GetResize(COleVariant(lNumRows),COleVariant(lNumCols)); rg3.SetValue(COleVariant(sa3)); ws4.AttachDispatch(wss.GetItem(_variant_t(sheet4)),true); ws4.SetVisible(TRUE); ws4.Activate(); rg4 = ws4.GetRange(COleVariant(A1),COleVariant(A1)); rg4 = rg4.GetResize(COleVariant(lNumRows),COleVariant(lNumCols)); rg4.SetValue(COleVariant(sa4)); sa4.Detach(); sa3.Detach(); sa2.Detach(); sa1.Detach(); ExcelApp.SetVisible(TRUE); ExcelApp.SetUserControl(TRUE); } Monday August 7, 2006 - 05:33pm (EEST) Permanent Link | 0 Comments

个人分类: R Statistics R统计|2627 次阅读|0 个评论

C code: Interspecific correlation

entomology 2008-7-28 22:54

C code: Interspecific correlation // 种间相关 Interspecific correlation // Original code by Xiaohua Dai void CExcellentView::OnCorrelation() { // TODO: Add your command handler code here COleVariant VOptional((long)DISP_E_PARAMNOTFOUND,VT_ERROR); _Worksheet ws1,ws2; Range rg1, rg2, cols, rows; VARIANT ret1,ret2; wss.AttachDispatch(wb.GetWorksheets(),true); ws1.AttachDispatch(wss.GetItem(_variant_t(sheet1)),true); rg1.AttachDispatch(ws1.GetUsedRange(),true); ret1 = rg1.GetValue(); cols.AttachDispatch(rg1.GetColumns(),true); rows.AttachDispatch(rg1.GetRows(),true); long lNumRows = rows.GetCount(); long lNumCols = cols.GetCount(); COleSafeArray sa1(ret1); long index ; VARIANT val; int r,c,i,j,k; double n ; double RR ; for (r=2;r=lNumRows;r++) { for (c=2;c=lNumCols;c++) { index = r; index = c; sa1.GetElement(index,val); n = val.dblVal; } } double SS , S ; for(i=1;ilNumRows;i++) { S = n ; for(j=1;jlNumRows;j++) { SS = 0.0; for (k=1;klNumCols-2;k++) { SS = SS + n *n ; } } } for(i=1;ilNumRows;i++) { for(j=1;jlNumRows;j++) { RR = (SS -S *S /double(lNumCols-3))/sqrt((SS -S *S /double(lNumCols-3))*(SS -S *S /double(lNumCols-3))); } } ws2.AttachDispatch(wss.GetItem(_variant_t(sheet2)),true); ws2.SetVisible(TRUE); ws2.Activate(); rg2 = ws2.GetRange(COleVariant(A1),COleVariant(A1)); rg2 = rg2.GetResize(COleVariant(lNumRows),COleVariant(lNumRows)); ret2 = rg2.GetValue(); COleSafeArray sa2(ret2); for (r=2;r=lNumRows;r++) { for (c=2;c=lNumRows;c++) { VARIANT v1; v1 = _variant_t(RR ); index = r; index = c; sa2.PutElement(index,v1); } } rg2.SetValue(COleVariant(sa2)); sa2.Detach(); sa1.Detach(); ExcelApp.SetVisible(TRUE); ExcelApp.SetUserControl(TRUE); } Monday August 7, 2006 - 05:34pm (EEST) Permanent Link | 0 Comments

个人分类: R Statistics R统计|2012 次阅读|0 个评论

C code: Interspecific association

entomology 2008-7-28 22:54

C code: Interspecific association // 种间联结 Interspecific association // Original code by Xiaohua Dai void CExcellentView::OnAssociation() { // TODO: Add your command handler code here COleVariant VOptional((long)DISP_E_PARAMNOTFOUND,VT_ERROR); _Worksheet ws1, ws2, ws3, ws4, ws5; Range rg1, rg2, rg3, rg4, rg5, cols, rows; VARIANT ret; wss.AttachDispatch(wb.GetWorksheets(),true); ws1.AttachDispatch(wss.GetItem(_variant_t(sheet1)),true); rg1.AttachDispatch(ws1.GetUsedRange(),true); ret = rg1.GetValue(); cols.AttachDispatch(rg1.GetColumns(),true); rows.AttachDispatch(rg1.GetRows(),true); long lNumRows = rows.GetCount(); long lNumCols = cols.GetCount(); COleSafeArray sa1(ret),sa2(ret),sa3(ret), sa4(ret), sa5(ret); long index ; VARIANT val; int r,c,s; double n ; double z ,g ,x ,p ; for (r=2;r=lNumRows;r++) { for (c=2;c=lNumCols;c++) { index = r; index = c; sa1.GetElement(index,val); n = val.dblVal; } } double aa, bb, cc, dd; for(r=1;rlNumRows;r++) { for(s=1;slNumRows;s++) { aa = bb = cc = dd = 0.0; for(c=1;clNumCols;c++) { if ((n 0.0)(n 0.0)) aa += 1.0; else if ((n 0.0)(n ==0.0)) bb += 1.0; else if ((n ==0.0)(n 0.0)) cc += 1.0; else if ((n ==0.0)(n ==0.0)) dd += 1.0; } /*aa += 0.5; bb += 0.5; cc += 0.5; dd += 0.5;*/ x = RevisedTest(aa,bb,cc,dd); //g = Likelihood(aa,bb,cc,dd); g = AC(aa,bb,cc,dd); z = PC(aa,bb,cc,dd); p = PCC(aa,bb,cc,dd); //z = OddsRatio(aa,bb,cc,dd); } } for (r=2;r=lNumRows;r++) { for (c=2;c=lNumCols;c++) { VARIANT v1,v2,v3,v4; v1 = _variant_t(x ); v2 = _variant_t(g ); v3 = _variant_t(z ); v4 = _variant_t(p ); index = r; index = c; sa2.PutElement(index,v1); sa3.PutElement(index,v2); sa4.PutElement(index,v3); sa5.PutElement(index,v4); } } ws2.AttachDispatch(wss.GetItem(_variant_t(sheet2)),true); ws2.SetVisible(TRUE); ws2.Activate(); rg2 = ws2.GetRange(COleVariant(A1),COleVariant(A1)); rg2 = rg2.GetResize(COleVariant(lNumRows),COleVariant(lNumRows)); rg2.SetValue(COleVariant(sa2)); ws3.AttachDispatch(wss.GetItem(_variant_t(sheet3)),true); ws3.SetVisible(TRUE); ws3.Activate(); rg3 = ws3.GetRange(COleVariant(A1),COleVariant(A1)); rg3 = rg3.GetResize(COleVariant(lNumRows),COleVariant(lNumRows)); rg3.SetValue(COleVariant(sa3)); ws4.AttachDispatch(wss.GetItem(_variant_t(sheet4)),true); ws4.SetVisible(TRUE); ws4.Activate(); rg4 = ws4.GetRange(COleVariant(A1),COleVariant(A1)); rg4 = rg4.GetResize(COleVariant(lNumRows),COleVariant(lNumRows)); rg4.SetValue(COleVariant(sa4)); ws5.AttachDispatch(wss.GetItem(_variant_t(sheet5)),true); ws5.SetVisible(TRUE); ws5.Activate(); rg5 = ws5.GetRange(COleVariant(A1),COleVariant(A1)); rg5 = rg5.GetResize(COleVariant(lNumRows),COleVariant(lNumRows)); rg5.SetValue(COleVariant(sa5)); sa5.Detach(); sa4.Detach(); sa3.Detach(); sa2.Detach(); sa1.Detach(); ExcelApp.SetVisible(TRUE); ExcelApp.SetUserControl(TRUE); } Monday August 7, 2006 - 05:35pm (EEST) Permanent Link | 0 Comments

个人分类: R Statistics R统计|2072 次阅读|0 个评论

Let’s R 来用R

entomology 2008-7-26 21:39

LetsR来用R entomology 发表于 2005-6-16 17:27:00 Lets R 来用 R In bilingual English-Chinese What is R? R 是什么？ *R is not only a programming language; R is also a graphic statistical environment withplenty of easily-loaded packages. (I like it, same as theeasy-to-useextensions for ArcView) R 是程序语言， R 是具有大量易装载功能包的图形统计环境。我喜欢这点，如同 ArcView 中使用方便的扩展部件。 How to R? 怎么用 R *You can write your own scripts, you can also call a large number of powerful functions. 你可以自己写脚本，也可以调用大量有用函数。 Why to R? 为什么R * You can run R on UNIX, Windows and Mac OS R 可以运行于 UNIX, Windows 和 Mac 操作系统 * R is free: free of charge and free to use 你可以免费和自由的使用 R * R is a combination of functional programming and object-oriented programming R 是函数型程序设计与面向对象程序设计的综合体 * You need not to be a programmer; you can quickly be a programmer 你不必是程序员；你能够很快地成为程序员 * Many R users and big name statisticians around the world will answer your questions in maillists 你可以通过邮件列表向为数众多的 R 使用者和统计牛人咨询问题 * Where is R? R 在哪里 * Home page: http://www.R-project.org/ and many mirrors 主页与镜像 * Useful m ini-course for beginners: http://life.bio.sunysb.edu/~dstoebel/R/ 初学者快速入门教程 * R introduction in Chinese: http://www.biosino.org/pages/newhtm/r/schtml/ 中文 R 导论 * R resources for ecologists: http:// c r an. r -p r oject.o r g/web/ views /Envi r onmet r ics.html 生态学家的 R 资源 * Last update 2000.06.16 Xiaohua Dai @ ecoinformatics.blog.edu.cn 搜索引擎关键词：统计软件R， R中文，中文R， R语言

个人分类: R Statistics R统计|2070 次阅读|0 个评论

GIS-related packages in R

entomology 2008-7-26 21:36

GIS-relatedpackagesinR entomology 发表于 2005-7-8 20:39:00 GIS-related packages in R: ade4 -- Analysis of Environmental Data : Exploratory and Euclidean methods in Environmental sciences adehabitat -- Analysis of habitat selection by animals fields -- Tools for spatial data GRASS -- Interface between GRASS 5.0 geographical information system and R Mapdata -- Extra Map Databases Mapproj -- Map Projections Maps -- Draw Geographical Maps Maptools -- tools for reading and handling shapefiles Maptree -- Mapping, pruning, and graphing tree models PBSmapping -- PBS Mapping 2 Shapefiles -- Read and Write ESRI Shapefiles Sp -- classes and methods for spatial data Spatial -- Functions for Kriging and Point Pattern Analysis Spatstat -- Spatial Point Pattern analysis, model-fitting and simulation Spdep -- Spatial dependence: weighting schemes, statistics and models etc.

个人分类: R Statistics R统计|2111 次阅读|0 个评论

R Tools and Websites R常见工具和网站

entomology 2008-7-26 21:35

R常见工具和网站 entomology发表于-2008-7-26 20:02:00 0 推荐这是我学R几年来觉得最有用的工具和网站，先写一部分，以后想起来慢慢补充。 1 R Task Views --to install packages for a special task. 用于特定专业研究的包组合: http://cran.r-project.org/web/views/ 如生态学的 http://cran.r-project.org/web/views/Environmetrics.html 2 R Reference Card--as a printed guideat hand, just several pages, but many useful hints.R参考手册，只有几页，最简单的只有一页，可以打印出来随时参考：（1）一页版英文： http://cran.r-project.org/doc/contrib/Short-refcard.pdf （2）多页版英文： http://cran.r-project.org/doc/contrib/refcard.pdf （3）多页版中文： http://cran.r-project.org/doc/contrib/Liu-R-refcard.pdf 3 Tinn-R--to make the use of R easier in a graphic interface. 图形界面的R编辑器： http://sourceforge.net/projects/tinn-r 4 Rcmdr--R GUI inteface.R的GUI界面套件： http:// cran.r-project.org/web/packages/ Rcmdr /index.html http://socserv.mcmaster.ca/jfox/Misc/Rcmdr/ 5 升级包的时候可以选择韩国的服务器，速度快，而且更新要比国内快得多。

个人分类: R Statistics R统计|2990 次阅读|1 个评论

帐号		自动登录	找回密码
密码			注册

关闭安全验证

标签: Statistics

相关帖子

相关日志

关闭 安全验证

标签: Statistics

相关帖子

相关日志

关闭安全验证