Examples of Fortran MEX-Files exsample 2================================================================================ C This is an adaptation of matsq from the API Guide. C C This file has been written to demonstrate the use C of two functions so that the size of the input C matrices must not be known at compilation. C C As an m-file, this would be written C function z=multscalar(x,y) C % x is a matrix of any dimension C % y is a scalar C z=x*y; C C Written by Matthew C Roberts, matthewcroberts@hotmail.com C C This is the gateway function subroutine mexFunction(nlhs,plhs,nrhs,prhs) implicit none integer nlhs, nrhs, mxGetM, mxGetN integer m(nrhs), n(nrhs), mxIsNumeric C Change this to integer*8 for 64 bit matchines integer plhs(nlhs), prhs(nrhs) C Check the proper number of input and output arguments if (nlhs .gt. 1) then call mexErrMsgTxt('No more than one output argument') end if if (nrhs .ne. 2) then call mexErrMsgTxt('Must be exactly two input arguments') end if C Check to make sure that the arguments are non-numeric if (mxIsNumeric(prhs(1)).eq.0) then call mexErrMsgTxt('Non-numeric Data Present') end if if (mxIsNumeric(prhs(2)).eq.0) then call mexErrMsgTxt('Non-numeric Data Present') end if C Get size of input arguments m(1)=mxGetM(prhs(1)) m(2)=mxGetM(prhs(2)) n(1)=mxGetN(prhs(1)) n(2)=mxGetN(prhs(2)) call matmult_1(nlhs,plhs,nrhs,prhs,m,n) end C ================================================================= C This is the computational function subroutine matmult_1(nlhs,plhs,nrhs,prhs,m,n) implicit none integer nrhs, nlhs integer m(nrhs), n(nrhs) C These are the input argument double precision x(m(1),n(1)), y, z(m(1),n(1)) C Change these to integer*8 for 64 bit machines integer plhs(nlhs), prhs(nrhs), zpr integer mxCreateFull, mxGetPr external mxGetPr C Copy the inputs into FORTRAN call mxCopyPtrtoReal8(mxGetPr(prhs(1)),x,m(1)*n(1)) call mxCopyPtrtoReal8(mxGetPr(prhs(2)),y,1) C Now, do the math: call matmult_2(x,y,z,m,n) C Send the answer home plhs(1)=mxCreateFull(m(1),n(1),0) zpr=mxGetPr(plhs(1)) call mxCopyReal8toPtr(z,zpr,m(1)*n(1)) end C Computational Subroutine? subroutine matmult_2(x,y,z,m,n) implicit none integer m(2), n(2) double precision x(m(1),n(1)), y, z(m(1),n(1)) z = x*y end exsample 3========================= dgemv.f ===================================== Here is an example of a fortran routine for multiplying a vector by a matrix. Of course, Matlab can do this already. This example was chosen because it can so easily be verified to work, and also because it contains several examples of important issues in creating a gateway file. In particular, variables of all types (integer, float, character) are passed, and in the case of float, we use scalar, vector, and matrix forms. Also addressed the issue of when an input parameter in fortran is modified and is output as well. This is distinct from functions in Matlab, where input and output to and from a function are distinct. This example contains most of the commands that would be needed by someone using numerical algorithms. dgemv.f is a fortran subroutine for multiplying a vector by a matrix. It is from the Blas set of routines, downloaded from Netlib. xerbla.f and lsame.f are called by dgemv.f. dgemvg.f is the gateway routine that links the fortran code to Matlab. mvmults.m is a Matlab script that defines the matrix and vector we wish to multiply, and calls the routine for doing the multiplication. --------------------------------dgemv.f-------------------------- subroutine mexfunction(nlhs,plhs,nrhs,prhs) integer*4 plhs(*),prhs(*) integer nlhs,nrhs C Define types for mx functions. Recall that pointers used C in mex files must be of type integer*4 integer*4 mxgetpr, mxcreatefull, mxgetstring real*8 mxgetscalar integer*4 status C Variables (and pointers to variables) used by the Fortran subroutine. C The floating point variables will be referred to by pointers. C For the integer and character variables, we will use the actual C values, so we defing the variables accordingly. integer*4 a,x,y,alpha,beta character*1 trans if (nrhs .ne. 11) then call mexerrmsgtxt('Wrong number of inputs in sgemv') end if if (nlhs .ne. 1) then call mexerrmsgtxt('Wrong number of outputs in sgemv') end if C Retrieve the dimensions of the input matrix and vector. This is not C necessary, but it gives us an opportunity to check for errors. C Moreover, this will help us determine how long to make the output C vector. ma = mxgetm(prhs(4)) na = mxgetn(prhs(4)) mx = mxgetm(prhs(6)) nx = mxgetn(prhs(6)) status = mxGetString(prhs(1),trans,1) m = int(mxgetscalar(prhs(2))) n = int(mxgetscalar(prhs(3))) lda = int(mxgetscalar(prhs(6))) incx = int(mxgetscalar(prhs(8))) incy = int(mxgetscalar(prhs(11))) C Create the output variable. plhs(1) = mxcreatefull(na,1,0) alpha = mxgetpr(prhs(4)) a = mxgetpr(prhs(5)) x = mxgetpr(prhs(7)) beta = mxgetpr(prhs(9)) y = mxgetpr(prhs(10)) call dgemv(trans,m,n,%val(alpha),%val(a),lda, %val(x),incx,%val(beta), %val(y),incy) plhs(1) = prhs(10) return end --------------------- xerbla.f ------------------------------- SUBROUTINE XERBLA( SRNAME, INFO ) * * -- LAPACK auxiliary routine (preliminary version) -- * Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., * Courant Institute, Argonne National Lab, and Rice University * February 29, 1992 * * .. Scalar Arguments .. CHARACTER*6 SRNAME INTEGER INFO * .. * * Purpose * ======= * * XERBLA is an error handler for the LAPACK routines. * It is called by an LAPACK routine if an input parameter has an * invalid value. A message is printed and execution stops. * * Installers may consider modifying the STOP statement in order to * call system-specific exception-handling facilities. * * Arguments * ========= * * SRNAME (input) CHARACTER*6 * The name of the routine which called XERBLA. * * INFO (input) INTEGER * The position of the invalid parameter in the parameter list * of the calling routine. * * WRITE( *, FMT = 9999 )SRNAME, INFO * STOP * 9999 FORMAT( ' ** On entry to ', A6, ' parameter number ', I2, ' had ', $ 'an illegal value' ) * * End of XERBLA * END -------------------- lsame.f ----------------------------------- LOGICAL FUNCTION LSAME( CA, CB ) * * -- LAPACK auxiliary routine (version 2.0) -- * Univ. of Tennessee, Univ. of California Berkeley, NAG Ltd., * Courant Institute, Argonne National Lab, and Rice University * January 31, 1994 * * .. Scalar Arguments .. CHARACTER CA, CB * .. * * Purpose * ======= * * LSAME returns .TRUE. if CA is the same letter as CB regardless of * case. * * Arguments * ========= * * CA (input) CHARACTER*1 * CB (input) CHARACTER*1 * CA and CB specify the single characters to be compared. * * ===================================================================== * * .. Intrinsic Functions .. INTRINSIC ICHAR * .. * .. Local Scalars .. INTEGER INTA, INTB, ZCODE * .. * .. Executable Statements .. * * Test if the characters are equal * LSAME = CA.EQ.CB IF( LSAME ) $ RETURN * * Now test for equivalence if both characters are alphabetic. * ZCODE = ICHAR( 'Z' ) * * Use 'Z' rather than 'A' so that ASCII can be detected on Prime * machines, on which ICHAR returns a value with bit 8 set. * ICHAR('A') on Prime machines returns 193 which is the same as * ICHAR('A') on an EBCDIC machine. * INTA = ICHAR( CA ) INTB = ICHAR( CB ) * IF( ZCODE.EQ.90 .OR. ZCODE.EQ.122 ) THEN * * ASCII is assumed - ZCODE is the ASCII code of either lower or * upper case 'Z'. * IF( INTA.GE.97 .AND. INTA.LE.122 ) INTA = INTA - 32 IF( INTB.GE.97 .AND. INTB.LE.122 ) INTB = INTB - 32 * ELSE IF( ZCODE.EQ.233 .OR. ZCODE.EQ.169 ) THEN * * EBCDIC is assumed - ZCODE is the EBCDIC code of either lower or * upper case 'Z'. * IF( INTA.GE.129 .AND. INTA.LE.137 .OR. $ INTA.GE.145 .AND. INTA.LE.153 .OR. $ INTA.GE.162 .AND. INTA.LE.169 ) INTA = INTA + 64 IF( INTB.GE.129 .AND. INTB.LE.137 .OR. $ INTB.GE.145 .AND. INTB.LE.153 .OR. $ INTB.GE.162 .AND. INTB.LE.169 ) INTB = INTB + 64 * ELSE IF( ZCODE.EQ.218 .OR. ZCODE.EQ.250 ) THEN * * ASCII is assumed, on Prime machines - ZCODE is the ASCII code * plus 128 of either lower or upper case 'Z'. * IF( INTA.GE.225 .AND. INTA.LE.250 ) INTA = INTA - 32 IF( INTB.GE.225 .AND. INTB.LE.250 ) INTB = INTB - 32 END IF LSAME = INTA.EQ.INTB * * RETURN * * End of LSAME * END ---------------------------- mvmults.m ------------------------------------- mvmults.m is a Matlab script that defines the matrix and vector we wish to multiply, and calls the routine for doing the multiplication. clear rows = 100; for i = 1:rows, for j = 1:rows, amat(i,j) = (i*j-1)/rows; end xvec(i) = i/rows; end xvec = xvec'; yin = zeros(rows,1); yvec = dgemv('N',rows,rows,1.0,amat,rows,xvec,1,0.0,yin,1);
cudaThreadSynchronize() is a _host_ function that waits for all previous async operations (i.e. kernel calls, async memory copies) to complete. __synchtreads() is a _device_ function that acts as a thread barrier. All threads in a block must reach the barrier before any can continue execution. It is only of use when you need to avoid race conditions when threads in a block access shared memory.
2008年5月15日,四川省北川县一所被地震震塌的教室的废墟下,一只蝴蝶在一位已经死亡的学生脚旁飞舞。Jason Lee 摄 2008的那场地震。 蝴蝶落在姑娘的花布鞋上。 这到底伤心了谁? 摄影师的蝴蝶比摄影师更令人难过, 科学家却比科学家的文章更加忧伤。 Pnas说,伤及大脑,日后难以忘记。 Plosone说,他们不难过,伤心的是你们。 都是真相吧,区别就在一年以后。 担忧的事很多,担忧的人却很少, 昨晚版纳的地震就够人难以入睡的。 更难以入梦的却不是摇晃,而是摇晃的房子。 和摇晃过后,我们该如何摇晃我们的房子。 参考文献: Su Luia, Qiyong Gonga,2009 High-field MRI reveals an acute impact on brain function in survivors of the magnitude 8.0 earthquake in China .Pnas.106(36)15412–15417.doi: 10.1073/pnas.0812751106 Li S, Rao L-L, Bai X-W, Zheng R, Ren X-P, et al. (2010) Progression of the “Psychological Typhoon Eye” and Variations Since the Wenchuan Earthquake . PLoS ONE 5(3): e9727. doi:10.1371/journal.pone.0009727 附: 如果西双版纳是这样的楼房,再大的地震我们都不会害怕。
http://pre.aps.org/abstract/PRE/v81/i6/e066202 Computation of the drift velocity of spiral waves using response functions Abstract: Rotating spiral waves are a form of self-organization observed in spatially extended systems of chysical,chemical, and biological nature. In the presence of a small perturbation, the spiral waves center of rotation and fiducial phase may change over time, i.e., the spiral wave drifts. In linear approximation, the velocity of the drift is proportional to the convolution of the perturbation with the spirals response functions, which are the eigenfunctions of the adjoint linearized operator corresponding to the critical eigenvalues =0, +-omega. Here, we demonstrate that the response functions give quantitatively accurate prediction of the drift velocities due to a variety of perturbations: a time dependent, periodic perturbation inducing resonant drift; a rotational symmetry-breaking perturbation inducing electrophoretic drift; and a translational symmetry-breaking perturbation inhomogeneity induced drift including drift due to a gradient, stepwise, and localized inhomogeneity. We predict the drift velocities using the response functions in FitzHugh-Nagumo and Barkley models, and compare them with the velocities obtained in direct numerical simulations. In all cases good quantitative agreement is demonstrated. 这是作者所在研究组一系列文章的一个总结类文献,理论基石仍然是第二作者,D. Barkley在上世纪九十年代发展的稳定性分析。 如其所言,计算response 函数的复杂性让一般,习惯于直接模拟的研究者有点发怵,别人我不敢说,对于我是这样的。 我感兴趣的是其中的resonant drift.
There are several method for curve fitting should be summarized for latter use. 1. "Solver" in Excel Step1 Define the objective parameters; Step2 Calculate the objective function with initial estimation of parameters and the observation data; Step3 Calculate and sum " { y_obs -y_pred }^2 "; Step4 Click "Solver" in the "Tool" tab; Step5 Select the "objective cell" and let "Solver" guess the "changed cell", and press "Solve" Advantage: Easy to implement without coding; Disadvantage: Time-consuming in estimating the reasonable initial value of parameters without priori-knowledge. 2. "lsqcurvefit" and "nlinfit"function F=fun(x,xdata)F=(x(1)./(xdata-x(2))).^(x(3)+1).*exp(-x(1)./(xdata-x(2)))/(x(1)*gamma(x(3))); xdata= ; ydata= ;x0= ; =nlinfit(xdata,ydata,@fun,x0) or =lsqcurvefit(@fun,x0,xdata,ydata) MaxValue=max(xdata); MinValue=min(xdata); x1= ; y1=(x(1)./(xdata-x(2))).^(x(3)+1).*exp(-x(1)./(xdata-x(2)))/(x(1)*gamma(x(3))); loglog(xdata, ydata, 'ko',xdata,y1,'k-') Summary: The "Isqcurvefit" doesn't need very accurate initial estimation of parameters,however, the result is much worse than solver or good initial estimation of parameters. The "nlinfit" needs good estimation of initial parameters to get results. 3. Maximum Likelihood Estimation The advantage of this method is the loose need for the accurate initial estimation of the parameters. And, it is super-good method to do the curve fitting for the data set span several orders of magnitude. Open the M file of “Inverse_Gamma” Step1. Prepare Input Data Copy the bin of landslide area into “A_L” at line 5; Copy the calculated probability into “y” at line 11; Copy the frequency of landslides into “y_freq” at line 15; Step2. Change function ‘mlecustom’ in MATLAB Open ‘mlecustom’ with the path of “InstallationDrive:\Matlab\toolbox\stats\private\mlecustom.m”; Or type ‘mlecustom’ in MATLAB Editor, then, select the string, right-click the string, and select ‘Open Selection’ in the popup list; Right after the function head, type in “global y_freq”; At line 150, replace “uncensFreq=freq(c);” with “uncensFreq =y_freq;” Step3. Output Run the script (Inverse_Gamma.m) in MATLAB Editor; The optimized parameters and the coefficient of determination will be showed in the command window; The fitting curve is showed in the figure window;
Xuzhang Shen Sex: Male Born: 7 /29/1976 Email: shenxzh@gmail.com Office Address: Lanzhou Institute of Seismology, China Earthquake Administration (CEA). 450 Donggang West Road, Lanzhou, Gansu Province, 730000, China Tel: +86-09318277661; Cell phone: +86-15009313127 Education PH.D. Earth College of Graduate University of Chininese Academy of Science (GUCAS). Beijing, China. 2008 M.S. Lanzhou seismology institute of China Earthquake Administration (CEA). Lanzhou, China. 2003 Undergraduate. Physical department of Gansu educational institute. Lanzhou, China. 2000 Junior college. Physical department of Zhangye teacher’s college. Zhangye, China. 1996 Professional Experience 11/2012-present. Researcher, LIS. 11/2008-11/2012. Associate researcher, LIS. 01/2005-12/2008. Assistant researcher , LIS. 05/2013-06/2013, Visit researcher in GFZ, hosted by Dr. Xiaohui Yuan 04/2012-05/2012, Visit researcher in GFZ, hosted by Dr. Xiaohui Yuan 09/2011-10/2011, Visit researcher in GFZ, hosted by Dr. Xiaohui Yuan 12/2009-12/2010, Visit researcher in ERI, hosted by Prof. Hitoshi Kawakatsu 11/2007-12/2007, Foreign research fellow in ERI, hosted by Prof. Hitoshi Kawakatsu 02/2007-06/2007, Foreign research fellow in ERI of Tokyo University, hosted by Prof. Hitoshi Kawakatsu 07/2003-12/2004. Research Assistant, Lanzhou institute of seismology (LIS), Chinese Earthquake Administration (CEA). List of academic publications 1. SHEN Xu-Zhang . An analysis of the deformation of the crust and LAB beneath the Lushan and Wenchuan earthquakes in Sichuan province (In Chinese with English abstract) . Chinese Journal Geophysics, 2013,56(6): 1895-1903,doi: 10.6038/cjg20130612 2. SHEN Xu-Zhang . Imaging structures of crust and upper mantle beneath the source of the 14 April 2010 Yushu, Qinghai earthquake using P- and S- wave receiver functions (In Chinese with English abstract) . Chinese Journal Geophysics, 2013,56(2): 493-503,doi: 10.6038/cjg20130213 3. Shen Xuzhang , Zhou Huilan, Receiver functions of CCDSN and crustal structure of China continent, Earthquake Science, 2012, 25:3-16, doi:10.1007/s11589- 012-0826-6 4. Shen Xuzhang , Zhou Huilan, Peeling linear inversion of upper mantle velocity structure with receiver functions, Earthquake Science, 2012, 25:65-74, doi:10.1007/s11589-012-0832-8 5. Xuzhang Shen , Xiuping Mei and Yuansheng Zhang. The crustal and upper mantle structures beneath the northeastern margin of Tibet, Bull. Seismol. Soc. Am.,2011,101,6, 2782–2795, doi: 10.1785/0120100112 6. SHEN Xu-Zhang , QIN Man-Zhong, ZHANG Yuan-Sheng et al. Seismological evidence of ultra-crust faults in northeastern margin of Qinghai-Tibet Plateau (In Chinese with English abstract). Chinese J. Geot. Eng., 2011, 33(Suppl.1): 255-260. 7. Shen Xuzhang , Study on the low-velocity discontinuity at the depth of 170 km beneath SSE station in Shanghai, China. Chinese J. Geophys.2011. 8. SHEN Xu-zhang , MEI Xiu-ping, YANG Hui. Study on the crustal structures beneath Wenchuan earthquake rupture zone. Progress in Geophysics (In Chinese with English abstract), 2011, 26(2): 477-488, doi: 10.3969/j.issn.1004-2903.2011.02.012 9. Shen Xuzhang , Zhang Shuzhen, Zheng zhong and Hao Chunyue, Study on the small-scale heterogeneities below the Hailaer CTBTO seismic array, Chinese J. Geophys (In Chinese with English abstract). 2010, 53(5):1158-1166. DOI:10.3969/j.issn. 0001-5733.2010.05.107 10. Shen Xuzhang and Mei Xiuping, The Array Response Function of Lanzhou Seismic Array and Results of FK analysis for Small earthquakes of Different Azimuths(In Chinese with English abstract). 2010, Northwestern Seismological Journal, 32 ( 1 ): 59-64 11. Xuzhang Shen and Joachim R. R. Ritter, Small-scale heterogeneities below the Lanzhou CTBTO seismic array, from seismic wavefield fluctuations, Journal of Seismo, 2009, 10.1007/s10950-009-9177-8. 12. Shen Xuzhang and Zhou Huilan, Locating seismic scatterers at the base of the mantle beneath eastern Tibet with PKIKP precursors, Chinese Science Bulletin, 2009, 10.1007/s11434-009-0382-1 13. Shen Xuzhang and Zhou Huilan, The low-velocity layer at the depth of 620 km beneath Northeast China, Chinese Science Bulletin, 2009, DOI 10.1007/s11434-008-0559-z 14. Xuzhang Shen , Huilan Zhou, and Hitoshi Kawakatsu, Mapping the upper mantle discontinuities beneath China with teleseismic receiver functions, Earth Planets Space, 2008 , 60 (7):713-719 15. Shen Xuzhang and Mei Xiuping, Temporal and spatial evolution of water-tube tiltmeter record of China deformation stations and its implication before and after Ms8.1 earthquake west to the Kunlun Mountain pass (In Chinese with English abstract). Northwestern Seismological Journal, 2009,31(1):57-60 16. Shen Xuzhang and Zhou Huilan Inversion of velocity structure under HLR seismic station with receiver function and NA method(In Chinese with English abstract), Journal of the Graduate School of the Chinese Academy of Sciences, 2005, 22(3 ) :322-328 17. Shen Xuzhang , Chang qianjun and Mei Xiuping, Analysis on precursor effect to earthquake of FSQ water tube tiltmeter in Lanzhou deformation station (In Chinese with English abstract). Northwestern Seismological Journal, 2004, 26(4): 368-370 18. Xiaoqing Xu , Xuzhang Shen * , Chang Ming , et al., Preliminary analysis of teleseismic receiver functions of the Ningxia and its adjacent area, Earthquake Science, 2012, 25:47-53. doi:10.1007/s11589-012-0830-x 19. Tang Jiuan, Shen Xuzhang *, and Gao Antai et al., Exprolation of estimation of observations of confined water level(In Chinese with English abstract). Journal of Geodesy and Geodynamics,2012,32(4):37-40 Meeting Abstracts Xuzhang Shen , Teh-Ru Alex Song, Determinating the lithosphere/asthenosphere boundary (LAB) with multiples on P-receiver functions, “Symposium III: Craton vs. Orogen”at theInternational Conference on Craton Formation and Destruction, 2011, Beijing, 25-29 April 2011. Shen Xuzhang , Origin of the geothermal in Tianshui region in the northeastern margin of Tibet and its implication, JPGU, 2010.6 Tokyo, Japan (Talk) Shen Xuzhang , Zhou Huilan. Peeling hybrid inversion of upper mantle structure with receiver functions. Poster, WPGM, 2006, Beijing, China. Shen Xuzhang , Zhou Huilan. The low velocity zone influence of receiver functions and the treat with it in theinversion, China Earthquake Annual, 2005 393 (Talk) Shen Xuzhang , Zhou Huilan. Parallel NA Inversion of Receiver Functions China Earthquake Annual, 2004 370 (Talk) Fu Guanyu, Shen Xuzhang , Tang, Jiuan, Fukuda,Y, Coseismic Strain Steps of the 2008 Wenchuan Earthquake Indicate EW Extension of Tibetan Plateau and Increased Hazard South to Epicenter, American Geophysical Union, Fall Meeting 2008 Research grant record Project Manager National Science Foundation of China. Study on the lithospheric anisotropy and deformation of eastern margin of Tibet and its adjacent area, 2013-2016 National Science Foundation of China. Study of lithosphere and deep discontinuities of upper mantle beneath northeast and east margin Tibet with P and S receiver functions. 2010---2012. Basic RD fund of Institute of Earthquake Prediction, CEA, Full wavefield simulation in 3-D media in the northeastern margin of Tibet. 2009-2010. China Earthquake Administration (CEA) associate fund. Study of the detailed structure of lithosphere and upper mantle under Lanzhou Dajianshan array. 2005---2007. Project Participants China Earthquake Administration (CEA) associated fund. Relationship of seismocity and spatio-temporal distribution of China deformation field. 2005---2007 National Science Foundation of China. Study of peeling inversion method of receiver functions and detection of deep discontinuities of upper mantle under China. 2003---2006. Special basic funding of Ministry of Science and Technology of the P.R.C. Construction of China tidal database. 2004---2006 Skills Operation system: Proficient in Windows, Linux Programming languages: Fortran Seismic software: SAC, Matseis, SH Script languages: Be familiar with Matlab, Perl, Shell, Java, Python Other software applications: Word, Powerpoint, Excel, GMT, TauP, Photoshop Excellent communication skills including written,verbal and interpersonal.
Biodiversity in a complex world: consolidation and progress in functional biodiversity research 从 Ecology Letters 作者: Helmut Hillebrand, Birte Matthiessen The global decline of biodiversity caused by human domination of ecosystems worldwide is supposed to alter important process rates and state variables in these ecosystems. However, there is considerable debate on the prevalence and importance of biodiversity effects on ecosystem function (BDEF). Here, we argue that much of the debate stems from two major shortcomings. First, most studies do not directly link the traits leading to increased or decreased function to the traits needed for species coexistence and dominance. We argue that implementing a trait-based approach and broadening the perception of diversity to include trait dissimilarity or trait divergence will result in more realistic predictions on the consequences of altered biodiversity. Second, the empirical and theoretical studies do not reflect the complexity of natural ecosystems, which makes it difficult to transfer the results to natural situations of species loss. We review how different aspects of complexity (trophic structure, multifunctionality, spatial or temporal heterogeneity, and spatial population dynamics) alter our perception of BDEF. We propose future research avenues concisely testing whether acknowledging this complexity will strengthen the observed biodiversity effects. Finally, we propose that a major future task is to disentangle biodiversity effects on ecosystem function from direct changes in function due to human alterations of abiotic constraints. Source: http://www3.interscience.wiley.com/journal/122659286/abstract?CRETRY=1SRETRY=0
差异基因筛选方法 和 时间序列表达数据分析工具 有多种,结果多样,就会导致下游的功能分析大相径庭。 下面这两段话似乎解决了问题? Hosack , showed that prevalent biological themes within the set of differentially expressed transcripts derived from the same experiment, but using different transcript selection methods, are a stable representation of the biology underlying the experiment . Therefore, even though differentially expressed transcript lists have only partial overlap they all represent subsets of transcripts associated to a specific biological event (Figure 2). The integration of transcription profiles with biology knowledge bases (e.g. GO, KEGG, PUBMED, etc.) is another way of mapping differentially expressed transcripts in specific biology knowledge domains. A coordinated change among many gene products can produce potent biological effects, while the effect of each individual transcript can be subtle. The identification of pathways distinctively enriched within a set of differentially expressed transcripts can also be subsequently used to check if more subtle transcriptional variations, not considered in the stringent differential expression analysis , could also be used to strengthen the biological mean of the identified pathway. Another possible application could be the link of alternative splicing events, detected with the new exon-oriented Affymetrix microarray platform, to functional pathways depicted by conventional differential expression analysis. GO is however marked by flaws of certain characteristic types, due to a failure to address basic ontological principles . 做过GO注释的都会遇到这个头疼的问题,就是每个基因都会有好几个GO号,但不一定所有基因包含同一层次的GO号(本来就是有向无环图,没法确定层次), 因此,拿哪些GO号来分类?真懵!幸好已经有工具解决了这个问题。 工具: Class 1: Singular enrichment analysis (SEA) The most traditional strategy for enrichment analysis is to 1. take the users preselected (e.g. differentially expressed genes selected between experimental versus control samples by t-test with a P-value 0.05 and fold change 1.5) interesting genes, and then iteratively 2. test the enrichment of each annotation term one-by-one in a linear mode. Thereafter, the individual, enriched annotation terms passing the enrichment P-value threshold are reported in a tabular format ordered by the enrichment probability (enrichment P-value). The enrichment P-value calculation, i.e. number of genes in the list that hit a given biology class as compared to pure random chance, can be performed with the aid of some common and well-known 3. statistical methods (11,12,76), including Chi-square, Fishers exact test, Binomial probability and Hypergeometric distribution , etc. (Table 1). More discussion regarding the enrichment P-value can be found 4. in a later section of this paper. (概括得太好了,我已经写完的那篇论文就是这个路子,看来已经过时了) Even though the strategy and output format of SEA are simple, SEA is indeed a very efficient way to extract the major biological meaning behind large gene lists, which may be generated from any type of high-throughput genomic studies or bioinformatics software packages. Most of the earlier tools (such as GoMiner, Onto-Express, DAVID and EASE ) and a lot of the recently released tools (such as GOEAST and GFinder ), adopted this strategy and demonstrated significant success in many genomic studies. However, the common weakness of tools in this class is that the linear output of terms can be very large and overwhelming (from hundreds to thousands). Therefore, the data analysts focus and interrelationships of relevant terms can be diluted . For example, relevant GO terms like apoptosis, programmed cell death, induction of apoptosis, anti-apoptosis, regulation of apoptosis, etc., are spread out at different positions in a large linear output. It is difficult to focus on interrelationships of relevant biology terms among hundreds or thousands of other terms. In addition, the quality of pre-selected gene lists could largely impact the enrichment analysis , which makes SEA analysis unstable to a certain degree when using different statistical methods or cutoff thresholds .(这就回到了本文开头的那种情况) Class 2: Gene set enrichment analysis (GSEA) GSEA carries the core spirit of SEA, but with a distinct algorithm to calculate enrichment P-values as compared to SEA (35). People in the field give great attention and expectation to the GSEA strategy. The unique idea of GSEA is its no-cutoff strategy that takes all genes from a microarray experiment without selecting significant genes (e.g. genes with P-value 0.05 and fold change 1.5). This strategy benefits the enrichment analysis in two aspects: 1) it reduces the arbitrary factors in the typical gene selection step that could impact the traditional enrichment analysis; and 2) it uses all information obtained from microarray experiments by allowing the minimally changing genes, which cannot pass the selection threshold, to contribute to the enrichment analysis in differing degrees. The maximum enrichment score (MES) is calculated from the rank order of all gene members in the annotation category. Thereafter, enrichment P-values can be obtained by matching the MES to randomly shuffled MES distributions (a KolmogorovSmirnov-like statistic) (35). Other enrichment tools in the GSEA class using the no-cutoff strategy, such as ErmineJ (31), FatiScan (55), MEGO (36), PAGE (29), MetaGF, Go-Mapper (22) and ADGO (45) , etc., employ parametric statistical approaches such as z-score, t-test, permutation analysis , etc. These approaches directly take experimental values (e.g. fold change) of all genes into the calculation for each annotation term. Collectively, recent GSEA tools which integrate the total experimental values into the functional data mining are an interesting trend with a lot of potential as a complement to traditional SEA (47,7779). However, tools in the GSEA class are also associated with some common limitations. First, the no-cutoff strategy is the key advantage of GSEA, but is also becoming its major limitation in many biological studies. The GSEA method requires a summarized biological value (e.g. fold change) for each of the genome-wide genes as input. Sometimes, it is a difficult task to summarize many biological aspects of a gene into one meaningful value when the biological study and genomic platform are complex . For example, each gene derived from a SNP microarray could associate with a set of SNPs, which vary in size, P-values, physical distances, disease regions, LD (Linkage Disequilibrium) strength and SNP-gene locations (e.g. in exon, or in intron) from gene to gene. It is still a very experimental procedure to summarize such diverse aspects of biology into one comprehensive value. Similar challenges may be found in many of the emerging genomic platforms (e.g. SNP, Exon, Promoter microarray). The situations in the examples fully or partially fail in the GSEA-required input data structure requirement. For another example, many clinical microarray studies involve multiple factors/variants simultaneously, such as disease/normal, ages, sex, drug treatment/control, reagent batch effects, animal batch effect, etc. In such complex situations, sophisticated statistical methods, like ANOVA, time series analysis, survival analysis, etc., will be more powerful to handle multi-variances, multiple time points and batch effects, etc. simultaneously for datamining interesting gene lists. In many similar cases, the upstream data processing and comprehensive gene selection statistics cannot be simply avoided or replaced by GSEA. Moreover, the genes ranked in higher positions (usually with higher differences, e.g. fold change) are the major force driving (highly weighted) the enrichment Pvalues in GSEA. Thus, the underlying assumption is that the genes with large regulations (e.g. fold changes) are contributing more to the biology. Obviously, this is not always true in real biology . Biologists know that small changes of some signal transduction genes can result in larger downstream biological consequences. In contrast, some big changes in metabolic genes may be just a consequence of other small, but important, signal regulation events. Depending on the questions that the researcher is asking, the mildly changed signal transduction genes may be more interesting/important than those largely regulated genes. The GSEA and SEA methods have been available in the community for many years. Surprisingly, no comprehensive and systematic side-by-side comparisons are available yet. A recent study ran the same datasets with DAVID methods (a SEA/MEA method) versus ErmineJ (a GSEA method) (60). As expected, the results from both methods were highly consistent with each other. The consistency makes sense because the major driving force of the enrichment calculation in GSEA is the largely changing genes. In addition, those genes most likely have better chances to be selected in the traditional gene selection procedures, thus resulting in very similar results between the SEA and GSEA methods . (共同的缺点却导致了结果的一致) Class 3: Modular enrichment analysis (MEA) MEA inherits the basic enrichment calculation found in SEA and incorporates extra network discovery algorithms by considering the term-to-term relationships. Recent tools, such as Ontologizer (69), topGO (41), GENECODIS (59), ADGO (45) and ProfCom (68), claimed to improve discovery sensitivity and specificity by considering inter-relationships of GO terms in the enrichment calculations, i.e. using genes of composite (joint) annotation terms as a reference background. The key advantage of this approach is that the researcher can take advantage of termterm relationships, in which joint terms may contain unique biological meaning for a given study, not held by individual terms . Moreover, when using heterogeneous annotation content, the annotation terms are highly redundant, and also have strong interrelationships regarding different aspects for the same biological process. Building such relationships is one step closer to the true nature of biology during data mining. GoToolBox (18) developed functions to cluster related GO terms or genes, which provides the gene functional annotation in a network context. However, the functions only work for a small scope and only for GO terms. DAVID (60,61) recently provided a new tool that is able to organize and condense a wide range of heterogeneous annotation content, such as GO terms, protein domains, pathways and so on, into term or gene classes. This organization is accomplished by using Kappa statistics to mine the complex biological co-occurrences found in multiple heterogeneous annotation content. Combined with traditional enrichment P-value calculations, the new approach allows the enrichment analysis to progress from termcentric or gene-centric to biological module-centric analysis . These methods take into account the redundant and networked nature of biological annotation content in order to concentrate on building the larger biological picture rather than focusing on an individual term or gene . Such data-mining logic seems closer to the nature of biology in that a biological process works in a network manner. However, the obvious limitation of MEA is that orphan terms or genes (without strong relationships to neighbor terms/genes) could be left out from the analysis . Thus, it is important to examine those terms or genes that are left out during analysis when using MEA (60). In addition, the quality of the pre-selected gene list impacts the analytic results , just as it does in SEA analysis. Class1 举例: GOEAST is web based software toolkit providing easy to use, visualizable, comprehensive and unbiased Gene Ontology (GO) analysis for high-throughput experimental results, especially for results from microarray hybridization experiments. The main function of GOEAST is to identify significantly enriched GO terms among give lists of genes using accurate statistical methods. Class2 举例: topGO (topology-based Gene Ontology scoring) More recently, Alexa proposed a conditional hypergeometric test that computes the significance of a GO term based on its neighbourhood. Using the classical approach in which each node is scored independently, only few true significant nodes remain undiscovered . However, the dependencies between top scoring nodes yield a high false-positive rate . Alexa introduced the possibility of weighting genes annotated to a GO term based on the scores of neighboring GO terms or iteratively removing genes mapped to significant GO terms from more general (higher level) GO terms. The conditional hypergeometric test based on GO terms weightings reduces the false-positive rate, while not missing many true enriched nodes . The other conditional test is more efficient in finding the important areas in the GO graph, it also further reduces the false-positive rate, but with a higher risk of discarding relevant nodes. 总结一下, 无非是从两种角度出发: 1.从差异基因出发,先筛基因,再分析功能;受到基因筛选方法和参数的限制。 2.从功能出发,再看每个功能对应的基因如何;倒向基因变化幅度大的功能,是否可以通过降低基因变化的权重来改进一下? 功能分析的检验 Routinely, both over- and under-representation of ontology terms can be detected using the standard hypergeometric test . In probability theory and statistics, the hypergeometric distribution is a discrete probability distribution that describes the number of successes in a sequence of n draws from a finite population without replacement. The test based on the hypergeometric distribution is identical to the corresponding one-tailed version of Fishers exact test(看不同) . Reciprocally, the P-value of a two- sided Fishers exact test(看大小) can be calculated as the sum of two appropriate hypergeometric tests. Even though ontology enrichment approaches are widely used, only the most significant portion of the gene list is used to compute their statistic. Furthermore, the order of genes on the significant gene list is not taken into consideration. As a result simply counting the number of gene set members contained in the short list leads to loss of information, especially if the list is long and the difference between the more significant and the less significant is substantial. Finally, the correlation structure of gene sets is not considered at all . Two of the most used statistics to evaluate the association between functional pathways and differential expression are the one-tailed Fisher exact test, (FET) and Gene Set Enrichment Analysis (GSEA)(TM4 Mev) . FET is a statistical significance test used in the analysis of categorical data where sample sizes are small. The test is used to examine the significance of the association between two variables in a 22 contingency table. GSEA on the other hand evaluates microarray data at the level of genesets. The gene sets are defined based on prior biological knowledge , e.g. published information about biochemical pathways or co-expression in previous experiments. The goal of GSEA is to determine whether members of a gene set S tend to occur toward the top (or bottom) of the list L, in which case the gene set is correlated with the phenotypic class distinction. GSEA acts through three steps: (i) Calculation of an enrichment score. (ii) Estimation of significance level of enrichment score. (iii) Adjustment for multiple hypothesis testing. Since an accurate and rapid identification of perturbed pathways through the analysis of genomewide expression profiles facilitates the generation of biological hypotheses, Tian proposed a statistical framework for determining whether a specified group of genes for a pathway has a coordinated association with a phenotype of interest. In this framework, the overall objective of the analysis is to test whether a group of genes has a coordinated association with a phenotype of interest evaluating the following two null hypothesis: (i) The genes in a gene set show the same pattern of associations with the phenotype compared with the rest of the genes. (ii) The gene set does not contain any genes whose expression levels are associated with the phenotype of interest. After the test statistics are computed for testing the two hypotheses gene sets are then ranked in order of their significance and a control for the inflated Type I error due to multiple comparisons of gene sets is also applied. The authors claimed that their approach has more statistical power than currently available methods and can result in the discovery of statistically significant pathways that are not depicted by other methods . 牛! Markowetz proposed an algorithm to infer nontranscriptional pathway features based on differential gene expression in silencing assays . The authors idea is that cellular signalling pathways, which are not modulated on a transcriptional level, cannot be directly deduced from expression profiling experiments. However, when external interventions occur i.e. RNA interference or gene knock-outs, even if the expression of the signalling genes is not changed, secondary effects in downstream genes shed light on the pathway, and allow partial reconstruction of its topology. The core of Markowetz approach is the definition of a scoring function, which measures how well hypotheses about pathway topology are supported by experimental data. 参考文献: Microarray data analysis and mining approaches Francesca Cordero , Marco Botta and Raffaele A. Calogero Corresponding author: Raffaele A. Calogero, Department of Clinical and Biological Sciences, University of Torino, Italy. Briefings in Functional Genomics and Proteomics 2008 6(4):265-281; doi:10.1093/bfgp/elm034 Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists Da Wei Huang , Brad T. Sherman and Richard A. Lempicki * Laboratory of Immunopathogenesis and Bioinformatics, Clinical Services Program, SAIC-Frederick, Inc., National Cancer Institute at Frederick, Frederick, MD 21702, USA Nucleic Acids Research 2009 37(1):1-13; doi:10.1093/nar/gkn923