趋势拟合模型分析时间序列( Time Series )随时间变化的趋势倾向性及其显著性水平。 以 美国某气象站 1894~2010 年连续的年降水量 为例,介绍 2 种趋势拟合模型的计算过程。 t 为年份, Y 为降水量, a i ( i =0,1,2 … ,n )、 b 等表示拟合系数。 1. 线性趋势模型(最常用) Y = a 0 + a 1 t 该模型的增长率是常数 a 1 。 计算过程及显著性水平见“ SIGNIFICANT Test for the Linear Regression Equation ”。 2. 指数趋势模型 Y = ab t 该模型的增长率是常数 b -1 。 计算方法是模型两边同时求自然对数为 ln Y =ln a + t ln b ,转化为线性趋势模型再求解 a 、 b ,显著性水平计算亦然。 图 1 如图 1 所示, a 、 b 为 775.147 和 1.0002 , R 2 过小也就是拟合结果包含的信息量少, p 大于 0.05 显著性阈值,未通过显著性检验。 TrendFitting.rar 参考文献 潘红宇 . 时间序列分析 . 北京 : 对外经济贸易大学出版社 , 2006.
How to add non-linear trend line to a scatter plot in R? Answers Let's create some data. n - 100x - seq(n) y - rnorm(n, 50 + 30 * x^(-0.2), 1) Data - data.frame(x, y) The following shows how you can fit a loess line or the fit of a non-linear regression. plot(y ~ x, Data) # fit a loess lineloess_fit - loess(y ~ x, Data) lines(Data$x, predict(loess_fit), col = "blue") # fit a non-linear regression nls_fit - nls(y ~ a + b * x^(-c), Data, start = list(a = 80, b = 20, c = 0.2)) lines(Data$x, predict(nls_fit), col = "red") First, it's possible that your data describe some process which you reasonably believe is non-linear. For instance, if you're trying to do regression on the distance for a car to stop with sudden braking vs the speed of the car, physics tells us that the energy of the vehicle is proportional to the square of the velocity - not the velocity itself. So you might want to try polynomial regression in this case, and (in R) you could do something like model - lm(d ~ poly(v,2),data=dataset) . There's a lot of documentation on how to get various non-linearities into the regression model. On the other hand, if you've got a line which is "wobbly" and you don't know why it's wobbly, then a good starting point would probably be locally weighted regression, or loess in R. This does linear regression on a small region, as opposed to the whole dataset. It's easiest to imagine a "k nearest-neighbour" version, where to calculate the value of the curve at any point, you find the k points nearest to the point of interest, and average them. Loess is just like that but uses regression instead of a straight average. For this, use model - loess(y ~ x, data=dataset, span=...) , where the span variable controls the degree of smoothing. On the third hand (running out of hands) - you're talking about trends? Is this a temporal problem? If it is, be a little cautious with over interpreting trend lines and statistical significance. Trends in time series can appear in "autoregressive" processes, and for these processes the randomness of the process can occasionally construct trends out of random noise, and the wrong statistical significance test can tell you it's significant when it's not! If you use ggplot2 (the third plotting system, in R, after base R and lattice), this becomes: library ( ggplot2 ) ggplot ( Data , aes ( x , y )) + geom_point () + geom_smooth () Without knowing exactly what you are looking for, using the lattice package you can easily add a loess curve with type="smooth" ; e.g., library ( lattice ) x - rnorm ( 100 ) y - rnorm ( 100 ) xyplot ( y ~ x , type = c ( "smooth" , "p" )) See help("panel.loess") for arguments that can be passed to the loess fitting routine in order to change, for instance, the degree of the polynomial to use. 原文网址: http://stats.stackexchange.com/questions/30975/how-to-add-non-linear-trend-line-to-a-scatter-plot-in-r
That is the question I need an answer for. A few days ago, I was convinced by a Nature paper that the tropical Indian Ocean has undergone changes that are consistent with what scientists expected from a warming climate. Today, I found two papers that emphasize decadal variability, at least at 32S. (Why 30S? That's a survey line that we have some data to work on.) I need to know the answer quickly, because I have a proposal deadline. So, I emailed the first author in UK. Hope she will answer me soon...
MDSM Communication Vol.2-No.12 Dec. 26, 1995 WHAT CAN WE DISCOVER FROM 1,2,3 TO 2,3,4? - An Introduction to vegetation dynamic analysis- Revised June 3, 1997 for home page Let’s consider a situation where the vegetation is recovering from a fire event, such as in Yellowstone National Park . The first year, say 1994, the data collected were 1, 2, 3, and the second year, 1995, were 2, 3, 4, respectively. These measurements represent the abundance of trees, shrubs, and grasses. The data will look like this: Tree Shrub Grass (Total) 1994 1 2 3 (6) 1995 2 3 4 (9) What can we discover from 1-2-3 to 2-3-4? 1). All of us can see that the vegetation changed from 1994 to 1995. 2). Most of us will agree that the vegetation has increased, but different species may increase at different speeds. This may be considered a temporal dynamic of vegetation. 3). Some of us would analyze the data and found out that: Trees increased 2/1=200%, Shrubs increased 3/2=150%, but Grasses increased 4/3=133%, and The Total increased 9/6=150%, respectively. The data will look like this: Tree Shrub Grass (Total) 1994 1 2 3 (6) 1995 2 3 4 (9) Increasing ratio 200% 150% 133% (150%) 4A). However, there may not be many of us who agree that the grasses are relatively decreasing, while all the figures appear to be increasing. To discover the instantaneous changing trends of the vegetation in 1995, we use the total increasing ratio to adjust each single increasing ratio and name this Trend Values, which is short for Multivariate Instantaneous Successional Trends for 1995(Bai, 1996) . The total increasing ratio is 9/6=150%. After using this Total ratio to adjust the ratio of each species, the Trend Values for each species are: T(t)=200%/150%=1.33, T(s)=150%/150%=1.0, and T(g)=133%/150%=0.89, respectively. The above analysis is expressed in the following table: Tree Shrub Grass 1994 1 2 3 (6) 1995 2 3 4 (9) Increasing rate 200% 150% 133% (150%) Trend Value 1.33 1.00 0.89 The Trend Value is the ratio of increment of species adjusted by that of the total. The Trend Value for each species can also be obtained by another means: 4B):The p ropor tions of the tree, shrub, and grass for 1994 were: 1/6=17%, 2/6=33%, and 3/6=50%, respectively. The p r o po rtions for 1995 were: 2/9=22%, 3/9=33%, and 4/9=44% respectively. The increasing ratio, named trends, for the three species are: T(t)=22/17=1.33 T(s)=33/33=1.00, and T(g)=44/50=0.89, respectively. The above trend analysis is shown in the following table: Tree Tree% Shrub Shrub% Grass Grass% Total 1994 1 17 2 33 3 50 6 1995 2 22 3 33 4 45 9 Trend Values 1.33 1 0.89 The percent is the ratio of the value of the species over total. And the Trend Value is the ratio of percent of 1995 over 1994. 5). We can sort the species by their Trend Values: Tree 1.33 Shrub 1.00 Grass 0.89 We see t he trend for trees is increasing, as the Trend Value of the trees is greater than one; while the trend for grasses is decreasing, as the Trend Value of grasses is smaller than one. 6). Without the concept of multivariate instantaneous trend, it is very difficult to convince people that in ten years this vegetation will be dominated by trees , 1.33^10=17 , and most of the grasses will be gone , 0.89^10=0.31 , provided the trend remains the same in the next ten years. The above trend analysis and prediction is shown in the following table: Tree Shrub Grass Total 1994 1 2 3 6 1995 2 3 4 9 Increasing rate 200% 150% 133% 150% Trend Value 1.33 1 0.89 ( T -Value) ^10 17.31 1 0.31 7). Conclusion: We can defin e the adjusted increasing percentage as an vegetation successional trend or instantaneous trend for species i at time k, in a simplified form: T(i,k)=Y'(i,k)/ . Y' is the relative abundance expressed in a percentage form: Y'(i)=Y(i)/{Total }, i=1,2,3. After the trend analysis we discovered that Trees are increasing the most, the Shrubs are stable, but Grasses are decreasing. Furthermore, we may make a best projection based on existing information that vegetation will be dominated by trees in ten years, instead of the present grass community. 8). Discussion: In vegetation dynamic analysis, we assume that the change of a species can only be discovered by comparison of the present over the past of the species, unless it is proved otherwise. In other words, it is our presumption that all species in vegetation are mathematically independent as every species is important and should be kept in the temporal dynamics (successiional trend)analysis. The instantaneous trend is different from the long term trend. It's value is a function of time, instead of a constan t . This makes the instantaneous trend analysis more flexible. The relation of the instantaneous trend and long-term trend will be discussed in another essay (MDSM Communication 4-2, 1997). 9). FURTHER DISCUSSION: First, w e apply our discussion to two-dimensional space (2-space) . Please recall two ancient great Mathematicians: Shang Gao (?) from China and Pythagoras (BC 500-580) from Greece. They discovered the relationship between the vector length and it's components in two dimensional space: 3^2+4^2=5^2. In other words, in 2-space, 3+4=5, instead of 7. While in 2-space, vector sum is interpreted as the vector length, the percentage is interpreted as cosine values: 3/5=0.6, and 4/5=0.8. The two cosine values determine the direction of the two-component-vector ( 2-vector ) . When the cosine values of 0.6 and 0.8 change, the 2-vector will change it's direction on the plane, and vise versa. This can further be extended to m ulti -dimensional space ( m-space ) : vector length of Y(i), L =sqrt{sum } , i=1,2,..m. In our case, 3-vector length of (1,2,3)=sqrt(14)=3.74, and 3-vector length of (2,3,4)=sqrt(29)=5.39. We can use these values to replace the one dimensional sum and begin our discussion all over again. This time, our discussion is in three dimensional space and the trees, shrubs, and grasses are represented by three axes. We can imagine the 3-vector representing the vegetation rotating in 3-space, from (1/3.74, 2/3.74, 3/3.74) of 1994 to (2/5.39, 3/5.39, 4/5.39) of 1995. This supplies a new tool that we may use to investigate the rotation of a n multi-component vector in multi-dimensional space and monitor the vegetation changes over time. SUMMARY OF 9) Using a vector sum instead of a scalar sum in m-space, the change in vegetation can be seen as the m- vector rotating in m-space. Thus, people can monitor the vegetation dynamics by tracing the movement of the m- vector on the unit hypersphere. This new method of performing Multivariate-Instantaneous-Trend analysis and system monitoring was named the Multi-Dimensional Sphere Model (MDSM). In the MDSM, an observation is expressed as a point or an m-vector in m-space. A community is the centroid m-vector of ' n ' observations, where the ' n ' is the observation number. The state of the vegetation is then a standardized (normalized) centroid m-vector, i.e., the center of the projections of ' n ' observations on the unit hypersphere. In the MDSM, the distance expresses the quantity while the direction expresses the quality. In other words, the distance of a vector relates to the production of vegetation, while direction of the vector contains the composition information of the vegetation. MDSM considers that vegetation a =(1,2,3) equals to vegetation a' =(10, 20, 30), as the two are in the same direction (coliner) . However, vegetation a =(1, 2, 3) is different from vegetation b =(3, 2, 1), although the two have the same vector length (norm). Furthermore, the trend may be paralleled as the slope of the trace, or the tangent vector onto the hypersphere, which indicates the reasons that have caused the changes. T. Jay Bai, Ph.D. MDSM Research P.O.Box 272628 Fort Collins, CO 80527 USA 970/495-9716, 970/581-0253 P.S. All the vectors should have been expressed in bolded case. The MDSM supposes the trend remains the same in neighborhood, and makes a prediction for next time interval: P(k+1)=Y(k)*T(k) and this presumption of exponential in neighborhood brings in prediction error. In the new development of MDSM, the prediction was adjusted and corrected with actually sampled data (D) from the next time interval (Jameson, 1986) and (Bai, 1996): E=P(1-alpha)+D*alpha, and R=sinE-P. Where, the E, P, D, and R are M-vectors of expectation, prediction, sampled data, and error, respectively. In the above example, as the 1995 data w as (2, 3, 4), and the trend was (1.33, 1, 0.89), so the prediction of 1996 based on the existing information is P96=(2.66, 3, 3.56). If the sampled data were 3, 4, 5 for 1996, then the expectation for the true value of 1996 would be E=P+D=(2.83, 3.5, 4.28), set alph=0.5, and the prediction error would be R=(0.17, 0.5, 0.72). This prediction error is less than the differences from the two observations: 96-95=(1, 1, 1). The entire procedure of vegetation dynamic analysis is shown in the following table: Tree Shrub Grass Total 1994 1 2 3 6 1995 2 3 4 9 Increasing Rate 200% 150% 133% 150% Trend Value-95 1.33 1.00 0.89 Projection-96 2.66 3 3.56 Observation-96 3 4 5 12 Expectation-96 2.83 3.5 4.28 Error 0.17 0.5 0.72 Interested readers please come to Aug. 14, 1996 session 155 of ESA meeting. -End- Dear friends, The above is a short essay discussing multivariate-instantaneous-trend analysis. Since it was posted on MDSM, SinoEco, and a local net last December, I received quite a few comments. Based on the se comments, I revised it and posted it here, hop ing to get more comments. T. Jay Bai If you are interested in MDSM research and discussion, please sign on: MDSM@gpsrv1.gpsr.colostate.edu or contact me at : JBAI@LAMAR.COLOSTATE.EDU , now should be updated: mdsm95bai@yahoo.com
Multi-Dimensional Sphere Model and Vegetation Instantaneous Trend Analysis T. JAY BAI1, TOM COTTRELL2, DUN-YUAN HAO3, TALA TE4, and ROBERT J. BROZKA5 (Ecological Modelling, 97/1-2, revised by author, for blog) 1 MDSM MT Research LLC, PO Box 80524-3864, Fort. Collins, CO 80524, USA; 2 Dept. of Biology, Luther College, Decorah, IA 52101, USA; 3 Dept. of Mathematics, University of Inner Mongolia, Huhhot, P.R.China; 4 Dept. of Physics, Normal University of Inner Mongolian, Huhhot, P.R.China, 5Research Scientist, Center for Ecological Management of Military Lands, Dept. of Forest Sciences, Colorado State University, Fort Collins, CO 80523, USA. Telephone, fax, and email address of corresponding author, T.Jay Bai: 970/495-9716, tjbmdsm@yahoo.com , mdsm95bai@gmail.com Permanent mailing address: T. Jay Bai, 615 JoAnne St., Fort Collins, CO 80524, USA. Received 25 August 1995; accepted 15 August 1996 Abstract The Multi-Dimensional Sphere Model (MDSM), a new method for multivariate instantaneous trend analysis, is introduced. The model handles three subscript data, Z (i,j,k) , e.g., for vegetation analysis, i, j, and k are species, quadrats, and time, respectively. The MDSM uses species, or species groups, as dimensions of a multi-dimensional space, and uses quadrats as points (vectors) in the m -space. The quadrats are standardized to 1.0 by division by their vector length, i.e., the square root of the sum of the squares of the components of a quadrat, q' (i) = q (i) / ( ( q (i) 2 )), i =1,2,.. m . All quadrats are projected onto the unit hypersphere. This maintains the composition information of each species for every quadrat in the data set, and makes all quadrats comparable because their vector lengths equal 1.0. The MDSM groups and combines quadrats on the hypersphere based on the cosine values between the m -vectors to form a state vector, z' , representing the vegetation, z' (i) = q' (i,j) , i =1,2,.. m , j =1,2,.. n . When performing trend analysis, the MDSM defines the quotient of components of previous (k-1) and present (k) state vectors as an instantaneous trend at a given time. This is referred to as a trend vector, and describes vegetation composition change over time, t (k) = z' (k) / z' (k-1) . The components of a trend vector (here called the t-value of the species) carry information from both previous and present states for species and community. This trend can then be extended to project the future states of the vegetation, p (k+1) = z (k) *t (k) . The MDSM combines correlation analysis, cluster analysis, trend analysis, and prediction of future vegetation states, making it a powerful and promising multivariate analysis method. The model was tested with data from the Land Condition Trend Analysis program at Fort Carson in southeastern Colorado. The model shows promising results for vegetation trend analysis; however, geometric meaning of the vector quotient is not yet clear. To improve our understanding, comparison with an additive model and a validation analysis are needed. Key Words Multi-dimensional space, m- space, vector analysis, hypersphere, multivariate time series, centralization, centroid vector, vector inverse, standardization, state vector, Importance Value, multivariate instantaneous trend, Trend analysis, trend vector, trend value, similarity coefficient. Introduction With ecosystems under increasing stress worldwide, numerous efforts have been undertaken to monitor changes in the biota. An example of such an effort is the U.S. Army Land Condition Trend Analysis (LCTA) program (Tazik et al. 1992). From a network of thousands of permanent plots on military installations across the United States and Germany, data are collected periodically to characterize the natural resources and to track changes over time. Fort Carson, a 55,600-hectare training installation in the eastern foothills of the Rocky Mountains in central Colorado, has used LCTA to monitor land condition since 1986. Data from Fort Carson was used to test a new multivariate analysis model, the Multi-Dimensional Sphere Model (MDSM). The model analyzes multivariate instantaneous trend in a vector form to express the magnitude, direction, and rate of instantaneous change in vegetation composition at a given time. In addition to trend analysis, the MDSM is used for correlation analysis, vegetation classification, and system monitoring (Bai et al. 1996a). The MDSM was designed for multivariate time series, three subscript data z (i,j,k) , or three way data (Gauch 1982). When used for vegetation analysis, i, j, and k represent species, quadrats, and time, respectively. MDSM fixes i as the dimension of multivariate space, m- space; groups and combines j to eliminate it from the analysis; and performs trend analysis over time. The principal advantage of the MDSM over other methods of multivariate analysis is its use of direction of a multivariate vector, m- vectors, to express the relation between the vegetation. Using m- vectors, represented here by bolded letters, MDSM can simultaneously analyze multiple species independently to reflect composition change. Other methods of multivariate analysis, such as least squares regression expressed by the formula Y = a (1) x (1) + a (2) x (2) + ...+ a (m) x (m) + r, are strongly influenced by a few dominant species. They only reflect changes in these dominant species, and are not capable of showing all vegetation composition changes. A second advantage of MDSM is that it works with species auto-correlation, instead of interrelation. MDSM emphasizes the study of vegetation state changes over time. When performing trend analysis, MDSM defines trend ( t ) as present over past: t (i,k) =z' (k) /z' (k-1) and extends this trend to the next time interval to make a projection ( p ) based on existing information (Bai et al. 1995): p (k+1) =z (k) *t (k) . A third advantage of the MDSM is that it uses instantaneous trend, instead of average trend (Bai et al. 1996a). The time interval is more flexible, and the results more preciously reflect the vegetation changes. Simplicity of calculation is another advantage of the MDSM. All calculations can be done on a handheld calculator. The MDSM can be described simply as an extension of division and percentage calculations from a scalar to a vector (Bai et al. 1996b). The MDSM is demonstrated in this paper with a sample data set of two species and three quadrats (Table 1). The multi-dimensional space, m -space, built from these sample data is a 2-dimensional surface defined by the orthogonal axes of X and Y with three points A, B, and D. The quadrats A, B, and D are represented as points and defined by the 2-vectors a (40,80), b (64,77), and d (117,53) in Figure 1,(Graphics are omitted from blog). Table 1 Simplified quadrat-species matrix on which the MDSM is based. Species data are abundance values (e.g., percent cover, density, or frequency) Quadrat A Quadrat B Quadrat D Species x 40 64 117 Species y 80 77 53 Methods Application of the Multi-Dimensional Sphere Model to vegetation analysis and synthesis involves four phases: quadrat data are divided by their vector length ( standardization ); determination of quadrat similarities and clustering of quadrats based on similarities ( centralization ); trend analysis, and projection. These four steps are outlined below. Step 1: Standardization of Data by Quadrat Standardization of quadrat data projects the quadrat points from an m -space to the unit hypersphere by dividing each element of an m- vector by the vector's length. Q'=Q/L (q) The length of a vector is the square root of the sum of the squares of the elements of a quadrat L (q) = S Q 2 (i) , i =1,2,.. m . therefore, Q (i) ' = Q (i) /|Q|= Q (i) / S Q 2 (i) where Q is the quadrat, Q (i) is the ith species of the quadrat, L (q) is the vector length of the quadrat represented as | Q |, Q' is the standardized quadrat, or the projection of the quadrat on the hypersphere, and Q' (i) is the ith species of the standardized quadrat. In this paper Q' (i) is referred to as the Importance Value (IV) of ith species. It can also be interpreted as the cosine value of ith component in the standardized vector. This process standardizes the different vector lengths to unity, while retaining their composition ratio. The standardization procedure is illustrated in Figure 1. The two dimensional space defined by the orthogonal axes of X and Y contains three points, A, B, and D, represented as squares. The lengths of these vectors are: |A|=OA= (40 2 +80 2 )= (8000)=89 |B|=OB= (64 2 +77 2 )= (10000)=100 |D|=OD= (117 2 +53 2 )= (16498)=128 Therefore, the standardized vectors a' , b' , and d' , shown as circles, are: A' =A (i) /|A|=(40/89, 80/89)=(0.45, 0.90) B' =B (i) /|B|=(64/100, 77/100)=(0.64, 0.77) D' =D (i) /|D|=(117/128, 53/128)=(0.91, 0.41) After standardization, the lengths of standardized vectors are 1, and the end points of these vectors fall on the unit hypersphere: OA'= (0.45 2 +0.90 2 )=1 OB'= (0.64 2 +0.77 2 )=1 OD'= (0.91 2 +0.41 2 )=1 The standardized m- vectors retain the same ratio of composition: A'(X:Y)=0.45:0.89=(40/89):(80/89)=40:80=A(X:Y) for m- vector A , and D'(X:Y)=0.91:0.41=(117/128):(53/128) =117:53=D(X :Y) for m- vector D . Step 2: Similarity Coefficient The standardized m- vectors a' (0.45, 0.90), b' (0.64, 0.77) and d' (0.91, 0.41) are plotted again in Figure 2. Based on their components, quadrat B' (0.64,0.77) is located between A' (0.45, 0.90) and D' (0.91, 0.41), and closer to A' than to D' . When dimensions are more than two, or the quadrats are numerous, it is impossible to visualize the entire data structure. Consequently, a similarity coefficient is needed to determine the relative position of the quadrats in m- space. The similarity coefficient between two m- vectors (quadrats) is defined by MDSM as the cosine of the angle between the two m- vectors (Bai 1982; Gauch 1982, Ludwig and Reynolds, 1988; Orloci, 1967). SC( a,b ) = COS AOB = COS BOA = S ( A (i) *B (i) )/(| A |*| B |) = S ( A' (i) * B' (i) ), i=1,2,..m Where SC is the Similarity Coefficient, and AOB is the angle between vectors a and b . Because the lengths of the standardized vectors are one, vector lengths are omitted in the following calculation: SC(a,b) = COS AOB = COS BOA = OF = (0.45*0.64 + 0.90*0.77) = 0.97 ARCCOS 0.97 = 13 ° = AOB The angle between a and b is therefore 13 degrees. SC(a,d) = COS AOD = COS DOA = OE = (0.45*0.91 + 0.89*0.41) = 0.77 ARCCOS 0.78 = 39 ° = BOD The angle between a and d is therefore 39 degrees. The cosine value calculated by MDSM shows that point B' is closer to point A' than to point D' in the 2-space. Therefore, quadrat B is more similar to quadrat A than to quadrat D in species composition. Step 3: Clustering Quadrats In trend analysis, MDSM analyzes vegetation state over time, as represented by state vectors. To determine a state vector of vegetation, the model groups quadrats based on their coefficients of similarity to generate a centroid m- vector representing the combined quadrats. We assume similar vegetation types will react similarly to environmental stressors and will have the same trend. A result of quadrat standardization, or projection onto the hypersphere, is the formation of clusters of quadrats. As an example, Figure 3 shows the projected locations of three standardized quadrats A' (0.45, 0.89), B' (0.64, 0.77), and D' (0.91, 0.41). B' is closer to A' than to D' . If A' and B' are combined to form a cluster G , the coordinates of G would be the average of A' and B' : A' = (0.45, 0.89) and B' = (0.64, 0.77) G =( A' + B' )/2= =(0.54, 0.83) G must be standardized in the same manner as described previously. Figure 3 shows that G is the center of chord A'B', and G' is the center of arc A'B'. Step 4: Trend Vector Discussion to this point has been based on quadrats sampled in the same time period but from different locations. However, for trend analysis, the MDSM is applied to quadrats sampled in different time periods but from the same location. We can use the quadrats A and B to define them as a time series, 1989 to 1993 for example, where 1989 = 89' = A' (0.45, 0.89) and 1993 = 93' = B' (0.64, 0.77). The 2-vectors 89' and 93' , are defined as state vectors and represent vegetation conditions in 1989 and 1993 from the same location (Figure 4). If the coordinates of the two states are not the same, then some change in composition has occurred during the sampling period. For natural resource monitoring and trend analysis, the following questions become important: 1) What is the magnitude and direction of change in species composition, t , during the sampling period? 2) If the trend remains the same, how long, k, could this system remain in a steady state? 3) If the vegetation do change after a time k, then what composition, z (k) , would it change to? Even when reasons for the vegetation change are not fully known, the MDSM assumes that people can make the best projection based on present conditions (Legendre 1983). This situation can be expressed as: 89'*t=93' where 89' and 93' are state 2-vectors representing the condition of vegetation in 1989 and 1993, respectively, and t is the trend 2-vector representing the combination of the unknown factors that caused the vegetation shifted from 89' to 93' . A basic premise of the MDSM is an alternative definition of multiplication and division of vectors, which has been used in spreadsheet software (QPRO 5.0, 1993): Current definition: Vector Matrix ADDITION: c (i) = a (i) + b (i) C (i,j) =A (i,j) +B (i,j) SUBTRACTION: c (i) = a (i) - b (i) C (i,j) =A (i,j) -B (i,j) MULTIPLICATION: c = a (i) * b (i) C (i,j) = A (i,k) *B (k,j) Or C (i,j) = a (i) * b (i) DIVISION: No definition. Exists only for some square matrices Alternative definition: MULTIPLICATION: c (i) = a (i) * b (i) The product of two m -vectors is an m- vector in the same m- space. The elements of the resultant m- vector are the products of the corresponding elements of the two m- vectors. SQUARE: a 2 = a*a = a (i) 2 DIVISION: c (i) = a (i) / b (i) The quotient of two m -vectors is an m -vector in the same m- space. The elements of the quotient are the quotients of the corresponding elements of the two m- vectors. INVERSION: a -1 =1/ a (i) The inverse of an m -vector is an m -vector in the same m- space, whose elements are the inverse of the elements of the m- vector. With this alternative definition of multiplication and division of vectors, the trend vector, t , can be expressed as the quotient of 93' and 89' : if 89'*t = 93' then t = 93' / 89' t (i) = 93' (i) / 89' (i) = 93 (i) /|93|*|89|/ 89 (i) The t-value also permits a projection of the next and future Values of the state vectors: B' * t + r (1) = P (1) and, B' * t 2 + r (2) = P (2) ... B' * t k + r (k) = P (k) , where P (k) is the projected future value of a state vector for k time intervals later, and r is the estimated error. For example, the time interval for the LCTA data at Fort Carson is four years. The given sample year was 1993. The prediction formula for the LCTA data is expressed as: 93' * t + r = 1997 93' * t 2 + r (2) = 2001, 93 ' * t k + r (k) = 1993+4k The time interval in the example is four years. The line connecting 89' and 93' in Figure 4 is indicative of the difference between the two states. The slope of the line indicates the direction and rate of change. The trend vector then operates on the state vector to project the state at the next time interval and beyond. The trend vector t in Figure 4 is calculated as follows: t = 93'/89 ' t (i) = 93' (i) / 89' (i) =(0.64/0.45, 0.77/0.89)=(1.42, 0.86) This trend vector shows that between the sampling dates, species X increased from 0.45 to 0.64, at a rate of 1.42; and species Y decreased from 0.89 to 0.77, at a rate of 0.86. Figure 4 also shows the dotted line of 89' 93' extended to P (1) : P (1) =1997= B' * t , the values of P (1) are: P (1) = B' * t = (0.64*1.42, 0.77*0.86) = (0.91, 0.67) This vector is then standardized giving P (1) ' : P (1) ' = 97' = P (1) /|97| P (i,1) ' = P (i,1) / S ( P (i,1) 2 ) = (0.91/ (0.912 2 +0.672 2 ), 0.67/ (0.912 2 +0.672 2 ) = (0.81, 0.59) The MDSM treats each species separately, and all species are included in the calculations, even those with comparatively small importance values. When performing instantaneous trend analysis, the model compares the present importance value of each species with its previous value to generate t-values. By comparing all t-values, a complete picture of vegetation community trend at given time can be obtained. A trial trend analysis The MDSM was applied to a trend analysis using Land Condition Trend Analysis data from Fort Carson. Results are presented in Table 2. For this analysis, 35 plant species were selected by correlation analysis and field experience. The correlation analysis was used to ensure that the selected species were not highly correlated. Species having a correlation of 0.97 or greater are combined to form a composite species. Species whose value fluctuates dramatically (from field observations) were not included in this trial analysis to avoid introducing too much noise. We did, however, keep a few widely fluctuating species, such as Melilotus officinalis , to show their activity in MDSM. Data were obtained from 199 permanent quadrats sampled in 1989 and 1993. To emphasize the trend analysis, all quadrats were combined and analyzed as one community representing Fort Carson as a whole. Table 2 Results of a trial MDSM multivariate instantaneous trend analysis Species 1989' t -values 1993' 1997 1997' 2001 2001' Melilotus officinalis 0.004 10.57 0.044 Bromus japonicus 0.010 6.102 0.061 Sitanion hystrix 0.020 3.567 0.072 Oryzopsis hymenoides 0.032 2.909 0.093 0.273 0.223 0.794 0.140 Bromus inermus 0.016 2.044 0.032 0.066 0.054 0.136 0.024 Juncus sp. 0.030 1.823 0.054 0.099 0.081 0.182 0.032 Salix sp. 0.014 1.499 0.021 0.032 0.026 0.048 0.008 Atriplex confertifolia 0.008 1.449 0.011 0.017 0.014 0.025 0.004 Sarcobatus vermiculatus 0.003 1.443 0.004 0.006 0.005 0.009 0.001 Atriplex canescens 0.009 1.380 0.013 0.018 0.015 0.025 0.004 Frankenia jamesii 0.010 1.368 0.014 0.020 0.016 0.027 0.004 Chenopodium incanum 0.011 1.340 0.015 0.020 0.016 0.027 0.004 Cercocarpus montanus 0.056 1.187 0.067 0.080 0.065 0.095 0.016 Quercus gambelii 0.067 1.168 0.078 0.091 0.074 0.107 0.018 Pascopyrum smithii 0.201 1.161 0.234 0.272 0.222 0.316 0.055 Oryzopsis micrantha 0.015 1.145 0.018 0.020 0.017 0.023 0.004 Yucca glauca 0.040 1.108 0.045 0.049 0.040 0.055 0.009 Pinus edulis 0.299 1.078 0.322 0.347 0.248 0.374 0.066 Krascheninnikovia lanata 0.014 1.065 0.014 0.015 0.013 0.016 0.003 Aristida purpurea 0.116 1.058 0.123 0.130 0.107 0.138 0.024 Juniperus monosperma 0.343 1.045 0.359 0.376 0.307 0.393 0.069 Rhus thilobata 0.027 1.034 0.028 0.029 0.024 0.030 0.005 Pinus ponderosa 0.047 1.023 0.048 0.050 0.040 0.051 0.009 Kochia scoparia 0.036 0.995 0.036 0.036 0.029 0.036 0.006 Stipa sp. 0.096 0.980 0.094 0.092 0.075 0.090 0.016 Bouteloua gracilis 0.823 0.965 0.795 0.767 0.627 0.740 0.131 Andropogon gerardii 0.027 0.961 0.026 0.025 0.020 0.024 0.004 Bouteloua curtipendula 0.068 0.943 0.065 0.061 0.050 0.058 0.010 Opuntia imbricata 0.006 0.889 0.005 0.005 0.004 0.004 0.000 Opunta polyacantha 0.024 0.848 0.020 0.017 0.014 0.015 0.002 Helianthus petiolaris 0.005 0.763 0.004 0.003 0.002 0.002 0.000 Hilaria jamesii 0.050 0.662 0.033 0.022 0.018 0.014 0.002 Gutierrezia sarothrae 0.033 0.576 0.019 Sprorobolus sp. 0.133 0.528 0.070 Salsosa kali 0.067 0.280 0.018 Length 1 1 1 1.223 1 5.651 1 Trend Index The second and fourth columns in Table 2 are the state 35-vectors for 1989 and 1993, shown as 1989' and 1993' , respectively. These values are the standardized averages for each species over all 199 quadrats. The third column is the trend 35-vector. The elements in the trend vector, t (i) , are the quotients of 1993' (i) and 1989' (i) and indicate the ratio of change for a given species during this time period. The expectation for the t-value is one. If the t-value for a species is equal to or greater than 1.0, the species' relative importance remained constant or increased over time. Conversely, if the t-value is less than 1.0, the species' importance value decreased over time. The MDSM assumes most species usually behave linearly in a neighborhood (Jameson 1986; Bai et al. 1995) (Author's note: the developed presumption would be multivariate exponential progress, instead of linear, April 25, 1998). Data from Table 2 were sorted by the t-value in descending order. The top and bottom 10 percent of the t-values that are extremely away from one were excluded from projection to avoid skewing the projection. Examples of calculated species trends are: Oryzopsis hymenoides (t (4) =2.9095), which increased by a factor of nearly three from 1989 to 1993; and Bouteloua gracilis (t (26) = 0.9652) which decreased slightly during the same time period. The fifth column in Table 2 is the projected state 35-vectors for 1997, while the sixth column is the standardized state 35-vector for 1997. The rows in Table 2 show the changes for each species over time. For example, the t-value for Pascopyrum smithii , t (15) = 1.1615, indicates a slight increase over four years. Its importance value increased from 0.2018 in 1989 to 0.2344 by 1993, while the projected value for 2001 after standardization decreased to 0.0559. The increase in the projected value for Pascopyrum smithii and its concomitant decrease in importance value is a unique feature of the MDSM and can be explained by reference to other species in the vector. Trend analysis by the MDSM is based on standardized vectors that include information from both the original vector lengths and their elements. The MDSM utilizes not only the ratios of species' values, but also the ratio of the lengths of the m- vectors, an indication of the overall vegetation condition. Although the instantaneous trend indicate the t (15) is grater than one, compare with all other species, it' IV is relatively decreasing. (Average trend value, pseudo trend index, equals to 1.2. See below.) The increase of Pascopyrum smithii . and decrease of Bouteloua gracilis , two important species in the shortgrass prairie at Fort Carson, may indicate a slight vegetation shift from Bouteloua gracilis to Pascopyrum smithii . This shift may be caused by management practices, such as removal of grazing pressure or soil surface damaged by military vehicles. Further analysis of the trend values of Bouteloua gracilis can supply more information. As t (26) =0.9652 and 0.9652^20=0.4924, MDSM shows that if the decreasing trend of Bouteloua gracilis continues for another 4*20=80 years, then this important species may no longer be dominant in this ecosystem. The average t-value of 1.2188 for 4th to 32nd species, the pseudo trend index(Bai, 1995), can be used as a relative indicator of overall condition to compare different sites. We used this pseudo trend index to label a map of Fort Carson with colors corresponding to levels of the index. The resultant map matched well with both the experience of the range manager and a usage map of Fort Carson (Bai et al. 1994). In another study utilizing the MDSM on an eleven-year data set, the predicted values were highly correlated with interpolated values and actual values (r=0.94) (Bai et al. 1996a). Conclusions and Discussion The Multi-Dimensional Sphere Model constructs a unit hypersphere by using species as dimensions, and quadrats as standardized m- vectors whose endpoints lie on the hypersphere. The MDSM uses the cosine of the angles between m- vectors as similarity coefficients, indicating the relative position of the quadrats on the hypersphere. It also uses cosine values to establish the direction of the m- vector, and illustrates vegetation changes by comparing the previous and present cosine values. In other words, It is interpreted as instantaneous trend that the ratio of the m- state- vector over time. The MDSM extends the trend to project the next and future states. In the Fort Carson data, the number of quadrats (j) sampled = 199, so there are 199 values for each plant species. MDSM used the centroid vector, made of averages, instead of a matrix to characterize the vegetation. This is important, as trend analysis assumes a homogenous community, so there would be only one trend for a community at a given time. The values of the centroid vector and state vector fluctuate less as the number of quadrats increases. MDSM considers the vector direction carries the composition information instead of distance. This is reflected in data standardization. Standardization also shows that MDSM excludes any specific interaction between special pairs. Each plant species interacts with every other species. Q (i) reacts with the sum of Q (j) , j=1,2,...i, ...m, including i. This interrelation of one to all is reflected in the standardization: q' (i) =IV (i) =q (i) /|q|, i=1,2,..m. This describes a basic interrelationship of species within a community. If we convert Brockwell's time series to vegetation, then the three parts of a time series can be expressed as: Z (k) =t (k) +u (k) +v (k) where t (k) is the trend component, u (k) is a function with known period referred to as an seasonal component, and v (k) is a random noise component (Brockwell and Davis, 1991). The trend component remains in the data and can be analyzed after MDSM filters out the random noise component (sampling error) and seasonal component (annual component) by centralization and standardization, respectively. MDSM expresses the trend as Instantaneous Trend: present over past as an extension of average changing trend. Instantaneous Trend can be calculated for any time interval supported by the data. This makes the trend analysis more flexible and more accurate. In the LCTA example, the time interval for sampling was four years. The model projected vegetation condition at subsequent four-year intervals. However, the time interval, k, can be set at any fraction: B' * t k + r k For the LCTA data, when k takes the value of 0.25, t-values are t =( b / a ) 1/4 With these t-values, the MDSM may be used to project vegetation condition in consecutive years instead of four-year intervals. It can also be used to interpolate values for the years 1990, 1991, and 1992. The MDSM allows the investigator to use several scales in data collection (e.g. using cover and dominance for different species in the same study) without affecting the trend results. Such scale changes are a transformation. Transformations of scale can be useful in classification interpretation because dimensions of an ordination can be expanded to increase resolution. There are limitations to the application of the MDSM, because t-values are derived from the division of state vectors, and the elements of the state vector cannot be zero (Bai 1984). It may be acceptable in some cases, however, to omit species from the analysis that have zero values. Another limitation is that the prediction error grows with increasing time intervals (Figure 4). All state vectors have a sampling error which could be passed to the t-values. When t is raised to the power of k for a projection, the error is also raised to the power of k. Thus prediction accuracy decreases with the size of k. To increase the accuracy of prediction, data must be collected in subsequent years to adjust the projection, in other word the k value should be kept small (Jameson 1986). The MDSM projection of future states does not fall on a straight line because the t-values are raised to the power of k. The projection line will approximate a straight line only if all t-values and k are very close to 1.0. This suggests that the MDSM as trend analysis described here should not be used to monitor ecosystems undergoing dramatic changes. When the MDSM is used on a rapidly changing ecosystem, and greater accuracy of prediction is required, an additive model, linear model, may be more appropriate. MDSM, a multiplicative model, was compared with interpolation, an additive model, using an eleven-year data set. Results will be reported in another paper (Bai et al. 1996a) MDSM can be used to analyze vegetation trend, or changes, over time. Preliminary tests of the MDSM using LCTA vegetation data from Fort Carson in Colorado show the method to be very promising. Results from the MDSM can be interpreted as a measure of successional change in vegetation. Linkage of the MDSM to succession theory will be useful in understanding the model. Tilman (1988) developed a theory of vegetation change that suggests secondary succession may be due to the transient dynamics of competitive displacement. A major feature of Tilman's theory is that plant life histories are a trade-off between growth rate and the ability to acquire resources. In early succession, rapid growth rates are advantageous for acquisition of territory. In later successional species, interactions become more important, and only strong competitors can increase in size. This theory suggests that in different series, species' growth rates are different on average, and that growth rate can vary for a single species through time, in response to competition. These changes in growth rates are of value when interpreting succession, and may be approximated (relative to the community) by the species transition vectors generated by the MDSM. In another words, the concept of a fixed intrinsic growth rate might be replaced by an empirical growth rate. Greater significance can be attached to the results from the MDSM when they are viewed from the standpoint of this theory of succession. Back face This research was conducted 1994, but published in 1997. There have some new developments that did not included in the published paper. Thus, the author posted a revised version in web page. Important concepts are m -space, m -vectors, m -exponential equation. There are differences between the slop and the ratio, projection and prediction. But these will be discussed in another paper. Acknowledgments Initial guidance from Academician Bo Li at the Chinese Academy of Sciences (CAS) and Professor Jie Chen at Inner Mongolian University, P.R. China, were essential to the early formation of the model. Though not always in agreement, the first author's discussions with D. Jameson, Department of Rangeland Ecosystem Science, Colorado State University (C.S.U.), were extremely helpful. D. Anderson, Fort Carson LCTA coordinator, supplied the data. G. Gertner, Department of Forestry, University of Illinois, F. Smith, Department of Earth Resources and J. She, Department of Physics, C.S.U., reviewed earlier versions of this manuscript. CEMML staff at C.S.U. assisted with technical aspects of the manuscript preparation. Discussion of mathematics with P.D. Chen, Institute of Applied Mathematics, CAS, Y.L. Shi, Department of Mathematics, C.S.U, and D. Levinson, Department of Mathematics, Colorado College, were very helpful. References Bai, T.J., 1982. Exploration of numerical classification of form Leymus chinensis in the Silingolo River Basin. Inner Mongolia University, Vol. 4. Hohhot, China ( in Chinese with English abstract). Bai, T.J., 1984 . Numerical prediction of grassland succession trend. Grassland Research Institute of Chinese Academy of Agricultural Sciences (GRIC), Hohhot, China (in Chinese). Bai, T.J., and Linn, J., 1994. Range Condition Trend Analysis for Fort Carson by Multidimensional Space Model. Presentation at Land Rehabilitation and Maintenance 3rd Conference, Aberdeen, MD. USA. Bai, T. J., Hao, D. Y., and Te, T., 1995. An application of Multi-Dimensional Sphere Model on vegetation trend analysis for Ft. Carson. In: International Symposium for Modern Ecology. Editor: Bo Li, Science Press, Beijing, 1995. (in Chinese with English abstract) Bai, T. J., Hao, D.Y., Cottrell, T., and Te, T., 1996b. Multi- Dimensional Sphere Model and vegetation succession trend analysis. Grasslands of China, 2:35-46. (in Chinese with English abstract) Bai, T.J., Brozka, R.J. and She, J., 1996a. Range Trend Analysis for Pinon Canyon Maneuver Site, CO (1985-1995) using Multi- Dimensional Sphere Model--Comparison of MDSM with Interpolation. Abstract, presented at the 49th Annual Meeting of the Society for Range Management. Wichita, KS. USA. Brockwell, P.J. and Davis, R.A., 1991. Time Series: Theory and Methods. Second edition. Springer-Verlag. 577 p. Gauch, H.G., 1982. Multivariate Analysis in Community Ecology. Cambridge Univ. Press. Jameson, D.A., 1986. Sampling intensity for monitoring of environmental systems. Applied Mathematics and Computation, 18:71-76. Jameson, D.A., 1986. Natural resource monitoring and adaptive management. Colorado State University. Legendre, L. and Legendre, P., 1983. Numerical Ecology. Elsevier Scientific Publishing Company. Ludwig, J.A. and Reynolds, J.F., 1988. Statistical Ecology. John Wiley Sons, N.Y. Orloci, L., 1967. An agglomerative method for classification of plant communities. J. Ecol., 55: 193-206. Quattro Pro 5.0 mannual. 1993. Tazik, D.J., Warren, S.D., Diersing, V.E., Shaw, R.B., Brozka, R.J., Bagley, C.F., and Whitworth, W.R., 1992. U.S. Army Land Condition-Trend Analysis (LCTA) Plot Inventory Field Methods. USACERL Tech. Rep. N-92/03. 62 p. Tilman, D., 1988. Plant Strategies and the Dynamics and Structure of Plant Communities. Princeton Univ. Press, Princeton, New Jersey.
JGR Eitors Highlight Trend discrepancies among three best track data sets of western North Pacific tropical cyclones Jin-Jie Song School of Atmospheric Sciences and Key Laboratory of Mesoscale Severe Weather/Ministry of Education, Nanjing University, Nanjing, Jiangsu, China Yuan Wang School of Atmospheric Sciences and Key Laboratory of Mesoscale Severe Weather/Ministry of Education, Nanjing University, Nanjing, Jiangsu, China Liguang Wu Key Laboratory of Meteorological Disaster of the Ministry of Education, Nanjing University of Information Science and Technology, Nanjing, Jiangsu, China The hot debate over the influence of global warming on tropical cyclone (TC) activity in the western North Pacific over the past several decades is partly due to the diversity of TC data sets used in recent publications. This study investigates differences of track, intensity, frequency, and the associated long-term trends for those TCs that were simultaneously recorded by the best track data sets of the Joint Typhoon Warning Center (JTWC), the Regional Specialized Meteorological Center (RSMC) Tokyo, and the Shanghai Typhoon Institute (STI). Though the differences in TC tracks among these data sets are negligibly small, the JTWC data set tends to classify TCs of category 23 as category 45, leading to an upward trend in the annual frequency of category 45 TCs and the annual accumulated power dissipation index, as reported by Webster et al. (2005) and Emanuel (2005). This trend and potential destructiveness over the period 19772007 are found only with the JTWC data set, but downward trends are apparent in the RSMC and STI data sets. It is concluded that the different algorithms used in determining TC intensity may cause the trend discrepancies of TC activity in the western North Pacific. Received 22 August 2009; accepted 29 January 2010; published 30 June 2010. Citation: Song, J.-J., Y. Wang, and L. Wu (2010), Trend discrepancies among three best track data sets of western North Pacific tropical cyclones, J. Geophys. Res., 115, D12128, doi:10.1029/2009JD013058. Check out Figure 1 of the paper at least, if you are interested in the trend. Click here for the pdf file of the paper