Int J Adv Manuf Technol (2013) 68:187–196 DOI 10.1007/s00170-012-4718-7 Machining deformation prediction for frame components considering multifactor coupling effects Z. T. Tang T. Yu L. Q. Xu Zhanqiang. Liu Abstract : The machining deformationprediction model was developed considering multifactor coupling effects includingoriginal residual stresses, clamping loads, milling mechanical loads, millingthermal loads, and machining-induced residual stresses. The machiningdeformation of a true frame monolithic component was predicted by this model.To validate the accuracy of prediction model, deformations also were measuredon a coordinate measuring machine. The deformations predicted by the model showa good agreement with the experiment ’ s results. The deformation prediction model can providean effective way to study further control strategies of machining deformationsfor monolithic component. Keywords : Machining deformationprediction. Monolithic components. Multifactor coupling. Aluminum alloy 2013-Machining deformation prediction for frame components.pdf
It was bad enough when some researchers started to use “data” to describe “(numerical) model output” a few years ago; now some scientists are saying “(numerical) projections are becoming observations…” Thank goodness, I don’t have to listen to such abuse of English words at a conference anymore! Human-Caused Global Warming Behind RecordHot Australian Summer A new study links the 2012 heat waves Down Under to the greenhousegas emissions causing climate change By Stephanie PaigeOgburn and ClimateWire The study shows without a doubt that the projections of climate scientists a few decades ago that the risk of extreme heat would increase are now becoming the reality. The projections are becoming observations, hesaid. Reprinted from Climatewire with permission from Environment EnergyPublishing, LLC. www.eenews.net ,202-628-6500 http://www.scientificamerican.com/article.cfm?id=human-caused-global-warming-behind-record-hot-australian-summer
根据BOEG主编给出的定义如下(摘自其13年卷首语): The IAEG has defined engineering geology as the science devoted to the investigation, study and solution of engineering and environmental problems that may arise as the result of the interaction between geology and the works or activities of humanity, as well as of the prediction of and development of measures for the prevention or remediation of geological hazards. So, engineering geology is much more than simply the application of geology to civil engineering. Further to this, much work has been done by Robert Tepel to answer the question ‘What is engineering geology really all about?’ On page 12 of the December 2012 edition of AEG News, he restated his conclusion that “Engineering geology benefits humanity by discovering, defining, and analyzing geologically-sourced risks or conditions that impact, or might impact, humans as they utilize and interact with their built and natural environments.” In summary, engineering geologists “help people recognize and manage, and make informed decisions about, geologically-sourced risks.” It is in this exciting and broad field that the Bulletin seeks to publish. 无独有偶,王思敬院士也在中国工程地质学13年卷首语“工程地质的世纪演化与前景”一文中 对工程地质学科的发展进行了总体的评述。他指出:“工程地质学一词自身说明它是一门新兴的交叉学科。在高层次上地球科学(地质学)同工程技术学科(土木工程学)的交叉融合构建了工程地质学科。这就注定了它的多元融合发展途径。”“我国工程地质科学面临的挑战在于学科核心价值的多学科融合。敢于迎接挑战,凝炼核心价值方可获得外延的拓展。善于外延空间拓展又需加强核心内涵的凝炼。这才是工程地质学界的出路。”“工程地质学的应用性强自不待言,但是它的应用目标是以对地球地质规律识知为基础的,具有自 然科学的特征。同时,要深入研究工程地质作用的理化机理和动力过程,才能做出工程问题的可靠决策。” 由此可见,工程地质学绝非地质学在土木工程学中的应用,而有其自身的核心价值。这个核心价值就是“人类工程活动工程地圈动力学过程”。工程地质研究既可以非常微观达到nano尺度,也可以宏观达到地圈尺度,这也是工程地质学与其相邻学科岩土力学、岩土工程的显著区别。
Breiman and Spector (1992) demonstrated that leave-one-out cross-validation has high variance if the prediction rule is unstable, because the leave-one-out training sets are too similar to the full data set. 5-fold or 10-fold cross-validation displayed lower variance. Efron and Tibershirani (1997) proposed a 0.632+ bootstrap method , which is a bootstrap smoothing version of cross validation and has less variation. LOOCV的误差比较大,而bootstrap的误差相对较小。 All of the data being used to select the genes and cross-validation used only for fitting the parameters of the model. 在高维情况下,容易错误的使用交叉验证/ In a pilot study, we generated 100 samples with 1000 features for each sample, all coming invariably from the Gaussian distribution N(0,1). We randomly assigned the samples into two classes ("fake-classes"). Since the data set is totally non-informative, the faithful CV error should be around 50% no matter what method is used. But by CV1 scheme we could achieve a CV error as low as 0.025 after recursive feature selection, which shows that the bias caused by the improper cross-validation scheme can be surprisingly large. A more proper approach is to include the feature selection procedure in the cross validation, i.e., to leave the test sample(s) out from the training set before undergoing any feature selection. In this way, not only the classification algorithm, but also the feature selection method is validated. We call this scheme CV2 and use it in all of our investigations throughout. For the above "fakeclass" data, the error rate evaluated by CV2 was always around 50% regardless of the specific method used for feature selection and classification. 以下简称交叉验证(Cross Validation)为CV.CV是用来验证分类器的性能一种统计分析方法,基本思想是把在某种意义下将原始数据(dataset)进行分组,一部分做为训练集(train set),另一部分做为验证集(validation set),首先用训练集对分类器进行训练,在利用验证集来测试训练得到的模型(model),以此来做为评价分类器的性能指标.常见CV的方法如下: 1).Hold-Out Method 将原始数据随机分为两组,一组做为训练集,一组做为验证集,利用训练集训练分类器,然后利用验证集验证模型,记录最后的分类准确率为此Hold-OutMethod下分类器的性能指标.此种方法的好处的处理简单,只需随机把原始数据分为两组即可,其实严格意义来说Hold-Out Method并不能算是CV,因为这种方法没有达到交叉的思想,由于是随机的将原始数据分组,所以最后验证集分类准确率的高低与原始数据的分组有很大的关系,所以这种方法得到的结果其实并不具有说服性. 2).K-fold Cross Validation(记为K-CV) 将原始数据分成K组(一般是均分),将每个子集数据分别做一次验证集,其余的K-1组子集数据作为训练集,这样会得到K个模型,用这K个模型最终的验证集的分类准确率的平均数作为此K-CV下分类器的性能指标.K一般大于等于2,实际操作时一般从3开始取,只有在原始数据集合数据量小的时候才会尝试取2.K-CV可以有效的避免过学习以及欠学习状态的发生,最后得到的结果也比较具有说服性. 3).Leave-One-Out Cross Validation(记为LOO-CV) 如果设原始数据有N个样本,那么LOO-CV就是N-CV,即每个样本单独作为验证集,其余的N-1个样本作为训练集,所以LOO-CV会得到N个模型,用这N个模型最终的验证集的分类准确率的平均数作为此下LOO-CV分类器的性能指标.相比于前面的K-CV,LOO-CV有两个明显的优点: ① 每一回合中几乎所有的样本皆用于训练模型,因此最接近原始样本的分布,这样评估所得的结果比较可靠。 ② 实验过程中没有随机因素会影响实验数据,确保实验过程是可以被复制的。 但LOO-CV的缺点则是计算成本高,因为需要建立的模型数量与原始数据样本数量相同,当原始数据样本数量相当多时,LOO-CV在实作上便有困难几乎就是不显示,除非每次训练分类器得到模型的速度很快,或是可以用并行化计算减少计算所需的时间. Test-set estimator of performance has high variance. CV涉及到模型评价与选择,获得最有的模型后,使用所有观测训练作为预测模型。whichever model gives the best CV score: train it with all data,and that's the predictive model you'll use. AIC(AKaike information criterion 赤池 信息标准) ,BIC( bayesion information criterion) how can we use cross-validation to find useful subset! intensive use of cross validation can overfit. hold out an additional testset before doing any model selection. 在pattern recognition与machine learning的相关研究中,经常会将dataset分为training跟test这两个subsets,前者用以建立model,后者则用来评估该model对未知样本进行预测时的精确度,正规的说法是generalization ability。在往下叙述之前,这边就必须点出一个极为重要的观念:只有training data才可以用在model的训练过程中,test data则必须在model完成之后才被用来评估model优劣的依据。 怎么将完整的dataset分为training set与test set也是学问,必须遵守两个要点: training set中样本数量必须够多,一般至少大于总样本数的50%。 两组子集必须从完整集合中均匀取样。 其中第2点特别重要,均匀取样的目的是希望减少training/test set与完整集合之间的偏差(bias),但却也不易做到。一般的作法是随机取样,当样本数量足够时,便可达到均匀取样的效果。然而随机也正是此作法的盲点,也是经常是可以在数据上做手脚的地方。举例来说,当辨识率不理想时,便重新取样一组training set与test set,直到test set的辨识率满意为止,但严格来说这样便算是作弊了。 Cross-validation正是为了有效的估测generalization error所设计的实验方法,可以细分为double cross-validation、k-fold cross-validation与leave-one-out cross-validation。Double cross-validation也称2-fold cross-validation(2-CV),作法是将dataset分成两个相等大小的subsets,进行两回合的分类器训练。在第一回合中,一个subset作为training set,另一个便作为test set;在第二回合中,则将training set与test set对换后,再次训练分类器,而其中我们比较关心的是两次test sets的辨识率。不过在实务上2-CV并不常用,主要原因是training set样本数太少,通常不足以代表母体样本的分布,导致test阶段辨识率容易出现明显落差。此外,2-CV中分subset的变异度大,往往无法达到「实验过程必须可以被复制」的要求。 K-fold cross-validation (k-CV)则是double cross-validation的延伸,作法是将dataset切成k个大小相等的subsets,每个subset皆分别作为一次test set,其余样本则作为training set,因此一次k-CV的实验共需要建立k个models,并计算k次test sets的平均辨识率。在实作上,k要够大才能使各回合中的training set样本数够多,一般而言k=10算是相当足够了。 最后是leave-one-out cross-validation (LOOCV),假设dataset中有n个样本,那LOOCV也就是n-CV,意思是每个样本单独作为一次test set,剩余n-1个样本则做为training set,故一次LOOCV共要建立n个models。相较于前面介绍的k-CV,LOOCV有两个明显的优点: 每一回合中几乎所有的样本皆用于训练model,因此最接近母体样本的分布,估测所得的generalization error比较可靠。 实验过程中没有随机因素会影响实验数据,确保实验过程是可以被复制的。 但LOOCV的缺点则是计算成本高,因为需要建立的models数量与总样本数量相同,当总样本数量相当多时,LOOCV在实作上便有困难,除非每次训练model的速度很快,或是可以用平行化计算减少计算所需的时间。 使用Cross-Validation时常犯的错误 由于实验室许多研究都有用到evolutionary algorithms(EA)与classifiers,所使用的fitness function中通常都有用到classifier的辨识率,然而把cross-validation用错的案例还不少。前面说过,只有training data才可以用于model的建构,所以只有training data的辨识率才可以用在fitness function中。而EA是训练过程用来调整model最佳参数的方法,所以只有在EA结束演化后,model参数已经固定了,这时候才可以使用test data。 那EA跟cross-validation要如何搭配呢?Cross-validation的本质是用来估测(estimate)某个classification method对一组dataset的generalization error,不是用来设计classifier的方法,所以cross-validation不能用在EA的fitness function中,因为与fitness function有关的样本都属于training set,那试问哪些样本才是test set呢?如果某个fitness function中用了cross-validation的training或test辨识率,那么这样的实验方法已经不能称为cross-validation了。 EA与k-CV正确的搭配方法,是将dataset分成k等份的subsets后,每次取1份subset作为test set,其余k-1份作为training set,并且将该组training set套用到EA的fitness function计算中(至于该training set如何进一步利用则没有限制)。因此,正确的k-CV 会进行共k次的EA演化,建立k个classifiers。而k-CV的test辨识率,则是k组test sets对应到EA训练所得的k个classifiers辨识率之平均值。
Problem: predict if a residues is a DNA-binding residues or not. Features: The information of each residue in the sliding window is constructed using evolutionary information, the torsion angles in the PBS and the solvent accessible surface (Li and Li, 2012). These features and the encoding scheme are described in Part 1 of the Supplementary Data S2. Classifier: Then, the encoded features are selected as the input parameters of the SVM Database: PDNA-62: PDNA-62 dataset contained 1215 DNA-binding residues and 6948 non-binding residues. PDNA-224: 3778 interacting residues and 53 570 non-interacting residues were projected to be present in the PDNA-224 dataset. Evaluation: In predicting DNA-binding sites, the 5-fold cross-validation test is often used to examine the effectiveness of a predictor (Wang and Brown, 2006; Wang et al., 2009, 2010; Wu et al., 2009). The performance of our predictor was also assessed by the 5-fold cross-validation test. During this test, a dataset is randomly divided into five non-overlapping sets, four of which are used for training the predictor and the accuracy of the predictor is assessed on the remaining sets. This process is repeated five times. Performance measure: The predictive capability of our method was evaluated by the sensitivity (Sn), specificity (Sp), Matthew ’ s correlation coefficient (MCC), overall prediction accuracy (Acc), strength (Str) and false-positive rate (FPR). Results: Table 1. The test results for the PDNA-62 dataset with respect to different window sizes based on the 5-fold cross-validation test Table 2. The prediction performances for the PDNA-62 dataset based on various features in the 5-fold cross-validation test Fig. 3. ROC curves for the DNA-binding sites prediction in PDNA-62 dataset by combining SVM predictor using different parameters prediction of DNA-binding sites:DNA结合位点预测 RED.doc prediction of DNA-binding sites:DNA结合位点预测.pdf
以下英文内容是我从美国国家宇航局网站上引用的。问题1主要是针对互联网上众多的关于2012年世界末日的问题的问答;而第二部分内容是关于2012年世界末日说的缘起的解释。 Question (Q): Are there any threats to the Earth in 2012? Many Internet websites say the world will end in December 2012. Answer (A): The world will not end in 2012. Our planet has been getting along just fine for more than 4 billion years, and credible scientists worldwide know of no threat associated with 2012. Q: What is the origin of the prediction that the world will end in 2012? A: The story started with claims that Nibiru , a supposed planet discovered by the Sumerians, is headed toward Earth. This catastrophe was initially predicted for May 2003, but when nothing happened the doomsday date was moved forward to December 2012 and linked to the end of one of the cycles in the ancient Mayan calendar at the winter solstice in 2012 -- hence the predicted doomsday date of December 21, 2012. Nibiru这颗行星撞地球这个事情 曾预测会在2003年发生(不是中国人说的2004年),后来又改为2012年12月21日。而这个日子和玛雅人的“月历末日”吻合到了一起。 看来世界末日真的不大可能发生的。
How to evaluate performance of prediction methods? Measures and their interpretation in variation effect analysis Mauno Vihinen BMC Genomics 2012, 13 (Suppl 4):S2 doi:10.1186/1471-2164-13-S4-S2 http://www.biomedcentral.com/1471-2164/13/S4/S2 This checklist is provided to help when comparing and measuring performance of predictors and when selecting a suitable one. These are items that method developers should include in articles, or as supplement to articles, as they enable effective comparison and evaluation of the performance of predictors. Items to check when estimating method performance and comparing performance of different methods: - Is the method described in detail? - Have the developers used established databases and benchmarks for training and testing (if available)? - If not, are the datasets available? - Is the version of the method mentioned (if several versions exist)? - Is the contingency table available? - Have the developers reported all the six performance measures: sensitivity, specificity, positive predictive value, negative predictive value, accuracy and Matthews correlation coefficient. If not, can they be calculated from figures provided by developers? - Has cross validation or some other partitioning method been used in method testing? - Are the training and test sets disjoint? - Are the results in balance e.g. between sensitivity and specificity? - Has the ROC curve been drawn based on the entire test set? - Inspect the ROC curve and AUC. - How does the method compare to others in all the measures? - Does the method provide probabilities for predictions?
4th WGNE workshop on systematic errors in weather and climate models The JSC/CAS Working Group on Numerical Experimentation (WGNE) is organising a workshop on systematic errors in weather and climate models to be hosted at the Met Office, Exeter, UK, during 15-19 April 2013. The principal goal will be to increase understanding of the nature and cause of errors in models used for weather and climate prediction (including intra-seasonal to inter-annual). It is anticipated that the focus will be on General Circulation Models (GCMs) such as those used in CMIP5 , TIGGE , operational NWP, etc., including atmosphere-only, coupled atmosphere-ocean and earth system models. Biases in the atmosphere, land surface, ocean and cryosphere are all of interest. A wide variety of diagnostic techniques will be discussed, including traditional analysis methods applied to global models, process studies, the use of diagnostic and process models (e.g. single-column, cloud-resolving), and simplified experiments (e.g. aqua-planet). Of special interest will be studies that consider errors found in multiple models and errors which are present across timescales. Diagnostics and metrics that utilize novel or multi-variate observational resources and constraints to identify and characterize systematic errors are welcomed, together with studies which infer the amount of systematic error in predicted extremes from systematic errors in non-extreme situations. Alongside WGNE , the following groups will contribute to the coordination of the workshop: The Working Group on Coupled Models ( WGCM ), the Working Group on Seasonal to Inter-annual Prediction ( WGSIP ), the Working Group on Ocean Model Development ( WGOMD ), Stratospheric Processes And their Role in Climate ( SPARC ), Global Energy and Water Cycle Experiment ( GEWEX ), the Joint Working Group on Forecast Verification Research ( JWGFVR ), and the Year Of Tropical Convection ( YOTC ) project. 详细信息见:http://www.metoffice.gov.uk/conference/wgne2013
2011JD017069.pdf Citation: Yong, B. , Y. Hong, L. L. Ren, , J. J. Gourley, G. J. Huffman, X. Chen, W. Wang, and S. I. Khan (2012), Assessment of evolving TRMM-based multi-satellite real-time precipitation estimation methods and their impacts on hydrologic prediction in a high latitude basin, Journal of Geophysical Research- Atmosphere , 117, D09108, doi: 10.1029/2011JD017069 .
Abstract: hort-term earthquake prediction has always been a very difficult problem in geology, 15 this article pre-displacement, pre-established short-term break for the earthquake prediction based on the theory becomes completely abandoned to form the basis of earthquake prediction method, short-term earthquake prediction is a theoretical breakthrough. Key words: Mechanics; earthquake,;short-term forecasting,;pre-displacement; pre-fracture 摘要: 地震短期预报历来是一个十分困难的地质学问题,本文以预位移预断裂为依据对于短期地震预报进行了理论思考,一旦该理论被实践所证明,将会是地震短期预报的一次理论突破。 关键词 :固体力学;地震;短期预报;预位移;预断裂 预位移预断裂短期地震预报数学方法探析.pdf
基于网络的 预测13347.full.pdf Network-based prediction for sources of transcriptional dysregulation using latent pathway identification analysis Lisa Phama, Lisa Christadoreb, Scott Schausb, and Eric D. Kolaczykc,1 aProgram in Bioinformatics, Understanding the systemic biological pathways and the key cellular mechanisms that dictate disease states, drug response, and altered cellular function poses a significant challenge. Although high-throughput measurement techniques, such as transcriptional profiling, give some insight into the altered state of a cell, they fall far short of providing by themselves a complete picture. Some improvement can be made by using enrichmentbased methods to, for example, organize biological data of this sort into collections of dysregulated pathways. However, such methods arguably are still limited to primarily a transcriptional view of the cell. Augmenting these methods still further with networks and additional -omics data has been found to yield pathways that play more fundamental roles. We propose a previously undescribed method for identification of such pathways that takes a more direct approach to the problem than any published to date. Our method, called latent pathway identification analysis (LPIA), looks for statistically significant evidence of dysregulation in a network of pathways constructed in a manner that implicitly links pathways through their common function in the cell. We describe the LPIA methodology and illustrate its effectiveness through analysis of data on (i) metastatic cancer progression, (ii) drug treatment in human lung carcinoma cells, and (iii) diagnosis of type 2 diabetes. With these analyses, we show that LPIA can successfully identify pathways whose perturbations have latent influences on the transcriptionally altered genes.
Why? Because the price for houses will go up eventually! Gary Shilling: 20% Drop in Housing to Cause Recession in 2012 Gary Shilling, President of A. Gary Shilling Co. and author of the Age of Deleveraging says another recession is brewing -- no matter what action the Fed takes. Shilling says the shock to trigger the next recess is "another big leg-down in housing." (An asset class the Fed has not been able to reflate.) As those familiar with Shilling know, his forecasts are generally bearish. However, in his defense, Shilling was one of the few economists who correctly predicted the dangers of the subprime mortgage market and its impact on the broader economy. The problem with the real estate market remains excess inventory. Based on Shilling's research, there are 2 million to 2.5 million excess homes in the country -- a supply that will take 4-5 years to work-off. The result: Housing prices will fall another 20% and underwater mortgages will balloon from 23% to 40%, he says. With housing slumping again, Shilling says recession is coming to a town near you in 2012. http://finance.yahoo.com/blogs/daily-ticker/20-drop-housing-cause-recession-2012-says-gary-161445494.html
Isaac Newton Institute for Mathematical Sciences Mathematical and Statistical Approaches to Climate Modelling and Prediction 11 August - 22 December 2010 Here is the link: http://www.newton.ac.uk/programmes/CLP/index.html
Many complex systems can be well described by networks where nodes present individuals or agents, and links denote the relations or interactions between nodes. Recently, the link prediction of complex networks has attracted more and more attention from computer scientists and physicists. Link prediction aims at estimating the likelihood of the existence of a link between two nodes, based on the observed links and the attributes of the nodes. For example, classical information retrieval can be viewed as predicting missing links between words and documents, and the process of recommending items to a user can be considered as a link prediction problem in the user-item bipartite network. Attached please find two newly published papers about the problem of link prediction. One (EPJB)discussed missing links prediction via local information. The other (PRE) introduced an efficient and effective similarity index, called Local Path index for link prediction. PRE_80_046122 EPJB_71_623