科学网 › 标签 › prediction › 相关日志

标签: prediction

相关日志

[转载]Development and validation of an epitope prediction tool for: ericmapes 2017-4-18 12:04; Developmentandvalidationofanepitopepredictiontoolforswine(PigMatrix)basedonthepocketprofilemethod https://apps.webofknowledge.com/full_record.do?product=UAsearch_mode=GeneralSearchqid=6SID=N24GNuxxiY5tgpYFdKQpage=1doc=1 作者: Gutierrez,AH (Gutierrez,AndresH.) ; Martin,WD (Martin,WilliamD.) ; Bailey-Kellogg,C (Bailey-Kellogg,Chris) ; Terry,F (Terry,Frances) ; Moise,L (Moise,Leonard) ; DeGroot,AS (DeGroot,AnneS.) 隐藏ResearcherID和ORCID 查看ResearcherID和ORCID 作者 ResearcherID ORCID号 DeGroot,Annie http://orcid.org/0000-0001-5911-1459 BMCBIOINFORMATICS 卷: 16 文献号: 290 DOI: 10.1186/s12859-015-0724-8 出版年: SEP152015 查看期刊信息 BMCBIOINFORMATICS 影响因子 2.435 3.435 2015 5年 JCR 类别类别中的排序 JCR分区 BIOCHEMICALRESEARCHMETHODS 39/77 Q3 BIOTECHNOLOGYAPPLIEDMICROBIOLOGY 64/161 Q2 MATHEMATICALCOMPUTATIONALBIOLOGY 10/56 Q1 数据来自第2015版 JournalCitationReports 出版商 BIOMEDCENTRALLTD,236GRAYSINNRD,FLOOR6,LONDONWC1X8HL,ENGLAND ISSN: 1471-2105 研究领域 BiochemistryMolecularBiology BiotechnologyAppliedMicrobiology MathematicalComputationalBiology 摘要 Background:Tcellepitopepredictiontoolsandassociatedvaccinedesignalgorithmshaveacceleratedthedevelopmentofvaccinesforhumans.Predictivetoolsforswineandotherfoodanimalsarenotaswelldeveloped,primarilybecausethedatarequiredtodevelopthetoolsarelacking.Here,weovercomealackofTcellepitopedatatoconstructswineepitopepredictorsbysystematicallyleveragingavailablehumaninformation.Applyingthepocketprofilemethod,weusesequenceandstructuralsimilaritiesinthebindingpocketsofhumanandswinemajorhistocompatibilitycomplexproteinstoinferSwineLeukocyteAntigen(SLA)peptidebindingpreferences.Wedevelopedepitope-predictionmatrices(PigMatrices),forthreeSLAclassIalleles(SLA-1*0401,2*0401and3*0401)andoneclassIIallele(SLA-DRB1*0201),basedonthebindingpreferencesofthebest-matchedHumanLeukocyteAntigen(HLA)pocketforeachSLApocket.ThecontactresiduesinvolvedinthebindingpocketsweredefinedforclassIbasedoncrystalstructuresofeitherSLA(SLA-specificcontacts,Ssc)orHLAsupertypealleles(HLAcontacts,Hc);forclassII,onlyHcwaspossible.Differentsubstitutionmatriceswereevaluated( PAM andBLOSUM)forscoringpocketsimilarityandidentifyingthebesthumanmatch.TheaccuracyofthePigMatriceswascomparedtoavailableonlineswineepitopepredictiontoolssuchasPickPocketandNetMHCpan. Results:PigMatricesthatusedSsctodefinethepocketsequencesandPAM30toscorepocketsimilaritydemonstratedthebestpredictiveperformanceandwereabletoaccuratelyseparatebindersfromrandompeptides.ForSLA-1*0401and2*0401,PigMatrixachievedareaunderthereceiveroperatingcharacteristiccurves(AUC)of0.78and0.73,respectively,whichwereequivalentorbetterthanPickPocket(0.76and0.54)andNetMHCpanversion2.4(0.41and0.51)andversion2.8(0.72and0.71).Inaddition,wedevelopedthefirstpredictiveSLAclassIImatrix,obtaininganAUCof0.73forexistingSLA-DRB1*0201epitopes.Notably,PigMatrixachievedthislevelofpredictivepowerwithouttrainingonSLAbindingdata. Conclusion:Overall,thepocketprofilemethodcombinedwithbindingpreferencesfromHLAbindingdatashowssignificantpromisefordevelopingTcellepitopepredictiontoolsforpigs.Whencombinedwithexistingvaccinedesignalgorithms,PigMatrixwillbeusefulfordevelopinggenome-derivedvaccinesforarangeofpigpathogensforwhichnoeffectivevaccinescurrentlyexist(e.g. porcine reproductiveandrespiratorysyndrome,influenzaand porcine epidemic diarrhea ). 关键词作者关键词: PigMatrix ; EpiMatrix ; Computationalvaccinology ; Epitopeprediction ; HLA ; SLA ; MHC ; ClassI ; ClassII ; Porcine ; PRRSV ; Influenza ; Genome-derivedvaccine ; Tcellepitope KeyWordsPlus: T-CELLEPITOPES ; RESPIRATORYSYNDROME VIRUS ; MOUTH-DISEASE- VIRUS ; PEPTIDE-BINDINGSPECIFICITIES ; MHCCLASS-I ; OUTBREDPIG-POPULATIONS ; HLA-DR ; NONSTRUCTURALPROTEINS ; MOLECULARCHARACTERIZATION ; CRYSTAL-STRUCTURE 作者信息通讯作者地址: DeGroot,AS(通讯作者) UnivRhodeIsl,CMBCELS,InstImmunolInformat,Providence,RI02903USA. 增强组织信息的名称 UniversityofRhodeIsland 地址: UnivRhodeIsl,CMBCELS,InstImmunolInformat,Providence,RI02903USA 增强组织信息的名称 UniversityofRhodeIsland EpiVaxInc,Pawtucket,RI02860USA DartmouthColl,DeptCompSci,Hanover,NH03755USA 增强组织信息的名称 DartmouthCollege 电子邮件地址: dr.annie.degroot@gmail.com 基金资助致谢基金资助机构授权号 InstituteforImmunologyandInformatics NationalPorkBoard 12-121 查看基金资助信息关闭基金资助信息 TheauthorsacknowledgethesupportofJacobTivinintheimplementationofthematricesevaluatedinthispaper.PartialsupportforthedevelopmentofPigMatrixwasprovidedbytheInstituteforImmunologyandInformaticsandtheNationalPorkBoard(Project#12-121). 出版商 BIOMEDCENTRALLTD,236GRAYSINNRD,FLOOR6,LONDONWC1X8HL,ENGLAND 类别/分类研究方向: BiochemistryMolecularBiology;BiotechnologyAppliedMicrobiology;MathematicalComputationalBiology WebofScience类别: BiochemicalResearchMethods;BiotechnologyAppliedMicrobiology;MathematicalComputationalBiology 文献信息文献类型: Article 语种: English 入藏号: WOS:000361095000002 PubMedID: 26370412 ISSN: 1471-2105 期刊信息 ImpactFactor(影响因子): JournalCitationReports 数据来自第2015版 JournalCitationReports 其他信息 IDS号: CR1OP WebofScience核心合集中的引用的参考文献: 58 WebofScience核心合集中的被引频次: 2; 个人分类: 社会热点时评|258 次阅读|0 个评论

再谈脑见（brainsight）: 热度 2 张能立 2016-4-2 11:18; 在“ 脑见（brainsight）与眼见（see） ”博文中，笔者创造出一个新的概念（concept）：脑见（brainsight）。文中给出了脑见（brainsight）的定义：从信息学角度讲，脑见是大脑对五官所接受到的外界信息进行深加工（deeply process）；从思维学角度讲，从众多特殊性（specialization）中归纳（induce）出普遍性（generalization）或共同基础（common base）；由普遍性（generalization）或共同基础（common base）演绎（deduce）或预示（prediction）某些特殊性（specialization）。这篇博文将以一道小学 4年级数学题目，继续讨论脑见（brainsight）问题。小学 4年级数学有小数内容。有一道题目：填写下述小数相邻的整数，如图1 所示：图1 一道小学数学题目这个题目，无论是对于小学生，还是成人，都是小儿科。问题在于，看到这道题目后，除了马上写下 1 1.6 2 之外，脑海还能想象出与这个题目相关的其它对象或者场景？当然，一个很自然的联想就是直尺，如图2 所示：图2 直尺除了能够联想到直尺之外，图1 所示的不等式能否在生活或者工作之中，寻找到其身影？某一天上午，我在邻居华师大的体育场跑步，看到一位体育老师，拿着秒表在记录 10多位学生 800米跑步的成绩，顿时产生了一个顿悟：一个秒表如何记录多位同学的跑步成绩的呢？图3 大学生跑步在晚餐交流中，我将这个问题抛给了正在读小学 4年级的孩子。她想了一下回答：“有一种秒表可以记录多个时间，一个同学过终点线，就按一下。” 我说：“问题中的秒表没有这个功能，那么，如何解决问题的呢？” 她又说：“找多个人来记录。” 我继续说：“问题中就只有一位体育老师和一块秒表，那么，如何解决问题的呢？” 她无法将图1 所示的不等式与这个跑步计时场景关联起来。见状，我顺手用餐桌上的纸巾，搓几个纸团，当学生，以餐桌当跑道，给她讲解一种解决方法：当第一位同学冲过终点线的时候，眼睛看一下秒表，且用脑袋记忆；最后一位同学冲过终点线的时候，同样如此操作。其他同学所用的时间，就介于第一位和第二位同学两者之间。于是，就有这样的关联： “1”：第一位同学 “2”：最后一位同学 “1.6”：其他的同学从图2 所示的看似不起眼的问题，如果展开讨论，我们就可以发现 brainsight 是一种从不同场景，抽象出共同模式（pattern）的能力。例如，从直尺和跑步计时两个不同的场景，抽象出图1 所示的不等式这个模式（pattern）。有一次，我从菜场买回一块冬瓜。我正要用刀切冬瓜的皮的时候，产生了一个教育契机。我向孩子展示了两种不同的切分，然后问孩子：“切法 1 和切法 2 有何区别？”孩子立即回答：“切法 1，我们可以多吃一些冬瓜。” 图4 冬瓜与数学孩子目前只有眼见（see）能力，因此，无法洞见：随着多边形的边数越多，多边形越是接近圆。从切冬瓜这个事例可以看出，brainsight 是一种极限和外推的思维能力。还有一次，我和孩子就洋葱的截面进行了一番对话。我向孩子提出一个问题：“你看，洋葱的截面，是由很多圆组成的。这些圆有什么特征的呢？” 图5 洋葱与数学孩子回答：“这些圆越来越小。”从孩子的回答可以看出，她目前也只具有眼见（see）能力，而不具备 brainsight 能力。她没有能够洞察出：这些不同的圆具有相同圆心，属于数学上的同心圆；这些看似大小不同的圆，无论大小（周长）怎么变化，但是，周长与直径之比不变（PI）。从洋葱截面这个事例来看， brainsight 是一种能够从多个变化的对象之中洞察出不变的对象的能力。综上所述，本篇博文在上一篇博文的基础之上，进一步拓展了 brainsight的内涵： brainsight 是一种能够洞察不同场景蕴藏相同模式（pattern）的能力，是一种极限和外推的思维能力及能够从多个变化的对象之中洞察出不变的对象的能力。思维规律博文汇总; 个人分类: 教育|1943 次阅读|2 个评论

受试者工作特征曲线ROC: itso310 2015-10-22 18:05; 个人分类: 数据挖掘|4 次阅读|0 个评论

GWAS和Genomic prediction概念、原理及应用: j314159 2014-11-25 17:05; 全基因组关联分析 GWAS ： Genome-Wide Association Study ; 基本概念：利用分布于全基因组范围内的分子标记，基于它们与分析性状的连锁不平衡关系，通过各种统计分析方法，以获得与这些性状关联的候选基因或基因组区域。基本原理（以 SNP 分子标记为例） : 1. 在一定群体中选择病例组和对照组（对于数量性状则可以是连续分布的群体），比较全基因组范围内所有 SNP 位点的等位基因或者基因型频率在病例组和对照组间的差异，如果某个 SNP 位点的等位基因或基因型在病例组中出现的频率明显高于或低于对照组，则认为该位点与疾病间存在关联性 2. 之后，根据该位点在基因组中的位置和连锁不平衡关系推测可能的疾病易感基因 . 全基因组预测： Genomic Prediction 基本概念： Genomic prediction exploits historical genotypic and phenotypic datato predict performance of on selection candidates based only on theirgenotypes, attempting to predict phenotypic variation from genomic information. ● 基本原理 1. 首先，建立一个参考群体（ Reference P opulation ），对参考群体的所有个体进行表型和全基因组的基因型测定，通过关联分析估计出每个标记的效应值（ Marker Effect ）； 2. 然后，根据上一步得到的标记效应值对没有表型记录但有基因型信息的预测群体（ Inference Population ）直接估计出他们的基因组育种值（ Genomic Breeding Value, GBV ）。样本选择：原则上越多越好，至少上千 ● SNP 获取：芯片或者测序 ● 数据质控： SNP 水平： MAP0.01 (or 0.05) 剔除；不符合 Hardy-Weinberg 平衡剔除； Call rate 90% (or 95%) 剔除。个体水平：基因型缺失大于 10% (or 5% 、 15% 、 20%) 的个体剔除。 1 、关联分析模型一般线性模型 (GeneralLinear Model) ： y = X α + Z β + e 混合线性模型 (Mixed Linear Model) ： y = X α + Z β + W μ + e y : 所要研究的表型性状； X α ：固定效应（ FixedEffect ），影响 y 的其他因素，包括群体结构、性别、年龄等因素； Z β ：标记效应（ MarkerEffect ）； W μ ：随机效应（ RandomEffect ），这里一般指个体的亲缘关系。 2 、关联分析统计方法 ● Bayes ： Bayes A 、 Bayes B 、 Bayes C 、 Bayes Cpi 统计软件： GenSel 、 GenABEL ，均为 R 程序包。 ● CMLM (Compressed Linear Mixed Model) 统计软件： GAPIT 、 TASSEL ● EMMAX (Efficient Mixed Model Association) 统计软件： emmax ● GBLUP （ Genomic Best Linear Unbiased Prediction ）：专门用于 Genomic prediction 统计软件： ASReml 3 、关联分析中群体分层校正校正方法： ● 基因组对照法（ Genome Control ） ● 结构关联法（ Structured Association ） ● 主成分分析法（ Principal Component Analyses ）群体分层检验： Q-Qplot 2.3 GWAS 多重检验校正 Bonferroni 校正法：将单个假设检验得到的每个位点的 P 值乘以本研究中同时进行假设检验的次数（即乘以所选择的遗传标记数量），如果校正后的 P 值仍然小于 0.05 ，可判断改位点与疾病之间的关联有显著性。递减调整法（ Step-DownAdjustment ）：首先将最小的 P 值乘以所选择的位点数目 m ，排列在第二的 P 值乘以 (m-1) ，其他的 P 值依次乘以 (m-1) ， (m-3) ，依次类推，排在最后的 P 值乘以 1 ，校正后的 P 0.05 的位点可认为与疾病的关联有显著性。控制错误发现率（ False discoveryrate ）法：首先将未校正的 P 值从小到大排序，最大的 P 值保持不变，其他的 P 值依次乘以系数（位点总数 / 该 P 值的位次），校正后的 P 0.05 的位点可认为与疾病的关联有显著性。。 2.4 Genomic prediction 预测公式结果验证交叉验证法：采用 Jackknife 法，即每次抽出一定数量（一个或多个）个体作为验证个体，剩余部分作为参考群体，建立新的公式来预测验证个体的基因组育种值。独立验证法：指对与参考群体没有关系的群体，采用基于参考群体得到的预测公式俩计算他们的基因组育种值。预测准确性：线性回归， R 2 越大，准确性越高。如图， ab 2.5 影响 GWAS Genomic Prediction 准确性因素 1. 样本大小 2. 标记类型（ e.g. SNP 或者单体型） 3. 连锁不平衡程度 4. 不同统计方法; 个人分类: 全基因组关联分析GWAS|18812 次阅读|0 个评论

GWAS和Genomic prediction概念、原理及应用: j314159 2014-11-25 17:04; 全基因组关联分析 GWAS ： Genome-Wide Association Study ; 基本概念：利用分布于全基因组范围内的分子标记，基于它们与分析性状的连锁不平衡关系，通过各种统计分析方法，以获得与这些性状关联的候选基因或基因组区域。基本原理（以 SNP 分子标记为例） : 1. 在一定群体中选择病例组和对照组（对于数量性状则可以是连续分布的群体），比较全基因组范围内所有 SNP 位点的等位基因或者基因型频率在病例组和对照组间的差异，如果某个 SNP 位点的等位基因或基因型在病例组中出现的频率明显高于或低于对照组，则认为该位点与疾病间存在关联性 2. 之后，根据该位点在基因组中的位置和连锁不平衡关系推测可能的疾病易感基因 . 全基因组预测： Genomic Prediction 基本概念： Genomic prediction exploits historical genotypic and phenotypic datato predict performance of on selection candidates based only on theirgenotypes, attempting to predict phenotypic variation from genomic information. ● 基本原理 1. 首先，建立一个参考群体（ Reference P opulation ），对参考群体的所有个体进行表型和全基因组的基因型测定，通过关联分析估计出每个标记的效应值（ Marker Effect ）； 2. 然后，根据上一步得到的标记效应值对没有表型记录但有基因型信息的预测群体（ Inference Population ）直接估计出他们的基因组育种值（ Genomic Breeding Value, GBV ）。样本选择：原则上越多越好，至少上千 ● SNP 获取：芯片或者测序 ● 数据质控： SNP 水平： MAP0.01 (or 0.05) 剔除；不符合 Hardy-Weinberg 平衡剔除； Call rate 90% (or 95%) 剔除。个体水平：基因型缺失大于 10% (or 5% 、 15% 、 20%) 的个体剔除。 1 、关联分析模型一般线性模型 (GeneralLinear Model) ： y = X α + Z β + e 混合线性模型 (Mixed Linear Model) ： y = X α + Z β + W μ + e y : 所要研究的表型性状； X α ：固定效应（ FixedEffect ），影响 y 的其他因素，包括群体结构、性别、年龄等因素； Z β ：标记效应（ MarkerEffect ）； W μ ：随机效应（ RandomEffect ），这里一般指个体的亲缘关系。 2 、关联分析统计方法 ● Bayes ： Bayes A 、 Bayes B 、 Bayes C 、 Bayes Cpi 统计软件： GenSel 、 GenABEL ，均为 R 程序包。 ● CMLM (Compressed Linear Mixed Model) 统计软件： GAPIT 、 TASSEL ● EMMAX (Efficient Mixed Model Association) 统计软件： emmax ● GBLUP （ Genomic Best Linear Unbiased Prediction ）：专门用于 Genomic prediction 统计软件： ASReml 3 、关联分析中群体分层校正校正方法： ● 基因组对照法（ Genome Control ） ● 结构关联法（ Structured Association ） ● 主成分分析法（ Principal Component Analyses ）群体分层检验： Q-Qplot 2.3 GWAS 多重检验校正 Bonferroni 校正法：将单个假设检验得到的每个位点的 P 值乘以本研究中同时进行假设检验的次数（即乘以所选择的遗传标记数量），如果校正后的 P 值仍然小于 0.05 ，可判断改位点与疾病之间的关联有显著性。递减调整法（ Step-DownAdjustment ）：首先将最小的 P 值乘以所选择的位点数目 m ，排列在第二的 P 值乘以 (m-1) ，其他的 P 值依次乘以 (m-1) ， (m-3) ，依次类推，排在最后的 P 值乘以 1 ，校正后的 P 0.05 的位点可认为与疾病的关联有显著性。控制错误发现率（ False discoveryrate ）法：首先将未校正的 P 值从小到大排序，最大的 P 值保持不变，其他的 P 值依次乘以系数（位点总数 / 该 P 值的位次），校正后的 P 0.05 的位点可认为与疾病的关联有显著性。。 2.4 Genomic prediction 预测公式结果验证交叉验证法：采用 Jackknife 法，即每次抽出一定数量（一个或多个）个体作为验证个体，剩余部分作为参考群体，建立新的公式来预测验证个体的基因组育种值。独立验证法：指对与参考群体没有关系的群体，采用基于参考群体得到的预测公式俩计算他们的基因组育种值。预测准确性：线性回归， R 2 越大，准确性越高。如图， ab 2.5 影响 GWAS Genomic Prediction 准确性因素 1. 样本大小 2. 标记类型（ e.g. SNP 或者单体型） 3. 连锁不平衡程度 4. 不同统计方法; 个人分类: 全基因组关联分析GWAS|0 个评论

[J-2013] Machining deformation prediction for frame componen: melius 2013-11-6 21:25; Int J Adv Manuf Technol (2013) 68:187–196 DOI 10.1007/s00170-012-4718-7 Machining deformation prediction for frame components considering multifactor coupling effects Z. T. Tang T. Yu L. Q. Xu Zhanqiang. Liu Abstract : The machining deformationprediction model was developed considering multifactor coupling effects includingoriginal residual stresses, clamping loads, milling mechanical loads, millingthermal loads, and machining-induced residual stresses. The machiningdeformation of a true frame monolithic component was predicted by this model.To validate the accuracy of prediction model, deformations also were measuredon a coordinate measuring machine. The deformations predicted by the model showa good agreement with the experiment ’ s results. The deformation prediction model can providean effective way to study further control strategies of machining deformationsfor monolithic component. Keywords : Machining deformationprediction. Monolithic components. Multifactor coupling. Aluminum alloy 2013-Machining deformation prediction for frame components.pdf; 个人分类: [Publications] 论文全文|3229 次阅读|0 个评论

[转载]Soil water prediction based on scale-specific control: 实话难说 2013-10-16 07:14; Soil water prediction based on its scale-specific control using multivariate empirical mode decomposition; 个人分类: 科研|1845 次阅读|0 个评论

Some climate scientists are too bold...: 热度 1 zuojun 2013-6-30 10:10; It was bad enough when some researchers started to use “data” to describe “(numerical) model output” a few years ago; now some scientists are saying “(numerical) projections are becoming observations…” Thank goodness, I don’t have to listen to such abuse of English words at a conference anymore! Human-Caused Global Warming Behind RecordHot Australian Summer A new study links the 2012 heat waves Down Under to the greenhousegas emissions causing climate change By Stephanie PaigeOgburn and ClimateWire The study shows without a doubt that the projections of climate scientists a few decades ago that the risk of extreme heat would increase are now becoming the reality. The projections are becoming observations, hesaid. Reprinted from Climatewire with permission from Environment EnergyPublishing, LLC. www.eenews.net ,202-628-6500 http://www.scientificamerican.com/article.cfm?id=human-caused-global-warming-behind-record-hot-australian-summer; 个人分类: Thoughts of Mine|2401 次阅读|1 个评论

2013-3-24 结构化预测(structured prediction): dudong 2013-3-24 11:59; 本周把结构化SVM运用到依存分析上的代码写完了，准备开始做实验本周没有读论文; 3033 次阅读|0 个评论

2013-3-9 结构化预测(structured prediction): dudong 2013-3-9 21:20; 假期的实验失败之后，还有两个方法可以用：第一个是继续沿着使用概率图模型(probabilistic graphical models)的方法，多读关于机器学习的论文来寻找近似计算partition function的方法。第二个是使用结构化预测(structured prediction)的方法。下面介绍一下结构化预测 (structured prediction)的概念以及常见的参数学习方法： 1.概念：结构化预测 (structured prediction)是由SVM发展而来的。SVM是一种最大间隔(max-margin)的方法，最擅长处理二分类问题，后来也被用于处理多分类问题。SVM的优点在于有很好的理论基础，即它的泛化能力很强。它的缺点在于1）训练复杂度高；2）不能用于预测结构化问题。结构化问题的例子有很多：给定一个句子，找出它对应的依存树(dependency tree)；对一个图进行分割(image segmentation)等。结构化预测 (structured prediction)通过修改SVM的约束条件以及目标函数，将SVM从二分类问题扩展到可以预测结构化问题。一种常见的表述方法如下：约束条件的意思是：对于任何一个学习用例，数据集的标注结果应该比模型预测的结果都要好！ 2.参数学习方法参数学习方法有很多： structured perceptron(Collins, 2002) stochastic subgradient(Ratliff, 2007) extra-gradient(Taskar, 2006) cutting-plane algorithms(Joachims, 2009) Dual decomposition(Meshi, 2010) 下边是关于structured prediction的reading list: integer linear programming inference for conditional random fields.pdf learning and inference over constrained output.pdf competitive generative models with structure learning for NLP classifiction tasks.pdf Pegasos-primal estimated subgradient solver for svm.pdf subgradient methods for structured prediction.pdf training structural svms when exact inference is intractable.pdf strctured learning with approximate inference.pdf polyhedral outer approximations with application to natural language parsing.pdf piecewise training for structured prediction.pdf learning effieicently with approximate inference via dual losses.pdf efficient decomposed learning for structured prediction.pdf 接下来的工作： 1.将结构化预测中参数学习方法总结一下，认真比较分析它们的区别与联系，优缺点 2.将结构化预测运用到dependency parsing中，做一次实验 3.现在关于结构化预测的模型稍微有一点想法，它的约束条件可以稍微改变一点，在第2步的基础上进行实验; 9897 次阅读|0 个评论

何谓工程地质学？: 热度 5 qishengwen 2013-3-1 15:01; 根据BOEG主编给出的定义如下（摘自其13年卷首语）： The IAEG has defined engineering geology as the science devoted to the investigation, study and solution of engineering and environmental problems that may arise as the result of the interaction between geology and the works or activities of humanity, as well as of the prediction of and development of measures for the prevention or remediation of geological hazards. So, engineering geology is much more than simply the application of geology to civil engineering. Further to this, much work has been done by Robert Tepel to answer the question ‘What is engineering geology really all about?’ On page 12 of the December 2012 edition of AEG News, he restated his conclusion that “Engineering geology benefits humanity by discovering, defining, and analyzing geologically-sourced risks or conditions that impact, or might impact, humans as they utilize and interact with their built and natural environments.” In summary, engineering geologists “help people recognize and manage, and make informed decisions about, geologically-sourced risks.” It is in this exciting and broad field that the Bulletin seeks to publish. 无独有偶，王思敬院士也在中国工程地质学13年卷首语“工程地质的世纪演化与前景”一文中对工程地质学科的发展进行了总体的评述。他指出：“工程地质学一词自身说明它是一门新兴的交叉学科。在高层次上地球科学（地质学）同工程技术学科（土木工程学）的交叉融合构建了工程地质学科。这就注定了它的多元融合发展途径。”“我国工程地质科学面临的挑战在于学科核心价值的多学科融合。敢于迎接挑战，凝炼核心价值方可获得外延的拓展。善于外延空间拓展又需加强核心内涵的凝炼。这才是工程地质学界的出路。”“工程地质学的应用性强自不待言，但是它的应用目标是以对地球地质规律识知为基础的，具有自然科学的特征。同时，要深入研究工程地质作用的理化机理和动力过程，才能做出工程问题的可靠决策。” 由此可见，工程地质学绝非地质学在土木工程学中的应用，而有其自身的核心价值。这个核心价值就是“人类工程活动工程地圈动力学过程”。工程地质研究既可以非常微观达到nano尺度，也可以宏观达到地圈尺度，这也是工程地质学与其相邻学科岩土力学、岩土工程的显著区别。; 7909 次阅读|9 个评论

[转载]交叉验证(Cross Validation): 热度 1 zt2730 2013-3-1 08:07; Breiman and Spector (1992) demonstrated that leave-one-out cross-validation has high variance if the prediction rule is unstable, because the leave-one-out training sets are too similar to the full data set. 5-fold or 10-fold cross-validation displayed lower variance. Efron and Tibershirani (1997) proposed a 0.632+ bootstrap method , which is a bootstrap smoothing version of cross validation and has less variation. LOOCV的误差比较大，而bootstrap的误差相对较小。 All of the data being used to select the genes and cross-validation used only for fitting the parameters of the model. 在高维情况下，容易错误的使用交叉验证/ In a pilot study, we generated 100 samples with 1000 features for each sample, all coming invariably from the Gaussian distribution N(0,1). We randomly assigned the samples into two classes ("fake-classes"). Since the data set is totally non-informative, the faithful CV error should be around 50% no matter what method is used. But by CV1 scheme we could achieve a CV error as low as 0.025 after recursive feature selection, which shows that the bias caused by the improper cross-validation scheme can be surprisingly large. A more proper approach is to include the feature selection procedure in the cross validation, i.e., to leave the test sample(s) out from the training set before undergoing any feature selection. In this way, not only the classification algorithm, but also the feature selection method is validated. We call this scheme CV2 and use it in all of our investigations throughout. For the above "fakeclass" data, the error rate evaluated by CV2 was always around 50% regardless of the specific method used for feature selection and classification. 以下简称交叉验证(Cross Validation)为CV.CV是用来验证分类器的性能一种统计分析方法,基本思想是把在某种意义下将原始数据(dataset)进行分组,一部分做为训练集(train set),另一部分做为验证集(validation set),首先用训练集对分类器进行训练,在利用验证集来测试训练得到的模型(model),以此来做为评价分类器的性能指标.常见CV的方法如下: 1).Hold-Out Method 将原始数据随机分为两组,一组做为训练集,一组做为验证集,利用训练集训练分类器,然后利用验证集验证模型,记录最后的分类准确率为此Hold-OutMethod下分类器的性能指标.此种方法的好处的处理简单,只需随机把原始数据分为两组即可,其实严格意义来说Hold-Out Method并不能算是CV,因为这种方法没有达到交叉的思想,由于是随机的将原始数据分组,所以最后验证集分类准确率的高低与原始数据的分组有很大的关系,所以这种方法得到的结果其实并不具有说服性. 2).K-fold Cross Validation(记为K-CV) 将原始数据分成K组(一般是均分),将每个子集数据分别做一次验证集,其余的K-1组子集数据作为训练集,这样会得到K个模型,用这K个模型最终的验证集的分类准确率的平均数作为此K-CV下分类器的性能指标.K一般大于等于2,实际操作时一般从3开始取,只有在原始数据集合数据量小的时候才会尝试取2.K-CV可以有效的避免过学习以及欠学习状态的发生,最后得到的结果也比较具有说服性. 3).Leave-One-Out Cross Validation(记为LOO-CV) 如果设原始数据有N个样本,那么LOO-CV就是N-CV,即每个样本单独作为验证集,其余的N-1个样本作为训练集,所以LOO-CV会得到N个模型,用这N个模型最终的验证集的分类准确率的平均数作为此下LOO-CV分类器的性能指标.相比于前面的K-CV,LOO-CV有两个明显的优点: ① 每一回合中几乎所有的样本皆用于训练模型,因此最接近原始样本的分布,这样评估所得的结果比较可靠。 ② 实验过程中没有随机因素会影响实验数据,确保实验过程是可以被复制的。但LOO-CV的缺点则是计算成本高,因为需要建立的模型数量与原始数据样本数量相同,当原始数据样本数量相当多时,LOO-CV在实作上便有困难几乎就是不显示,除非每次训练分类器得到模型的速度很快,或是可以用并行化计算减少计算所需的时间. Test-set estimator of performance has high variance. CV涉及到模型评价与选择，获得最有的模型后，使用所有观测训练作为预测模型。whichever model gives the best CV score: train it with all data,and that's the predictive model you'll use. AIC(AKaike information criterion 赤池信息标准) ,BIC( bayesion information criterion) how can we use cross-validation to find useful subset! intensive use of cross validation can overfit. hold out an additional testset before doing any model selection. 在pattern recognition与machine learning的相关研究中，经常会将dataset分为training跟test这两个subsets，前者用以建立model，后者则用来评估该model对未知样本进行预测时的精确度，正规的说法是generalization ability。在往下叙述之前，这边就必须点出一个极为重要的观念：只有training data才可以用在model的训练过程中，test data则必须在model完成之后才被用来评估model优劣的依据。怎么将完整的dataset分为training set与test set也是学问，必须遵守两个要点： training set中样本数量必须够多，一般至少大于总样本数的50%。两组子集必须从完整集合中均匀取样。其中第2点特别重要，均匀取样的目的是希望减少training/test set与完整集合之间的偏差(bias)，但却也不易做到。一般的作法是随机取样，当样本数量足够时，便可达到均匀取样的效果。然而随机也正是此作法的盲点，也是经常是可以在数据上做手脚的地方。举例来说，当辨识率不理想时，便重新取样一组training set与test set，直到test set的辨识率满意为止，但严格来说这样便算是作弊了。 Cross-validation正是为了有效的估测generalization error所设计的实验方法，可以细分为double cross-validation、k-fold cross-validation与leave-one-out cross-validation。Double cross-validation也称2-fold cross-validation(2-CV)，作法是将dataset分成两个相等大小的subsets，进行两回合的分类器训练。在第一回合中，一个subset作为training set，另一个便作为test set；在第二回合中，则将training set与test set对换后，再次训练分类器，而其中我们比较关心的是两次test sets的辨识率。不过在实务上2-CV并不常用，主要原因是training set样本数太少，通常不足以代表母体样本的分布，导致test阶段辨识率容易出现明显落差。此外，2-CV中分subset的变异度大，往往无法达到「实验过程必须可以被复制」的要求。 K-fold cross-validation (k-CV)则是double cross-validation的延伸，作法是将dataset切成k个大小相等的subsets，每个subset皆分别作为一次test set，其余样本则作为training set，因此一次k-CV的实验共需要建立k个models，并计算k次test sets的平均辨识率。在实作上，k要够大才能使各回合中的training set样本数够多，一般而言k=10算是相当足够了。最后是leave-one-out cross-validation (LOOCV)，假设dataset中有n个样本，那LOOCV也就是n-CV，意思是每个样本单独作为一次test set，剩余n-1个样本则做为training set，故一次LOOCV共要建立n个models。相较于前面介绍的k-CV，LOOCV有两个明显的优点：每一回合中几乎所有的样本皆用于训练model，因此最接近母体样本的分布，估测所得的generalization error比较可靠。实验过程中没有随机因素会影响实验数据，确保实验过程是可以被复制的。但LOOCV的缺点则是计算成本高，因为需要建立的models数量与总样本数量相同，当总样本数量相当多时，LOOCV在实作上便有困难，除非每次训练model的速度很快，或是可以用平行化计算减少计算所需的时间。使用Cross-Validation时常犯的错误由于实验室许多研究都有用到evolutionary algorithms(EA)与classifiers，所使用的fitness function中通常都有用到classifier的辨识率，然而把cross-validation用错的案例还不少。前面说过，只有training data才可以用于model的建构，所以只有training data的辨识率才可以用在fitness function中。而EA是训练过程用来调整model最佳参数的方法，所以只有在EA结束演化后，model参数已经固定了，这时候才可以使用test data。那EA跟cross-validation要如何搭配呢？Cross-validation的本质是用来估测(estimate)某个classification method对一组dataset的generalization error，不是用来设计classifier的方法，所以cross-validation不能用在EA的fitness function中，因为与fitness function有关的样本都属于training set，那试问哪些样本才是test set呢？如果某个fitness function中用了cross-validation的training或test辨识率，那么这样的实验方法已经不能称为cross-validation了。 EA与k-CV正确的搭配方法，是将dataset分成k等份的subsets后，每次取1份subset作为test set，其余k-1份作为training set，并且将该组training set套用到EA的fitness function计算中(至于该training set如何进一步利用则没有限制)。因此，正确的k-CV 会进行共k次的EA演化，建立k个classifiers。而k-CV的test辨识率，则是k组test sets对应到EA训练所得的k个classifiers辨识率之平均值。; 个人分类: 统计学|19959 次阅读|1 个评论

prediction of DNA-binding sites：DNA结合位点预测: jingyanwang 2013-2-10 22:03; Problem: predict if a residues is a DNA-binding residues or not. Features: The information of each residue in the sliding window is constructed using evolutionary information, the torsion angles in the PBS and the solvent accessible surface (Li and Li, 2012). These features and the encoding scheme are described in Part 1 of the Supplementary Data S2. Classifier: Then, the encoded features are selected as the input parameters of the SVM Database: PDNA-62: PDNA-62 dataset contained 1215 DNA-binding residues and 6948 non-binding residues. PDNA-224: 3778 interacting residues and 53 570 non-interacting residues were projected to be present in the PDNA-224 dataset. Evaluation: In predicting DNA-binding sites, the 5-fold cross-validation test is often used to examine the effectiveness of a predictor (Wang and Brown, 2006; Wang et al., 2009, 2010; Wu et al., 2009). The performance of our predictor was also assessed by the 5-fold cross-validation test. During this test, a dataset is randomly divided into five non-overlapping sets, four of which are used for training the predictor and the accuracy of the predictor is assessed on the remaining sets. This process is repeated five times. Performance measure: The predictive capability of our method was evaluated by the sensitivity (Sn), specificity (Sp), Matthew ’ s correlation coefficient (MCC), overall prediction accuracy (Acc), strength (Str) and false-positive rate (FPR). Results: Table 1. The test results for the PDNA-62 dataset with respect to different window sizes based on the 5-fold cross-validation test Table 2. The prediction performances for the PDNA-62 dataset based on various features in the 5-fold cross-validation test Fig. 3. ROC curves for the DNA-binding sites prediction in PDNA-62 dataset by combining SVM predictor using different parameters prediction of DNA-binding sites：DNA结合位点预测 RED.doc prediction of DNA-binding sites：DNA结合位点预测.pdf; 个人分类: APP|4 次阅读|0 个评论

美国宇航局关于2012年“世界末日”的解释: 热度 1 duke01361 2012-12-17 20:04; 以下英文内容是我从美国国家宇航局网站上引用的。问题1主要是针对互联网上众多的关于2012年世界末日的问题的问答；而第二部分内容是关于2012年世界末日说的缘起的解释。 Question (Q): Are there any threats to the Earth in 2012? Many Internet websites say the world will end in December 2012. Answer (A): The world will not end in 2012. Our planet has been getting along just fine for more than 4 billion years, and credible scientists worldwide know of no threat associated with 2012. Q: What is the origin of the prediction that the world will end in 2012? A: The story started with claims that Nibiru , a supposed planet discovered by the Sumerians, is headed toward Earth. This catastrophe was initially predicted for May 2003, but when nothing happened the doomsday date was moved forward to December 2012 and linked to the end of one of the cycles in the ancient Mayan calendar at the winter solstice in 2012 -- hence the predicted doomsday date of December 21, 2012. Nibiru这颗行星撞地球这个事情曾预测会在2003年发生（不是中国人说的2004年），后来又改为2012年12月21日。而这个日子和玛雅人的“月历末日”吻合到了一起。看来世界末日真的不大可能发生的。; 个人分类: 先哲也闲着|2212 次阅读|2 个评论

[转载]Checklist for computational programs in bioinformatics: chuangma2006 2012-11-28 13:57; How to evaluate performance of prediction methods? Measures and their interpretation in variation effect analysis Mauno Vihinen BMC Genomics 2012, 13 (Suppl 4):S2 doi:10.1186/1471-2164-13-S4-S2 http://www.biomedcentral.com/1471-2164/13/S4/S2 This checklist is provided to help when comparing and measuring performance of predictors and when selecting a suitable one. These are items that method developers should include in articles, or as supplement to articles, as they enable effective comparison and evaluation of the performance of predictors. Items to check when estimating method performance and comparing performance of different methods: - Is the method described in detail? - Have the developers used established databases and benchmarks for training and testing (if available)? - If not, are the datasets available? - Is the version of the method mentioned (if several versions exist)? - Is the contingency table available? - Have the developers reported all the six performance measures: sensitivity, specificity, positive predictive value, negative predictive value, accuracy and Matthews correlation coefficient. If not, can they be calculated from figures provided by developers? - Has cross validation or some other partitioning method been used in method testing? - Are the training and test sets disjoint? - Are the results in balance e.g. between sensitivity and specificity? - Has the ROC curve been drawn based on the entire test set? - Inspect the ROC curve and AUC. - How does the method compare to others in all the measures? - Does the method provide probabilities for predictions?; 个人分类: Research|1754 次阅读|0 个评论

缺陷二分类预测的评价: lzhx171 2012-11-10 21:49; 最近由于考试等复习，新论文看的甚少，主要看了一篇 2010-ASE 的由 Tim.M 等人发的一篇利用静态代码属性进行缺陷预测的文章（ Defect prediction from static code features-current results, limitations, new approaches ），借这个机会，我也慢慢总结自己看过的一些文献。这周我主要总结了以前所看文献对二分类问题的评价。首先先简单说明一下利用静态代码的预测中，为何使用二分类进行预测。在发现一个缺陷时，我们很难知道这个缺陷的危害程度，也就是说如果仅仅利用静态代码的话是很难挖掘出哪个缺陷应该被优先修复，在之前的一些文献中也有人提过 “ 由于对危害的定义很广泛，没有统一的标准 ” ，因此用 0/1 来二分类更明确。还有一些方法是预测模块中缺陷的数量的，但在挖掘 NASA 数据集后发现，大部分模块只包含 1 个缺陷，因此，预测模块中含有的数量并没有太大的意义（这里只强调模块内，并不是预测全局），二分类就完全可以。大多数论文，在对二分类评价时采用了AUC，即ROC曲线下面积，也就是利用pd和pf的关系来画ROC曲线。pd为召回率，即查出的缺陷占总缺陷的比例；pf是错误警示，即错误将无缺陷预测为有缺陷的模块占所有无缺陷模块的比例。07年时Tim.M等人发现只用AUC虽然能在一定程度上说明模型的好坏，但是由于ROC曲线有很多(pd，pf)点，因此要选出一个最好的点来作为模型的打分，如何选取这个点，他们引入了balance的评测方法，定义如下：显然，最好的情况下为(pd,pf)=(1,0)，以上公式其实就是(pd,pf)到最理想点的欧式距离，作为模型的评价方法。这个公式平等的兼顾了 pd，pf，但缺点是可能会出现多个(pd,pf)点。还有一种常用的评价方法为F1-measure，定义为： F1 = 2rp / ( r +p ) r就是上面的召回率，p是precision （表示预测出的有缺陷模块中真正有缺陷所占比例，即精度），这个也是很多论文中用到的，主要的优点是更注重对有缺陷模块预测的重要性。 10年Tim.M等人发现，在软件缺陷的二分类时，不能仅仅用简单的用信息检索的评价方法来对模型进行评估，应该结合软件工程中软件测试的一些问题来解决。例如，如果我们预测100%的模块有缺陷，那么在QA中就需要花很多的成本来维护，而预测10%的模块有缺陷时，测试成本也就相对低，而缺陷预测的目标是利用最少的模块找出最多的缺陷。于是引入了effort参数，来表示缺陷的比例，作为一项衡量测试成本的参数引入到ROC曲线中，组成一个三维的坐标系如下图：改进的ROC图，类似的balance评估方法也会变成如下： $\alpha$等参数是权值，可以根据实际来定。以上公式也就是与(1,0,0)理想空间点的带权距离。针对缺陷的二分类评价很多，以后遇到新的方法还会更新。; 130 次阅读|0 个评论

基于抽样的缺陷预测: lzhx171 2012-10-14 18:25; 本周看了一篇12年发表在ASE上的文章，文章名为Sample-based software defect prediction with active and semi-supervised learning. 翻译为基于抽样方法的主动学习和半监督学习软件缺陷预测。文中作者描述了三种方法，一是利用传统的机器学习方法随机抽样训练，二是利用半监督学习（semi-supervised learning）训练器随机抽样，三是利用主动半监督学习训练主动抽样（active sampling）的结果，并提出一种叫ACoForest的算法进行主动抽样。本文核心是为主动学习+半监督学习，以及融合这二者的提出的ACoForest算法。为了描述这个算法，我们先描述一般抽样的半监督学习算法（CoForest，这个算法在07年时由本文作者提出，并应用于医学诊断当中）。给定带标签集合L，和为标记集合U，首先利用带标签的训练集初始化N个随机树，接着在每次迭代中用N-1个随机树集成训练预测标记U中数据，并将可信度较高的实例加入训练集L ，中，对L ，随机抽样，使其满足一定的条件，然后由带标签集合L和新标记的实例集合L ，进行优化，直到迭代中没有任何一个随机树变化为止。（红色标记部分的需要满足一定的条件，这个在以前报告中讲过）。以上为CoForest方法。在优化随机树时，应该选取最有助于优化的算法，这样可以减小训练集而同时提高精确率，因此在进行优化前（上段蓝色字体），选取N个随机树最有争议（说明所含信息多）的前M组数据，再进行之后的过程，这个就是ACoForest。这个算法利用了主动学习及半监督学习的优点，使得每个随机树收敛的更快。在实验部分，作者比较了ACoForest的方法，几乎在所有数据集上F1值都好于CoForest。此算法的新颖之处在于训练集的获取上，个人认为就是要找到含有有效信息最多的数据集合，作者通过他们之前提出的一种基于分歧的半监督学习方法，根据多分类器对每个数据的分歧程度来说明一个数据是否值得作为训练集。可以说这是一篇作者在他们之前研究基础上的一种应用延伸。; 201 次阅读|0 个评论

会议信息——4th WGNE workshop: jiati0214 2012-9-27 11:22; 4th WGNE workshop on systematic errors in weather and climate models The JSC/CAS Working Group on Numerical Experimentation (WGNE) is organising a workshop on systematic errors in weather and climate models to be hosted at the Met Office, Exeter, UK, during 15-19 April 2013. The principal goal will be to increase understanding of the nature and cause of errors in models used for weather and climate prediction (including intra-seasonal to inter-annual). It is anticipated that the focus will be on General Circulation Models (GCMs) such as those used in CMIP5 , TIGGE , operational NWP, etc., including atmosphere-only, coupled atmosphere-ocean and earth system models. Biases in the atmosphere, land surface, ocean and cryosphere are all of interest. A wide variety of diagnostic techniques will be discussed, including traditional analysis methods applied to global models, process studies, the use of diagnostic and process models (e.g. single-column, cloud-resolving), and simplified experiments (e.g. aqua-planet). Of special interest will be studies that consider errors found in multiple models and errors which are present across timescales. Diagnostics and metrics that utilize novel or multi-variate observational resources and constraints to identify and characterize systematic errors are welcomed, together with studies which infer the amount of systematic error in predicted extremes from systematic errors in non-extreme situations. Alongside WGNE , the following groups will contribute to the coordination of the workshop: The Working Group on Coupled Models ( WGCM ), the Working Group on Seasonal to Inter-annual Prediction ( WGSIP ), the Working Group on Ocean Model Development ( WGOMD ), Stratospheric Processes And their Role in Climate ( SPARC ), Global Energy and Water Cycle Experiment ( GEWEX ), the Joint Working Group on Forecast Verification Research ( JWGFVR ), and the Year Of Tropical Convection ( YOTC ) project. 详细信息见：http://www.metoffice.gov.uk/conference/wgne2013; 2315 次阅读|0 个评论

预测在相向行人流中的作用——封面文章: 热度 3 majian 2012-7-24 22:33; Effect of prediction on the self-organization of pedestrian counter flow http://iopscience.iop.org/1751-8121/45/30/305004/ 刚刚收到通知，得知最近和北交大王子洋博士合作的一篇文章被Journal of Physics A选为封面文章，（2012年45卷30期），如下图。该文主要考虑人在运动过程中的预测机制对自组织行为的影响，主要表现为counter flow中的lane formation会产生不同的变化。为量化这种影响，我们在Physica A的文章（2010, 389:2101-2117.）中方法的基础上对这种影响进行了研究，从封面的图中可以直观的看出lane的个数、形态的变化趋势。另附Physica a的文章地址： k -Nearest-Neighbor interaction induced self-organized pedestrian counter flow http://www.sciencedirect.com/science/article/pii/S0378437110000464; 个人分类: 复杂系统|4773 次阅读|5 个评论

第5篇一作：Journal of Geophysical Research: yongbin 2012-5-11 14:39; 2011JD017069.pdf Citation: Yong, B. , Y. Hong, L. L. Ren, , J. J. Gourley, G. J. Huffman, X. Chen, W. Wang, and S. I. Khan (2012), Assessment of evolving TRMM-based multi-satellite real-time precipitation estimation methods and their impacts on hydrologic prediction in a high latitude basin, Journal of Geophysical Research- Atmosphere , 117, D09108, doi: 10.1029/2011JD017069 .; 个人分类: 科学研究|6247 次阅读|0 个评论

2004-01Achieving real-time pulse-to-pulse PRI prediction: lcj2212916 2012-1-23 14:53; 共9页。免费网盘下载地址： http://www.ctdisk.com/file/4322860 论坛下载地址： http://radarew.5d6d.com/thread-582-1-1.html; 1868 次阅读|0 个评论

2005-10Enabling technology radar PRI and RF prediction: lcj2212916 2012-1-23 11:31; 共23页。免费网盘下载地址： http://www.ctdisk.com/file/4322102 论坛下载地址： http://radarew.5d6d.com/thread-580-1-1.html; 1855 次阅读|0 个评论

2004-09Using PRI prediction to improve ECM effectiveness: lcj2212916 2012-1-22 18:55; 共54页。免费网盘下载地址： http://www.ctdisk.com/file/4319476; 1697 次阅读|0 个评论

2005PRI and RF prediction enabling technology: lcj2212916 2012-1-22 18:34; 免费网盘下载地址： http://www.ctdisk.com/file/4319421; 1793 次阅读|0 个评论

2012-01Achieving real-time pulse-to-pulse PRI prediction: lcj2212916 2012-1-22 17:40; 共28页。免费网盘下载地址： http://www.ctdisk.com/file/4316990 论坛下载地址： http://radarew.5d6d.com/thread-577-1-1.html; 1712 次阅读|0 个评论

2002-02PRI prediction enhance EW training: lcj2212916 2012-1-22 15:26; 免费网盘下载地址： http://www.ctdisk.com/file/4314691 论坛下载地址： http://radarew.5d6d.com/thread-576-1-1.html; 2008 次阅读|0 个评论

数论在短期地震预报中的运用: 热度 1 sfw111 2011-11-29 12:41; Abstract: hort-term earthquake prediction has always been a very difficult problem in geology, 15 this article pre-displacement, pre-established short-term break for the earthquake prediction based on the theory becomes completely abandoned to form the basis of earthquake prediction method, short-term earthquake prediction is a theoretical breakthrough. Key words: Mechanics; earthquake,;short-term forecasting,;pre-displacement; pre-fracture 摘要：地震短期预报历来是一个十分困难的地质学问题，本文以预位移预断裂为依据对于短期地震预报进行了理论思考，一旦该理论被实践所证明，将会是地震短期预报的一次理论突破。关键词：固体力学；地震；短期预报；预位移；预断裂预位移预断裂短期地震预报数学方法探析.pdf; 个人分类: 科学研究|366 次阅读|1 个评论

[转载]基于网络的预测: Network-based prediction for sources .: Fangjinqin 2011-8-12 09:33; 基于网络的预测13347.full.pdf Network-based prediction for sources of transcriptional dysregulation using latent pathway identification analysis Lisa Phama, Lisa Christadoreb, Scott Schausb, and Eric D. Kolaczykc,1 aProgram in Bioinformatics, Understanding the systemic biological pathways and the key cellular mechanisms that dictate disease states, drug response, and altered cellular function poses a significant challenge. Although high-throughput measurement techniques, such as transcriptional profiling, give some insight into the altered state of a cell, they fall far short of providing by themselves a complete picture. Some improvement can be made by using enrichmentbased methods to, for example, organize biological data of this sort into collections of dysregulated pathways. However, such methods arguably are still limited to primarily a transcriptional view of the cell. Augmenting these methods still further with networks and additional -omics data has been found to yield pathways that play more fundamental roles. We propose a previously undescribed method for identification of such pathways that takes a more direct approach to the problem than any published to date. Our method, called latent pathway identification analysis (LPIA), looks for statistically significant evidence of dysregulation in a network of pathways constructed in a manner that implicitly links pathways through their common function in the cell. We describe the LPIA methodology and illustrate its effectiveness through analysis of data on (i) metastatic cancer progression, (ii) drug treatment in human lung carcinoma cells, and (iii) diagnosis of type 2 diabetes. With these analyses, we show that LPIA can successfully identify pathways whose perturbations have latent influences on the transcriptionally altered genes.; 个人分类: 学术文章|2284 次阅读|0 个评论

[转载]Buy America (I mean American houses!): zuojun 2011-7-14 06:52; Why? Because the price for houses will go up eventually! Gary Shilling: 20% Drop in Housing to Cause Recession in 2012 Gary Shilling, President of A. Gary Shilling Co. and author of the Age of Deleveraging says another recession is brewing -- no matter what action the Fed takes. Shilling says the shock to trigger the next recess is "another big leg-down in housing." (An asset class the Fed has not been able to reflate.) As those familiar with Shilling know, his forecasts are generally bearish. However, in his defense, Shilling was one of the few economists who correctly predicted the dangers of the subprime mortgage market and its impact on the broader economy. The problem with the real estate market remains excess inventory. Based on Shilling's research, there are 2 million to 2.5 million excess homes in the country -- a supply that will take 4-5 years to work-off. The result: Housing prices will fall another 20% and underwater mortgages will balloon from 23% to 40%, he says. With housing slumping again, Shilling says recession is coming to a town near you in 2012. http://finance.yahoo.com/blogs/daily-ticker/20-drop-housing-cause-recession-2012-says-gary-161445494.html; 个人分类: From the U.S.|1586 次阅读|0 个评论

[转载]CAFA： Critical Assessment of Function Annotation: lry198010 2010-8-5 19:11; 随着测序技术的发展，基因和基因组及其相关序列的获取已不是什么重大问题，可以预计今后相当长一段时间，如何有效的注释基因的功能将成为生物学亟待解决的一个问题。目前，未知功能的基因的数目是如此之多，在基因功能的预测上，如果能提高一个百分点，那么能节省的资源和人力将是非常可观的。而想完全通过实验的方法来注明数量如此之大的未知基因，在今后很长一段时间来说都是不可能的，因此，基因预测软件将是预测未知功能基因功能的最后方法。但是，目前存在如此之多的基因功能预测软件，如何评价这些软件预测结果的可靠性和准确性，一级如何对不同预测软件的结果进行比较依旧是一个问题。CAFA就是对这种问题的一种尝试，从现在开始，CAFA将提供50，000个未知功能的蛋白质序列，所有参与这个项目的基因功能预测软件对提供的序列进行功能的注释，到明年1月份的时候再把预测的结果提交给CAFA，最后在五月份将在ISMB2011一个专门的会上对预测的结果进行评价。这是一个非常有意思的活动！预计这个活动将会对基因功能预测软件有着非常大的影响，可能在结果准确性的评价上将会有新的标准，而这50，000个蛋白质序列将会成为蛋白质功能预测的benchmark。 CAFA的网址: CAFA experiment website; 个人分类: genetic association breeding|3353 次阅读|0 个评论

[转载]Mathematical and Statistical Approaches to Climate Modelling and Prediction: zuojun 2010-7-9 07:11; Isaac Newton Institute for Mathematical Sciences Mathematical and Statistical Approaches to Climate Modelling and Prediction 11 August - 22 December 2010 Here is the link: http://www.newton.ac.uk/programmes/CLP/index.html; 个人分类: My Research Interests|2279 次阅读|0 个评论

Two newly published papers on Link Prediction: babyann519 2009-10-28 02:53; Many complex systems can be well described by networks where nodes present individuals or agents, and links denote the relations or interactions between nodes. Recently, the link prediction of complex networks has attracted more and more attention from computer scientists and physicists. Link prediction aims at estimating the likelihood of the existence of a link between two nodes, based on the observed links and the attributes of the nodes. For example, classical information retrieval can be viewed as predicting missing links between words and documents, and the process of recommending items to a user can be considered as a link prediction problem in the user-item bipartite network. Attached please find two newly published papers about the problem of link prediction. One (EPJB)discussed missing links prediction via local information. The other (PRE) introduced an efficient and effective similarity index, called Local Path index for link prediction. PRE_80_046122 EPJB_71_623; 个人分类: 未分类|13483 次阅读|8 个评论

12 下一页

帐号		自动登录	找回密码
密码			注册

关闭 安全验证

标签: prediction

相关日志

关闭安全验证