当前迫切需要改善资助机构、学术研究机构及其他各方用以评价科学研究产出的方法。 为了解决这一问题, 2012 年 12 月 16 日 ,在美国细胞生物学学会( ASCB )年会期间,一些学术期刊的编辑和出版者举行了会晤,提出了一系列建议。在此基础上形成《关于科研评价的旧金山宣言》 ( San Francisco Declaration on Research Assessment: P utting science into the assessment of research ) 。部分内容翻译( 李宏、王建芳) 如下: “ 期刊影响因子被频繁地用作比较个人和机构科学产出的基本参数。由汤森路透公司计算的期刊影响因子,最初是作为帮助图书馆员确定购买哪些期刊的工具的,并不是测度研究论文科学质量的指标。鉴于此,需要清晰地了解如同很多文献分析过的期刊影响因子作为科研评价工具时存在的缺陷。这些缺陷包括:( A )期刊的引文分布是高度偏态的 ;( B )期刊影响因子的特性随领域而不同:它覆盖多种多样的论文类型,包括原始研究论文与评述 ;( C )期刊影响因子可以通过编辑政策来被操纵(甚至“制造”) ;( D )用于计算期刊影响因子的数据对于公众来说既不透明也不公开 。 我们将提出改善科研产出评价方法的一系列建议。在未来,研究论文之外的其他产出对于评价科研成效将越来越重要。但是,经过同行评议的研究论文仍将是科研评价所使用的核心研究产出。我们的建议首先是针对同行评议期刊研究论文相关的评价工作的,但也可以并应该扩展到作为重要研究产出而被承认的其他产品上,如数据集( datasets )。这些建议是针对资助机构、学术研究机构、期刊、计量指标提供机构和科研人员个人的。 这些建议贯穿着以下主题: —— 在考虑资助、聘用和晋升时,要停止使用基于期刊的计量指标,如期刊影响因子; —— 要评估科研工作本身的价值,而不是基于研究成果所登载的期刊; —— 要充分利用在线出版所提供的机会(例如,放宽对论文字数、图表和参考文献数量的不必要限制,开发新的重要度和影响指标)。” 。。。。。。
San Francisco Declaration on Research Assessment 翻译: 邱敦莲 提要: 科研产出可以体现在发表论文、申请专利和软件著作权、取得其它知识产权以及培养青年科学家等很多方面。 如何评价科研产出的价值? 科学研究的价值不能只依据于论文发表的期刊和期刊的影响因子来评价,而是要根据科学研究本身的价值来评价。 翻译: 邱敦莲 Putting science into the assessment of research 将科学放到研究中去评价 There is a pressing need to improve the ways in which the output of scientific research is evaluated by funding agencies, academic institutions, and other parties. 当前,基金资助机构、学术机构以及其它组织均急切需要改进科研成果的评价方式。 To address this issue, the group of editors and publishers of scholarly journals listed below met during the Annual Meeting of The American Society for Cell Biology (ASCB) in San Francisco, CA, on December 16, 2012. The group developed a set of recommendations, referred to as the San Francisco Declaration on Research Assessment . We invite interested parties to indicate their support by adding their names to this declaration. 文后所列出的多家学术期刊的编辑人员和出版者们,于2012年12月16日齐聚美国旧金山,在一年一度的美国细胞生物学(ASCB)年会上讨论了科学研究成果的评价问题。 The outputs from scientific research are many and varied, including: research articles reporting new knowledge, data, reagents, and software; intellectual property; and highly trained young scientists. Funding agencies, institutions that employ scientists, and scientists themselves all have a desire, and need, to assess the quality and impact of scientific outputs. It is imperative that scientific output be measured accurately, evaluated wisely, and used thoughtfully. 科学研究的产出形式多种多样,可以是报道新知识、新数据、 新的反应产物、新软件的研究论文,也可以是知识产权,还可以是受到良好训练的青年科学家。基金资助机构、聘用科学家的学术研究单位,以及科学家自身都有对他们的科研产出质量和影响进行评估的愿望和需求。对科研产出进行准确的测度、明智的评估、经过思考的应用已势在必行! The Journal Impact Factor is frequently used as the primary parameter with which to measure the scientific output of individuals and institutions. The Journal Impact Factor, as calculated by Thomson Reuters, was originally created as a tool to help librarians identify journals to purchase, not as a measure of the scientific quality of research in an article. With that in mind, it is critical to understand that the Journal Impact Factor has a number of well-documented deficiencies as a tool for research assessment. These limitations include: A) citation distributions within journals are highly skewed ; B) the properties of the Journal Impact Factor are field-specific; it is a composite of multiple, highly diverse article types, including primary research papers and reviews ; C) Impact Factors can be manipulated (or “gamed”) by editorial policy ; and D) data used to calculate the Journal Impact Factors are neither transparent nor openly available to the public . 期刊影响因子常常被用作衡量个人和机构科研产出的主要参数。由汤森路透所计算的期刊影响因子,最初只是用来帮助图书馆员识别需要购买的期刊的一种工具,并不是用来评价一篇文章的研究质量的指标。牢记这一点,了解将期刊影响因子作为一种工具去评价研究成果具有众所周知的不足,这非常关键。其主要不足表现在:1)同一期刊发表的论文其被引极不均衡 ;2)期刊影响因子具有研究领域特异性;期刊论文是多种文章类型的综合,既有原创性研究,也有综述报道 3)影响因子会受编辑政策所操控 ; 4)用于计算期刊影响因子的数据既不透明,也不能被大众获悉 。 Below we make a number of recommendations for improving the way in which the actual quality of research output is evaluated. Outputs other than research articles will grow in importance in assessing research effectiveness in the future, but the peer-reviewed research paper will remain a central research output that informs research assessment. Our recommendations therefore focus primarily on practices relating to research articles published in peer-reviewed journals, but can and should be extended by recognising additional products, such as datasets, as important research outputs. The recommendations are aimed at funding agencies, academic institutions, journals, organizations that supply metrics, and individual researchers. 下面是对 科研产出质量进行评价的方法进行改进的几项 建议 。在未来的科研效果评价中,科研产出(而不是研究论文)的重要性将日益突显。但是,经过同行评议的研究论文仍将是评价科学研究成效的主要产出成果。因此,我们的建议首先还是针对在同行评议期刊上发表的研究论文而言,但是可以也应该扩展到诸如数据集、重要的研究产出等其它方面。这些建议适用于资助机构、学术机构、期刊、提供测度标准的组织,以及每个研究人员。 A number of themes run through these recommendations: 贯穿建议方案的几个主题: - the need to eliminate use of journal-based metrics, such as impact factors, in funding, appointment and promotion considerations. 在基金资助和在考虑任职、升职时,废除利用影响因子等基于期刊的评价指标对科学家进行评价。 - the need to assess research on its own merits rather than on the basis of the journal in which the research is published, and 对一项研究成果进行评价时,要基于研究本身的业绩,而不是该项研究所发表论文的期刊质量来评价。 - the need to capitalize on the opportunities provided by online publication (such as relaxing unnecessary limits on the number of words, figures, and references in articles, and exploring new indicators of significance and impact) 要抓住在线出版所提供的机遇,如在线出版放松了对文字、图表、参考文献数量的限制,努力探索评价论文重要性和影响力的新指标。 We recognize that many funding agencies, institutions, publishers, and researchers are already encouraging improved practices in research assessment. Such steps are beginning to increase the momentum toward more sophisticated and meaningful approaches to research evaluation that can now be built upon and adopted by all of the key constituencies involved. 我们已意识到,很多基金资助机构、学术机构、出版商和研究人员都在鼓励改进科研评价的实践。在这方面所采取的措施正将科研评价引向更复杂但更有意义的方向发展,目前已被与科研评价有关的主要各方所采纳。 The signatories of the San Francisco Declaration support the adoption of the following practices in research assessment. 《旧金山宣言 》签约各方支持在科研评价中采用以下措施: General Recommendation 总体建议 1. Do not use journal-based metrics, such as journal impact factors, as a surrogate measure of the quality of individual research articles, to assess an individual scientist’s contributions, or in hiring, promotion or funding decisions. 不使用影响因子等评价期刊的指标代替评价单篇研究论文质量的指标, 不使用影响因子等评价期刊的指标作为 评价某位科学家实际贡献的指标,也不作为决定是否聘用、升职和得到经费资助的指标。 For funding agencies 对资助机构 2. Be explicit about the criteria used in evaluating the scientific productivity of grant applicants and clearly highlight, especially for early-stage investigators, that the scientific content of a paper is much more important than publication metrics or the identity of the journal in which it was published. 明确用于评价资助申请人科研生产力的标准,明确强调一篇文章的科学内容比刊载该篇论文的期刊的计量指标和知名度更重要,特别是对起步阶段的研究人员。 3. For the purposes of research assessment, consider the value and impact of all research outputs (including datasets and software) in addition to research publications, and consider a broad range of impact measures including qualitative indicators of research impact, such as influence on policy and practice. 为了评价科学研究,除了发表的研究成果外,应考虑数据集和软件等所有研究产出的价值和影响,还应该考虑采用更广泛的影响测量指标,如对政策和实践的影响。 For institutions对研究机构 4. Be explicit about the criteria used to reach hiring, tenure, and promotion decisions, clearly highlighting, especially for early-stage investigators, that the scientific content of a paper is much more important than publication metrics or the identity of the journal in which it was published. 明确人员聘用、留职、晋升的标准,明确地突出一篇文章的科学内容比 比刊载该篇论文的期刊的计量指标和知名度更重要,特别是对 起步阶段的 研究人员。 5. For the purposes of research assessment, consider the value and impact of all research outputs (including datasets and software) in addition to research publications, and consider a broad range of impact measures including qualitative indicators of research impact, such as influence on policy and practice. 在对科学研究进行评价时,除了考虑发表的研究成果外,应考虑数据集和软件等所有科学研究产出的价值和影响,而且还应该考虑采用更广泛的测量指标,如研究成果对政策和实践的影响。 For publishers对出版者 6. Greatly reduce emphasis on the journal impact factor as a promotional tool, ideally by ceasing to promote the impact factor or by presenting the metric in the context of a variety of journal-based metrics (eg. 5-year impact factor, EigenFactor , SCImago , editorial and publication times, etc) that provide a richer view of journal performance. 尽量减少将期刊影响因子作为期刊的推介手段,最好不提影响因子,或者只呈现一系列基于对期刊进行评价的指标,如5年影响因子 ,恩格尔系数 , SCImago ,出版频次等,以便对一个期刊的表现提供更为全面丰富的视角。 7. Make available a range of article-level metrics to encourage a shift toward assessment based on the scientific content of an article rather than publication metrics of the journal in which it was published. 提供一系列的测度指标,鼓励各方将评价重点从发表论文的期刊本身转向一篇论文的科学内容方面。 8. Encourage responsible authorship practices and the provision of information about the specific contributions of each author. 鼓励负责任的作者行为,提供能够了解每一位作者各自的贡献的信息。 9. Whether a journal is open-access or subscription-based, remove all reuse limitations on reference lists in research articles and make them available under the Creative Commons Public Domain Dedication. (See reference 8.) 无论一个期刊是采取开放获取还是订购阅读模式,去除对论文中的参考文献列表进行再利用的限制,使其能够按照 “公共领域声明与许可” 原则得到获取利用。 10. Remove or reduce the constraints on the number of references in research articles, and, where appropriate, mandate the citation of primary literature in favor of reviews in order to give credit to the group(s) who first reported a finding. 去除或者减少对研究论文中的参考文献数量的限制,恰当的情况下,应当强制性地要求引用原创性论文而不是综述论文,以便让首次报道某一发现的作者的 劳动 能够得到承认。 For organizations that supply metrics 对 提供评价工具的机构 11. Be open and transparent by providing data and methods used to calculate all metrics. 使所有计算计量指标的数据和方法公开透明。 12. Provide the data under a licence that allows unrestricted reuse, and provide computational access to data. 根据许可提供数据,允许数据不受限制地再利用,提供获取数据的计算路径。 13. Be clear that inappropriate manipulation of metrics will not be tolerated; be explicit about what constitutes inappropriate manipulation and what measures will be taken to combat this. 明确表态不能容忍对计量指标的不当操控,明确说明哪些属于不当操控,并将采用何种措施打击这种操控。 14. Account for the variation in article types (e.g., reviews versus research articles), and in different subject areas when metrics are used, aggregated, or compared 在利用、合计、比较计量指标时,考虑综述、研究型论文等文章类型的差异和不同学科领域的差异。 For researchers对研究人员 15. When involved in committees making decisions about funding, hiring, tenure, or promotion, make assessments based on scientific content rather than publication metrics. 作为委员参与决定基金资助、聘用、留用或者晋升时,根据科学研究的内容而不是出版物的计量指标作出评价。 16. Wherever appropriate, cite primary literature in which observations are first reported rather than reviews in order to give credit where credit is due. 任何时候,最适当论文引用方式是引用第一次报道这项观察结果的原创文献而不是引用综述文献,把荣誉给应当得到这个荣誉的人。 17. Use a range of article metrics and indicators on personal/supporting statements, as evidence of the impact of individual published articles and other research outputs . 利用一系列论文计量指标,作为证明某人所发表的论文以及其它科研产出的影响力的证据。 18. Challenge research assessment practices that rely inappropriately on Journal Impact Factors and promote best practice that focuses on the value and influence of specific research outputs. 挑战不恰当地依靠期刊影响因子进行科研评价的行为,推动 侧重于某个具体研究产出的价值和影响的最佳实践。 References 参考文献 Editorial (2005). Not so deep impact. Nature 435, 1003–1004 Rossner, M., Van Epps, H., Hill, E. (2007). Show me the data. J. Cell Biol. 179, 1091–1092. www.jcb.org/cgi/content/full/179/6/1091. The PLoS Medicine Editors (2006). The impact factor game. PLoS Med 3(6): e291 doi:10.1371/journal.pmed.0030291. Rossner M., Van Epps H., and Hill, E. (2008). Irreproducible results: a response to Thomson Scientific. J. Cell Biol. 180, 254–255. http://jcb.rupress.org/content/180/2/254.full. Adler, R., Ewing, J., and Taylor, P. (2008) Citation statistics. A report from the International Mathematical Union. www.mathunion.org/publications/report/citationstatistics0. http://www.eigenfactor.org/ http://www.scimagojr.com/ http://opencitations.wordpress.com/2013/01/03/open-letter-to-publishers http://altmetrics.org/tools/ Participants in declaration drafting Sharon Ahmad, Journal of Cell Science Bruce Alberts, Science Stefano Bertuzzi, American Society for Cell Biology Ana-Maria Cuervo, Aging Cell Tracey dePellegrin, Genetics David Drubin, Molecular Biology of the Cell Martha Fedor, Journal of Biological Chemistry Petra Gross, Journal of Cell Science Lisa Hannan, Traffic Mark Johnston, Genetics W. Mark Leader, Molecular Biology of the Cell Michael Marks, Traffic Mark Marsh, Traffic Tom Misteli, Journal of Cell Biology Mark Patterson, eLife Bernd Pulverer, EMBO Journal Annalisa VanHook, Science Signalling Brian Ray, Science Michael Rossner, Rockefeller University Press Randy Schekman, eLife Sandra Schmid, former editor, MBoC and Traffic Michael Way, Journal of Cell Science Liz Williams, Journal of Cell Biology Additional Signees ___________________________________ Related report 相关报道: 1. 取消用论文被引用次数来评判科学家个人成就 http://cssci.nju.edu.cn/news_show.asp?Articleid=534 2. 《关于科研评价的旧金山宣言》发布 —— 科学家对影响因子说“不” http://www.sinori.cn/jsp/archives/archivesViewDt!archivesViewDt.action?modelId=1archivesId=7841 3.科学家对影响因子说“不”:科技期刊由科学的奴仆变成了主人 http://blog.sciencenet.cn/blog-2277-692028.html
看到有报道说中国人当不了中国科学院院士可当美国科学院院士,想到了上述题目。此事我不care,所以题目的前一句其实是多余的,这里谈的是后一句。是为题记。 一个学者的学术贡献能否被准确评价?在学术评价比学术研究还要花样繁多、还要如火如荼的当下,问这种问题的确有点不识趣。但学术人都明白,即使不明白,圣人的智慧和凡人的实践均表明,真正的学术是不能被准确评价的。 学术乃至所有智力贡献的准确评价相当于物理学中的精确测量,是个高技术活,多数情况下即使技术再高明也无法对某个物理量精确测量。关于这一点,英国经济学家 Charles Goodhart 在 1975 年发明了一个以他名字命名的 Goodhart 定律:某种评价一旦被选择用来作决策时,这种评价就开始失去其价值了。( Once a measure is chosen for making policy decisions, it begins to lose value as a measure )。 Goodhart 定律曾被应用于银行和其他领域的政策制定,实践表明,评价不仅破坏评价过程,也扭曲对评价目标的认识。 从这一点来看, Goodhart 定律 类似于量子力学中的测不准原理 。学术就像量子,所以也“测不准”;学术评价就像量子测量,任何测试量子的操作,都会改变量子状态,同样,任何学术评价都会使学术和学术评价本身变味,而且不是变好,而是越评越掉价。 最近,纽约州立石溪大学哲学和历史学教授 Robert Crease 发表“ Measuring culture ”一文,其中举了两个咱中国人最能信服的例子,实证了学术“测不准”。例子一,当你用标准化考试评价智力时,学校就会开始应试教育,而你也会把智力看作是学校教给孩子的应试能力。例子二,如果你用论文数量来评价科研人员的学术贡献,那么科研人员将会立马造出一大堆毫无意义的低劣论文,因此这种评价最终将不仅不能评出真正的学术,反而会使许多杰出的科学成就被低估。为了弥补纯粹用论文数量评价科研人员的不足,老外又想出了一个试图结合论文数量和影响力的所谓 h 指数( h-index ),现在被广泛使用,但随着对 h 指数认可程度的提升,对付这种评价的变味招数也是层出不穷,可以预计,这种评价最终也将使评价毫无意义。从这些例子看来,学术评价也摆脱不了咱中国人都熟悉的一个魔咒:上有政策,下有对策。 既然学者的学术贡献不仅不能被准确评价,而且越评越不准,越评越不利于学术本身的发展,那我们为什么还在如此频繁、如此花样繁多地进行学术评价呢?且不说从评职称到评领军、拔尖、将才、帅才等令人眼花缭乱的所谓“高端人才”,即使对“帽子”、“位子”无欲无求,至少也无法摆脱每一步学术业绩都要被精细地转化成“工分”的命运。仔细想来,不难发现,学术评价的目的竟然是学术功利化。只有功利化的评价体系或指标,才能“精确地”评价一个学者的学术贡献,也才具有引导众多学者对照指标朝着功利方向努力的强大力量。然而,正是源于这种强大力量,越精确的评价越能将学术最终导向远离学术的本质。 学术研究的本质是探究真理,很大程度上说也就是中国先贤所谓的“道”。老子曰:“道可道,非常道。”老子似乎在几千年前就告诫后人,真正的学术是无法评价的,能被准确评价的学术不是真正的学术。爱因斯坦说:“如果有一个能够应付现代科学需求,又能与科学相依共存的宗教,那必定是佛教。”看来学术也似佛法真理,“不可云”,尤其是高深的学术,一旦“云”了,便失去了它的真相。真正的学术其驱动力是好奇和兴趣而不是功利,我们“道”学术,“云”学术,是因为我们还不是 “寺庙”里的人 ,也因此,不要指望我们当中能出什么所谓的大师,我们充其量是类似 GDP (中文念“鸡的屁”,既可忽悠也可下蛋)指标的贡献者。 (登载于《中国教育报》 2013 年 5 月 8 日第 3 版“ 能出院士不一定能出大师 ”) “学术评价”系列: 评和被评都是一面镜子 (登于《中国教育报》) 学术评价回归学术本身 学术不净始于项目评审 (登于《中国教育报》) 年轻学者的学术独立指的是什么? ( 登于《中国科学报》) 学术评价要不要兼顾历史的观点?