《山地科学学报(英文版)》 ( 英文刊号: Journal of Mountain Science ) 自 2010 年起开始使用 iThenticate 提供的 Crosscheck 检测系统防止学术不端行为。我们将该检测系统嵌入到了ScholarOne在线稿件系统中。稿件(包括 新投稿、修改稿 )一旦投到系统中,管理员会将主要文件递交给检测系统进行检测。除此以外,在 稿件正式刊登前 ,编辑部还会对接收的稿件作一次检测。检测的目的以下两方面:1)预防一稿多投或者重复发表;2)检测抄袭和过度借用。 在 Journal of Mountain Science 的作者指南( http://jms.imde.ac.cn/for-authors )中的 General Guidelines部分 有如下的申明: “ JMS accepts original papers and invited reviewthat have never been published in English in any form. All manuscripts (newsubmission, revision, and before formal publication) will be subjectedto a plagiarism checking by CrossChecking Software iThenticate to prevent plagiarismand inappropriate citation. ” 检测以后,一般依据以下标准对稿件作出退稿处理: 1.与单篇文章的相似度达到20%左右; 2.总的相似度达到40%左右; 3.摘要或者结论部分与过去已发表的文章(他人的和作者自己的)非常相似。 4.大部分图表与作者过去已发表的文章相同,或者某个图表与其它人发表的文章基本相同但没有注明来源; 5.文章的主要结果(数据)与过去发表的文章相同。
假如源代码也可以查重 有些鳄鱼必定会夜半心惊 假定源代码可以查重 有些高山必定塌蹦 有些趾高气扬的人 必定会低三下气 假如源代码可以查重 必定有的海域就会干涸 假如源代码可以查重 必定有许多狐狸精 现出原形 必定有许多妖魔鬼怪 无法遁形 查重 查重 查重 查重 代码也要查重 代码也要查重 狡猾的狐狸也有尾巴 悄悄的黄鼠狼也有脚印 现形 现形 现形 现形 (假如这个软件正在研制中,此人是比尔盖茨式的人) 有趣的谷歌翻译(汉英) Some people arrogance Three times lower bound gas Assume that the source code can be re-check Some crocodiles will definitely scared midnight Assume that the source code can be re-check Some mountain must fall jump If the source code can be re-check Certainly some waters will dry up If the source code can be re-check There must be a lot of fox Xianchuyuanxing There must be a lot of demons Unable to hide check check check check Borrow borrow borrow borrow
当一篇文章经查重软件处理后,相似率是第一个出现的结果。因为我们很容易把注意力放在这个表示有问题的数字上,所以新用户通常会问的问题是“什么样的相似率说明有问题?” 这个问题的答案是,没有一个“神奇的数字”能够告诉你一篇文章是否包含有问题的内容。相似率只是为你提供一个粗略的“标题”,以确保你能够直接注意到那些有大量重复的文章,而快速忽略掉几乎没有重复的文章。除此之外,相似率本身不会给你确切的答案,也绝对不能告诉你这篇文章是否有抄袭的情况。 为什么会这样呢? 其实,当评估一篇文章的整体相似率时要考虑到若干因素。 首先,需要注意的是相似率告诉你的是一篇文章中和其他文章相同 (即所谓的匹配)的文字的总量。这个总量可能是由许多较小的匹配组成的。相似率30%有可能是指30%匹配同一篇文章,但更有可能的情况是,这30%是由许多较小的匹配相加而成,这些小的匹配最大都不超过4或5%。这只有在看详细的查重报告时才能看出来。 当然,一篇有6个5%匹配的文章可能和一篇30%都抄自同一篇文章的文章一样有抄袭的问题。不过不看查重报告就没法确定了。 其次,匹配出现在文章的哪一部分有时比到底有多少文字匹配更重要。例如,某些学科领域的编辑可能不太在意方法部分的重复,因为要描述一个过程也只有那么多的方式。而另一方面,在讨论或结论部分的匹配,尽管它可能只占手稿的一小部分,如果没有适当的引用,也会引起编辑的怀疑。 同样的,一类文章的可接受的阈值未必适合另一类型的文章:综述文章相似率通常会比研究文章高一些。 同样需要记住的是在未编辑的手稿中可能存在一些简单地错误而导致查重软件错误地标出存在匹配的部分。查重软件的排除书目功能是依赖于在文章的参考文献部分有一行是reference这个标题。如果这个标题在手稿中被省略,参考文献部分将不被排除在外。 同样,排除引文功能是通过查找引号。如果作者没有使用引号或是在开头或结尾漏掉一个引号时,系统不会识别出引用的文字,即使编辑们可以通过文章布局和参考文献一眼看出是引用的文字。 基于以上所有的原因,比起单单只看查重的相似率而言,更重要的是查看查重报告。 Understanding the Similarity Score The similarity score is the first thing you see when a document is processed and, because it’s easy to focus on this number as signifying a problem, a common question new users of the system ask is ‘what level of similarity score indicates a problem?’ The answer to this question is there is no such thing as a ‘magic number’ that will tell you whether a document contains problematic content. The similarity score gives you a rough ‘headline’ that ensures heavily duplicated papers are brought straight to your attention and allows you to quickly disregard papers with hardly any matches. Beyond that, the score itself doesn’t give you definitive answers and definitely cannot tell you whether you have a case of plagiarism. Why is this? Well, there are a number of factors that need to be taken into account when assessing a paper’s overall similarity score. Firstly, it’s important to note the similarity score is telling you the total amount of matching text. This is probably going to be made up of a number of smaller matches. It is possible a 30% score will turn out to be a 30% match to one source, but it’s much more likely that when you look at the reports you’ll find the 30% is made up of a number of smaller matches, the largest of which might be just 4 or 5%. Of course, a paper with six separate matches of 5% could well be as problematic as one that has copied 30% of its content from a single source, but it’s impossible to tell whether this is the case without looking at the reports. Secondly, where the match appears can sometimes be more important than how big the match is. For example, editors in certain subject areas may be less concerned about sizable matches in methods sections, where there are only so many ways to describe a certain process. A match in the discussion or conclusions with no appropriate citation, on the other hand, could set alarm bells ringing even though it only accounts for a small percentage of the manuscript. Similarly, acceptable thresholds for one type of article may not be appropriate for another: Review articles could be expected to have a higher overall similarity score than original research articles. It is also important to bear in mind there could be simple errors in the unedited manuscript that mean matches are picked up incorrectly. The exclude bibliography feature of sofewares relies on the reference section having a title on its own line within the document. If this is omitted from the manuscript, the references will not be excluded. Similarly, the exclude quotes feature looks for quotation marks. If the author has not used quotation marks or missed one at the start or end of the passage, the system will not recognize it as a quote, even though it might be apparent to the editor due to its layout and reference. For all of these reasons it’s important to look at the reports rather than rely on the similarity score alone. 进一步了解, 请点击查看: www.letpub.com.cn SCI论文英语润色 │ 同行资深专家修改 │ 专业翻译 │ 格式排版整理 │ 联系我们 (转载请注明本文来自LetPub中文官网: www.letpub.com.cn/index.php?page=sci_writing_23 )
SCI/EI文献数据融合软件设计与实现 Design and Application of Data Fusion Software on Papers Indexed By SCI and EI 摘要 设计一款具有SCI/EI数据库文献数据查重和数据融合功能的软件. 帮助分析人员获得来自SCI/EI数据库的文献融合数据集, 更好地满足微观学科情报分析对灵活构建多来源期刊文献数据集的需求. 利用两种自动算法和一种半自动算法实现SCI/EI文献数据的准确查重, 在对两者的全记录字段进行深入微观文本分析的基础上实现数据融合. 可自动标记SCI/EI文献数据的重复记录并生成查重后的融合数据表. 有效解决两个不同期刊文献数据源的统一分析数据集构建问题. 关键词 : 查重 , 融合 , EI , SCI , 软件设计 Abstract : A software is designed to implement duplication checking and data fusion of the papers indexed by SCI and by EI. The software can help paper analysts obtain a dataset in the same format and meet demand of micro-analysis of subject information. Two automatic algorithms and one semi-automatic algorithm are used to complete accurate data duplicate checking on the papers indexed by SCI and EI. Data fusion is based on detailed analysis of text features of data fields of SCI and EI. It can mark papers which are duplicated between SCI papers and EI papers and create a de-duplicated data fusion sheet. The construction problem of the dataset from different data sources is solved effectively and its design ideas also can be applied to other databases. Key words : Duplicate checking Data fusion EI SCI Software design 基金资助: 本文系中国科学院文献情报中心青年人才领域前沿项目学科化知识服务辅助工具优化设计(项目编号:青1209)的研究成果之一. 通讯作者: 于健 E-mail: yuj@mail.las.ac.cn E-mail: yuj@mail.las.ac.cn 全文pdf下载链接:http://www.infotech.ac.cn/CN/abstract/abstract3977.shtml
以“Humans as the World's Greatest Evolutionary Force”(人类是世界上最大的进化驱动力)为关键词在Google中搜到两篇完全一样的文章,一篇是2001年发表在Science的,另一篇(或同一篇)2008年发在Urban Ecology上,当然大家引用时都是发在Science上的那篇。这就提醒我们,Google是论文查重的绝好工具,现推荐给各期刊编辑部,做好论文录用把关工作。 大家使用Google进行科研已经是最普遍的行为了,Google不仅能迅速的给出你最关心的结果,还可以知道最新的研究进展。使用Google可以使初学者尽快入门并节约大量时间是我的体会,但Google的火眼晶晶还不止于此,上面的例子就是巧用Google所得出的结论。