Distinguishing sleeping beauties in science Jiang Li,Fred Y. Ye Abstract Three kinds of criteria have been advanced for distinguishing sleeping beauties in science, i.e., average-based criteria, quartile-based criteria and parameter-free criteria,on which basis four rules are proposed that should be adhered to in distinguishing sleeping beauties: (1) the early citations should be penalized; (2) the whole citation history should be taken into account; (3) the awakening time of a sleeping beauty should not vary overtime; and (4) arbitrary thresholds on sleeping period or awakening intensity should be avoided. Keywords: Sleeping beauty Beauty coefficient Delayed recofgnition ( 基于一般的 标准 , 基于分位数 标准 和 无 参数 的 标准 ;4条规则:早期被引被忽略;考虑整个引用历史;睡美人文献苏醒时间历时太久;避免武断地确定阈值来确定沉睡期或唤醒强度) Scientometrics (2016) 108:821–828 DOI 10.1007/s11192-016-1977-3
既然“睡美人”在诺贝尔奖获得者的论文中出现的概率更高,常常蕴含惊人发现,那么哪些论文更容易成为“睡美人”呢?其实,多数论文发表之初都是沉睡状态,甚至有相当一部分论文遭遇“零被引”,即便是在诺贝尔奖获得者的论文中,也有10%左右的论文从未被引用。“沉睡”状态的被引次数分布有很多种,如果能够计算出不同分布的“睡美人”过去出现的概率,那么,就可以基于此推断当前正处于沉睡期的论文未来成为“睡美人”(或者说“被唤醒”)的概率。 基于上面的逻辑,我们课题组提出论文沉睡期的“心跳谱”,借助Gini系数测度“心跳谱”中被引次数的均衡性,最终发现 最容易成为“睡美人”的“心跳谱”中心跳均匀、重心靠后 ,例如:(0,2,0,2,0,2),相比而言,(2,0,2,0,2,0)、(1,1,1,1,1,1)成为“睡美人”的概率更低;最难成为“睡美人”的情况是“零被引”,也就是说,没有心跳。当“睡美人”连续五年没有心跳,“被唤醒”的概率低至0.2%,如果连续十年没有心跳,“被唤醒”的概率低至0.05%,如果连续二十年没有心跳,基本就成了植物人。 论文发表在Journal of Informetrics(SSCI收录), 附论文标题与摘要。全文参见: http://www.sciencedirect.com/science/article/pii/S1751157714000418 。 A study of the “heartbeat spectra” for “sleeping beauties” Abstract: We first introduced interesting definitions of “heartbeat” and “heartbeat spectrum” for “sleeping beauties”, based on van Raan's variables. Then, we investigated 58,963 papers of Nobel laureates during 1900–2000 and found 758 sleeping beauties. By proposing and using G s index, an adjustment of Gini coefficient, to measure the inequality of “heartbeat spectrum”, we observed that publications which possess “late heartbeats” (most citations were received in the second half of sleeping period) have higher awakening probability than those have “early heartbeats” (most citations were received in the first half of sleeping period). The awakening probability appears the highest if an article's G s index exists in the interval [0.2, 0.6).
与“睡美人”的被引曲线相反,“昙花一现”这一类论文一经发表立即被大量引用,但持续时间很短,之后逐渐被人遗忘。“昙花一现”反映的是科研领域的快速更迭;“睡美人”反映的是科研成果中迟到的认可。二者存在的概率均较低。“昙花一现”出现的原因有很种,例如:一个科学热点被另一个科学热点所取代,一个热点研究问题得到解决,一项解决方案被另一项更优的解决方案所取代……。总而言之,这些现象更符合自然科学领域的研究特征,因此,在自然科学领域出现的概率远高于社会科学。 另有一类被引曲线极为独特,在图形上看(参见附图),论文发表之后的一段时间内前期符合“昙花一现”,后期符合“睡美人”。也就是说,一项研究成果发表之初立即引发大量关注,成为热点,此后若干年里被人遗忘,但不久之后,它的价值又一次被人发现,因此引发了第二波大量关注,第二次成为热点。这种现象出现的概率更低。我们在1900-2012年所有诺贝尔科学奖获得者发表并被Web of Science收录的12862篇论文中找到了两个案例,并将这种现象写成了一篇论文,发表在Scientometrics(SSCI收录) http://link.springer.com/article/10.1007/s11192-013-1217-z 。论文中还详细讨论了如何睡美人与王子的happy ending。 附论文标题与摘要 Citation Curves of All-elements-sleeping-beauties: “Flash in the Pan” first and then “DelayedRecognition” Abstract “Delayed recognition” refers to the phenomenon where papers did notachieve any sort of recognition until some years after their originalpublication. A paper with delayed recognition was termed a “sleeping beauty”: aprincess sleeps (goes unnoticed) for a long time and then, almost suddenly, isawakened (receives a lot of citations) by a prince (another article). There area sleeping period and an awakening period in the definition of a “sleepingbeauty”. Apart from and prior to the two periods, an awaking period was found incitation curves of some publications, “sleeping beauties” was hence expanded to“all-elements-sleeping-beauties”. The opposite effect of “delayed recognition”was described as “flash in the pan”: documents that were noticed immediately afterpublication but did not seem to have a lasting impact. In this work, we brieflydiscussed the citation curves of two remarkable “all-elements-sleeping-beauties”.We found they appeared “flash in the pan” first and then “delayed recognition”.We also found happy endings of sleeping beauties and princes, and hence suggestthe citation curve of an “all-elements-sleeping-beauty” include an awakingperiod, a sleeping period, an awakening period and a happy ending. 附图(详解参见原文)
有些论文发表之后,逐渐被关注,2年左右达到被引次数的峰值(期刊影响因子中2年时间窗口的理论依据);有些论文发表之后,立马被大量关注,当年被引次数达到峰值,但轰动效应迅速消退,这种现象被称为“昙花一现”;有些论文发表之后,多年一直未被关注,被引次数极少,但此后突然被大量关注,被引次数飙升,这种现象类似于童话故事:“公主”最初处于“沉睡”状态,但“王子”出现并亲吻她之后,“公主”被唤醒。因此,2004年van Raan(Price奖获得者)将这种现象命名为“睡美人”。 大量的统计数据显示,“睡美人”存在的概率约为1%。我们课题组专门挑选了一组获得诺贝尔奖的论文,发现其中“睡美人”的概率远远高于1%。也就是说,高质量的研究常常难以得到同时代科学家的认可,要经过若干年后,同行才能理解。最典型的例子莫过于孟德尔(遗传育种科学家 )的豌豆杂交育种实验,实验结果发表于1866年,但直到1901年才逐渐被认可,并最终成为经典。此外,我们还意外发现另外一种现象,有些论文并不是一出场就沉睡了,而是被“纺锤”刺破手指才沉睡的。也就是说,公主沉睡前有一段活跃期。参照完整的童话故事,我们将这种现象命名为“全要素睡美人”。 论文发表在Scientometrics(SSCI收录), http://link.springer.com/article/10.1007/s11192-012-0643-7 ,附论文标题与摘要: The phenomenon of all-elements-sleeping-beauties in scientific literature Abstract: The phenomenon of all-elements-sleeping-beauties in science is revealed by four special cases. The ‘sleeping beauties’ prick their fingers on the ‘spindles’ so that they fall into sleep then are awakened by their ‘princes’. The authors speculate that the phenomenon could happen in scientific literatures with high quality.