摘要 目的 建立一种抗感染性疾病药物的启发式发现方法(aCODE方法),以用于抗感染性疾病药物研发。 方法 选择美国食品药品监督管理局(FDA)批准药物数量≥40个的5种感染性疾病(艾滋病、流行性感冒、副黏液病毒感染、细菌性感染和百日咳),每种疾病设实验组和2个阴性对照组(A和B),实验组随机抽取(500次)M个FDA批准的适应证是该疾病的抗感染类药物为种子药物,阴性对照组A用所有FDA批准的适应证非当前疾病的抗感染类药物代替种子药物,阴性对照组B用所有适应证为抗非感染性疾病的药物代替种子药物。M从2取到20,输入种子药物的靶基因信息,计算种子药物集合的特征向量,通过药物特征向量的相似性搜索,对候选化合物进行预测。通过计算预测药物与FDA批准的抗该疾病药物阳性集合的交集大小并计算二者交集的显著性,验证aCODE方法的有效性。建立aCODE方法后,选取洛匹那韦、利巴韦林、利托那韦和磷酸氯喹4个药物作为治疗新型冠状病毒肺炎(COVID-19)的种子药物,对天然产物成分进行预测;以文献调研的已知具有抗冠状病毒活性的天然产物为验证集,计算预测结果的显著性。 结果 在5种感染性疾病中,随机抽取一定数量的种子药物作为输入,随着种子药物数量增多,实验组预测结果中阳性药物的比例增加,而2个阴性对照组的阳性率均显现基本持平或略有下降。aCODE方法应用于治疗COVID-19药物筛选时,能够有效预测得到具有潜在抗新型冠状病毒活性的药物(P=0.0046)。 结论 在aCODE方法中,种子药物越多,由这组种子药物计算得到的与疾病相关的基因模块特征越准确,预测结果中阳性药物的比例越高。该方法可能有助于COVID-19治疗药物的发现。 Abstract : OBJECTIVE To establish an agile discovery method of drugs or natural products for epidemics (aCODE) for the development of anti-infectious disease drugs. METHODS Five infectious diseases (HIV infection, human influenza, Paramyxoviridae infections, bacterial infections and whooping cough) involving more than 40 drugs approved by the United States Food and Drug Administration (FDA) were selected. An experimental group and two negative control groups (A and B) for each disease were set up. The experimental group randomly selected (500 times) M FDA-approved indications as seed drugs for the disease, while negative control group A used all FDA-approved infectious drugs for non-current diseases instead of seed drugs, and negative control group B used all non-infectious disease drugs for non-infectious diseases instead of seed drugs. M ranged from 2 to 20, the target gene information of the seed drug was input, and the feature vector of the seed drug set was calculated. Candidate compounds were predicted through similarity search of drug feature vectors. The size of the intersection between the predicted drug and the positive set of drugs approved by the FDA for the disease, and the significance of the intersection were calculated. After the establishment of the aCODE method, four drugs (lopinavir, ribavirin, ritonavir and chloroquine) were selected as seed drugs for COVID-19 to predict the composition of natural products. Using natural products with known anti-coronavirus activities as the verification set, the significance of the prediction results was calculated. RESULTS In the case of the five infectious diseases, the proportion of positive drugs in the results of prediction in the experimental group increased with the number of seed drugs, while the positive rate of the two negative control groups remained basically unchanged or somewhat trended down. The aCODE method, when applied to COVID-19 drug screening, could effectively predict drugs with potential anti-SARS-Cov-2 activity (P=0.0046). CONCLUSION With the aCODE method, the more the seed drugs, the more accurate the characteristics of the disease-related gene modules calculated from this group of seed drugs, and the higher the proportion of positive drugs in the prediction result. This method may contribute to the discovery of drugs for COVID-19. Key words : COVID-19 drug repositioning natural products network pharmacology 高 敏, 徐睿峰, 全 源, 梁峰吉, 朱月星, 熊江辉. 一种抗感染性疾病药物的启发式发现方法及其在治疗新型冠状病毒肺炎药物发现中的应用初探 . 中国药理学与毒理学杂志, 2020, 34(6): 408-417. http://202.38.153.236:81/Jweb_cjpt/CN/abstract/abstract4602.shtml
最新一期Nature Medicine撤销美国杜克大学肿瘤学家的一篇文章Genomic signatures to guide the use of chemotherapeutics。 http://www.nature.com/nm/journal/v17/n1/full/nm0111-135.html 文中提到撤销的原因是We wish to retract this article because we have been unable to reproduce certain crucial experiments showing validation of signatures for predicting response to chemotherapies, including docetaxel and topotecan 由于原文claim通过基因表达谱数据来预测化疗药物的敏感性,其结论在个体化医疗方面的意义重大,备受关注。但是随后引发诸多讨论。 我们曾试图重现其方法,怎么也无法重现其结果。当时的疑问主要在于:(1)从细胞系(in vitro)得到的signature用到肿瘤病人(in vivo)效果咋就那么好呢? (2)其挖掘signature的方法是如此朴素。 当然,这并不影响对 基于基因组学来预测/分析药物敏感性 这一思路的可行性。
最新一期Nature Medicine撤销美国杜克大学肿瘤学家的一篇文章Genomic signatures to guide the use of chemotherapeutics。 http://www.nature.com/nm/journal/v17/n1/full/nm0111-135.html 文中提到撤销的原因是We wish to retract this article because we have been unable to reproduce certain crucial experiments showing validation of signatures for predicting response to chemotherapies, including docetaxel and topotecan 由于原文claim通过基因表达谱数据来预测化疗药物的敏感性,其结论在个体化医疗方面的意义重大,备受关注。但是随后引发诸多讨论。 我们曾试图重现其方法,怎么也无法重现其结果。当时的疑问主要在于:(1)从细胞系(in vitro)得到的signature用到肿瘤病人(in vivo)效果咋就那么好呢? (2)其挖掘signature的方法是如此朴素。 当然,这并不影响对 基于基因组学来预测/分析药物敏感性 这一思路的可行性。
最新一期Nature Chemical Biology发表了系列Commentary文章,其中The challenges of integrating multi-omic data sets 一文明确指出:多组学数据的集成、挖掘所需投入的资源可能高于数据的采集,这对从事计算生物学建模的学者无疑是极大地鼓励: The capability to generate multi-omic data sets raises the issue of resource allocation for data generation versus data curation and integration. The initial experience of researchers shows that the effort required for the latter can be much greater than that for the former. 当组学数据的类型增加,例如mRNAs数据与microRNAs的集成分析时,上述趋势更加明显。 上述资源的理解,似可包括人力、物力、经费。 作者还指出: (1) 之所以上述情况现在还没有发生,部分因为人们发现难于找到合适的人从事多种组学数据的处理和集成,因为 这项工作需要对数据产生的过程有深入的技术知识; (2) 同时,实验设计时的预谋预筹很重要,组学数据往往由于技术背景的人掌管,而试图测试一切(参数)。此时,深厚的生命科学背景显得尤为重要。总之,生物医学、信息科学的多学科背景是解决多组学数据集成的重要素质,因为任何一个数据集成方案的背后,都实际上代表了你对这个问题的理解,即modeling。 原文链接: http://www.nature.com/nchembio/journal/v6/n11/full/nchembio.462.html