小柯机器人

可扩展的广义线性混合模型可用于大型数据分析
2020-05-20 23:30

美国密歇根大学Seunggeun Lee、Wei Zhou等研究人员合作开发了一个可扩展的广义线性混合模型,可用于大型数据集的分析。2020年5月18日,《自然—遗传学》在线发表了这一成果。

研究人员提出了一个可扩展的、基于混合模型的区域通用关联测试SAIGE-GENE,该测试适用于成千上万个样本的外显子区域和全基因组区域分析,并且可以解决二元性状问题。
 
通过对69,716个挪威样本的HUNT研究和408,910个英国白人样本的UK Biobank数据进行的广泛模拟研究和分析,研究人员表明SAIGE-GENE可以有效分析大样本数据(N> 400,000)并且I型错误率得到很好地控制。
 
据悉,由于样本量巨大,生物银行为鉴定复杂性状的遗传成分提供了难得的机会。为了分析稀有变异,通常使用基于区域的多变量聚集检验来提高关联检验的功效。但是,由于计算量巨大,同时又考虑了人口分层和样本相关性等混杂因素,现有的区域测试无法分析成千上万的样本。
 
附:英文原文

Title: Scalable generalized linear mixed model for region-based association tests in large biobanks and cohorts

Author: Wei Zhou, Zhangchen Zhao, Jonas B. Nielsen, Lars G. Fritsche, Jonathon LeFaive, Sarah A. Gagliano Taliun, Wenjian Bi, Maiken E. Gabrielsen, Mark J. Daly, Benjamin M. Neale, Kristian Hveem, Goncalo R. Abecasis, Cristen J. Willer, Seunggeun Lee

Issue&Volume: 2020-05-18

Abstract: With very large sample sizes, biobanks provide an exciting opportunity to identify genetic components of complex traits. To analyze rare variants, region-based multiple-variant aggregate tests are commonly used to increase power for association tests. However, because of the substantial computational cost, existing region-based tests cannot analyze hundreds of thousands of samples while accounting for confounders such as population stratification and sample relatedness. Here we propose a scalable generalized mixed-model region-based association test, SAIGE-GENE, that is applicable to exome-wide and genome-wide region-based analysis for hundreds of thousands of samples and can account for unbalanced case–control ratios for binary traits. Through extensive simulation studies and analysis of the HUNT study with 69,716 Norwegian samples and the UK Biobank data with 408,910 White British samples, we show that SAIGE-GENE can efficiently analyze large-sample data (N>400,000) with type I error rates well controlled.

DOI: 10.1038/s41588-020-0621-6

Source: https://www.nature.com/articles/s41588-020-0621-6

Nature Genetics:《自然—遗传学》,创刊于1992年。隶属于施普林格·自然出版集团,最新IF:41.307
官方网址:https://www.nature.com/ng/
投稿链接:https://mts-ng.nature.com/cgi-bin/main.plex


本期文章:《自然—遗传学》:Online/在线发表

分享到:

0