博文

高维稀疏统计推断,惩罚估计的专著及Lasso变量选择相关的高引论文

已有 11999 次阅读 2015-8-12 18:25 |系统分类:科研笔记|关键词:学者

近些年来，与变量选择有关的高维统计推断非常火（特别是和L1惩罚函数有关的lasso方法，以及一些推广的方法，具体看看网上的这些博文

videolectures上的视频http://videolectures.net/site/search/?q=LASSO http://videolectures.net/site/search/?q=+High-Dimensional+Data

高维模型选择方法综述《数理统计与管理》2012年04期

变量选择作为现代数理统计的重要一支得到了迅速的发展，在生物，医药, 网络，经济金融、图像处理等领域的应用广泛。一些大牛门（这些大牛的名字可见下面与Lasso变量选择有关的高引论文，引用次数截至2015.8.12）经常在统计四大天王杂志

Journal of the Royal Statistical Society Series B-Statistical Methodology，Annals of Statistics，Biometrika ，Journal of the American Statistical Association

上“灌水”。在统计的六小天王杂志

Bernoulli，Statistica Sinica，Scandinavian Journal of Statistics， Electronic Journal of Statistics,

Statistical Science，Technometrics

上也有许多相当好的文章。

随着科学技术的进步，收集数据维数也越来越大。因此如何有效地从海量数据中挖掘出有用的信息备受人们的关注。高维统计建模无疑是目前处理这一问题的最有效的手段之一。在低维模型建立之初，为了尽量减小因缺少重要自变量而出现的模型偏差，人们通常会选择尽可能多的自变量。但在高维数据建模中，由于维数祸根（Curse of Dimensionality，见Introduction to High-Dimensional Statistics by Christophe Giraud的第1章详细描述），若把所以变量选出来是不合符实际的。故我们需要选择一些变量，以提高模型的解释性和预测精度。变量选择也服从了奥卡姆剃刀（Occam's Razor）的思想。他在《箴言书注》2卷15题说“切勿浪费较多东西，去做‘用较少的东西，同样可以做好的事情’。奥卡姆是由14世纪逻辑学家、圣方济各会修士奥卡姆的威廉（William of Occam，约1285年至1349年）提出。

Occam’s Razor is a well known principle of “parsimony of explanations” which is influential

in scientific thinking in general and in problems of statistical inference in particular. by Rasmussen

要研究高维统计也不容易，需要下面的基础课程作为预备知识：
数理统计（经典统计推断），高等概率论（极限理论以及大样本理论部分），线性与广义线性模型（矩阵论，经典线性模型），统计计算（优化方法）
书单可见博文概率统计金融数学计量精算一些内容利于自学,新而全的教科书。

下面的书籍是专门讲（或者有一些章节提到）高维统计,稀疏推断,惩罚估计的一些书籍（按照时间顺序排列）。

偏理论的高维统计(稀疏推断,惩罚估计)推断书籍：
2002,Subset selection in regression 2ed by Miller, A.
2005,The concentration of measure phenomenon by Ledoux, M.
2007,Introduction to Clustering Large and High-Dimensional Data by Jacob Kogan
2007,Concentration inequalities and model selection by Massart, P.
2008,Modern multivariate statistical techniques by Izenman, A. J.

2008,High-Dimensional Data Analysis in Cancer Research by Xiaochun Li and Ronghui Xu
2010,High-dimensional Data Analysis by Tony Cai and Xiaotong Shen
2009,Spectral Analysis of Large Dimensional Random Matrices by Zhidong Bai and Jack W. Silverstein
2012,大维统计分析白志东
2010,Statistics for High-Dimensional Data: Methods, Theory and Applications by Peter Bühlmann
2010,Large-scale inference: empirical Bayes methods for estimation, testing, and prediction by Efron, B.
2011,Oracle Inequalities in Empirical Risk Minimization and Sparse Recovery Problems by Koltchinskii, V.
2013,Multivariate statistical analysis: A high-dimensional approach by Serdobolskii, V. I. by Max Bramer
2013,High Dimensional Probability VI The Banff Volume
2013,High-Dimensional Covariance Estimation: With High-Dimensional Data by Mohsen Pourahmadi

2013,Penalty, Shrinkage and Pretest Strategies: Variable Selection and Estimation by S. Ejaz Ahmed

2014, Superconcentration and Related topics, Sourav Chatterjee

2014,Multivariate Statistics High-Dimensional and Large-Sample Approximations,Fujikoshi
2014,Introduction to High-Dimensional Statistics by Christophe Giraud

2014,An Introduction to Sparse Stochastic Processes by M Unser, PD Tafti
2015,Statistical Learning for High-Dimensional Data by Jianqing Fan,Runze Li
2015,Multivariate Density Estimation: Theory, Practice, and Visualization 2ed by David W. Scott (第7章)
2015,Applied multivariate statistical analysis 4ed by Härdle, W., & Simar, L.
2015,Statistical Learning with Sparsity: The Lasso and Generalizations by Hastie, T., Tibshirani, R., & Wainwright, M.
2015,Modeling and Stochastic Learning for Forecasting in High Dimensions by Antoniadis, A., Poggi, J. M., & Brossat, X.
2015,Large Sample Covariance Matrices and High-Dimensional Data Analysis by Jianfeng Yao and Shurong Zheng
2015,Regression Modeling With Many Correlated Predictors: High Dimensional Data Analysis in Practice by Jay Magidson 2015,Mathematical Foundations of Infinite-Dimensional Statistical Models by Evarist Giné
偏应用的统计学习书籍：

1998,Statistical learning theory by Vapnik

2004,All of Statistics: A Concise Course in Statistical Inference by Larry Wasserman(这个书后半部分几乎统计学习的内容,Bootstrap 图模型因果推断分类非参都有介绍)

2008,Statistical Learning from a Regression Perspective by Richard A. Berk

2009,The Elements of Statistical Learning : Data Mining, Inference, and Prediction by Robert Tibshirani、Trevor Hastie、Jerome Friedman 这本书的作者是Boosting方法,变量选择最活跃的几个研究人员，发明的Gradient Boosting提出了理解Boosting方法的新角度，极大扩展了Boosting方法的应用范围。这本书对当前最为流行的方法有比较全面深入的介绍，对工程人员参考价值也许要更大一点。另一方面，它不仅总结了已经成熟了的一些技术，而且对尚在发展中的一些议题也有简明扼要的论述。让读者充分体会到机器学习是一个仍然非常活跃的研究领域，应该会让学术研究人员也有常读常新的感受。”

2009,Algebraic Geometry and Statistical Learning Theory by Sumio Watanabe

2012,统计学习方法李航(作者是国内机器学习领域的几个大家之一，曾在MSRA任高级研究员，现在华为诺亚方舟实验室。书中写了十个算法，每个算法的介绍都很干脆，直接上公式，是彻头彻尾的“干货书”。每章末尾的参考文献也方便了想深入理解算法的童鞋直接查到经典论文。)

2012,Machine Learning: A Probabilistic Perspective by Kevin P. Murphy

2013,Machine Learning with R by Brett Lantz

2013,Probability for Statistics and Machine Learning by Anirban DasGupta (统计学习中的概率理论应有尽有)

2013,An Introduction to Statistical Learning: with Applications in R by Gareth James

2014,Applied Linear Regression, 4th Edition by Sanford Weisberg(第10章)

下面Lasso变量选择有关的高引论文（谷歌学术引用次数大于100，这里用的是）清单：
Lasso变量选择的提出是Tibshirani在1996年JRSS-B上的一篇文章
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 267-288. 被引用次数：13667

Lasso already is in the statistical mainstream.Look it up on Google scholar and the 1996 lasso paper has over 13000 citations, with dozens of other papers on lasso having thousands of citations each. Lasso is huge.

Lasso的全称是least Least absolute shrinkage and selection operator。其想法可以用如下的最优化问题来表述：

Tibshirani(1996)提出Lasso方法之前的变量选择方法高引论文
Akaike, H. (1973), "Information theory and an extension of the maximum likelihood principle", in Petrov, B.N.; Csáki, F., 2nd International Symposium on Information Theory, Tsahkadsor, Armenia, USSR, September 2-8, 1971, Budapest: Akadémiai Kiadó, p. 267-281.(AIC准则) 被引用次数：14906
Mallows, C. L. (1973). Some comments on Cp. Technometrics, 15(4), 661-675. （MallowsCp）被引用次数：3336

Schwarz, Gideon E. (1978), Estimating the dimension of a model, Annals of Statistics 6 (2): 461–464 (BIC准则) 被引用次数：24512
Frank, L. E., & Friedman, J. H. (1993). A statistical view of some chemometrics regression tools. Technometrics, 35(2), 109-135. (桥估计)被引用次数：1630

Breiman, L. (1995). Better subset regression using the nonnegative garrote. Technometrics, 37(4), 373-384.被引用次数：737
Mallows, C. L. (1995). More comments on Cp. Technometrics, 37(4), 362-372.被引用次数：127

Tibshirani(1996)提出Lasso方法之后的高引论文
1-10
Efron, B., Hastie, T., Johnstone, I., & Tibshirani, R. (2004). Least angle regression. The Annals of statistics, 32(2), 407-499.（提出最小角回归方法）被引用次数：5125
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2), 301-320. （提出 elastic net）被引用次数：3872
Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American statistical Association, 96(456), 1348-1360. （提出SCAD）被引用次数：2888
Yuan, M., & Lin, Y. (2006). Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68(1), 49-67. (提出Group lassso) 被引用次数：2686
Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American statistical association, 101(476), 1418-1429. (提出adaptive lasso ) 被引用次数：2303
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of statistical software, 33(1), 1. 被引用次数：2207
Candes, E., & Tao, T. (2007). The Dantzig selector: statistical estimation when p is much larger than n. The Annals of Statistics, 2313-2351. (Dantzig selector) 被引用次数：1893
Meinshausen, N., & Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso. The Annals of Statistics, 1436-1462.(lasso in graphs model) 被引用次数：1489
Zhao, P., & Yu, B. (2006). On model selection consistency of Lasso. The Journal of Machine Learning Research, 7, 2541-2563. (consistency of Lasso) 被引用次数：1241
Zou, H., Hastie, T., & Tibshirani, R. (2006). Sparse principal component analysis. Journal of computational and graphical statistics, 15(2), 265-286. (稀疏主成分分析) 被引用次数：1176
11-20
Friedman, J., Hastie, T., Höfling, H., & Tibshirani, R. (2007). Pathwise coordinate optimization. The Annals of Applied Statistics, 1(2), 302-332. 被引用次数：1024
Bickel, P. J., Ritov, Y. A., & Tsybakov, A. B. (2009). Simultaneous analysis of Lasso and Dantzig selector. The Annals of Statistics, 1705-1732. 被引用次数：970
Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., & Knight, K. (2005). Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(1), 91-108.(提出Fused LASSO) 被引用次数：935

Park, T., & Casella, G. (2008). The bayesian lasso. Journal of the American Statistical Association, 103(482), 681-686. （贝叶斯lasso）被引用次数：886
Knight, K., & Fu, W. (2000). Asymptotics for lasso-type estimators. Annals of statistics, 1356-1378.(lasso渐进性质的必读论文) 被引用次数：774
Breiman, L. (1995). Better subset regression using the nonnegative garrote. Technometrics, 37(4), 373-384. 被引用次数：737
Fu, W. J. (1998). Penalized regressions: the bridge versus the lasso. Journal of computational and graphical statistics, 7(3), 397-416. 被引用次数：703
Meier, L., Van De Geer, S., & Bühlmann, P. (2008). The group lasso for logistic regression. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70(1), 53-71. 被引用次数：709
Fan, J., & Lv, J. (2008). Sure independence screening for ultrahigh dimensional feature space. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70(5), 849-911.(SIS方法) 被引用次数：713
Wainwright, M. J. (2009). Sharp thresholds for high-dimensional and noisy sparsity recovery using-constrained quadratic programming (Lasso). Information Theory, IEEE Transactions on, 55(5), 2183-2202. 被引用次数：686
21-30

Schäfer, J., & Strimmer, K. (2005). A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Statistical applications in genetics and molecular biology, 4(1).被引用次数：678
Tibshirani, R. (1997). The lasso method for variable selection in the Cox model. Statistics in medicine, 16(4), 385-395.被引用次数：680
Zhu, J., Rosset, S., Hastie, T., & Tibshirani, R. (2004). 1-norm support vector machines. Advances in neural information processing systems, 16(1), 49-56.被引用次数：635
Meinshausen, N., & Bühlmann, P. (2010). Stability selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72(4), 417-473.被引用次数：672
Arlot, S., & Celisse, A. (2010). A survey of cross-validation procedures for model selection. Statistics surveys, 4, 40-79.被引用次数：647
Yuan, M., & Lin, Y. (2007). Model selection and estimation in the Gaussian graphical model. Biometrika, 94(1), 19-35.被引用次数：608
Zhang, C. H. (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 894-942(提出MCP).被引用次数：588
Zou, H., & Li, R. (2008). One-step sparse estimates in nonconcave penalized likelihood models. Annals of statistics, 36(4), 1509.被引用次数：565
Park, M. Y., & Hastie, T. (2007). L1‐regularization path algorithm for generalized linear models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 69(4), 659-677.被引用次数：554
Osborne, M. R., Presnell, B., & Turlach, B. A. (2000). On the lasso and its dual. Journal of Computational and Graphical statistics, 9(2), 319-337.被引用次数：545
Hastie, T., Rosset, S., Tibshirani, R., & Zhu, J. (2004). The entire regularization path for the support vector machine. The Journal of Machine Learning Research, 5, 1391-1415.被引用次数：520
31-40
Koenker, R. (2004). Quantile regression for longitudinal data. Journal of Multivariate Analysis, 91(1), 74-89.被引用次数：502Meinshausen, N., & Yu, B. (2009). Lasso-type recovery of sparse representations for high-dimensional data. The Annals of Statistics, 246-270. 被引用次数：485
Fan, J., & Peng, H. (2004). Nonconcave penalized likelihood with a diverging number of parameters. The Annals of Statistics, 32(3), 928-961.被引用次数：483
Zou, H., Hastie, T., & Tibshirani, R. (2007). On the “degrees of freedom” of the lasso. The Annals of Statistics, 35(5), 2173-2192.被引用次数：479
Bach, F. R. (2008). Consistency of the group lasso and multiple kernel learning. The Journal of Machine Learning Research, 9, 1179-1225.被引用次数：476
Blei, D. M., & Lafferty, J. D. (2007). A correlated topic model of science. The Annals of Applied Statistics, 17-35. 被引用次数：477
Antoniadis, A., & Fan, J. (2011). Regularization of wavelet approximations. Journal of the American Statistical Association. 被引用次数：467
Genkin, A., Lewis, D. D., & Madigan, D. (2007). Large-scale Bayesian logistic regression for text categorization. Technometrics, 49(3), 291-304. 被引用次数：455
Hofmann, T., Schölkopf, B., & Smola, A. J. (2008). Kernel methods in machine learning. The annals of statistics, 1171-1220.被引用次数：451
Portnoy, S., & Koenker, R. (1997). The Gaussian hare and the Laplacian tortoise: computability of squared-error versus absolute-error estimators. Statistical Science, 12(4), 279-300.被引用次数：415
41-50
Bühlmann, P., & Hothorn, T. (2007). Boosting algorithms: Regularization, prediction and model fitting. Statistical Science, 477-505.被引用次数：407
Zhang, C. H., & Huang, J. (2008). The sparsity and bias of the lasso selection in high-dimensional linear regression. The Annals of Statistics, 1567-1594.被引用次数：407

Jacob, L., Obozinski, G., & Vert, J. P. (2009, June). Group lasso with overlap and graph lasso. In Proceedings of the 26th annual international conference on machine learning (pp. 433-440). ACM.被引用次数：405
Koh, K., Kim, S. J., & Boyd, S. P. (2007). An Interior-Point Method for Large-Scale l1-Regularized Logistic Regression. Journal of Machine learning research, 8(8), 1519-1555.被引用次数：420
Wu, T. T., & Lange, K. (2008). Coordinate descent algorithms for lasso penalized regression. The Annals of Applied Statistics, 224-244.被引用次数：385
Jolliffe, I. T., Trendafilov, N. T., & Uddin, M. (2003). A modified principal component technique based on the LASSO. Journal of computational and Graphical Statistics, 12(3), 531-547.
Van de Geer, S. A. (2008). High-dimensional generalized linear models and the lasso. The Annals of Statistics, 614-645.被引用次数：357
Bair, E., Hastie, T., Paul, D., & Tibshirani, R. (2006). Prediction by supervised principal components. Journal of the American Statistical Association, 101(473).被引用次数：356
Candès, E. J., & Plan, Y. (2009). Near-ideal model selection by ℓ1 minimization. The Annals of Statistics, 37(5A), 2145-2177.被引用次数：345
Ramsay, J. O., Hooker, G., Campbell, D., & Cao, J. (2007). Parameter estimation for differential equations: a generalized smoothing approach. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 69(5), 741-796.被引用次数：341
51-60
Wang, H., Li, R., & Tsai, C. L. (2007). Tuning parameter selectors for the smoothly clipped absolute deviation method. Biometrika, 94(3), 553-568.被引用次数：329
Rosset, S., & Zhu, J. (2007). Piecewise linear regularized solution paths. The Annals of Statistics, 1012-1030.被引用次数：328
Zhao, P., Rocha, G., & Yu, B. (2009). The composite absolute penalties family for grouped and hierarchical variable selection. The Annals of Statistics, 3468-3497.（提出CAP方法）被引用次数：331
Bunea, F., Tsybakov, A., & Wegkamp, M. (2007). Sparsity oracle inequalities for the Lasso. Electronic Journal of Statistics, 1, 169-194.被引用次数：321
Wu, T. T., Chen, Y. F., Hastie, T., Sobel, E., & Lange, K. (2009). Genome-wide association analysis by lasso penalized logistic regression. Bioinformatics, 25(6), 714-721.被引用次数：321
Kuo, L., & Mallick, B. (1998). Variable selection for regression models. Sankhyā: The Indian Journal of Statistics, Series B, 65-81.被引用次数：314
Leeb, H., & Pötscher, B. M. (2005). Model selection and inference: Facts and fiction. Econometric Theory, 21(01), 21-59.被引用次数：298
Fan, J., & Li, R. (2002). Variable selection for Cox's proportional hazards model and frailty model. Annals of Statistics, 74-99.被引用次数：303
Negahban, S., Yu, B., Wainwright, M. J., & Ravikumar, P. K. (2009). A unified framework for high-dimensional analysis of $ M $-estimators with decomposable regularizers. In Advances in Neural Information Processing Systems (pp. 1348-1356).被引用次数：319
Jenatton, R., Audibert, J. Y., & Bach, F. (2011). Structured variable selection with sparsity-inducing norms. The Journal of Machine Learning Research, 12, 2777-2824.被引用次数：301
61-70
Chen, J., & Chen, Z. (2008). Extended Bayesian information criteria for model selection with large model spaces. Biometrika, 95(3), 759-771.被引用次数：312
Fan, J., & Li, R. (2004). New estimation and model selection procedures for semiparametric modeling in longitudinal data analysis.Journal of the American Statistical Association, 99(467), 710-723.被引用次数：291
Fan, J., & Lv, J. (2010). A selective overview of variable selection in high dimensional feature space. Statistica Sinica, 20(1), 101.被引用次数：294
Sauerbrei, W., & Royston, P. (1999). Building multivariable prognostic and diagnostic models: transformation of the predictors by using fractional polynomials. Journal of the Royal Statistical Society. Series A (Statistics in Society), 71-94.被引用次数：286
Guo, Y., Hastie, T., & Tibshirani, R. (2007). Regularized linear discriminant analysis and its application in microarrays. Biostatistics, 8(1), 86-100.被引用次数：283
Wainwright, M. J. (2009). Information-theoretic limits on sparsity recovery in the high-dimensional and noisy setting. Information Theory, IEEE Transactions on, 55(12), 5728-5741.被引用次数：285
George, E. I. (2000). The variable selection problem. Journal of the American Statistical Association, 95(452), 1304-1308.被引用次数：272
Huang, J., Ma, S., & Zhang, C. H. (2008). Adaptive Lasso for sparse high-dimensional regression models. Statistica Sinica, 18(4), 1603.被引用次数：275
Shen, H., & Huang, J. Z. (2008). Sparse principal component analysis via regularized low rank matrix approximation. Journal of multivariate analysis, 99(6), 1015-1034.被引用次数：280
Friedman, J. H., & Popescu, B. E. (2008). Predictive learning via rule ensembles. The Annals of Applied Statistics, 916-954.被引用次数：265
71-80
Shevade, S. K., & Keerthi, S. S. (2003). A simple and efficient algorithm for gene selection using sparse logistic regression. Bioinformatics, 19(17), 2246-2253.被引用次数：265
Buehlmann, P. (2006). Boosting for high-dimensional linear models. The Annals of Statistics, 559-583.被引用次数：264
Hunter, D. R., & Li, R. (2005). Variable selection using MM algorithms. Annals of statistics, 33(4), 1617.被引用次数：262
Ravikumar, P., Lafferty, J., Liu, H., & Wasserman, L. (2009). Sparse additive models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 71(5), 1009-1030.被引用次数：260
Zhang, H. H., & Lu, W. (2007). Adaptive Lasso for Cox's proportional hazards model. Biometrika, 94(3), 691-703.被引用次数：255
Huang, J. Z., Liu, N., Pourahmadi, M., & Liu, L. (2006). Covariance matrix selection and estimation via penalised normal likelihood. Biometrika, 93(1), 85-98.被引用次数：255
Lange, N., & Zeger, S. L. (1997). Non‐linear Fourier Time Series Analysis for Human Brain Mapping by Functional Magnetic Resonance Imaging. Journal of the Royal Statistical Society: Series C (Applied Statistics), 46(1), 1-29.被引用次数：252

Obozinski, G., Taskar, B., & Jordan, M. I. (2010). Joint covariate selection and joint subspace selection for multiple classification problems. Statistics and Computing, 20(2), 231-252.被引用次数：255
Greenshtein, E., & Ritov, Y. A. (2004). Persistence in high-dimensional linear predictor selection and the virtue of overparametrization. Bernoulli, 10(6), 971-988.被引用次数：252
Huang, J., Horowitz, J. L., & Ma, S. (2008). Asymptotic properties of bridge estimators in sparse high-dimensional regression models. The Annals of Statistics, 587-613.被引用次数：253
81-90
Bunea, F., Tsybakov, A. B., & Wegkamp, M. H. (2007). Aggregation for Gaussian regression. The Annals of Statistics, 35(4), 1674-1697.被引用次数：243
Wand, M. P. (2003). Smoothing and mixed models. Computational statistics, 18(2), 223-249.被引用次数：238Bai, J., & Ng, S. (2008). Forecasting economic time series using targeted predictors. Journal of Econometrics, 146(2), 304-317.被引用次数：238
Peng, J., Wang, P., Zhou, N., & Zhu, J. (2009). Partial correlation estimation by joint sparse regression models. Journal of the American Statistical Association, 104(486).被引用次数：243
Huang, J., Zhang, T., & Metaxas, D. (2011). Learning with structured sparsity. The Journal of Machine Learning Research, 12, 3371-3412.被引用次数：243
Mazumder, R., Hastie, T., & Tibshirani, R. (2010). Spectral regularization algorithms for learning large incomplete matrices. The Journal of Machine Learning Research, 11, 2287-2322.被引用次数：241
Li, C., & Li, H. (2008). Network-constrained regularization and variable selection for analysis of genomic data. Bioinformatics, 24(9), 1175-1182.被引用次数：239
Zou, H., & Zhang, H. H. (2009). On the adaptive elastic-net with a diverging number of parameters. Annals of statistics, 37(4), 1733.被引用次数：239
Meinshausen, N. (2007). Relaxed lasso. Computational Statistics & Data Analysis, 52(1), 374-393.（提出Relaxed LASSO）被引用次数：238
De Mol, C., Giannone, D., & Reichlin, L. (2008). Forecasting using a large number of predictors: Is Bayesian shrinkage a valid alternative to principal components?. Journal of Econometrics, 146(2), 318-328.被引用次数：235
91-100
O'Hara, R. B., & Sillanpää, M. J. (2009). A review of Bayesian variable selection methods: what, how and which. Bayesian analysis, 4(1), 85-117.被引用次数：234
Turlach, B. A., Venables, W. N., & Wright, S. J. (2005). Simultaneous variable selection. Technometrics, 47(3), 349-363.被引用次数：233
Yuan, M., & Lin, Y. (2007). On the non‐negative garrotte estimator. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 69(2), 143-161.被引用次数：218Gui, J., & Li, H. (2005). Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data. Bioinformatics, 21(13), 3001-3008.被引用次数：216
Goeman, J. J. (2010). L1 penalized estimation in the cox proportional hazards model. Biometrical Journal, 52(1), 70-84.被引用次数：223
Steyerberg, E. W., Borsboom, G. J., van Houwelingen, H. C., Eijkemans, M. J., & Habbema, J. D. F. (2004). Validation and updating of predictive logistic regression models: a study on sample size and shrinkage. Statistics in medicine, 23(16), 2567-2586.被引用次数：217
Kadane, J. B., & Lazar, N. A. (2004). Methods and criteria for model selection. Journal of the American statistical Association, 99(465), 279-290.被引用次数：218
Aliferis, C. F., Statnikov, A., Tsamardinos, I., Mani, S., & Koutsoukos, X. D. (2010). Local causal and markov blanket induction for causal discovery and feature selection for classification part i: Algorithms and empirical evaluation. The Journal of Machine Learning Research, 11, 171-234.被引用次数：214
Bondell, H. D., & Reich, B. J. (2008). Simultaneous regression shrinkage, variable selection, and supervised clustering of predictors with OSCAR. Biometrics, 64(1), 115-123.被引用次数：209
Fan, J., & Li, R. (2006). Statistical challenges with high dimensionality: feature selection in knowledge discovery. In Proceedings oh the International Congress of Mathematicians: Madrid, August 22-30, 2006: invited lectures (pp. 595-622).被引用次数：208

101-110
Journée, M., Nesterov, Y., Richtárik, P., & Sepulchre, R. (2010). Generalized power method for sparse principal component analysis. The Journal of Machine Learning Research, 11, 517-553.被引用次数：206
Chun, H., & Keleş, S. (2010). Sparse partial least squares regression for simultaneous dimension reduction and variable selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72(1), 3-25.被引用次数：206
Sardy, S., Bruce, A. G., & Tseng, P. (2000). Block coordinate relaxation methods for nonparametric wavelet denoising. Journal of computational and graphical statistics, 9(2), 361-379.被引用次数：203
Seeger, M. W. (2008). Bayesian inference and optimal design for the sparse linear model. The Journal of Machine Learning Research, 9, 759-813.被引用次数：203
Tibshirani, R., & Wang, P. (2008). Spatial smoothing and hot spot detection for CGH data using the fused lasso. Biostatistics, 9(1), 18-29.被引用次数：204
Leng, C., Lin, Y., & Wahba, G. (2006). A note on the lasso and related procedures in model selection. Statistica Sinica, 16(4), 1273.被引用次数：196
Wasserman, L., & Roeder, K. (2009). High dimensional variable selection.Annals of statistics, 37(5A), 2178.被引用次数：193
Meier, L., Van de Geer, S., & Bühlmann, P. (2009). High-dimensional additive modeling. The Annals of Statistics, 37(6B), 3779-3821.被引用次数：192
Li, R., & Liang, H. (2008). Variable selection in semiparametric regression modeling. Annals of Statistics, 36(1), 261.被引用次数：191
Zhang, H. H., Ahn, J., Lin, X., & Park, C. (2006). Gene selection using support vector machines with non-convex penalty. Bioinformatics, 22(1), 88-95.被引用次数：186
111-120
Lin, Y., & Zhang, H. H. (2006). Component selection and smoothing in multivariate nonparametric regression. The Annals of Statistics, 34(5), 2272-2297.被引用次数：185
d'Aspremont, A., Bach, F., & Ghaoui, L. E. (2008). Optimal solutions for sparse principal component analysis. The Journal of Machine Learning Research, 9, 1269-1294.被引用次数：187
Clemmensen, L., Hastie, T., Witten, D., & Ersbøll, B. (2011). Sparse discriminant analysis. Technometrics, 53(4).被引用次数：186
Wang, H., Li, G., & Jiang, G. (2007). Robust regression shrinkage and consistent variable selection through the LAD-Lasso. Journal of Business & Economic Statistics, 25(3), 347-355.被引用次数：181
Sauerbrei, W. (1999). The use of resampling methods to simplify regression models in medical statistics. Journal of the Royal Statistical Society: Series C (Applied Statistics), 48(3), 313-329.被引用次数：181
Carvalho, C. M., Polson, N. G., & Scott, J. G. (2010). The horseshoe estimator for sparse signals. Biometrika, asq017.被引用次数：179
Barron, A. R., Cohen, A., Dahmen, W., & DeVore, R. A. (2008). Approximation and learning by greedy algorithms. The annals of statistics, 64-94.被引用次数：177
Breheny, P., & Huang, J. (2011). Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. The annals of applied statistics, 5(1), 232.被引用次数：179
Liu, H., Lafferty, J., & Wasserman, L. (2009). The nonparanormal: Semiparametric estimation of high dimensional undirected graphs. The Journal of Machine Learning Research, 10, 2295-2328.被引用次数：172
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1), 1929-1958.被引用次数：173
121-130
Wang, H., & Leng, C. (2007). Unified LASSO estimation by least squares approximation. Journal of the American Statistical Association, 102(479).被引用次数：168
Huang, J., Horowitz, J. L., & Wei, F. (2010). Variable selection in nonparametric additive models. Annals of statistics, 38(4), 2282.被引用次数：166
Raskutti, G., Wainwright, M. J., & Yu, B. (2011). Minimax rates of estimation for high-dimensional linear regression over-balls. Information Theory, IEEE Transactions on, 57(10), 6976-6994.被引用次数：162
Kyung, M., Gill, J., Ghosh, M., & Casella, G. (2010). Penalized regression, standard errors, and Bayesian lassos. Bayesian Analysis, 5(2), 369-411.被引用次数：159
Teo, C. H., Vishwanthan, S. V. N., Smola, A. J., & Le, Q. V. (2010). Bundle methods for regularized risk minimization. The Journal of Machine Learning Research, 11, 311-365.被引用次数：157
Zou, H., & Yuan, M. (2008). Composite quantile regression and the oracle model selection theory. The Annals of Statistics, 1108-1126.被引用次数：157
Hesterberg, T., Choi, N. H., Meier, L., & Fraley, C. (2008). Least angle and ℓ1 penalized regression: A review. Statistics Surveys, 2, 61-93.被引用次数：154
Hans, C. (2009). Bayesian lasso regression. Biometrika, 96(4), 835-845.被引用次数：156
Negahban, S., & Wainwright, M. J. (2011). Estimation of (near) low-rank matrices with noise and high-dimensional scaling. The Annals of Statistics, 1069-1097.被引用次数：156
Fukumizu K, Bach F R, Jordan M I. Kernel dimension reduction in regression[J]. The Annals of Statistics, 2009: 1871-1905.被引用次数：152
131-140
Shalev-Shwartz, S., & Tewari, A. (2011). Stochastic methods for l 1-regularized loss minimization. The Journal of Machine Learning Research, 12, 1865-1892.被引用次数：152
Fan, J., & Song, R. (2010). Sure independence screening in generalized linear models with NP-dimensionality. The Annals of Statistics, 38(6), 3567-3604.被引用次数：152
Rothman, A. J., Levina, E., & Zhu, J. (2009). Generalized thresholding of large covariance matrices. Journal of the American Statistical Association, 104(485), 177-186.被引用次数：151
Lounici, K. (2008). Sup-norm convergence rate and sign concentration property of Lasso and Dantzig estimators. Electronic Journal of statistics, 2, 90-102.被引用次数：150
Mazumder, R., Friedman, J. H., & Hastie, T. (2011). SparseNet: Coordinate descent with nonconvex penalties. Journal of the American Statistical Association, 106(495).被引用次数：149
Fan, J. (1997). Comments on «wavelets in statistics: A review» by a. antoniadis. Journal of the Italian Statistical Society, 6(2), 131-138.被引用次数：149
Chen, Z., & Dunson, D. B. (2003). Random effects selection in linear mixed models. Biometrics, 59(4), 762-769.被引用次数：143
Wu, Y., & Liu, Y. (2009). Variable selection in quantile regression. Statistica Sinica, 19(2), 801.被引用次数：148
Wang, L., Li, H., & Huang, J. Z. (2008). Variable selection in nonparametric varying-coefficient models for analysis of repeated measurements. Journal of the American Statistical Association, 103(484), 1556-1569.被引用次数：139
Yuan, M., & Lin, Y. (2005). Efficient empirical Bayes variable selection and estimation in linear models. Journal of the American Statistical Association, 100(472).被引用次数：138
141-150
Lin, Y., & Zhang, H. H. (2006). Component selection and smoothing in smoothing spline analysis of variance models. Annals of Statistics, 34(5), 2272-2297.被引用次数：135
Pan, W., & Shen, X. (2007). Penalized model-based clustering with application to variable selection. The Journal of Machine Learning Research, 8, 1145-1164.被引用次数：136
Fan, J., Samworth, R., & Wu, Y. (2009). Ultrahigh dimensional feature selection: beyond the linear model. The Journal of Machine Learning Research, 10, 2013-2038.被引用次数：137
Belloni, A., & Chernozhukov, V. (2011). ℓ1-penalized quantile regression in high-dimensional sparse models. The Annals of Statistics, 39(1), 82-130.被引用次数：142
Fan, J., & Lv, J. (2011). Nonconcave penalized likelihood with NP-dimensionality. Information Theory, IEEE Transactions on, 57(8), 5467-5484.被引用次数：142
Lv, J., & Fan, Y. (2009). A unified approach to model selection and sparse recovery using regularized least squares. The Annals of Statistics, 3498-3528.被引用次数：138
Wang, H., Li, G., & Tsai, C. L. (2007). Regression coefficient and autoregressive order shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 69(1), 63-78.被引用次数：136
Kim, Y., Kim, J., & Kim, Y. (2006). Blockwise sparse regression. Statistica Sinica, 16(2), 375.被引用次数：131
Wang, L., Zhu, J., & Zou, H. (2006). The doubly regularized support vector machine. Statistica Sinica, 16(2), 589.被引用次数：133
Sauerbrei, W., Royston, P., & Binder, H. (2007). Selection of important variables and determination of functional form for continuous predictors in multivariable model building. Statistics in medicine, 26(30), 5512-5528.被引用次数：130
151-160
Chen, X., Lin, Q., Kim, S., Carbonell, J. G., & Xing, E. P. (2012). Smoothing proximal gradient method for general structured sparse regression. The Annals of Applied Statistics, 6(2), 719-752.被引用次数：136
Bach, F. R. (2008). Consistency of trace norm minimization. The Journal of Machine Learning Research, 9, 1019-1048.被引用次数：129
Fan, J., Feng, Y., & Wu, Y. (2009). Network exploration via the adaptive LASSO and SCAD penalties. The annals of applied statistics, 3(2), 521.被引用次数：131
Zhang, N. R., & Siegmund, D. O. (2007). A modified Bayes information criterion with applications to the analysis of comparative genomic hybridization data. Biometrics, 63(1), 22-32.被引用次数：130
Li, H., & Gui, J. (2006). Gradient directed regularization for sparse Gaussian concentration graphs, with applications to inference of genetic networks. Biostatistics, 7(2), 302-317.被引用次数：128
Biau, G. (2012). Analysis of a random forests model. The Journal of Machine Learning Research, 13(1), 1063-1095.被引用次数：134
Zhao, P., Rocha, G., & Yu, B. (2006). Grouped and hierarchical model selection through composite absolute penalties. Department of Statistics, UC Berkeley, Tech. Rep, 703.被引用次数：127
Huang, J., Ma, S., Xie, H., & Zhang, C. H. (2009). A group bridge approach for variable selection. Biometrika, 96(2), 339-355.被引用次数：127
Wang, H., Li, B., & Leng, C. (2009). Shrinkage tuning parameter selection with a diverging number of parameters. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 71(3), 671-683.被引用次数：130
Levina, E., Rothman, A., & Zhu, J. (2008). Sparse estimation of large covariance matrices via a nested Lasso penalty. The Annals of Applied Statistics, 2(1), 245-263.被引用次数：126
161-170
Xu, S. (2007). An empirical Bayes method for estimating epistatic effects of quantitative trait loci. Biometrics, 63(2), 513-521.被引用次数：125
Kanamori, T., Hido, S., & Sugiyama, M. (2009). A least-squares approach to direct importance estimation. The Journal of Machine Learning Research, 10, 1391-1445.被引用次数：133
Xu, Z., Zhang, H., Wang, Y., Chang, X., & Liang, Y. (2010). L1/2 regularization. Science China Information Sciences, 53(6), 1159-1169.被引用次数：127
James, G. M., Radchenko, P., & Lv, J. (2009). DASSO: connections between the Dantzig selector and lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 71(1), 127-142.被引用次数：127
Griffin, J. E., & Brown, P. J. (2010). Inference with normal-gamma prior distributions in regression problems. Bayesian Analysis, 5(1), 171-188.被引用次数：127
Simon, N., Friedman, J., Hastie, T., & Tibshirani, R. (2013). A sparse-group lasso. Journal of Computational and Graphical Statistics, 22(2), 231-245.被引用次数：130
Yuan, G. X., Chang, K. W., Hsieh, C. J., & Lin, C. J. (2010). A comparison of optimization methods and software for large-scale l1-regularized linear classification. The Journal of Machine Learning Research, 11, 3183-3234.被引用次数：124
Wang, L., Zhu, J., & Zou, H. (2008). Hybrid huberized support vector machines for microarray classification and gene selection. Bioinformatics, 24(3), 412-419.被引用次数：124
Wang, H., & Xia, Y. (2009). Shrinkage estimation of the varying coefficient model. Journal of the American Statistical Association, 104(486).被引用次数：124
Simon, N., Friedman, J., Hastie, T., & Tibshirani, R. (2011). Regularization paths for Cox’s proportional hazards model via coordinate descent. Journal of statistical software, 39(5), 1-13.被引用次数：126
171-180
Yuan, M., Ekici, A., Lu, Z., & Monteiro, R. (2007). Dimension reduction and coefficient estimation in multivariate linear regression. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 69(3), 329-346.被引用次数：127
Belloni, A., Chernozhukov, V., & Wang, L. (2011). Square-root lasso: pivotal recovery of sparse signals via conic programming. Biometrika, 98(4), 791-806.被引用次数：124
Raskutti, G., Wainwright, M. J., & Yu, B. (2010). Restricted eigenvalue properties for correlated Gaussian designs. The Journal of Machine Learning Research, 11, 2241-2259.被引用次数：121
van Houwelingen, H. C., Bruinsma, T., Hart, A. A., van't Veer, L. J., & Wessels, L. F. (2006). Cross‐validated Cox regression on microarray gene expression data. Statistics in medicine, 25(18), 3201-3216.被引用次数：119
Cai, T. T., Xu, G., & Zhang, J. (2009). On recovery of sparse signals via minimization. Information Theory, IEEE Transactions on, 55(7), 3388-3397.被引用次数：119
Donoho, D. L., Maleki, A., & Montanari, A. (2011). The noise-sensitivity phase transition in compressed sensing. Information Theory, IEEE Transactions on, 57(10), 6920-6941.被引用次数：119
Hansen, M. H., & Kooperberg, C. (2002). Spline Adaptation in Extended Linear Models (with comments and a rejoinder by the authors. Statistical Science, 17(1), 2-51.被引用次数：117
Witten, D. M., & Tibshirani, R. (2009). Covariance‐regularized regression and classification for high dimensional problems. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 71(3), 615-636.被引用次数：117
Peng, J., Zhu, J., Bergamaschi, A., Han, W., Noh, D. Y., Pollack, J. R., & Wang, P. (2010). Regularized multivariate regression for identifying master predictors with application to integrative genomics study of breast cancer. The annals of applied statistics, 4(1), 53.被引用次数：115
Li, Y., & Zhu, J. (2008). L1-norm quantile regression. Journal of Computational and Graphical Statistics, 17(1).被引用次数：115
181-190
Belloni, A., Chen, D., Chernozhukov, V., & Hansen, C. (2012). Sparse models and methods for optimal instruments with an application to eminent domain. Econometrica, 80(6), 2369-2429.被引用次数：117
Yuan, M. (2010). High dimensional inverse covariance matrix estimation via linear programming. The Journal of Machine Learning Research, 11, 2261-2286.被引用次数：116
Goeman, J. J., Van De Geer, S. A., & Van Houwelingen, H. C. (2006). Testing against a high dimensional alternative. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68(3), 477-493.被引用次数：114
Lockhart, R., Taylor, J., Tibshirani, R. J., & Tibshirani, R. (2014). A significance test for the lasso. Annals of statistics, 42(2), 413.被引用次数：113
Hastie, T., Taylor, J., Tibshirani, R., & Walther, G. (2007). Forward stagewise regression and the monotone lasso. Electronic Journal of Statistics, 1, 1-29.被引用次数：113
Koltchinskii, V. (2009). Sparsity in penalized empirical risk minimization. Annales de l’Institut Henri Poincaré - Probabilités et Statistiques (Vol. 45, No. 1, pp. 7-57).被引用次数：112
Fan, J., Feng, Y., & Song, R. (2011). Nonparametric independence screening in sparse ultra-high-dimensional additive models. Journal of the American Statistical Association, 106(494).被引用次数：112
Witten, D. M., & Tibshirani, R. (2011). Penalized classification using Fisher's linear discriminant. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73(5), 753-772.被引用次数：112
Dudík, M., Phillips, S. J., & Schapire, R. E. (2007). Maximum entropy density estimation with generalized regularization and an application to species distribution modeling. Journal of Machine Learning Research, 8(6).被引用次数：111
Tsiatis, A. A., Davidian, M., Zhang, M., & Lu, X. (2008). Covariate adjustment for two‐sample treatment comparisons in randomized clinical trials: A principled yet flexible approach. Statistics in medicine, 27(23), 4658-4677.被引用次数：110
191-200
Seaman, S. R., & White, I. R. (2013). Review of inverse probability weighting for dealing with missing data. Statistical methods in medical research, 22(3), 278-295.被引用次数：109
Parkhomenko, E., Tritchler, D., & Beyene, J. (2009). Sparse canonical correlation analysis with application to genomic data integration. Statistical Applications in Genetics and Molecular Biology, 8(1), 1-34.被引用次数：107
Kim, Y., Choi, H., & Oh, H. S. (2008). Smoothly clipped absolute deviation on high dimensions. Journal of the American Statistical Association, 103(484), 1665-1673.被引用次数：108
Bickel, P. J., Li, B., Tsybakov, A. B., van de Geer, S. A., Yu, B., Valdés, T., ... & van der Vaart, A. (2006). Regularization in statistics. Test, 15(2), 271-344.被引用次数：107
Liu, H., Han, F., Yuan, M., Lafferty, J., & Wasserman, L. (2012). High-dimensional semiparametric Gaussian copula graphical models. The Annals of Statistics, 40(4), 2293-2326.被引用次数：104
Lamarche, C. (2010). Robust penalized quantile regression estimation for panel data. Journal of Econometrics, 157(2), 396-408.被引用次数：104
Li, R., & Sudjianto, A. (2012). Analysis of computer experiments using penalized likelihood in gaussian kriging models. Technometrics.被引用次数：103
Inoue, A., & Kilian, L. (2008). How useful is bagging in forecasting economic time series? A case study of US consumer price inflation. Journal of the American Statistical Association, 103(482), 511-522.被引用次数：102
Leeb, H., & Pötscher, B. M. (2008). Sparse estimators and the oracle property, or the return of Hodges’ estimator. Journal of Econometrics, 142(1), 201-211.被引用次数：101

200名以后的论文点击这里可见

转载本文请联系原作者获取授权，同时请注明本文来自张慧铭科学网博客。
链接地址：https://m.sciencenet.cn/blog-752541-912555.html

上一篇：2015年新晋数理统计学会(IMS)会士名单——包括五位华人教授
下一篇：Lasso变量选择有关的统计方向高引论文清单(续)

Infinitely divisible(无穷可分)分享 http://blog.sciencenet.cn/u/a3141592653589 概率与数理统计,随机过程,金融数学,精算,大数据,机器学习,高维统计,金融统计,数学建模,学术资讯,书单

博文

高维稀疏统计推断,惩罚估计的专著及Lasso变量选择相关的高引论文

当前推荐数：1 推荐人：何胜美

该博文允许注册用户评论请点击登录评论 (0 个评论)

张慧铭

全部作者的精选博文

全部作者的其他最新博文

全部精选博文导读

相关博文

Infinitely divisible(无穷可分)分享 http://blog.sciencenet.cn/u/a3141592653589 概率与数理统计,随机过程,金融数学,精算,大数据,机器学习,高维统计,金融统计,数学建模,学术资讯,书单

博文

高维稀疏统计推断,惩罚估计的专著及Lasso变量选择相关的高引论文

当前推荐数：1 推荐人： 何胜美

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

张慧铭

全部作者的精选博文

全部作者的其他最新博文

全部精选博文导读

相关博文

当前推荐数：1 推荐人：何胜美

该博文允许注册用户评论请点击登录评论 (0 个评论)