nav emailalert searchbtn searchbox tablepage yinyongbenwen piczone journalimg journalInfo journalinfonormal searchdiv searchzone qikanlogo popupnotification paper paperNew
Abstract:

Random Forests is a statistical learning theory,using bootsrap re-sampling method form sample sets,and then combining the tree predictors by majority voting so that each tree is grown using a new bootstrap training set.It is widely applied in medicine,bioinformatics,economics and other fields,because of its high prediction accuracy,good tolerance of noisy data,and the law of large numbers they do not overfit.In this paper we first introduce the concept of random forest and the latest research,then provide some important aspects of applications in economics,and a summary is given in the final section.

References

[1]Breiman L.Bagging Preditors[J].Machine Learning,1996,24(2).

[2]Dietterich T.An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees:Bagging,Boosting and Randomization[J].Machine Learning,2000,40(2).

[3]Ho T K.The Random Subspace Method for Constructing Decision Forests[J].Trans.on Pattern Analysis and MachineIntelligence,1998,20(8).

[4]Amit Y,Geman D.Shape Quantization and Recognition with Randomized Trees[J].Neural Computation,1997,9(7).

[5]Breiman L.Random Forests[J].Machine Learning,2001,45(1).

[6]Tibshirani R.Bias,Variance,and Prediction Error for Classification Rules[C].Technical Report,Statistics Department,University of Toronto,1996.

[7]Wolpert D H,Macready W G.An Efficient Method to Estimate Bagging’s Generalization Error[J].Machine Learning,1999,35(1).

[8]Breiman L.Out-of-bag Estimation[EB/OL].[2010-06-30].http//stat.berkeley.edu/pub/users/breiman/OOB estimation.ps.

[9]Breiman L.Randomizing Outputs to Increase Prediction Accuracy[J].Machine Learning,2000,40(3).

[10]Ishwaran H,Kogalur U B,Blackstone E H,Lauer M S.Random Survival Forests[J].The Annals of Applied Statis-tics,2008,2(3).

[11]Ishwaran H,Udaya B,Kogalur.Consistency of Random Survival Forests[J].Statistics and Probability Letters,2010,80(13/14).

[12]Nicolai,Meinshausen.Quantile Regression Forests[J].Journal of Machine Learning Research,2006,7(6).

[13]Lin Y,Jeon Y.Random Forests and Adaptive Nearest Neighbors[J].Journal of the American Statistics Assoccation,2006,101(474).

[14]Sexton J,Laake P.Standard Errors for Bagged and Random Forest Estimators[J].Computational Statistics&DataAnalysis,2009,53(1).

[15]Brence J R,Brown D E.Improving the Robust Random Forest Regression Algorithm[R].Systems and InformationEngineering Technical Papers,Department of Systems and Information Engineering,University of Virginia,2006.

[16]Parkhurst D F,Brenner K P,Dufour A P,Wymer L J.Indicator Bacteria at Five Swimming Beaches—Analysis UsingRandom Forests[J].Water Research,2005,39(7).

[17]Smith A,Sterba-Boatwright B,Mott J.Novel Application of a Statistical Technique,Random Forests,in a BacterialSource Tracking Study[J].Water Research,2010,44(14).

[18]Perdiguero-Alonso D,Montero F E,A Kostadinova,Raga J A,Barrett J.Random Forests,a Novel Approach forDiscrimination of Fish Populations Using Parasites as Biological Tags[J].International Journal for Parasitology,2008,38(12).

[19]Gislason P O,Benediktsson J A,Sveinsson J R.Random Forests for Land Cover Classification[J].Pattern RecognitionLetters,2006,27(4).

[20]Jan P,Bernard D B,Niko E C V,Roeland S,Sven D,Piet D B,Willy H.Random Forests as a Tool for EcohydrologicalDistribution Modelling[J].Ecological Modelling,2007,207(2/4).

[21]Lee S L A,Kouzania A Z,Hu E J.Random Forest Based Lung Nodule Classification Aided by Clustering[J].Computerized Medical Imaging and Graphics,2010,34(7).

[22]Diaz-Uriate R,Andres S A D.Gene Selection and Classification of Microarray Data Using Random Forest[J].BMCBioinformatics,2006,7(3).

[23]Chen X W,Liu M.Prediction of Protein-protein Interactions Using Random Decision Forest Framework[J].Bioinformatics,2006,21(24).

[24]Pal M.Random Forest Classifier for Remote Sensing Classification[J].Remote Sens,2005,26(1).

[25]Ham J,Chen Y C,Crawford M P,Ghosh J.Investigation of the Random Forest Framework for Classification ofhyperspectral Data[J].IEEE Trans.Geosci.Remote Sens,2005,43(3).

[26]Gislason P O,Benediktsson J A,Sveinsson J R.Random Forests for Land Cover Classification[J].Pattern Recogn.Lett,2006,27(4).

[27]Xu P,Jelinek F.Random Forests and the Data Sparseness Problem in Language Modeling[J].Computer Speech&Language,2007,21(1).

[28]Auret L,Aldrich C.Change Point Detection in Time Series Data with Random Forests[J].Control Engineering Practice,2010,18(8).

[29]Larivière B,Poel D V D.Predicting Customer Retention and Profitability by Using Random Forests and RegressionForests Techniques[J].Expert Systems with Applications,2005,29(2).

[30]Xie Y,Li X,Ngai E W T,Wei Y Y.Customer Churn Prediction Using Improved Balanced Random Forests[J].ExpertSystems with Applications,2009,36(3).

[31]Coussement K,Poel D V D.Churn Prediction in Subscription Services:An Application of Support Vector MachinesWhile Comparing Two Parameter-Selection Techniques[J].Expert Systems with Applications,2008,34(1).

[32]Burez J,Poel D V D.Handling Class Imbalance in Customer Churn Prediction[J].Expert Systems with Applications,2009,36(3).

[33]Coussement K,Poel D V D.Improving Customer Attrition Prediction by Integrating Emotions from Client/CompanyInteraction Emails and Evaluating Multiple Classifiers[J].Expert Systems with Applications,2009,36(3).

[34]Buckinx W,Verstraeten G,Poel D V D.Predicting Customer Loyalty Using the Internal Transactional Database[J].Expert Systems with Applications,2007,32(1).

[35]Figini S,Fantazzini D.Random Survival Forests Models for SME Credit Risk Measurement[J].Methodology andComputing in Applied Probability,2009,11(1).

[36]Yasushi U,Hiroyuki M.Credit Risk Evaluation of Power Market Players with Random Forest[J].Transactions onPower and Energy,2008,128(1).

[37]林成德,彭国兰.随机森林在企业信用评估指标体系确定中的应用[J].厦门大学学报:自然科学版,2007,46(2).

[38]方匡南,朱建平.基于随机森林方法的基金超额收益方向预测与交易策略研究[J].经济经纬,2010(2).

[39]刘微,罗林开,王华珍.基于随机森林的基金重仓股预测[J].福州大学学报:自然科学版,2008,36(1).

[40]Keely L C,Tan C M.Understanding Preferences for Income Redistribution[J].Journal of Public Economics,2008,92(516).

[41]Lessmann S,Sung M-C,Johnson J E V.Alternative Methods of Predicting Competitive Events:An Application inHorserace Betting Markets[J].International Journal of Forecasting,2010,26(3).

[42]Verikas A,Gelzinis A,Bacauskiene M.Mining Data with Random Forests:A Survey and Results of New Tests[J].Pattern Recognition,2011,44(2).

Basic Information:

China Classification Code:O212.2

Citation Information:

[1]FANG Kuang-nana,b,WU Jian-bina,ZHU Jian-pinga,b,SHIA Bang-changa,b(a.Department of Statistics,School of Economics,b.Data Mining Center,Xiamen University,Xiamen 361005,China).A Review of Technologies on Random Forests[J].Journal of Statistics and Information,2011,26(03):32-38.

Fund Information:

中央高校基本科研业务费专项资金《基于数据挖掘的数据质量管理研究》(2010221040);; 国家统计局重点项目《金融风险中的统计方法》(2009LZ045)

quote

GB/T 7714-2015
MLA
APA
Search Advanced Search