|Table of Contents|

Research on Soybean Pre-Micro RNA Prediction Model Based on Recursive Feature Elimination and Random Forest Fusion Algorithm(PDF)

《大豆科学》[ISSN:1000-9841/CN:23-1227/S]

Issue:
2020年03期
Page:
401-405
Research Field:
Publishing date:

Info

Title:
Research on Soybean Pre-Micro RNA Prediction Model Based on Recursive Feature Elimination and Random Forest Fusion Algorithm
Author(s):
AN Yu CHEN Gui-fen LI Jing
(College of Information Technology, Jilin Agricultural University, Changchun 130118, China)
Keywords:
Soybean Pre-MicroRNA Recursive Feature Elimination(RFE) Random Forest(RF) Prediction model
PACS:
-
DOI:
10.11861/j.issn.1000-9841.2020.03.0401
Abstract:
With the continuous in-depth research on the biological regulatory effects of small genes in soybean, the use of data mining technology to effectively predict the pre-MicroRNA of soybean has become an important development direction in this field. To solve the problem that conventional Random Forest (RF) algorithm has low recognition accuracy in pre-MicroRNA prediction model, this study proposed and constructed a soybean pre-microRNA prediction model based on Recursive Feature Elimination (RFE) and RF fusion algorithm. Firstly, we used the RFE method to select the optimal feature subset of soybean pre-MicroRNA sequences. Then, we constructed a prediction model of soybean pre-MicroRNA based on RF algorithm. Finally, we compared the prediction results of the RFE-RF fusion model with the prediction results of the single RF and Support Vector Machine(SVM) classification model. The results showed that the accuracy of the soybean Pre-MicroRNA prediction model constructed after fusion was significantly improved, reaching 84.62%, 17.02% higher than the model constructed by SVM algorithm, and 14.58% higher than the model constructed by RF algorithm alone. This method provides a new idea for the prediction of pre-MicroRNA genes in soybean.

References:

[1]Bartel D P. MicroRNAs: Genomics,biogenesis,mechanism, and function[J]. Cell, 2004, 116: 281- 297.[2]Ambros V. The functions of animal MicroRNAs[J]. Nature, 2004, 431(76): 350-352.[3]Reinhart B J, Weinstein E G. MicroRNAs in plant[J]. Gene Development, 2002, 16(13): 1616-1626.[4]金伟波, 李楠楠, 吴方丽, 等. 水稻MicroRNA的预测及实验验证[J].中国生物化学与分子生物学报, 2007, 23(9): 743-750. (Jin W B, Li N N, Wu F L, et al. Prediction and experimental verification of rice MicroRNA [J]. Chinese Journal of Biochemistry and Molecular Biology, 2007, 23 (9): 743-750.)[5]金伟波. 基于支持向量机方法的植物miRNA预测及小麦miRNA的克隆[D]. 杨凌: 西北农林科技大学, 2007. (Jin W B. Prediction of miRNA in plants and cloning of miRNA in wheat based on support vector machine [D]. Yangling: North West Agriculture and Forestry University, 2007.)[6]刘永鑫, 韩英鹏, 常玮, 等. 一种适合大豆MicroRNA鉴定的RT-PCR方法[J].大豆科学, 2009, 28(4): 600-604. (Liu Y X, Han Y P, Chang W, et al. A RT-PCR method suitable for identification of soybean MicroRNA [J]. Soybean Science,2009, 28(4):600-604.)[7]陈旭. 玉米microRNA的计算机预测与克隆及在干旱下的差异表达分析[D]. 雅安: 四川农业大学,2009.(Chen X. Computer prediction and cloning of maize microRNA and differential expression analysis in drought [D]. Ya′an: Sichuan Agricultural University, 2009.)[8]Huang Y, Zou Q, Sun X H, et al. Computational identification of microRNAs and their targets in perennoal ryegrass (Lolium perenne)[J]. Applied Biochemistry and Biotechnology, 2014, 173(4): 1011-1122.[9]李小平,曾庆发,赵娟.大豆生长素响应因子GmARF16器官表达特征及抗降解表达载体的构建[J]. 大豆科学, 2014, 33(5):661-666. (Li X P, Zeng Q F, Zhao J. Expression characteristics of soybean auxin response factor GmARF16 organ and construction of anti-degradation expression vector[J]. Soybean Science, 2014, 33(5): 661-666.)〖ZK)〗[10]倪志勇,于月华,陈全家, 等. 大豆gma-miR1510a生物信息学分析及人工microRNA植物表达载体构建[J]. 大豆科学, 2016, 35(2): 239-244. (Ni Z Y, Yu Y H, Chen Q J, et al. Bioinformatics analysis of soybean gma-miR1510a and construction of artificial microRNA expression vectors [J]. Soybean Science, 2016, 35(2): 239-244.)[11]王颖, 李金, 王磊, 等. 基于机器学习的microRNA预测方法研究进展[J].计算机科学,2015,42(2):7-13.(Wang Y, Li J, Wang L, et al. Research progress of microRNA prediction method based on machine learning [J]. Computer Science, 2015, 42(2): 7-13.)[12]Jiang P, Wu H, Wang W, et al. MiPred:Classification of real and pseudo MicroRNAs precursors using random forest prediction model with combined features [J]. Nucleic Acids Research, 2007, 35: 339-343.[13]Huang K Y, Lee T Y, Teng Y C, et al. ViralmiR: A support-vector-machine-based method for predicting viral microRNA precursors[J]. BMC Bioinformatics, 2015,16(1): 1-7.[14]Guyon I, Weston J, Barnhill S, et al. Gene selection for cancer classification using support vector machines[J]. Machine Learning, 2002, 46(1-3): 389-422.[15]Breiman L. Random forests[J]. Machine Learning, 2001, 45(1): 5-32.[16]吴辰文,梁靖涵,王伟,等.基于递归特征消除方法的随机森林算法[J].统计与决策,2017(21):60-63.(Wu C W, Liang J H, Wang W, et al. Random forest algorithm based on recursive feature elimination [J]. Statistics and Decision Making, 2017(21): 60-63.)[17]刘笑笑. 基于RF-RFE算法的森林生物量遥感特征选择方法研究[D]. 泰安: 山东农业大学,2016. (Liu X X. Research on forest biomass remote sensing feature selection based on RF-RFE algorithm[D]. Taian: Shandong Agricultural University,2016.)[18]魏小敏,徐彬,关佶红.基于递归特征消除法的蛋白质能量热点预测[J].山东大学学报(工学版), 2014,44(2):12-20. (Wei X M, Xu B, Guan J H. Prediction of protein energy hotspots based on recursive feature elimination[J]. Journal of Shandong University (Engineering Science Edition),2014,44(2):12-20.)[19]董红斌, 石丽, 李涛.一种改进的microRNA预测模型集成方法[J].计算机科学,2018,45(2): 69-75.(Dong H B, Shi L, Li T. An improved integrated method for microRNA prediction model[J]. Computer Science, 2008,45(2):69-75.)[20]林云光.基于计算智能方法的microRNA预测[D]. 济南: 济南大学, 2013.(Lin Y G. MicroRNA prediction based on computational intelligence[D].Jinan: Jinan University ,2013.)[21]张璇.基于生物异构网络的疾病microRNA预测研究[D]. 厦门: 厦门大学,2017. (Zhang X. Prediction of disease microRNA based on biological heterogeneous network[D]. Xiamen: Xiamen University,2017.)

Memo

Memo:
-
Last Update: 2020-07-14