Full Length Research Paper
Abstract
MicroRNAs (miRNAs) are a class of non-coding RNAs that are produced from miRNA precursors (pre-miRNAs) with stem-loop structure. At present, development of computational approach for pre-miRNA identification continues to be a challenging task, in which feature selection is greatly important. Here, we first extracted feature subsets by a hybrid algorithm of genetic algorithm (GA) and support vector machine (SVM) from 124 sequence and secondary structure features. Next, based on the high-frequency features taken from the feature subsets, we proposed a novel stepwise SVM method to identify the optimal feature combinations. The cooperative effect was found among different features in our study. Finally, we obtained 10 feature combinations with strong combined effect which possessed high classification performance for predicting pre-miRNAs. In external validation, all the 10 combinations could predict accurately over 13 pre-miRNAs from 16 new confirmed human pre-miRNAs in miRBase 14.0. The best one could reach 15 (93.75%), which significantly outperformed triplet-SVM (13, 81.25%) in predicting pre-miRNAs.
Key words: MicroRNA precursor, feature selection, genetic algorithm, support vector machine.
Abbreviation
miRNAs, MicroRNAs; pre-miRNAs, miRNA precursors; SVM,support vector machine; GA, genetic algorithm.
Copyright © 2025 Author(s) retain the copyright of this article.
This article is published under the terms of the Creative Commons Attribution License 4.0