Fig. 1From: Using amino acid features to identify the pathogenicity of influenza B virusFlowchart of pathogenicity identification of IBV. The 40 signature positions based on entropy were first screened after data were downloaded and cleaned. Six encoding methods of amino acids with changeable parameters were used to extract features. Then, 67 descriptors were proposed, and two types of informative outputs from the RF method were obtained to be further optimized with the mRMR algorithm and the SFS strategy. Each strain was finally represented by two optimized informative features with the low dimension ‘class’ and ‘prob.’ These optimal subsets were used to construct predictive modelsBack to article page