The relevant parameters in the experiments are set as follows: fuzzification parameter m is set to 2 that is the most frequently used value, threshold value is ?= 10−6, the size of nearest neighbors q is set to 8.
5.1. Simple tests of missing single attribute value
This group of tests is designed to test whether there exist any missing values with such a feature, namely, when the prior distribution information of missing values is introduced by adopting maximum likelihood criterion, appropriate guidance can also be introduced for the OCS strategy. We use three real-world data sets, namely, Wholesale Customers, Seeds and Wine [44,45] to carry out this XL388 group of tests. Incomplete data sets are medulla artificially generated on the basis of the above complete data sets by randomly selecting one component to be designated as missing.
5.1.1. Wholesale Customers data set
The Wholesale Customers data set contains 440 six-dimensional vectors, which refers to clients of a wholesale distributor. It includes the annual spending in monetary units on diverse product categories. The data set can be divided into 2 categories according to the Channel index, in which 298 samples belong to the first category and the rest 142 samples belong to the second. We perform 10 experiments on 10 incomplete Wholesale Customers data sets, and the average experimental results are listed in Table 1.
5.1. Simple tests of missing single attribute value
This group of tests is designed to test whether there exist any missing values with such a feature, namely, when the prior distribution information of missing values is introduced by adopting maximum likelihood criterion, appropriate guidance can also be introduced for the OCS strategy. We use three real-world data sets, namely, Wholesale Customers, Seeds and Wine [44,45] to carry out this XL388 group of tests. Incomplete data sets are medulla artificially generated on the basis of the above complete data sets by randomly selecting one component to be designated as missing.
5.1.1. Wholesale Customers data set
The Wholesale Customers data set contains 440 six-dimensional vectors, which refers to clients of a wholesale distributor. It includes the annual spending in monetary units on diverse product categories. The data set can be divided into 2 categories according to the Channel index, in which 298 samples belong to the first category and the rest 142 samples belong to the second. We perform 10 experiments on 10 incomplete Wholesale Customers data sets, and the average experimental results are listed in Table 1.