Article details
Title: Extreme Data Mining: Inference from Small Datasets
Author(s):  Răzvan Andonie;  
Keywords:  


CITE THIS PAPER AS:
Andonie R., Extreme Data Mining: Inference from Small Datasets, International Journal of Computers Communications & Control, ISSN 1841-9836, 5(3):280-291, 2010.
Abstract:  
Neural networks have been applied successfully in many fields. However,satisfactory results can only be found under large sample conditions. When it comesto small training sets, the performance may not be so good, or the learning task caneven not be accomplished. This deficiency limits the applications of neural networkseverely. The main reason why small datasets cannot provide enough information isthat there exist gaps between samples, even the domain of samples cannot be ensured.Several computational intelligence techniques have been proposed to overcome thelimits of learning from small datasets.We have the following goals: i. To discuss the meaning of "small" in the context ofinferring from small datasets. ii. To overview computational intelligence solutionsfor this problem. iii. To illustrate the introduced concepts with a real-life application.
Introduction:  
Conclusions:  
References:  
[1] R. Andonie, L. Fabry-Asztalos, S. Abdul-Wahid, C. Collar, and N. Salim, “An integrated softcomputing approach for predicting biological activity of potential HIV-1 protease inhibitors,” inProceedings of the IEEE International Joint Conference on Neural Networks (IJCNN 2006), Vancouver,BC, Canada, July 16-21 2006, pp. 7495–7502.
[2] L. Fabry-Asztalos, R. Andonie, C. Collar, S. Abdul-Wahid, and N. Salim, “A genetic algorithmoptimized fuzzy neural network analysis of the affinity of inhibitors for HIV-1 protease,” Bioorganicand Medicinal Chemistry, vol. 16, pp. 2903–2911, 2008.
[3] R. Andonie, L. Fabry-Asztalos, C. B. Abdul-Wahid, S. Abdul-Wahid, G. I. Barker, and L. C. Magill,“Fuzzy ARTMAP prediction of biological activities for potential HIV-1 protease inhibitors using asmall molecular dataset,” IEEE/ACM Transactions on Computational Biology and Bioinformatics,vol. 99, no. PrePrints, 2009.
[4] R. Andonie and L. Sasu, “Fuzzy ARTMAP with input relevances,” IEEE Transactions on NeuralNetworks, vol. 17, pp. 929–941, 2006.
[5] G. A. Carpenter, S. Grossberg, N. Markuzon, J. H. Reynolds, and D. B. Rosen, “Fuzzy ARTMAP: Aneural network architecture for incremental supervised learning of analog multidimensional maps,”IEEE Transactions on Neural Networks, vol. 3, no. 5, pp. 698–713, 1992.
[6] S. Verzi, G. Heileman, M. Georgiopoulos, and G. Anagnostopoulos, “Universal approximationwith fuzzy art and fuzzy ARTMAP,” in Proceedings of the IEEE International Joint Conference onNeural Networks (IJCNN ’03), vol. 3, Portland, Oregon, 20-24 July 2003, pp. 1987–1992.
[7] R. Andonie, L. Fabry-Asztalos, C. Collar, S. Abdul-Wahid, and N. Salim, “Neuro-fuzzy predictionof biological activity and rule extraction for HIV-1 protease inhibitors,” in Proceedings ofthe IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology(CIBCB’05), 2005, pp. 113–120.
[8] R. Andonie, L. Fabry-Asztalos, L. Magill, and S. Abdul-Wahid, “A new Fuzzy ARTMAP approachfor predicting biological activity of potential HIV-1 protease inhibitors,” in Proceedings of the IEEEInternational Conference on Bioinformatics and Biomedicine (BIBM 2007), I. C. S. Press, Ed., SanJose, CA, 2007, pp. 56–61.
[9] R. Andonie, “Inference from small training sets - a computational intelligence perspective,” Universityof Ulster, Jordanstown, Nothern Ireland, United Kingdom, invited talk, June 2008.
[10] R. Andonie, L. Fabry-Asztalos, B. Crivat, S. Abdul-Wahid, and B. Abdul-Wahid, “Fuzzy ARTMAPrule extraction in computational chemistry,” in IJCNN’09: Proceedings of the 2009 InternationalJoint Conference on Neural Networks. IEEE, 2009, pp. 2961–2967.
[11] R. Andonie, “Extreme data mining: Inference from small datasets,” National University of Ireland,Maynooth, Ireland, invited talk, June 2008.
[12] ——, “How to learn from small training sets,” Dalle Molle Institute for Artificial Intelligence (IDSIA),Manno-Lugano, Switzerland, invited talk, September 2009.
[13] V. Vapnik, Statistical Learning Theory. New York: Wiley, 2000.
[14] J. L. Balcázar and R. V. Book, “Sets with small generalized Kolmogorov complexity,” Acta Inf.,vol. 23, no. 6, pp. 679–688, 1986.
[15] A. Ambainis, “Application of Kolmogorov complexity to inductive inference with limited memory,”in ALT ’95: Proceedings of the 6th International Conference on Algorithmic Learning Theory.London, UK: Springer-Verlag, 1995, pp. 313–318.
[16] A. Ambainis, K. Apsitis, C. Calude, R. Freivalds, M. Karpinski, T. Larfeldt, I. Sala, andJ. Smotrovs, “Effects of Kolmogorov complexity present in inductive inference as well,” in ALT’97: Proceedings of the 8th International Conference on Algorithmic Learning Theory. London,UK: Springer-Verlag, 1997, pp. 244–259.
[17] J.-L. Yuan and T. Fine, “Neural-network design for small training sets of high dimension,” IEEETnansactions on Neural Networks, vol. 9, pp. 266–280, 1998.
[18] J.-L. Yuan, “Bootstrapping nonparametric feature selection algorithms for mining small data sets,”in Proceedings of the International Joint Conference on Neural Networks (IJCNN), 1999, pp. 2526– 2529.
[19] C. Huang and C. Moraga, “A diffusion-neural-network for learning from small samples,” Interna-tional Journal of Approximate Reasoning, vol. 35, pp. 137–161, 2004.
[20] R. Mao, H. Zhu, L. Zhang, and A. Chen, “A new method to assist small data set neural networklearning,” in Proceedings of the Sixth International Conference on Intelligent Systems Design andApplications (ISDA’06), 2006, pp. 17–22.
[21] D.-C. Li, C.-S. Wu, T. T.-I., and L. Y.-S., “Using mega-trend-diffusion and artificial samples insmall data set learning for early flexible manufacturing system scheduling knowledge,” Computersand Operations Research, vol. 34, pp. 966–982, 2007.
[22] D.-C. Li, C.-W. Yeh, T.-I. Tsai, Y.-H. Fang, and S. Hu, “Acquiring knowledge with limited experience,”Expert Systems, vol. 24, pp. 162–170, 2007.
[23] D.-C. Li, C.-S. Wu, T.-I. Tsai, and F. M. Chang, “Using mega-fuzzification and data trend estimationin small data set learning for early FMS scheduling knowledge,” Comput. Oper. Res., vol. 33,no. 6, pp. 1857–1869, 2006.
[24] T.-I. Tsai and D.-C. Li, “Approximate modeling for high order non-linear functions using smallsample sets,” Expert Syst. Appl., vol. 34, no. 1, pp. 564–569, 2008.
[25] D.-C. Li and C.-W. Yeh, “A non-parametric learning algorithm for small manufacturing data sets,”Expert Syst. Appl., vol. 34, no. 1, pp. 391–398, 2008.
[26] D.-C. Li and C.-W. Liu, “A neural network weight determination model designed uniquely for smalldata set learning,” Expert Syst. Appl., vol. 36, no. 6, pp. 9853–9858, 2009.
[27] I. V. Tetko, A. I. Luik, and G. I. Poda, “Application of neural networks in structure-activity relationshipsof a small number of molecules,” J. Med. Chem., vol. 36, pp. 811–814, 1993.
[28] D. Hecht and G. Fogel, “High-throughput ligand screening via preclustering and evolved neuralnetworks,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 4, pp. 476–484, 2007.
[29] M. Cheung, S. Johnson, D. Hecht, and G. Fogel, “Quantitative structure-property relationshipsfor drug solubility prediction using evolved neural networks,” in Proceedings of the IEEE WorldCongress on Computational Intelligence, 2008, pp. 688–693.
[30] H. Lohr, Sampling: Design and Analysis. Duxbury Press, 1999.
[31] J. Gamez, F. Modave, and O. Kosheleva, “Selecting the most representative sample is NP-hard:Need for expert (fuzzy) knowledge,” in Fuzzy Systems, 2008. FUZZ-IEEE 2008. (IEEE WorldCongress on Computational Intelligence). IEEE International Conference on, June 2008, pp. 1069–1074.
[32] L. Holmstrom and P. Koistinen, “Using additive noise in backpropagation training,” IEEE Transac-tions on Neural Networks, vol. 3, pp. 24–38, 1992.
[33] C. Wang and J. C. Principe, “Training neural networks with additive noise in the desired signal,”IEEE Transactions on Neural Networks, vol. 10, pp. 1511–1517, 1995.
[34] K. Wang, J. Yang, G. Shi, and Q. Wang, “An expanded training set based validation method toavoid overfitting for neural network classifier,” International Conference on Natural Computation,vol. 3, pp. 83–87, 2008.
[35] G. N. Karystinos and D. A. Pados, “On overfitting, generalization, and randomly expanded trainingsets,” IEEE Transactions on Neural Networks, vol. 5, pp. 1050–1057, 2000.
[36] Y. Liu, J. A. Starzyk, and Z. Zhu, “Optimized approximation algorithm in neural networks withoutoverfitting,” IEEE Transactions on Neural Networks, vol. 19, no. 6, pp. 983–995, 2008.
[37] S. Bos and E. Chug, “Using weight decay to optimize the generalization ability of a perceptron,” inProceedings of the 1996 International Conference on Neural Networks. IEEE, 1996, pp. 241–246.
[38] K. Mahdaviani, H. Mazyar, S. Majidi, and M. H. Saraee, “A method to resolve the overfittingproblem in recurrent neural networks for prediction of complex systems’ behavior,” in IJCNN’08:Proceedings of the 2008 International Joint Conference on Neural Networks, 2008, pp. 3723–3728.
[39] R. Reed, “Pruning algorithms - a survey,” IEEE Transactions on Neural Networks, vol. 4, pp. 740–747, 1993.
[40] T.-Y. Kwok and D.-Y. Yeung, “Constructive algorithms for structure learning in feedforward neuralnetworks for regression problems,” IEEE Transactions on Neural Networks, vol. 8, pp. 630–645,1997.
[41] L. Prechelt, “Automatic early stopping using cross validation: Quantifying the criteria,” NeuralNetworks, vol. 11, pp. 761–767, 1998.
[42] I. Dagher, M. Georgiopoulos, G. Heileman, and G. Bebis, “Ordered Fuzzy ARTMAP: a FuzzyARTMAP algorithm with a fixed order of pattern presentation,” in Proceedings of the IEEE Inter-national Joint Conference on Neural Networks (IJCNN 1998), IEEE World Congress on Computa-tional Intelligence, Anchorage, Alaska, 1998, pp. 1717–1722.
[43] I. Dagher, M. Georgiopoulos, G. L. Heileman, and G. Bebis, “An ordering algorithm for patternpresentation in Fuzzy ARTMAP that tends to improve generalization performance,” IEEE Trans-actions on Neural Networks, vol. 10, pp. 768–778, 1999.
[44] S. Tan, M. Rao, and C. P. Lim, “A hybrid neural network classifier combining ordered FuzzyARTMAP and the dynamic decay adjustment algorithm,” Soft Computing, vol. 12, pp. 765–775,2008.
[45] J. Tou and R. Gonzales, Pattern recognition principles. Reading, MA: Addison-Wesley, 1976
File link :  To download full article text in PDF format click here