DEVELOPMENT OF COST EFFECTIVE MODELS FOR ENHANCING INTEROPERABILITY AND BUILDING PRODUCTIVE MODELS FOR CLASS DETERMINATION IN DATA MINING

Prof.Dr.G.Manoj Someswar, Mukiri Ratna Raju

Abstract


Dimensionality diminishment through the determination of an applicable quality (component) subset may deliver different advantages to the real information mining step, for example, execution change, by easing the scourge of dimensionality and enhancing speculation abilities, accelerate by lessening the computational exertion, enhancing model interpretability and decreasing expenses by maintaining a strategic distance from "costly" elements. These objectives are not completely perfect with each other. Consequently, there exist a few component determination issues, as indicated by the particular objectives. In our research paper, include determination issues are characterized into two fundamental classifications: finding the ideal prescient components (for building productive expectation models) and discovering all the applicable elements for the class quality.

 

From a simply hypothetical point of view, the determination of a specific trait subset is not of enthusiasm, since the Bayes ideal forecast control is monotonic, consequently including more components can't diminish precision [Koh97]. Practically speaking, be that as it may, this is really the objective of highlight choice: choosing the most ideal property subset, given the information and learning calculation qualities, (for example, inclinations, heuristics). Regardless of the possibility that there exist certain associations between the characteristics in the subset returned by a few strategies and the hypothetically significant properties, they can't be summed up to shape a useful technique, material to any learning calculation and dataset. This is on account of the data expected to register the level of importance of a characteristic (i.e. the genuine dissemination) is not by and large accessible in commonsense settings.

Full Text:

PDF

References


. [Fan00] Fan W., Stolfo S., Zhang J. and Chan P. (2000). AdaCost: Misclassification cost-sensitive boosting. Proceedings of the 16th International Conference on Machine Learning, pp. 97–105.

. [Far09] Farid D.M., Darmont J., Harbi N., Hoa N.H., Rahman M.Z. (2009). Adaptive Network Intrusion Detection Learning: Attribute Selection and Classification. World Academy of Science, Engineering and Technology 60. [Faw97] Fawcett T. and Provost F.J. (1997). Adaptive Fraud Detection. Data Mining and Knowledge Discovery, 1(3), pp. 291-316

. [Faw06] Fawcett T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27, 861–874.

. [Fay96] Fayyad U.M., Piatetsky-Shapiro G. and Smyth P. (1996). From Data Mining to Knowledge Discovery in Databases. Artificial Intelligence Magazine, 17(3): 37-54.

. [Fei11] Feier M., Lemnaru C. and Potolea R. (2011). Solving NP-Complete Problems on the CUDA Architecture using Genetic Algorithms. In Proceedings of ISPDC 2011, pp. 278-

. [Fir09] Firte A.A., Vidrighin B.C. and Cenan C. (2009). Intelligent component for adaptive E-

. learning systems. Proceedings of the IEEE 5th International Conference on Intelligent Computer Communication and Processing. 27-29 August 2009, Cluj-Napoca, Romania. pp. 35-38.

. [Fir10] Firte L., Lemnaru C. and Potolea R. (2010). Spam detection filter using KNN algorithm and resampling. Proceedings of the 2010 IEEE 6th International Conference on Intelligent Computer Communication and Processing, pp.27-33.

. [Fre97] Freund Y. and Shapire R. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1):119–139.

. [Gar09] García S. and Herrera F. (2009). Evolutionary Undersampling for Classification with Imbalanced Datasets: Proposals and Taxonomy. Evolutionary Computation, Vol. 17, No. 3. pp. 275-306.

. [Ged03] Gediga G. and Duntsch I. (2003). Maximum consistency of incomplete data via noninvasive imputation. Artificial intelligence Review, vol. 19, pp. 93-107.

. 11. Gen89] Gennari, J.H., Langley P. and Fisher D. (1989). Models of incremental concept formation. Artificial Intelligence, 40, pp.11-61.

. [Gog10] Gogoi P., Borah B., Bhattacharyya D.K., (2010). Anomaly Detection Analysis of Intrusion Data using Supervised & Unsupervised Approach. Journal of Convergence Information Technology, vol. 5, no. 1, pp. 95-110

. [Gre86] Grefenstette, J.J. (1986). Optimization of control parameters for genetic algorithms. IEEE Transactions on Systems, Man, and Cybernetics, 16, 122-128.

. [Grz02] Grzymala-Busse J.W., Grzymala-Busse W.J. and Goodwin L.K. (2002). A comparison of three closest fit approaches to missing attribute values in preterm birth data. International journal of intelligent systems, vol. 17, pp. 125-134

. [Grz05] Grzymala-Busse J.W., Stefanowski J. and Wilk S. (2005). A comparison of two approaches to data mining from imbalanced data. Journal of Intelligent Manufacturing, 16. Springer Science+Business Media Inc. pp. 565–573


Refbacks

  • There are currently no refbacks.