ゲノム情報科学研究教育機構  アブストラクト
Date Feb 24, 2017
Speaker Prof. Einoshin Suzuki, Graduate School of Information Science and Electrical Engineering, Kyushu University, Japan
Title Recovering a Partial Decision List from Noisy Data and an Approximate Theory Based on Minimum Encoding
Abstract A partial decision list is defined as a partial classifier which consists of ordered classification rules. The problem of recovering a partial decision list from noisy data and an approximate theory represents a realistic setting of knowledge discovery from data, especially discovery of mutually related rules. To measure the goodness of a method to a specific problem with the ground-truth, we defined discovery accuracy as the average ratio of successful recoveries for data sets. The extended MDL principle proposed by Tangkitvanich and Shimura has proven to be useful for a similar problem of classification from noisy data and an approximate theory but has several flaws to be extended to our problem. In this talk, I will explain our solution based on minimum encoding which corresponds to the maximum a posteriori hypothesis when the initial theory and the data are statistically independent given the output hypothesis. Experiments with synthetic data show that our CLARDEM by far outperforms its simplification which neglects the initial theory and another method based on information compression in terms of the discovery accuracy, especially in the presence of class noise of 10 to 20%. Experiments using the UCI ML Repository data and C4.5Rules show that CLARDEM almost always outperforms or ties with the other two methods in recall and precision and is often much faster in the running time.
「セミナー」に戻る      
 ホーム