Universite Francois Rabelais Tours
(Blois campus)
The University of Tours (French: Université de Tours), formerly François Rabelais University of Tours (French: Université François Rabelais), is a public university in Tours, France.
Full scholarship from the European Commission, 100% covering tuition, living costs and medical insurance
Only 1% of the students that applied received the scholarship
Finished top 3 in the Master Class of 20 students
Interactive pattern mining is a two-way learning problem between the user and the system. On the “User to System” way, the system learns the user preferences from their feedback. In the meantime, the user learns new knowledge from the database through the patterns provided by the system (“System to User” way).
The actual problem differs from traditional active learning problem in three ways. First, the queries provided to the user represent patterns, not transactions.
Second, the system has to mine patterns which are interesting for the user in order to maintain the user interaction.
Third, a satisfactory interaction with the user should be maintained by computing the suggested queries in a short time.
Enumerating the interesting patterns over a search space is time consuming and exhaustive.
In this work, we are sampling patterns exactly and directly, and evaluating the algorithm based on user interestingness.
We also defined some measures to evaluate the pattern mining algorithm based on our context. Some limitations of the algorithm subject to evaluation was observed during experimentation.
Further, we performed experiments based on our proposed solution taking into account the measures of evaluation using a user based movie rating data set.
In this work we tried to evaluate pattern mining algorithm that samples exactly and directly. The algorithm also takes into account of user feedback as well. For evaluation of the algorithm we proposed using a real user based movie ratings data set. The dataset was transformed for the evaluation and the proposed algorithm was not working in all the cases. This could be attributed with two technical limitations: First, regarding the data weight allocation and Secondly, Transactions with high length was always interesting.
Based on the observed drawbacks we proposed two approaches: First, transforming the data set for reasonable length distribution. Secondly, to adapt the algorithm by changing the data weights and sampling methods. Since, we are evaluating the algorithm with real user data, it is not interesting to modify the set to adopt for the algorithm. Hence, we used the second approach.
In general, the evaluation measures in the pattern mining algorithms are very few. So, for evaluating the algorithm we defined measures: diversity, interestingness and recall. Based on the measures defined we experimented with the dataset to get the results.
Paper
Presentation of the project & results