Paper Type

Master's Thesis


College of Computing, Engineering & Construction

Degree Name

Master of Science (MS)



NACO controlled Corporate Body

University of North Florida. School of Computing

First Advisor

Dr. Xudong Liu

Second Advisor

Dr. Ayan Dutta

Third Advisor

Dr. Sandeep Reddivari

Department Chair

Dr. Sherif Elfayoumy

College Dean

Dr. William F. Klostermeyer


Preferences are very important in research fields such as decision making, recommendersystemsandmarketing. The focus of this thesis is on preferences over combinatorial domains, which are domains of objects configured with categorical attributes. For example, the domain of cars includes car objects that are constructed withvaluesforattributes, such as ‘make’, ‘year’, ‘model’, ‘color’, ‘body type’ and ‘transmission’.Different values can instantiate an attribute. For instance, values for attribute ‘make’canbeHonda, Toyota, Tesla or BMW, and attribute ‘transmission’ can haveautomaticormanual. To this end,thisthesis studiesproblemsonpreference visualization and learning for lexicographic preference trees, graphical preference models that often are compact over complex domains of objects built of categorical attributes. Visualizing preferences is essential to provide users with insights into the process of decision making, while learning preferences from data is practically important, as it is ineffective to elicit preference models directly from users.

The results obtained from this thesis are two parts: 1) for preference visualization, aweb- basedsystem is created that visualizes various types of lexicographic preference tree models learned by a greedy learning algorithm; 2) for preference learning, a genetic algorithm is designed and implemented, called GA, that learns a restricted type of lexicographic preference tree, called unconditional importance and unconditional preference tree, or UIUP trees for short. Experiments show that GA achieves higher accuracy compared to the greedy algorithm at the cost of more computational time. Moreover, a Dynamic Programming Algorithm (DPA) was devised and implemented that computes an optimal UIUP tree model in the sense that it satisfies as many examples as possible in the dataset. This novel exact algorithm (DPA), was used to evaluate the quality of models computed by GA, and it was found to reduce the factorial time complexity of the brute force algorithm to exponential. The major contribution to the field of machine learning and data mining in this thesis would be the novel learning algorithm (DPA) which is an exact algorithm. DPA learns and finds the best UIUP tree model in the huge search space which classifies accurately the most number of examples in the training dataset; such model is referred to as the optimal model in this thesis. Finally, using datasets produced from randomly generated UIUP trees, this thesis presents experimental results on the performances (e.g., accuracy and computational time) of GA compared to the existent greedy algorithm and DPA.