Fixing the ï in Naïve Bayes! References #11

unintendedbear · Aug 11, 2017 · 2e10916 · 2e10916
1 parent 1cbe02a
commit 2e10916
Showing 1 changed file with 2 additions and 2 deletions.
diff --git a/Chapters/03-softc.tex b/Chapters/03-softc.tex
@@ -35,10 +35,10 @@ \subsection{Classification in the data mining process}
 
 Inside KDD, the process of classification, or application of classifying algorithms, helps in building a model of the data set, and to understand the relationships therein. As previously said, the data coming from BYOD practises is usually not only numerical or nominal, thus, only classification algorithms that support both types of data can be considered. Weka \cite{weka:site} is a collection of State-of-the-Art machine learning algorithms and data preprocessing tools that are key for data mining processes \cite{witten2016data}. On the other hand, it is important that for our purposes we focus on rule-based and decision-tree-based algorithms. A decision-tree algorithm is a group of conditions organised in a top-down recursive manner in a way that a class is assigned following a path of conditions, from the root of the tree to one of its leaves. Generally speaking, the possible classes to choose are mutually exclusive. Furthermore, these algorithms are also called ``divide-and-conquer'' algorithms. On the other hand, there are the ``separate-and-conquer'' algorithms, which work creating rules one at a time, then the instances covered by the created rule are removed and the next rule is generated from the remaining instances. The most important characteristic of these algorithms is that the model that is built from the dataset is expressed in the form of a set of rules.
 
-Inside the rule-based and decision tree-based algorithms, there is a great number of possible algorithms to work with, we have conducted a preselection phase trying to choose those which would yield better results in the experiments. A reference to each Weka classifier can be found at \cite{witten2016data}. Below are described the top five techniques, obtained from the best results  of the experiments done in this stage, along with more specific bibliography. Naïve Bayes method \cite{Bayesian_Classifier_97} has been included as a baseline, normally used in text categorization problems. According to the results, the five selected classifiers are much better than this method.
+Inside the rule-based and decision tree-based algorithms, there is a great number of possible algorithms to work with, we have conducted a preselection phase trying to choose those which would yield better results in the experiments. A reference to each Weka classifier can be found at \cite{witten2016data}. Below are described the top five techniques, obtained from the best results  of the experiments done in this stage, along with more specific bibliography. Na\"{i}ve Bayes method \cite{Bayesian_Classifier_97} has been included as a baseline, normally used in text categorization problems. According to the results, the five selected classifiers are much better than this method.
 
 \begin{description}
-  \item[Naïve Bayes] It is the classification technique that we have used as a reference for either its simplicity and ease to understand. Its basis relies on the Bayes Theorem and the possibility of represent the relationship between two random variables as a Bayesian network \cite{rish2001empirical}. Then, by assigning values to the variables probabilities, the probabilities of the occurrences between them can be obtained. Thus, assuming that a set of attributes are independent one from another, and using the Bayes Theorem, patterns can be classified without the need of trees or rule creation, just by calculating probabilities.
+  \item[Na\"{i}ve Bayes] It is the classification technique that we have used as a reference for either its simplicity and ease to understand. Its basis relies on the Bayes Theorem and the possibility of represent the relationship between two random variables as a Bayesian network \cite{rish2001empirical}. Then, by assigning values to the variables probabilities, the probabilities of the occurrences between them can be obtained. Thus, assuming that a set of attributes are independent one from another, and using the Bayes Theorem, patterns can be classified without the need of trees or rule creation, just by calculating probabilities.
    \item[J48] This classifier generates a pruned or unpruned C4.5 decision tree. Described for the first time in 1993 by \cite{Quinlan1993}, this machine learning method builds a decision tree selecting, for each node, the best attribute for splitting and create the next nodes. An attribute is selected as `the best' by evaluating the difference in entropy (information gain) resulting from choosing that attribute for splitting the data. In this way, the tree continues to grow till there are not attributes anymore for further splitting, meaning that the resulting nodes are instances of single classes. 
    \item[Random Forest] This manner of building a decision tree can be seen as a randomization of the previous C4.5 process. It was stated by \cite{Breiman2001} and consist of, instead of choosing `the best' attribute, the algorithm randomly chooses one between a group of attributes from the top ones. The size of this group is customizable in Weka.
    \item[REP Tree] Is another kind of decision tree, it means Reduced Error Pruning Tree. Originally stated by \cite{Quinlan1987}, this method builds a decision tree using information gain, like C4.5, and then prunes it using reduced-error pruning. That means that the training dataset is divided in two parts: one devoted to make the tree grow and another for pruning. For every subtree (not a class/leaf) in the tree, it is replaced by the best possible leaf in the pruning three and then it is tested with the test dataset if the made prune has improved the results. A deep analysis about this technique and its variants can be found in \cite{Elomaa2001}.