is built in greedy fashion to avoid combinatoric explosion of number of possibilities which attribute to pick first? A good attribute splits the data into subsets that are (ideally) all positive or all negative. Then if feature value is greater/smaller than that the prob. is the populations of classes vs each other. Keep splitting as you go down the tree based on nodes. If there are many many trees it is a good classifier. It can run distributed. [the initial set of nodes can be a subset of the total training set we have] It is a supervised classifier. |