Decision stream is a statistic-based supervised learning technique that generates a deep directed acyclic graph of decision rules to solve classification and regression tasks. This decision tree based method avoids the problem of data exhaustion in terminal nodes by merging of leaves from the same/different levels of predictive model.
Decision stream provides:
– High accuracy due to the precise splitting of data with unpaired two-sample test statistics.
– Decrease of over-fitting due to partition of data only into statistically representative groups.
– Reduction of complexity on every level of predictive model.
– Self-regulated depth of predictive model.
Unlike the classical decision tree approach, this method builds a predictive model with high degree of connectivity by merging statistically indistinguishable nodes at each iteration. The key advantage of decision stream is an efficient usage of every node, taking into account all fruitful feature splits. With the same quantity of nodes, it provides higher depth than decision tree, splitting and merging the data multiple times with different features. The predictive model is growing till no improvements are achievable, considering different data recombinations, and resulting in deep directed acyclic graph, where decision branches are loosely split and merged like natural streams of a waterfall. Decision stream supports generation of extremely deep graph that can consist of hundreds of levels.
- Open-source implementation of decision stream for classification and regression
- Rule based decision stream in Neo4j
- IEEE ICTAI 2017 presentation