🖍️
gitbook_docs
  • Introduction
  • Machine Learning
    • Recommended Courses
      • For Undergrad Research
      • Math for Machine Learning
    • ML Notes
      • Covariance Correlation
      • Feature Selection
      • Linear Regression
      • Entropy, Cross-Entropy, KL Divergence
      • Bayesian Classifier
        • Terminology Review
        • Bayesian Classifier for Normally Distributed classes
      • Linear Discriminant Analysis
      • Logistic Regression
        • Logistic Regression Math
      • Logistic Regression-MaximumLikelihood
      • SVM
        • SVM concept
        • SVM math
      • Cross Validation
      • Parameter, Density Estimation
        • MAP, MLE
        • Gaussian Mixture Model
      • E-M
      • Density Estimation(non-parametric)
      • Unsupervised Learning
      • Clustering
      • kNN
      • WaveletTransform
      • Decision Tree
    • Probability and Statistics for Machine Learning
      • Introduction
      • Basics of Data Analysis
      • Probability for Discrete Random Variable
      • Poisson Distribution
      • Chi-Square Distribution
      • P-value and Statistical Hypothesis
      • Power and Sample Size
      • Hypothesis Test Old
      • Hypothesis Test
      • Multi Armed Bandit
      • Bayesian Inference
      • Bayesian Updating with Continuous Priors
      • Discrete Distribution
      • Comparison of Bayesian and frequentist inference
      • Confidence Intervals for Normal Data
      • Frequenist Methods
      • Null Hypothesis Significance Testing
      • Confidence Intervals: Three Views
      • Confidence Intervals for the Mean of Non-normal Data
      • Probabilistic Prediction
  • Industrial AI
    • PHM Dataset
    • BearingFault_Journal
      • Support Vector Machine based
      • Autoregressive(AR) model based
      • Envelope Extraction based
      • Wavelet Decomposition based
      • Prediction of RUL with Deep Convolution Nueral Network
      • Prediction of RUL with Information Entropy
      • Feature Model and Feature Selection
    • TempCore Journal
      • Machine learning of mechanical properties of steels
      • Online prediction of mechanical properties of hot rolled steel plate using machine learning
      • Prediction and Analysis of Tensile Properties of Austenitic Stainless Steel Using Artificial Neural
      • Tempcore, new process for the production of high quality reinforcing
      • TEMPCORE, the most convenient process to produce low cost high strength rebars from 8 to 75 mm
      • Experimental investigation and simulation of structure and tensile properties of Tempcore treated re
    • Notes
  • LiDAR
    • Processing of Point Cloud
    • Intro. 3D Object Detection
    • PointNet
    • PointNet++
    • Frustrum-PointNet
    • VoxelNet
    • Point RCNN
    • PointPillars
    • LaserNet
  • Simulator
    • Simulator List
    • CARLA
    • Airsim
      • Setup
      • Tutorial
        • T#1
        • T#2
        • T#3: Opencv CPP
        • T#4: Opencv Py
        • Untitled
        • T#5: End2End Driving
  • Resources
    • Useful Resources
    • Github
    • Jekyll
  • Reinforcement Learning
    • RL Overview
      • RL Bootcamp
      • MIT Deep RL
    • Textbook
    • Basics
    • Continuous Space RL
  • Unsupervised Learning
    • Introduction
  • Unclassified
    • Ethics
    • Conference Guideline
  • FPGA
    • Untitled
  • Numerical Method
    • NM API reference
Powered by GitBook
On this page
  • Decision Tree
  • Example (Process of Decsion Tree)
  • Entropy, Impurity
  • Impurity Indicator
  • Rish of Overfitting
  • Pruning
  • Summary
  • References

Was this helpful?

  1. Machine Learning
  2. ML Notes

Decision Tree

PreviousWaveletTransformNextProbability and Statistics for Machine Learning

Last updated 3 years ago

Was this helpful?

Decision Tree

  • 데이터를 분석하여 이들 사이에 존재하는 패턴을 예측 가능한 규칙들의 조합으로 나타냄.

  • 분류(Classification)와 회귀(Regression) 문제에 적용 가능.

  • 계산복잡성 대비 높은 예측 성능을 지님.

Example (Process of Decsion Tree)

  • Find the feature and criteria that best differentiate the data.

  • Q) How to findd the criteria that best differentiate the data?

    • -> Maximize the information gain. (Minimize the Impurity (Entropy))

Entropy, Impurity

  • 불순도 (Impruty)

    • 해당 범주 안에 서로 다른 데이터가 얼마나 섞여 있는가?

  • Entropy

    • 불순도를 수치적으로 나타낸 척도

Entropy=−∑i(pi)log⁡2(pi)Entropy = - \sum\nolimits_i {({p_i}){{\log }_2}({p_i})}Entropy=−∑i​(pi​)log2​(pi​)

Entropy Example

Impurity Indicator

  • Entropy Entropy(A)=∑i=1dRi(−∑k=1mpklog⁡2(pk))Entropy(A) = \sum\limits_{i = 1}^d {{R_i}\left( { - \sum\limits_{k = 1}^m {{p_k}{{\log }_2}({p_k})} } \right)}Entropy(A)=i=1∑d​Ri​(−k=1∑m​pk​log2​(pk​))

  • Gini-Index G.I(A)=∑i=1d(Ri(1−∑k=1mpik2))G.I(A) = \sum\limits_{i = 1}^d {\left( {{R_i}\left( {1 - \sum\limits_{k = 1}^m {{p_{ik}}^2} } \right)} \right)}G.I(A)=i=1∑d​(Ri​(1−k=1∑m​pik​2))

Rish of Overfitting

  • Decision trees tend to overfit on data with a large number of features. Getting the right ratio of samples to number of features is important, since a tree with few samples in high dimensional space is very likely to overfit.

  • We have to limit the maximum depth, the maximum number of nodes, minimum number of data for one node to split.

Pruning

Cost Function of Pruning

  • 𝐶𝐶(𝑇)=𝐸𝑟𝑟(𝑇)+𝛼×𝐿(𝑇)

    • 𝐶𝐶(𝑇): 의사결정나무의 비용 복잡도 (=오류가 적으면서 terminal node 수가 적은 단순한 모델일수록 작은 값)

    • 𝐸𝑅𝑅(𝑇): 검증 데이터에 대한 오분류율

    • 𝐿(𝑇): terminal node의 수 (구조 복잡도)

    • 𝐴𝑙𝑝ℎ𝑎: 𝐸𝑅𝑅(𝑇)와 𝐿(𝑇)를 결합하는 가중치 (사용자에 의해 부여, 보통 0.01~0.1의 값을 사용)

Summary

  • 의사결정나무는 한번에 하나씩의 설명변수를 사용하여 예측 가능한 규칙들의 집합을 생성하는 알고리즘.

    • 한번 분기 때마다 변수 영역을 두개로 구분한 뒤 각 영역의 순도가 증가(엔트로피가 감소)하는 방향으로 학습을 진행함.

    • 입력 변수 영역을 두 개로 구분하는 **재귀적 분기(recursive partioning)**와 구분된 영역을 통합하는 **가지치기(pruning)**의 두가지 과정으로 나뉨.

  • 의사결정나무는 계산복잡성 대비 높은 예측 성능을 내는 장점을 지님.

  • 하지만, 결정경계(decision boundary)가 데이터 축에 수직이어서 특정 데이터에만 잘 작동하는 한계가 있음.

    • 이러한 문제를 극복하기 위해 랜덤포레스트와 같은 기법을 사용.

    • 랜덤포레스트란?

      • 같은 데이터에 대해 의사결정나무를 여러 개 만들어서 그 결과를 조합해서 예측 성능을 높이는 기법

      • 동일한 데이터로부터 복원추출을 통해 30개 이상의 데이터셋을 만들어서 각각에 의사결정나무를 적욯안 뒤 학습 결과를 취합하는 방식으로 동작

References

  • Theoretical Explanation

    • https://bkshin.tistory.com/entry/%EB%A8%B8%EC%8B%A0%EB%9F%AC%EB%8B%9D-4-%EA%B2%B0%EC%A0%95-%ED%8A%B8%EB%A6%ACDecision-Tree

    • https://ratsgo.github.io/machine%20learning/2017/03/26/tree/

  • Python Implementation of Decision Tree

    • https://towardsdatascience.com/construct-a-decision-tree-and-how-to-deal-with-overfitting-f907efc1492d

  • Scikit-learn documentation

    • https://scikit-learn.org/stable/modules/tree.html

  • MATHLAB Example

    • fitctree & fitrtree (instruction & algorithm)

      • https://kr.mathworks.com/help/stats/fitctree.html

      • https://kr.mathworks.com/help/stats/fitrtree.html

    • View Decision Tree

      • https://kr.mathworks.com/help/stats/view-decision-tree.html

    • Prediction Using Classification and Regression Trees

      • https://kr.mathworks.com/help/stats/prediction-using-classification-and-regression-trees.html

Prevent overfitting using pruning.