Decision Tree

Decision Tree

  • ๋ฐ์ดํ„ฐ๋ฅผ ๋ถ„์„ํ•˜์—ฌ ์ด๋“ค ์‚ฌ์ด์— ์กด์žฌํ•˜๋Š” ํŒจํ„ด์„ ์˜ˆ์ธก ๊ฐ€๋Šฅํ•œ ๊ทœ์น™๋“ค์˜ ์กฐํ•ฉ์œผ๋กœ ๋‚˜ํƒ€๋ƒ„.

  • ๋ถ„๋ฅ˜(Classification)์™€ ํšŒ๊ท€(Regression) ๋ฌธ์ œ์— ์ ์šฉ ๊ฐ€๋Šฅ.

  • ๊ณ„์‚ฐ๋ณต์žก์„ฑ ๋Œ€๋น„ ๋†’์€ ์˜ˆ์ธก ์„ฑ๋Šฅ์„ ์ง€๋‹˜.

Example (Process of Decsion Tree)

  • Find the feature and criteria that best differentiate the data.

  • Q) How to findd the criteria that best differentiate the data?

    • -> Maximize the information gain. (Minimize the Impurity (Entropy))

Entropy, Impurity

  • ๋ถˆ์ˆœ๋„ (Impruty)

    • ํ•ด๋‹น ๋ฒ”์ฃผ ์•ˆ์— ์„œ๋กœ ๋‹ค๋ฅธ ๋ฐ์ดํ„ฐ๊ฐ€ ์–ผ๋งˆ๋‚˜ ์„ž์—ฌ ์žˆ๋Š”๊ฐ€?

  • Entropy

    • ๋ถˆ์ˆœ๋„๋ฅผ ์ˆ˜์น˜์ ์œผ๋กœ ๋‚˜ํƒ€๋‚ธ ์ฒ™๋„

Entropy=โˆ’โˆ‘i(pi)logโก2(pi)Entropy = - \sum\nolimits_i {({p_i}){{\log }_2}({p_i})}

Entropy Example

Impurity Indicator

  • Entropy Entropy(A)=โˆ‘i=1dRi(โˆ’โˆ‘k=1mpklogโก2(pk))Entropy(A) = \sum\limits_{i = 1}^d {{R_i}\left( { - \sum\limits_{k = 1}^m {{p_k}{{\log }_2}({p_k})} } \right)}

  • Gini-Index G.I(A)=โˆ‘i=1d(Ri(1โˆ’โˆ‘k=1mpik2))G.I(A) = \sum\limits_{i = 1}^d {\left( {{R_i}\left( {1 - \sum\limits_{k = 1}^m {{p_{ik}}^2} } \right)} \right)}

Rish of Overfitting

  • Decision trees tend to overfit on data with a large number of features. Getting the right ratio of samples to number of features is important, since a tree with few samples in high dimensional space is very likely to overfit.

  • We have to limit the maximum depth, the maximum number of nodes, minimum number of data for one node to split.

Pruning

  • Prevent overfitting using pruning.

Cost Function of Pruning

  • ๐ถ๐ถ(๐‘‡)=๐ธ๐‘Ÿ๐‘Ÿ(๐‘‡)+๐›ผร—๐ฟ(๐‘‡)

    • ๐ถ๐ถ(๐‘‡): ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด์˜ ๋น„์šฉ ๋ณต์žก๋„ (=์˜ค๋ฅ˜๊ฐ€ ์ ์œผ๋ฉด์„œ terminal node ์ˆ˜๊ฐ€ ์ ์€ ๋‹จ์ˆœํ•œ ๋ชจ๋ธ์ผ์ˆ˜๋ก ์ž‘์€ ๊ฐ’)

    • ๐ธ๐‘…๐‘…(๐‘‡): ๊ฒ€์ฆ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ์˜ค๋ถ„๋ฅ˜์œจ

    • ๐ฟ(๐‘‡): terminal node์˜ ์ˆ˜ (๊ตฌ์กฐ ๋ณต์žก๋„)

    • ๐ด๐‘™๐‘โ„Ž๐‘Ž: ๐ธ๐‘…๐‘…(๐‘‡)์™€ ๐ฟ(๐‘‡)๋ฅผ ๊ฒฐํ•ฉํ•˜๋Š” ๊ฐ€์ค‘์น˜ (์‚ฌ์šฉ์ž์— ์˜ํ•ด ๋ถ€์—ฌ, ๋ณดํ†ต 0.01~0.1์˜ ๊ฐ’์„ ์‚ฌ์šฉ)

Summary

  • ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด๋Š” ํ•œ๋ฒˆ์— ํ•˜๋‚˜์”ฉ์˜ ์„ค๋ช…๋ณ€์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์˜ˆ์ธก ๊ฐ€๋Šฅํ•œ ๊ทœ์น™๋“ค์˜ ์ง‘ํ•ฉ์„ ์ƒ์„ฑํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜.

    • ํ•œ๋ฒˆ ๋ถ„๊ธฐ ๋•Œ๋งˆ๋‹ค ๋ณ€์ˆ˜ ์˜์—ญ์„ ๋‘๊ฐœ๋กœ ๊ตฌ๋ถ„ํ•œ ๋’ค ๊ฐ ์˜์—ญ์˜ ์ˆœ๋„๊ฐ€ ์ฆ๊ฐ€(์—”ํŠธ๋กœํ”ผ๊ฐ€ ๊ฐ์†Œ)ํ•˜๋Š” ๋ฐฉํ–ฅ์œผ๋กœ ํ•™์Šต์„ ์ง„ํ–‰ํ•จ.

    • ์ž…๋ ฅ ๋ณ€์ˆ˜ ์˜์—ญ์„ ๋‘ ๊ฐœ๋กœ ๊ตฌ๋ถ„ํ•˜๋Š” **์žฌ๊ท€์  ๋ถ„๊ธฐ(recursive partioning)**์™€ ๊ตฌ๋ถ„๋œ ์˜์—ญ์„ ํ†ตํ•ฉํ•˜๋Š” **๊ฐ€์ง€์น˜๊ธฐ(pruning)**์˜ ๋‘๊ฐ€์ง€ ๊ณผ์ •์œผ๋กœ ๋‚˜๋‰จ.

  • ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด๋Š” ๊ณ„์‚ฐ๋ณต์žก์„ฑ ๋Œ€๋น„ ๋†’์€ ์˜ˆ์ธก ์„ฑ๋Šฅ์„ ๋‚ด๋Š” ์žฅ์ ์„ ์ง€๋‹˜.

  • ํ•˜์ง€๋งŒ, ๊ฒฐ์ •๊ฒฝ๊ณ„(decision boundary)๊ฐ€ ๋ฐ์ดํ„ฐ ์ถ•์— ์ˆ˜์ง์ด์–ด์„œ ํŠน์ • ๋ฐ์ดํ„ฐ์—๋งŒ ์ž˜ ์ž‘๋™ํ•˜๋Š” ํ•œ๊ณ„๊ฐ€ ์žˆ์Œ.

    • ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋ฅผ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•ด ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ์™€ ๊ฐ™์€ ๊ธฐ๋ฒ•์„ ์‚ฌ์šฉ.

    • ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ๋ž€?

      • ๊ฐ™์€ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด๋ฅผ ์—ฌ๋Ÿฌ ๊ฐœ ๋งŒ๋“ค์–ด์„œ ๊ทธ ๊ฒฐ๊ณผ๋ฅผ ์กฐํ•ฉํ•ด์„œ ์˜ˆ์ธก ์„ฑ๋Šฅ์„ ๋†’์ด๋Š” ๊ธฐ๋ฒ•

      • ๋™์ผํ•œ ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ ๋ณต์›์ถ”์ถœ์„ ํ†ตํ•ด 30๊ฐœ ์ด์ƒ์˜ ๋ฐ์ดํ„ฐ์…‹์„ ๋งŒ๋“ค์–ด์„œ ๊ฐ๊ฐ์— ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด๋ฅผ ์ ์šฏ์•ˆ ๋’ค ํ•™์Šต ๊ฒฐ๊ณผ๋ฅผ ์ทจํ•ฉํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ ๋™์ž‘

References

  • Theoretical Explanation

    • https://bkshin.tistory.com/entry/%EB%A8%B8%EC%8B%A0%EB%9F%AC%EB%8B%9D-4-%EA%B2%B0%EC%A0%95-%ED%8A%B8%EB%A6%ACDecision-Tree

    • https://ratsgo.github.io/machine%20learning/2017/03/26/tree/

  • Python Implementation of Decision Tree

    • https://towardsdatascience.com/construct-a-decision-tree-and-how-to-deal-with-overfitting-f907efc1492d

  • Scikit-learn documentation

    • https://scikit-learn.org/stable/modules/tree.html

  • MATHLAB Example

    • fitctree & fitrtree (instruction & algorithm)

      • https://kr.mathworks.com/help/stats/fitctree.html

      • https://kr.mathworks.com/help/stats/fitrtree.html

    • View Decision Tree

      • https://kr.mathworks.com/help/stats/view-decision-tree.html

    • Prediction Using Classification and Regression Trees

      • https://kr.mathworks.com/help/stats/prediction-using-classification-and-regression-trees.html

Last updated

Was this helpful?