How is root node value chosen in regression decision tree?

Question

Christiana Sasser on 8 Sep 2020

0
Link

Direct link to this question

https://se.mathworks.com/matlabcentral/answers/590758-how-is-root-node-value-chosen-in-regression-decision-tree

Answered: Ayush Aniket on 4 Jun 2025

I understand the criteria for node splitting and how the root node variable is chosen but I do not understand how the actual value for the inequality at the root node is chosen. Is it just local optimization of the numbers? For example, I have a variety of whole number values ranging from 3 to 25 and the root node is chosing 9.5. This is not the median or mean, so why is this number chosen? Is it because the decision tree analyzed all potential values to see what had the lowest MSE to start with? If so, why did it chose a decimal number when all my data points are whole numbers?

Thank you for your help!

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Ayush Aniket on 4 Jun 2025

0
Link

Direct link to this answer

https://se.mathworks.com/matlabcentral/answers/590758-how-is-root-node-value-chosen-in-regression-decision-tree#answer_1565883

Open in MATLAB Online

The split value at the root node in a decision tree is chosen based on optimization criteria, not necessarily the median or mean. Decision trees aim to minimize impurity (for classification) or reduce variance/MSE (for regression).The algorithm evaluates all possible split points and selects the one that maximizes information gain or minimizes error.

Why a Decimal Value Instead of Whole Numbers?

Even if your dataset contains only whole numbers, the tree considers midpoints between consecutive values as potential split points.
For example, if your sorted values are {3, 5, 7, 9, 11, 13, ...}, the tree might evaluate splits at {4, 6, 8, 10, 12, ...}.
The split at 9.5 means the algorithm found that separating values below 9.5 from those above 9.5 resulted in the best reduction in impurity or error.

In MATLAB, you can visualize the tree using:

view(SVModelTree, 'Mode', 'graph');

Refer the following documentation to learn more about the viewing options: https://www.mathworks.com/help/stats/view-decision-tree.html

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

How is root node value chosen in regression decision tree?

0 Comments
Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

How is root node value chosen in regression decision tree?

0 Comments Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments