chaidnode properties
Last updated: Oct 09, 2024
The CHAID node generates decision trees using chi-square statistics to identify
optimal splits. Unlike the C&R Tree and Quest nodes, CHAID can generate non-binary trees,
meaning that some splits have more than two branches. Target and input fields can be numeric range
(continuous) or categorical. Exhaustive CHAID is a modification of CHAID that does a more thorough
job of examining all possible splits but takes longer to compute.
Example
stream = modeler.script.stream() sourcenode = stream.findByID("id46WRP1285C") node = stream.createAt("chaid", "My node", 200, 100) stream.link(sourcenode, node) node.setPropertyValue("custom_fields", True) node.setPropertyValue("target", "Drug") node.setPropertyValue("inputs", ["Age", "Na", "K", "Cholesterol", "BP"]) node.setPropertyValue("use_model_name", True) node.setPropertyValue("model_name", "CHAID") node.setPropertyValue("method", "Chaid") node.setPropertyValue("model_output_type", "InteractiveBuilder") node.setPropertyValue("use_tree_directives", True) node.setPropertyValue("tree_directives", "Test") node.setPropertyValue("split_alpha", 0.03) node.setPropertyValue("merge_alpha", 0.04) node.setPropertyValue("chi_square", "Pearson") node.setPropertyValue("use_percentage", False) node.setPropertyValue("min_parent_records_abs", 40) node.setPropertyValue("min_child_records_abs", 30) node.setPropertyValue("epsilon", 0.003) node.setPropertyValue("max_iterations", 75) node.setPropertyValue("split_merged_categories", True) node.setPropertyValue("bonferroni_adjustment", True)
Properties |
Values | Property description |
---|---|---|
|
field | CHAID models require a single target and one or more input fields. You can also specify a frequency. See Common modeling node properties for more information. |
|
flag | |
|
|
is used for very large datasets, and requires a server
connection. |
|
|
|
|
flag | |
|
string | |
|
|
|
|
|
|
|
integer | Maximum tree depth, from 0 to 1000. Used only if . |
|
flag | |
|
number | |
|
number | |
|
number | |
|
number | |
|
flag | |
|
structured | Structured property. |
|
number | Number of component models for boosting or bagging. |
|
|
Default combining rule for categorical targets. |
|
|
Default combining rule for continuous targets. |
|
flag | Apply boosting to very large data sets. |
|
number | Significance level for splitting. |
|
number | Significance level for merging. |
|
flag | Adjust significance values using Bonferroni method. |
|
flag | Allow resplitting of merged categories. |
|
|
Method used to calculate the chi-square statistic: Pearson or Likelihood Ratio |
|
number | Minimum change in expected cell frequencies.. |
|
number | Maximum iterations for convergence. |
|
integer | |
|
number | |
|
flag | |
|
flag | |
|
flag | |
|
|
|
|
integer | |
|
double | The algorithm internally separates records into a model building set and an overfit
prevention set, which is an independent set of data records used to track errors during training in
order to prevent the method from modeling chance variation in the data. Specify a percentage of
records. The default is . |
Was the topic helpful?
0/1000