Remove or replace trees from a TreeBagger ensemble

2 ビュー (過去 30 日間)
Artik Crazy
Artik Crazy 2012 年 3 月 9 日
Hello,
There is this method TreeBagger.combine that can combine two independently trained TreeBagger classifiers.
The problem is that I have to combine two ensembles, but I want to replace randomly 25% of the trees from the first ensemble with trees from the second ensemble.
Three possible solutions to overcome my problem:
1. Method to remove specified trees from ensemble. Then I would just remove 25% of the ensemble with random indexes and combine with the second ensemble.
2. Method to construct new TreeBagger from a collection of trees. For example something like this:
TreeBagger=Construct(Tree1, Tree2, ...);
3. Method to replace trees in TreeBagger ensemble. For example I tried to do this, like in a structure:
ensemble.Trees{i}=new_tree;
but that didn't work, because the property is private.
Please help me, because I had no progress with any of three...

採用された回答

Ilya
Ilya 2012 年 3 月 9 日
Combining two objects would be hard. You can work around this by growing one big ensemble and treating parts of it as separate ensembles. Take a look at the 'trees' argument for PREDICT and similar mehods.
This recipe is justified because every tree in a TreeBagger ensemble is grown independently of others. For prediction, trees are combined by simple averaging. This approach would not work for ensembles of other types.
Here is how I effectively grow two TreeBagger ensembles and then combine them:
% Load data
load ionosphere;
% Grow two ensembles, 100 trees each
b = TreeBagger(200,X,Y,'oobpred','on');
% Out-of-bag prediction from 1st ensemble
Yoob1 = oobPredict(b,'trees',1:100);
% Out-of-bag prediction from 2nd ensemble
Yoob2 = oobPredict(b,'trees',101:200);
% Randomly drop 25 trees from the 1st ensemble and combine the two
% ensembles.
idx1 = datasample(1:100,75,'replace',false);
Yoob3 = oobPredict(b,'trees',[idx1 101:200]);
  1 件のコメント
Artik Crazy
Artik Crazy 2012 年 3 月 10 日
Hi,
Thank you for your help. But my goal was to explicitly remove some trees from one ensemble and combine with another ensemble. The "Trees" argument works fine, but you still have to store all the trees in the memory.

サインインしてコメントする。

その他の回答 (1 件)

Artik Crazy
Artik Crazy 2012 年 3 月 10 日
I figured that the solution is quite simple. TreeBagger ensemble is actually a collection of regular trees, so I went to a TreeBagger class declaration and removed the 'NTrees' and "Trees" properties from the list of private properties.
This way I'm able now to work with an ensemble like a regular cell array. For example if I want to take 100 trees from an ensemble I just code:
PrunedEnsemble=OriginalEnsemble;
PrunedEnsemble.Trees(101:end)=[];
PrunedEnsemble.NTrees=100;
NewResponse=predict(PrunedEnsemble, Predictors);
I tested and it gives the same predictions as if I called:
NewResponse=predict(OriginalEnsemble, 'Trees', 1:100);
It is important however to adjust the NTrees property accordingly.
I wonder what was the reason to add these properties to a "Private Properties" list... Is there any issue I didn't cover there??
  2 件のコメント
Ilya
Ilya 2012 年 3 月 10 日
Various properties of your PrunedEnsemble are now broken. For example, OOBIndices are the indices of observations that are out-of-bag for the grown trees. Tree 101 in OriginalEnsemble must use column 101 of OOBIndices. In PrunedEnsemble, this tree has index 1 and will use column 1 of OOBIndices instead. Because the tree list and OOBIndices no longer match, you cannot use oobPredict anymore. This is just one example; there are other broken properties.
If you must go down this road, I would recommend modifying CompactTreeBagger in a similar fashion. You should then execute COMPACT on a TreeBagger object before removing trees. The compact class is simpler, and there is less chance you will screw something up by these changes. Certainly, the PREDICT method will work fine, but I cannot vouch for everything else.
Artik Crazy
Artik Crazy 2012 年 3 月 10 日
Thanks for clearing this up for me!
Your answers are very helpful.
Actually I'm not interested in OOB functions - once I've trained every ensemble separately I have interest in PREDICT method only.
I use the combined ensemble on an unlabeled data.

サインインしてコメントする。

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by