GPU computing for machine learning (bagging / ensemble)

Question

0 投票

Hi, Is bagging / ensemble supported by gpu computing in matlab?

I need to create a random forest that requires lots of processing time and was wondering if I can accelerate issuing gpu computing.

Thanks

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Follow Question

Answer 1

Chetan Rawal 2015 年 8 月 21 日

1 投票

Yes, treebagger is supported and has in-built GPU support. Give it a try: http://www.mathworks.com/products/parallel-computing/builtin-parallel-support.html

You should be able to grow the trees on the GPU. Then also might gain further performance by aggregating your ensemble on multiple cores of the CPU using Parallel Computing Toolbox. I'd suggest profiling your code first to see if this second step will help.

Chetan

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

Sean de Wolski 2015 年 8 月 21 日

It has built in parallel support; not GPU support.

サインインしてコメントする。

Answer 2

Ilya 2015 年 8 月 21 日

0 投票

There is no GPU support for decision trees or their ensembles. If you work in a sufficiently recent release, decision trees are multithreaded. In addition, TreeBagger, as noted, has parallel support through Parallel Computing Toolbox.

Can you tell us about your data size (number of observations and predictors) and time requirements? Have you already tried fitting a random forest and concluded it is too slow for your case?

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

サインインしてコメントする。

Answer 3

Joe 2015 年 8 月 22 日

0 投票

Hi,

Thanks for the comments. I am already using parallel computing with my 6 cores.

I have around 10 million observations, around 100 predictors, and very weak signals. I fitted a tree and took 20 mins. However this is one piece of a big optimization and need to do a few thousands.

Any help appreciated!

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

Ilya 2015 年 8 月 23 日

編集済み: Ilya 2015 年 8 月 23 日

Try boosting. You don't say much about your data, so I can't recommend a specific boosting algorithm. Use at least a few dozen trees and play with the minimal leaf size ('minleaf') to obtain the optimal accuracy on an independent test set.

If you insist on using TreeBagger, likewise play with the minimal leaf size. By default, trees for random forest are split to a very fine level (minleaf=1 for classification and 5 for regression). You likely don't need such deep trees for 10M observations. Increasing the leaf size would speed up training and reduce the memory footprint of the ensemble a lot.

Boosting typically outperforms random forest in accuracy on large datasets. Boosting trees is also faster because you can keep trees fairly shallow. The disadvantage is that you need to spend more time searching for optimal values of the boosting parameters such as the minimal leaf size and something else (say learning rate for some algorithms).

If you are using 15a or later, trees are multithreaded. parfor across local cores is not going to help you much and can even lead to a slowdown because you would be consuming quite a bit more memory.

サインインしてコメントする。

Answer 4

Joe 2015 年 8 月 24 日

0 投票

Thanks a lot Ilya!

What are the things about the data you need to know to be able to provide a recommendation on specific boosting algorithm?

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

Ilya 2015 年 8 月 25 日

Would you consider reading the doc, in particular this section, and then asking a specific question?

サインインしてコメントする。

GPU computing for machine learning (bagging / ensemble)

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

回答 (4 件)

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

カテゴリ

タグ

Community Treasure Hunt

GPU computing for machine learning (bagging / ensemble)

0 件のコメント -2 件の古いコメントを表示 -2 件の古いコメントを非表示

回答 (4 件)

1 件のコメント -1 件の古いコメントを表示 -1 件の古いコメントを非表示

0 件のコメント -2 件の古いコメントを表示 -2 件の古いコメントを非表示

1 件のコメント -1 件の古いコメントを表示 -1 件の古いコメントを非表示

1 件のコメント -1 件の古いコメントを表示 -1 件の古いコメントを非表示

カテゴリ

タグ

参考

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示