What tool boxes do I need to integrate with Hadoop.

1 回表示 (過去 30 日間)
Adam Neuf
Adam Neuf 2015 年 8 月 10 日
編集済み: Adam Neuf 2015 年 11 月 18 日
Hi, I am currently looking into integrating Matlab with a Hadoop Cluster. I have looked all over the website but it isn't clear which tool boxes are actually necessary to do this, I know that Matlab Compiler, Parallel Computing Tool Box, and the Matlab Distributed Computing Server(MDCS), are related, but I have found the website very unclear, and if all, none, or some of these are actually necessary. Thanks

採用された回答

Esther
Esther 2015 年 11 月 18 日
Hi Adam,
To integrate MATLAB with a cluster (whether a Hadoop cluster or some other generic cluster), you need MATLAB Distributed Computing Server (MDCS).
Then to send mapreduce jobs to that Hadoop cluster from MATLAB, you'll need at minimum Parallel Computing Toolbox.
Matlab Compiler is only required if you wish to package MapReduce based algorithms for deploying to production Hadoop systems.
Required:
  • MATLAB, MDCS, Parallel Computing Toolbox
Optional:
  • Matlab Compiler
  1 件のコメント
Adam Neufeldt
Adam Neufeldt 2015 年 11 月 18 日
I actually ended up contacting them and had a phone call with one of their engineers and here are the notes from that meeting:
There are two methods:
  • Method 1: With the parallel computing tool box(installed locally on each of our machines) and the MATLAB Distributed Computing Server(installed on the Hadoop Cluster)
-This runs interactively on a live session. You can write and test code and have it run instantaneously and it is almost identical to how you normally use Matlab except you will have all of the additional computing power of all of the cores, and you would be using Map Reduce algorithms.
  • Method 2: Matlab Compiler
- Can compile Analytics into an exe(Hadoop specific) which can then run on the cluster(so it is not intereactive). With no tool boxes at all you can still download data from the Hadoop cluster, and write and test Map Reduce algorithms on a small section of the cluster.
You can of course combine these two methods, by testing and debugging your code on the entire cluster by using the MDCS and parallel computing toolbox interactively, and then compiling the code.

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeCluster Configuration についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by