How to use kmeans function on data stored by datastore function?

3 ビュー (過去 30 日間)
Ahmed Hamed
Ahmed Hamed 2016 年 4 月 29 日
編集済み: Josh Meyer 2017 年 7 月 17 日
I'm trying to cluster big data using kmeans, i found a code that can do something similar here you are
Mu = bsxfun(@times,ones(20,30),(1:20)'); % Gaussian mixture mean
rn30 = randn(30,30);
Sigma = rn30'*rn30; % Symmetric and positive-definite covariance
Mdl = gmdistribution(Mu,Sigma);
rng(1); % For reproducibility
X = random(Mdl,10000);
pool = parpool; % Invokes workers
stream = RandStream('mlfg6331_64'); % Random number stream
options = statset('UseParallel',1,'UseSubstreams',1,...
'Streams',stream);
tic; % Start stopwatch timer
[idx,C,sumd,D] = kmeans(X,20,'Options',options,'MaxIter',10000,...
'Display','final','Replicates',10);
toc % Terminate stopwatch timer
But as you can see, X is double.
My problem is that i have a file named HIS.csv and i used the datastore function to store it as follows
ds = datastore('HIS_all.csv', 'DatastoreType', 'tabulartext','TreatAsMissing', 'NA');
when i tried
[idx,C,sumd,D] = kmeans(ds,20,'Options',options,'MaxIter',10000, 'Display','final','Replicates',10);
i get the following error
Undefined function 'isnan' for input arguments of type 'matlab.io.datastore.TabularTextDatastore'.
Error in kmeans (line 158)
wasnan = any(isnan(X),2);
Any suggestions?

回答 (1 件)

Josh Meyer
Josh Meyer 2017 年 7 月 15 日
編集済み: Josh Meyer 2017 年 7 月 17 日
Datastore is just a framework for loading small chunks of the data at a time, so you can't call generic functions directly on the datastore. Instead try converting the datastore into a tall array first:
T = tall(ds);
The kmeans function supports tall arrays, so once the data is in this format you can use the function. Note that there are some limitations to using kmeans on a tall array, so some of the NV pairs you specified might not work. The limitations are outlined here:

カテゴリ

Help Center および File ExchangeStatistics and Machine Learning Toolbox についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by