# Is there a way to identify which dataset a value belongs to for overlapping datasets?

2 ビュー (過去 30 日間)
UH 2024 年 3 月 25 日
コメント済み: UH 2024 年 3 月 25 日
I have three types of datasets. These data sets visually shows that `data a` has comparatively lower values compared to `data b` and `data c`. I used a box plot to make a comparison and it shows that they have differences but there are overlaps. I will demonstrate them in the code below:
clc; clear all; close all
figure
hold on
xlabel('index of points')
ylabel('data value')
plot(a,'.',DisplayName='data1')
plot(b,'.',DisplayName='data2')
plot(c,'.',DisplayName='data3')
figure;
boxplot([a b c],'Notch','on','Labels',{'data1','data2','data3'})
grid on
Now considering these data sets, I have a set of values, say [4 7 40 8 4], I want to predict which dataset these value may belong to. Is there a way to do that? Having a very basic knowledge of statistics, I cannot come up with a solution. I found one solution based on which Kernel density estimate (kde) was used for comparison. However, the data was distinctly separable. In my case, the datasets are more overlapped, is there a way to predict in this case? Forgive my very basic knowledge and suggest a solution. Will appreciate it.
figure
hold on
[fn,xfn,bwn] = kde(a);
plot(xfn,fn)
[fn,xfn,bwn] = kde(b);
plot(xfn,fn)
[fn,xfn,bwn] = kde(c);
plot(xfn,fn)
##### 2 件のコメントなしを表示なしを非表示
Jeff Miller 2024 年 3 月 25 日
You might look into logistic regression and discriminant function analysis. These are both techniques for predicting category membership.
UH 2024 年 3 月 25 日
Thank you for the idea. I am looking into these.

サインインしてコメントする。

### 採用された回答

Chunru 2024 年 3 月 25 日
ans = '/users/mss.system.pbnsl/dataset.mat'
figure
hold on
xlabel('index of points')
ylabel('data value')
plot(a,'.',DisplayName='data1')
plot(b,'.',DisplayName='data2')
plot(c,'.',DisplayName='data3')
whos
Name Size Bytes Class Attributes a 90x1 720 double ans 1x35 70 char b 90x1 720 double c 90x1 720 double cmdout 1x33 66 char gdsCacheDir 1x14 28 char gdsCacheFlag 1x1 8 double i 0x0 0 double managers 1x0 0 cell managersMap 0x1 8 containers.Map
figure;
boxplot([a b c],'Notch','on','Labels',{'data1','data2','data3'})
grid on
x = [4 7 40 8 4]';
% K Nearest neighbour (KNN) classification
data = [a; b; c];
label = [ones(size(a)); 2*ones(size(b)); 3*ones(size(b)) ];
Mdl = fitcknn(data, label, "NumNeighbors", 80); % larger number of neighbours
predictedClass = predict(Mdl, x) % predicted class
predictedClass = 5x1
1 1 3 2 1
##### 1 件のコメント-1 件の古いコメントを表示-1 件の古いコメントを非表示
UH 2024 年 3 月 25 日
Thank you for your answer. I will further check with whether the most occuring prediction leads to the predicted class or I have to perform some additional analysis. This probably works. Thank you. Good day.

サインインしてコメントする。

### カテゴリ

Help Center および File ExchangeLinear Predictive Coding についてさらに検索

R2024a

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by