How to remove multiple outlier data in a rectangular maze?

2 ビュー (過去 30 日間)
Atanu
Atanu 2022 年 6 月 10 日
コメント済み: Atanu 2022 年 6 月 16 日
I have this rodent x,y coordinates as described in the attached picture. There are some outliers in the data, I need to remove them. I have to dynamically define a zone based on the extreme coordinates in the quardrants.
For example, let's say I need to remove the outlier data circled in red. The datapoint is in Maze4. I have attached the data for Maze4. I want to remove the bins where histcounts2 is < 2. I also need the 'xcoordinates2' and 'ycoordinates2' array after cleaning the outliers. I tried this so far.
h4 = histogram2(Maze4.xcoordinates2, Maze4.ycoordinates2, ...
nbins,'DisplayStyle','tile','ShowEmptyBins','on');
counts4 = histcounts2(Maze4.xcoordinates2, Maze4.ycoordinates2, 25);
index4 = h4.Values(counts4<2);
But the index4 gives me 1-D array. How do I solve this?

採用された回答

Image Analyst
Image Analyst 2022 年 6 月 10 日
First try to avoid the problem by not letting your rodents escape from your maze! 🐭🐹🐁🐀🤣
Then if you know your x coordinates must be between -80 and 80, use masking to extract only those good indexes:
% Find out which indexes are outside the -80 to 80 range.
mask = abs(x <= 80);
% Extract only those good indexes.
x = x(mask);
y = y(mask);
  5 件のコメント
Image Analyst
Image Analyst 2022 年 6 月 14 日
How about this, where you take all x data between the 5% and 95% points?
s = load('maze4.mat')
% Extract the table
t = s.Maze4
xOriginal = t.xcoordinates2;
yOriginal = t.ycoordinates2;
subplot(2, 2, 1)
plot(xOriginal, yOriginal, 'r.', 'MarkerSize', 8)
grid on
title('Original Data')
subplot(2, 2, 2)
[counts, edges] = histcounts(xOriginal, 100)
bar(edges(1:end-1), counts, 1)
grid on;
title('Histogram of X values')
theCDF = rescale(cumsum(counts) / sum(counts), 0, 100);
subplot(2, 2, 4)
plot(edges(1:end-1), theCDF, 'b-', 'LineWidth', 2)
title('CDF of Histogram')
grid on;
% Find the index of the 5% and 95% points
index1 = find(theCDF > 5, 1, 'first')
x1 = edges(index1)
xline(edges(index1), 'LineWidth', 2, 'Color', 'r')
% Find the index of the 5% and 95% points
index2 = find(theCDF > 95, 1, 'first')
x2 = edges(index2)
xline(edges(index2), 'LineWidth', 2, 'Color', 'r')
% Get a mask for the x values we want to exclude
indexesToKeep = t.xcoordinates2 >= x1 & t.xcoordinates2 <= x2;
xCleaned = xOriginal(indexesToKeep);
yCleaned = yOriginal(indexesToKeep);
subplot(2, 2, 3)
plot(xCleaned, yCleaned, 'r.', 'MarkerSize', 8)
grid on
title('Cleaned Data')
Or else you can use a clustering algorithm like dbscan. Demo attached.
Atanu
Atanu 2022 年 6 月 16 日
Very elegant! Thank you so much!

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeLoops and Conditional Statements についてさらに検索

製品


リリース

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by