Plot a curve that splits data into two sets

3 ビュー (過去 30 日間)
RPatel
RPatel 2017 年 8 月 4 日
コメント済み: Image Analyst 2017 年 8 月 14 日
Hello,
I have data points which represent 2 classes (collisions avoided and probable collisions). My goal is to plot a curve (polynomial equation), that would split the data points say in a chosen ratio (Say 90% collisions avoided to 10% probable collisions). Note that data points corresponding to two classes are very close.
I have tried using 'fit' funciton in matlab, and for a polynomial of degree 8, here is what I get (refer image). But it doesn't split the data as required.
I am looking at Support Vector Machines for Binary Classification (I am not an expert in this domain), I am not sure if it would help. How can I get the data seggregation I want?
Best,
Raj

回答 (3 件)

Greg Heath
Greg Heath 2017 年 8 月 4 日
Your data is extremely discontinuous. The best you can hope for is a decision tree.
Hope this helps
Thank you for formally acceptingmy answer
Greg
  2 件のコメント
RPatel
RPatel 2017 年 8 月 14 日
Thanks Greg for your suggestion, but it will not help my study...
Image Analyst
Image Analyst 2017 年 8 月 14 日
Too bad because I think that's your best shot at a possible solution. Since your data is so overlapping, I think that those two parameters are not enough to do the discrimination. You'd best try to look for a third or fourth parameter, like acceleration, velocity vector angles, or something. If you can't, then I think a treebagger/random forest/decision tree type of approach is the best you can hope for, like Greg said. See the scatterplot example on https://www.mathworks.com/help/stats/ensemble-methods.html#bsx62vu Actually your ad hoc convex hull example is somewhat related to a treebagger type of solution. It also sounds a bit like dbscan https://en.wikipedia.org/wiki/DBSCAN

サインインしてコメントする。


John D'Errico
John D'Errico 2017 年 8 月 4 日
But why would a polynomial regression fit have any chance of satisfying this goal? It would be pure random chance if it came even close. It is especially wrong to hope that such a fit, based on purely distance as the independent variable would have a chance.
It seems you are looking for a nonlinear discriminant curve, based on both velocity and distance. I'd suggest neural nets, but just because you want to see a 90% success rate does not mean any such function exists. You could have as easily have insisted on a 99.99% target success rate. If wishes were horses, beggars would ride.
What you need to be modeling is a boolean result, thus collision or not, as a function of TWO independent variables, vehicle velocity and inter-vehicle distance. Again, use a tool of your choice. But a polynomial regression is still NOT the tool I would ever advise here.
  1 件のコメント
RPatel
RPatel 2017 年 8 月 4 日
Hello John,
I never had to do something of this sort before, and I have no idea about the diverse tools matlab offers to solve this kind of an issue. 'Polynomial regression fit' is just one of them I came across and I tried.
Indeed, I would like to have a different curve, for different percentage of success rate (90 %, 99%, etc.).
I will have a look at neural nets to see if it helps. Thanks for your comments :)

サインインしてコメントする。


RPatel
RPatel 2017 年 8 月 14 日
As there doesn't seem to be any solution to this, here is what I did:
I found the centroid, chose x % of the closest points. Then I plot a convexhull around those points. Next, I check whether a particular point of interest lies in or out of the convex hull. Using this, I manage to get the percentage of collisions avoided to probable collisions (of points inside the hull)..
Hope this helps to others who might face a similar situation...

カテゴリ

Help Center および File ExchangeStatistics and Machine Learning Toolbox についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by