GEV Mixture Model (as opposed to GMM)

2 ビュー (過去 30 日間)
Andrew Feenan
Andrew Feenan 2023 年 6 月 19 日
コメント済み: Ayush Kashyap 2023 年 6 月 22 日
Hi,
I'm looking for advice on using a GEV mixture model for clustering as opposed to a Gaussian Mixture Model. I want to compare the results of the two. I'm wondering if it is feasible?
The data set I will be using is approx. 1TB of time series data.

回答 (1 件)

Ayush Kashyap
Ayush Kashyap 2023 年 6 月 19 日
If you have a good understanding of the data's underlying distribution, you can use a Generalized Extreme Value (GEV) mixture model for clustering. Extreme events, such as the distribution of storm intensity or the peak levels of a flood, are frequently modeled using the GEV distribution, a continuous probability distribution. It attempts to estimate those parameters from the data, as with any mixture model, assuming that your data come from a mixture of several subpopulations, each of which has its own parameters.
Where Gaussian distributions may not be sufficient, GEV mixture models can capture more complex data distribution shapes, particularly tail behaviors, in comparison to Gaussian mixture models. As a result, if your data have high values or long tails, you might find that a GEV mixture model fits the data better than a Gaussian mixture model.
However, you should be aware that working with GEV mixture models can be more time-consuming than working with Gaussian mixture models, particularly when dealing with large datasets like the one you describe. Compared to the mean and covariance parameters used in Gaussian mixture models, these models require fitting multiple distribution parameters for each mixture component, which can be more complicated and take longer to compute. To process your data quickly, you may need to use advanced computing architectures like GPUs or clusters.
In conclusion, if your data have extreme values or large tails, a GEV mixture model can be used for clustering. However, you should be aware that these models may necessitate specialized computing resources and can be more computationally demanding to work with than Gaussian mixture models. You can determine which method provides the best representation and fit to your data by comparing the results of the two models. This will help you comprehend the underlying distribution of your data.
  2 件のコメント
Andrew Feenan
Andrew Feenan 2023 年 6 月 19 日
Thank you for your response!
The data does fit the GEV distribution better than the Gaussian distribution. My understating of clustering is very limited. How would you recommend going about creating a GEV mixture model? I have found it difficult to find resources/research papers touching on the topic, whereas GMM is a very common approach.
I would really appreciate any advice you may have to offer.
Ayush Kashyap
Ayush Kashyap 2023 年 6 月 22 日
You may refer to following documentations for a better understanding of clustering and GEV in particular:

サインインしてコメントする。

製品


リリース

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by