Summing over observations in unbalanced panel data
2 ビュー (過去 30 日間)
古いコメントを表示
Hello, I have an unbalanced panel data set. For each id, I'd like to sum all values of x up to the latest time I observe that id, and record the summation to a new variable. How can I do this? Ideally, I'd like to avoid looping as I have a large dataset and I try to speed up the process.
Thank you in advance!
Selcen
3 件のコメント
dpb
2024 年 8 月 17 日
As @the cyclist notes, w/o a sample dataset we're pretty-much without recourse to a direct reponse, but look at rowfun and/or splitapply
採用された回答
Shishir Reddy
2024 年 8 月 20 日
編集済み: Shishir Reddy
2024 年 8 月 20 日
Hi Selcen
As per my understanding you would like to sum all the values of a specific variable till the latest occurrence of that variable in an unbalanced panel data set and record the summation to a new variable.
Assuming that the data is in a MATLAB table format with at least 3 columns ‘id’, ‘time’, and ‘x’, the following is a sample MATLAB code to achieve this.
% Sample unsorted data
data = table([2; 1; 3; 2; 1; 3; 1], [1; 3; 2; 2; 1; 1; 2], [5; 20; 30; 15; 10; 25; 35], 'VariableNames', {'id', 'time', 'x'});
data = sortrows(data, {'id', 'time'}); %Sort the table by 'id' and 'time'
[~, idx] = unique(data.id, 'last'); %Find the maximum time for each 'id'
data.cumSumX = cumsum(data.x); %Calculate the cumulative sum of 'x' for each 'id'
latestCumulativeSum = data.cumSumX(idx); %Extract the cumulative sum at the latest time for each 'id'
result = table(data.id(idx), latestCumulativeSum, 'VariableNames', {'id', 'LatestCumSumX'});
% Display the result
disp(result);
For more information regarding the ‘cumSum’ function kindly refer the following documentation https://www.mathworks.com/help/matlab/ref/cumsum.html
I hope this helps.
その他の回答 (0 件)
参考
カテゴリ
Help Center および File Exchange で Logical についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!