Consecutive count of values based on multiple conditions

3 ビュー (過去 30 日間)
mtango
mtango 2020 年 11 月 19 日
回答済み: Setsuna Yuuki. 2020 年 11 月 19 日
I have a very big table consisting of approximately 3 million rows. The table consists of four columns [ID period amount pbehind], see the following screenshot including the output column I want.
For every ID, I want to count the consecutive amount of times of behind payment (pbehind). But there are a few conditions to this:
  1. when pbehind <= 0.5, then output = 0
  2. when pbehind > 1.5, but value in previous period is 0, then output = 0;
  3. when pbehind > 0.5, but amount in next period (which is last period) is zero, then output=0.
There are a couple of things to note:
  1. period does not have to start at 1 and increase by 1 (see ID = 2). However it is ordered by ID and period using sortrows(table,[1,2]).
  2. An ID can make multiple payments for a certain period, however pbehind will always be equal to for that period (see ID=3).
I know for the first condition that I can just do the following:
output = pbehind;
output(output<=0.5)=0;
But I cannot figure out how to efficiently implement the other conditions. Does anyone know how to do this?

回答 (2 件)

Peter Perkins
Peter Perkins 2020 年 11 月 19 日
Put your data in a table, and use rowfun with ID as the grouping variable. Write a function that does all the logic you need to enforce. Your function will be applied to one group of rows at a time, and should return a column vector of output values. Here's a simple example:
>> t = table([1;1;1;2;2],rand(5,1),rand(5,1),'VariableNames',{'ID' 'X' 'Y'})
t =
5×3 table
ID X Y
__ ________ ________
1 0.82202 0.26854
1 0.041591 0.63908
1 0.91635 0.031734
2 0.17678 0.33395
2 0.92236 0.8908
>> t2 = rowfun(@(x,y) x - mean(y),t,'GroupingVariable','ID')
t2 =
5×3 table
ID GroupCount Var3
__ __________ ________
1 3 0.50891
1 3 -0.27152
1 3 0.60323
2 2 -0.4356
2 2 0.30998
>> t.Output = t2.Var3
t =
5×4 table
ID X Y Output
__ ________ ________ ________
1 0.82202 0.26854 0.50891
1 0.041591 0.63908 -0.27152
1 0.91635 0.031734 0.60323
2 0.17678 0.33395 -0.4356
2 0.92236 0.8908 0.30998
Obviously you will need to write your own function.
  1 件のコメント
mtango
mtango 2020 年 11 月 19 日
The problem is that I cannot figure out how to create code to count the consecutive amount of times of behind payment (pbehind) for each ID...

サインインしてコメントする。


Setsuna Yuuki.
Setsuna Yuuki. 2020 年 11 月 19 日
you should try with a series of if..else. For example:
for i = length(output)
if(pbehind(i) > 1.5 && pbehind(i-1) == 0)
output(i) =0;
elseif(condition)
...
end

カテゴリ

Help Center および File ExchangeTables についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by