Rank stock returns by month

5 ビュー (過去 30 日間)
Syabil
Syabil 2023 年 5 月 12 日
回答済み: Eric Sofen 2023 年 6 月 5 日
I have a table of stock returns for each company in the US, in each month from 2000-2019. Each month has roughly 3000 companies (and forecasted returns). I want to sort these forecasted returns into ten deciles, with decile 1 having the highest 10% of forecasted returns in each month (90th percentile), and so on so decile 10 has the lowest 10% forecasted returns of the month (10th percentile). I tried discretize but this function ranks the whole table together, not by month. Finally, I want to be able to combine all decile 1 ACTUAL returns into one table, decile 2 returns into another table and so on, and then find the average returns for each month, in each decile. I know I can do this using groupsummary, but I need to get the deciles assigned first. Any tips? Thanks.
I attached a sample data that I believe represents the structure of my table. I just took random parts of my table and put it together. The original table is 700k+ rows long. I need to rank by "TPER" (forecasted returns), and in the end I will want to find average "RET" (actual return) for each month in each decile. Thanks!
  2 件のコメント
William Rose
William Rose 2023 年 5 月 12 日
@Syabil, To help others help you, please provide the table, or a subset of the table, or simulated data with the same table structure, that illustrates how the data is arranged. Someone could waste time devising a solution that doesn;t work for your table. For example, your table could have 3000 rows and 241 columns (col.1=company name, columns 2-241=monthly returns, for the months from Jan 2000 to Dec 2019). Or the table could have 240 x 2 columns, because the company names could change from month to month, as some companies fail and others start. Or the columns could be companies and the rows could be months. Or something else.
Syabil
Syabil 2023 年 5 月 13 日
Hi, sorry for that. I added some sample data in my question now. Thanks!

サインインしてコメントする。

回答 (2 件)

Santosh Fatale
Santosh Fatale 2023 年 5 月 18 日
Hi Syabil,
I understand that you want to process the data available in tabular format in MATLAB.
Kindly refer to the following code snippet, which could help you to understand about extracting rows of interests from the table variable.
load("sampledata.mat");
originalTable = sampledata;
% Sort table as per Year+Month and assign it to new table variable.
[sortedValues, sortingIndices] = sort(originalTable.dates,'ascend');
updatedTable = originalTable(sortingIndices,:);
% Find unique dates from the updated table.
uniqueDates = unique(updatedTable.dates);
% Loop over each unique date and process the data.
for idx = 1 : length(uniqueDates)
% Select the rows as per
tempTable = updatedTable(updatedTable.dates == uniqueDates(idx),:);
% Write your logic to process data from tempTable
end
In “for” loop, you could add logic for processing data extracted from the original table.
Refer to the documentation page of table, unique, sort to learn more about functions used in the code snippet and how to process tabular data in MATLAB.
  1 件のコメント
Syabil
Syabil 2023 年 5 月 20 日
編集済み: Syabil 2023 年 5 月 20 日
Hi Santosh,
Thank you for this! Just one final question, I did the processing, but after the loop ends, I'm only left with values of d for the last loop (ie. Dec 2018 since its the last month in my table). I do know that each loop would overwrite previous interations. So how do I extract the values for the previous months (preferably into the original table) before it is overwritten? Most answers I find on this website are when each row has unique i values, such as i=1:10 with also 10 rows, but my data will have thousands of rows with the same idx value. How do I get around that? Thanks.
for idx = 1 : length(uniqueDates)
% Select the rows as per
%fulltable and yyyymm are simply my actual table and variable names
tempTable = fulltable(fulltable.yyyymm == uniqueDates(idx),:);
% Write your logic to process data from tempTable
d= discretize(tempTable{:,"TPER"},quantile(tempTable{:,"TPER"},[0:10]/10));
end

サインインしてコメントする。


Eric Sofen
Eric Sofen 2023 年 6 月 5 日
First off, get your dates into datetime rather than YYYYMM doubles!
Then, you can use groupsummary to group the data by month and do the discretize you have above. The tricky part is that each month appears to have a different number of companies, so the size of the output varies from month to month. To deal with this, we wrap the output in a cell. Then, for any given month, if you want the company numbers corresponding to the quantiles, you'll have to index back into that list...
load sampledata.mat
sampledata.dates = datetime(string(sampledata.dates),InputFormat="yyyymm");
head(sampledata)
dates companynumber TPER RET ____________________ _____________ _______ _______ 01-Jan-2000 00:01:00 79678 0.10131 0.5681 01-Jan-2000 00:01:00 84828 0.10181 1.4585 01-Jan-2000 00:01:00 81127 0.10192 1.129 01-Jan-2000 00:01:00 80320 0.10388 1.1066 01-Jan-2000 00:01:00 77668 0.10641 0.74181 01-Jan-2000 00:01:00 87092 0.11021 0.1373 01-Jan-2000 00:01:00 80049 0.11454 0.92289 01-Jan-2000 00:01:00 83531 0.1147 0.51898
pctlFun = @(x) {discretize(x,quantile(x,[0:10]/10))};
coNumList = @(y) {horzcat(y)};
monthlyPercentiles = groupsummary(sampledata,"dates","month",pctlFun,"TPER")
monthlyPercentiles = 4×3 table
month_dates GroupCount fun1_TPER ___________ __________ _____________ Jan-2000 30 {30×1 double} Jan-2004 15 {15×1 double} Jan-2011 15 {15×1 double} Jan-2015 30 {30×1 double}
Depending on the format you ultimately want, you could also use rowfun to find which companies are in each quantile, then put each quantile's companies into a cell that goes into the table. Similar to this example.

カテゴリ

Help Center および File ExchangeShifting and Sorting Matrices についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by