Hey, just realised that the int8(a) stuff and the clear T and clear C are unnecessary.probably also better to use R = numel(intersection) rather than [R C] = size(.). Still blocked concerning the cell format issue, though.
How to allocate the content of a M*1 vector to a T*N matrix (M<T*N) ? problem with cell format
1 ビュー (過去 30 日間)
表示 古いコメント
Dear all
I am newcomer to matlab and have a question that, I believe, should be easy to solve. Yet, I have lost way too many hours on this particular detail and find myself unable to advance. I hope someone can point out the problem. Maybe the function is poorly coded to begin with. Anyways, thanks in advance!
I have a set of unbalanced monthly data for various firms.
I want to create a matrix y: dim(y) = T * N, where T is number of periods and N is number of firms
My idea is to create a function - ymatrix - that retrieves the value of interest at time t for firm n and places it in position (t,n) in the matrix y.
With this in mind, I wrote the function below:
function y = ymatrix(intersection,date,id,var)
% intersection is an argument that allows me to determine the number of columns of y. % id is the the identification
% number of the firm - in cell format % var is that variable of interest
[R C] = size(intersection);
T = [2007 - 1964 + 1]*12 + 12;
clear C
y = NaN(T,R);
clear T
sortDate = sort(date,'ascend'); % I do no sort IDs as they are already sorted in the dataset
T = length(date);
N = length(id);
past = cell(T,1);
for t = 1 : T
for n = 1 : N
if sum(ismember(past,id{n})) == 0
% This is where the problem seems to lie. Since I want to have each column of y as a firm
% and since in the dataset a given id appears many times, I must apply the current loop only for firms to which
% I haven't applied it before. Otherwise, I will have several repeated columns.
% Hence, at the end of each loop, I want to add the id of the firm I am currently looping through to a
% local variable - "past" - that I can then use - in the next loop - to decide whether to skip that particular
% loop. The problem is that id is in cell format. Either I pre-allocate memory to past in cell format and
% use past{n,1} = id{n,1}, in which case, in the next loop, I receive an error in sum(ismember(past,id{n})) ==
% 0 that reads :
%"Input must be cell arrays of strings"; or I do not preallocate it to be a cell and allocate it as,
%for instance,past = NaN(10,1), and I receive an error in past{n,1} = cusip{n,1} that
%reads: "Cell contents assignment to a non-cell array object."
a = ismember(date,sortDate(t));
b = ismember(id,id{n});
a = int8(a);
b = int8(b);
d = find(a == 1);
e = find(b == 1);
index = intersect(d,e); % Note that the intersection must either be a number of empty
past{n,1} = id{n,1};
if index == zeros(0,1) % This is to account for the possibility that the intersection is empty,
% in which case I simply skip the current loop
else
y(t,n) = var(index,1);
end
else
end
end
end
end
採用された回答
Guillaume
2016 年 4 月 21 日
編集済み: Guillaume
2016 年 4 月 21 日
If I understood correctly, this should work:
[udate, ~, drow] = unique(date); %udate could be replaced by ~ if you don't want a table
[uid, ~, idrow] = unique(id); %uid could be replaced by ~ if you don't want a table
y = nan(max(drow), max(idrow));
y(sub2ind(size(y), drow, idrow)) = var;
You can prettify the output with:
t = array2table(y, VariableNames, uid, RowNames, cellstr(num2str(udate)))
Note that neither date or id need to be sorted since unique does the sorting anyway.
その他の回答 (1 件)
Renato Agurto
2016 年 4 月 21 日
If you preallocate past:
past = NaN(10,1);
then you should use
past(n,1) = id{n,1};
instead of
past{n,1} = id{n,1};
参考
カテゴリ
Find more on Mathematics and Optimization in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!