Add to a cell array in a for loop

Question

0 投票

Hi,

So am trying to figure out how to add values to a cell array in a for loop. Basically, I am analyzing college major data from the ACS and wanted to determine what majors had salaries that were in the 75th percentile of non-STEM fields but the 25th percentile of STEM fields. I then found the top 5 salaries given these conditions and am now trying to figure out how to figure out which majors correspond to those salaries. I have a for loop that runs though all the majors and checks to see if that major has a 25th percentile or 75th percentile salary that is in the top 5. The problem is that there are more than one majors that share the salary in the top 5. So I need my cell array to store each major that corresponds to the salary. Here are the lines from the code I am trying to get to work:

for j = 1:length(majors)

indx = find(top5salary_nstem75_stem25==P25(j)|P75(j));

top5majors_nstem75_stem25(indx) = majors(indx); % need this to store multiple cells for each major that satisfies logic

end

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Follow Question

Answer 1

Star Strider 2019 年 4 月 1 日

MATLAB Online で開く

0 投票

Your ‘indx’ assignment is not coded correctly. See the documentation section on Apply Multiple Conditions (link).

This may work better:

indx = find(top5salary_nstem75_stem25==P25(j) | top5salary_nstem75_stem25==P75(j));

I can’t run your code.

4 件のコメント
2 件の古いコメントを表示 2 件の古いコメントを非表示

Camden Ford 2019 年 4 月 1 日

MATLAB Online で開く

recent-grads.csv

Yeah I realize that it probably an issue. Attached below is my code and dataset.

% The Economic Impact of College Majors
% Code written by Camden Ford (cmf68)
% Updated 04012019
% NB - This script requres MATLAB 2018 or later as function maxk and 
% some syntax will not run in previous versions
%% Initialize the script, set format, and load data set
clear; clc;
format shortG;
DATA = readtable('recent-grads.csv');
%% Separate relevant categories of interest from the data file
majorstype = DATA.Major_category;
majors = DATA.Major;
P75 = DATA.P75th;
P25 = DATA.P25th;
UnemploymentRate = DATA.Unemployment_rate;
MedianSalary = DATA.Median;
TotalNumber_major = DATA.Total;
%% 75th Percentile
% Find top 5 paying majors, how much $ they make and their index in the data set
[top5, idx] = maxk(P75,5);
top5majors = majors(idx);
% Remove engineering majors from 75th percentile
Patern = "Engineering";
TF = strcmp(majorstype,Patern);
majors_no_engineers = majors(~TF,:);
P75_no_engineers = P75(~TF,:);
% Find top 5 paying 75th percentile majors neglecting Engineering, how much $ they 
% make and their index in the data set
[top5salary_75_no_engineers, idx] = maxk(P75_no_engineers,5);
top5majors_75_no_engineers = majors_no_engineers(idx);
%% 25th Percentile
% Find top 5 highest 25th percentile salaries
[top5_25, idx] = maxk(P25,5);
top5_25_majors = majors(idx);
% Remove engineering majors from 25th percentile
P25_no_engineers = P25(~TF,:);
% Find top 5 paying 25th percentile  majors neglecting Engineering, how much $ they 
% make and their index in the data set    
[top5salary_25_no_engineers, idx] = maxk(P25_no_engineers,5);
top5majors_25_no_engineers = majors_no_engineers(idx);
%% Top 75th non-STEM, 25th of STEM
% Find indexes of all STEM majors and create logical variable TFSTEM
STEM = ["Agriculture & Natural Resources", "Biology & Life Science",... 
"Computers & Mathematics", "Engineering", "Health", "Industrial Arts & Computer Services",...
"Physical Sciences"];
TFSTEM = ismember(majorstype,STEM);
% Create a list of non-STEM majors from 75th percentile
P75_nstem = P75.*(~TFSTEM);
% Create a list of STEM majors from 25th percentile
P25_stem = P25.*(TFSTEM); 
% Create a new variable called salary_nstem75_stem25 that takes the value 
% of p75th for non-stem majors and p25th for stem majors
 salary_nstem75_stem25 = P75_nstem+P25_stem;
 top5salary_nstem75_stem25 = maxk(salary_nstem75_stem25,5);
% Now find the majors associated with those top 5 salaries
% Takes care of the fact that there is missing data, 
% and NAN value added to anything becomes NAN, so we set that to 0
P75(isnan(P75)) = 0; 
P25(isnan(P25)) = 0;
for j = 1:length(majors)
    indx = find(top5salary_nstem75_stem25==P25(j) | top5salary_nstem75_stem25==P75(j));
    top5majors_nstem75_stem25(indx) = majors(indx);
end 
%% Calculate the expected salary of each major
ExpectedSalary = zeros(length(UnemploymentRate),1);
for k = 1:length(UnemploymentRate)
    ExpectedSalary(k) = (1-UnemploymentRate(k))*(.5*(MedianSalary(k))+...
        .25*(P75(k))+.25*(P25(k)));
end
[top5ExpectedSalary idx] = maxk(ExpectedSalary,5);
top5majors_ExpectedSalary = majors(idx);
%% Modify the above equations by weighting each major by its # of graduates relative to major category
% Start by finding the number of of graduates per major
%Takes care of the fact that there is missing data, 
%and NAN value added to anything becomes NAN, so we set that to 0
TotalNumber_major(isnan(TotalNumber_major)) = 0; 
MajorCategories = unique(majorstype);
nummajors = zeros(length(MajorCategories),1);
Weight = zeros(length(majorstype),1);
for j = 1:length(majorstype)    
%finds the index where Majortype matches the MajorCategory
indx = find(ismember(MajorCategories,majorstype(j))); 
% Add the total number of students to the from each MajorCategory at that index of the vector
nummajors(indx) = nummajors(indx) + TotalNumber_major(j);            
end 
% Find the weight by dividing each major by total number for that majors
% category
for j = 1:length(majorstype)
    indx = find(ismember(MajorCategories,majorstype(j)));
    Weight(j) = TotalNumber_major(j)/nummajors(indx);
end 
% Find the weighted average of median, p75, and p25
Median_weighted = MedianSalary.*Weight;
P75_weighted = P75.*Weight;
P25_weighted = P25.*Weight;
Average_Median_weighted = zeros(length(MajorCategories),1);
Average_P75_weighted = zeros(length(MajorCategories),1);
Average_P25_weighted = zeros(length(MajorCategories),1);
for j = 1:length(majorstype)
    indx = find(ismember(MajorCategories,majorstype(j)));
    Average_Median_weighted(indx) = Average_Median_weighted(indx)+Median_weighted(j);
    Average_P75_weighted(indx) = Average_P75_weighted(indx)+P75_weighted(j);
    Average_P25_weighted(indx) = Average_P25_weighted(indx)+P25_weighted(j);
end 
% Find the weighted average of 75th non-STEM, 25th STEM
% 75th non-STEM
Average_P75_nstem_weighted = zeros(length(MajorCategories),1);
for j = 1:length(majorstype)
    indx = find(ismember(MajorCategories,majorstype(j)));
    Average_P75_nstem_weighted(indx) = Average_P75_weighted(indx)*(~TFSTEM(j));  
end 
% 25th STEM
Average_P25_stem_weighted = zeros(length(MajorCategories),1);
for j = 1:length(majorstype)
    indx = find(ismember(MajorCategories,majorstype(j)));
    Average_P25_stem_weighted(indx) = Average_P75_weighted(indx)*(TFSTEM(j));  
end 
% Combine non-STEM and STEM
salary_nstem75_stem25_weighted = Average_P75_nstem_weighted+Average_P25_stem_weighted;
% Find expected salary using weighted median, 75th, and 25th
for k = 1:length(MajorCategories)
   ExpectedSalary_weighted = 0.5*Average_Median_weighted+0.25*Average_P75_weighted+...
       0.25*Average_P75_weighted;
end
% Generate table summarizing results of weighted averages
T = table(MajorCategories,Average_Median_weighted,Average_P75_weighted,Average_P25_weighted,...
    salary_nstem75_stem25_weighted,ExpectedSalary_weighted);
% Find the max of each of the above variables and their major category
[MaxMajor(1) indx] = maxk(T.Average_Median_weighted,1);
MaxMajorCategory(1) = MajorCategories(indx);
[MaxMajor(2) indx] = maxk(T.Average_P75_weighted,1);
MaxMajorCategory(2) = MajorCategories(indx);
[MaxMajor(3) indx] = maxk(T.Average_P25_weighted,1);
MaxMajorCategory(3) = MajorCategories(indx);
[MaxMajor(4) indx] = maxk(T.salary_nstem75_stem25_weighted,1);
MaxMajorCategory(4) = MajorCategories(indx);
[MaxMajor(5) indx] = maxk(T.ExpectedSalary_weighted,1);
MaxMajorCategory(5) = MajorCategories(indx);

Camden Ford 2019 年 4 月 1 日

Thanks for the help but the problem I am still having is that there are multiple majors that have the same salaries top5salary_nstem75_stem25. In my code I want to add all the majors that share the same salary that is in the top 5 (for example 70,000 has several majors that meet the logic statement) but my cell array is only storing the last iterations. I want to keep track of all the majors that meet this requirement. How do I add strings for all of these majors to the cell array top5majors_nstem75_stem25in a for loop?

Star Strider 2019 年 4 月 1 日

編集済み: Star Strider 2019 年 4 月 2 日

MATLAB Online で開く

I’m not certain what you’re referring to.

Try this:

for j = 1:length(majors)
    indx{j,:} = find(top5salary_nstem75_stem25==P25(j) | top5salary_nstem75_stem25==P75(j));
%     top5majors_nstem75_stem25(indx) = majors(indx);
end 
idx = unique([indx{:}]);
top5majors_nstem75_stem25(idx) = majors(idx);

There are only five, those being 1, 2, 3, 4, and 8.

EDIT — (1 Apr 2019 at 00:49)

I was primarily concerned with the logic of your find call, so I didn’t look much further through your code. The only other possibility is to use the ismember (link) or ismembertol function instead of find and the loop.

I have no other suggestions.

サインインしてコメントする。

Add to a cell array in a for loop

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

回答 (1 件)

4 件のコメント
2 件の古いコメントを表示 2 件の古いコメントを非表示

カテゴリ

タグ

Community Treasure Hunt

Add to a cell array in a for loop

0 件のコメント -2 件の古いコメントを表示 -2 件の古いコメントを非表示

回答 (1 件)

4 件のコメント 2 件の古いコメントを表示 2 件の古いコメントを非表示

カテゴリ

タグ

参考

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

4 件のコメント
2 件の古いコメントを表示 2 件の古いコメントを非表示