To add more detail, what i have tried in excel is to take the differnece in time points (since each point is identical expect for the loop) to try and show where i need to split the column. This is a looooong process though using excel and then i am trying to maually copy and paste each segment into a new colunm repsectivly.
現在この質問をフォロー中です
- フォローしているコンテンツ フィードに更新が表示されます。
- コミュニケーション基本設定に応じて電子メールを受け取ることができます。
Parse 1 large column of uneven data into an array of columns by nth rows.
2 ビュー (過去 30 日間)
古いコメントを表示
Everyone, this may be a very easy question, and as i mentioned in previous work i am a biochemist trying to apply some basic (I think basic) metlab to help expidite data managemnet. I have a text file that is 4 colunms by 136000 rows (ish). (see attached text). I am running a loop where voltage is constnat and current is measured. The time between each measuremnt is identical OTHER then when the loop occurs creating some larger time gap. I want to extract each loop in reshape (neat metlab word i recently leanr haha) the text so that each colunm is one loop for however many loops there are.
So.. the data file is to large and it wont let me attach. I can explain the data in more detail if needed.
7 件のコメント
Jeffrey Clark
2022 年 11 月 10 日
@David Probst, please make a small file to attach or at least tell us what each of the four columns contains, you only mention time, voltage and current.
David Probst
2022 年 11 月 10 日
Jeffrey,
Will do! Sorry i ddint think of that right away. See attached!
Jeffrey Clark
2022 年 11 月 11 日
@David Probst, thanks but the file you attached was created as text with only 3 significant digits, this quickly becomes a problem for the time column after line 200 or so (201 x 0.05 needs 4 significant digits and it just keeps getting worse). Please repost with sufficient significant digits for the entire "short" file so we can see what the gaps are supposed to be like.
David Probst
2022 年 11 月 14 日
The "normal" time step for each point should be ~ 0.01-0.07 seconds, and the loop occurs when the delta in time is about 340 seconds (plus and minus some decimal). so in excel I take the differnee in each time point in a seperate colunm, then do conditional formating to highlight points higher then 1 (which only is the loop wait times). After I manually copy and paste each colunm bteween the highlighted region. Obviously this is a VERY long and VERY VERY ineffiecnt. I apoliges if there is missing inofmration you need, I am a biochemcist trying to learn this basic (I think basic) matlab to help out our research.
回答 (1 件)
Bjorn Gustavsson
2022 年 11 月 11 日
If your file contains an equal number of samples for every "loop" and the files contains data from full "loops". Then something like this should work:
n_per_loop = 37; % adjust
idx_time_var = 1; % index to the column with the time-stamps
sz_data = size(datasmaple);
n_loops = floor(sz_data(1)/n_per_loop); % this should get us the number of full loops
% This should extract the time-stamps and put them in an
% [ n_per_loop x n_loops ] array
t_all = reshape(datasmaple(1:(n_per_loop*n_loops),idx_time_var),n_per_loop,n_loops);
for i2 = setdiff(1:sz_data(2),idx_time_var) % Here we loop over the other columns
% Extract those columns reshape them and put them into a cell-array
data_cell_format{i2} = eshape(datasmaple(1:(n_per_loop*n_loops),i2),n_per_loop,n_loops);
end
HTH
13 件のコメント
David Probst
2022 年 11 月 11 日
Bjorn,
Thank you for the code, the loops sizes are not the exact same (they are all about 1-3 differente in size) although i think if I run them as the same it should still work, since the change in reuslts would be negibible.
that bing said I am working through your code and get this error:
>> ParseTest
Error using tabular/reshape (line 194)
Undefined function 'reshape' for input arguments of type 'table'.
Error in ParseTest (line 7)
t_all = reshape(datasmaple(1:(n_per_loop*n_loops),idx_time_var),n_per_loop,n_loops);
Im playing with a few things such as ensure the n_per_loop is a number that can be equally divided into the total data set (right now there is some points a the the end that are "leftover" since the loop size is not perfectly divisible in the total number). IDK if thats the issue. Again i apologize if this is not the most helpful. I appreciate the support.
Bjorn Gustavsson
2022 年 11 月 11 日
Then I would convert the table to an array (but that's much because I've never had any reason to start working with tables). Step 2 would be to create an array with indices of the end of each "loop" (did those correspond to end-of-day?). Perhaps something like this as a first step:
datasample = table2array(datasmaple);
sz_data = size(datasmaple);
t_all = datasample(:,idx_time_var); % lets say the time is in seconds
doy = round(t_all/(24*3600))+1;
udays = sort(unique(doy),'descend');
% Extract those columns reshape them and put them into a cell-array
% now it seems we have to this day-by-day
for iDay = udays
iCurr = find(doy==iDay);
for i2 = 1:sz_data(2) % Here we loop over all columns
data_cell_format{i2}(1:numel(iCurr),iDay) = datasample(iCurr,i2);
end
end
It is most likely not the fastest, but should get the job done...
HTH
David Probst
2022 年 11 月 14 日
I will try this today and let you know! thank you for the support, I may post some questions (so i can learn some of the detaisl here) if that cool!
David Probst
2022 年 11 月 14 日
Also, the time goas are not in the way of "end of each day" its more of about every 6000 points +/- 5-10 points due to sampling error of the tool. So one coulnm made need to be about 6000 rows while the other is 6005, and the time jump is 0.5 second for every point, execpt for when the loop occurs which has a longer (~340 second) wiat time.
Bjorn Gustavsson
2022 年 11 月 14 日
Well the "only" thing to change then is to build the index to the current loop taking that into account. Something like this:
dt_jump = 100; % or whatever jump-condition you can use between loops
idxEnd = [find( diff(t_all) > dt_jump ), numel(t_all)];
idxStart = [1,idxEnd(1:end-1)+1];
Then change the loop to something like this:
for iLoop = 1:numel(idxStart)
iCurr = idxStart(iLoop):idxEnd(iLoop)
for i2 = 1:sz_data(2) % Here we loop over all columns
data_cell_format{i2}(1:numel(iCurr),iLoop) = datasample(iCurr,i2);
end
end
David Probst
2022 年 11 月 14 日
So to make sure i understand this, i am usnig the following code:
datasample1 = table2array(datasmaple);
sz_data = size(datasmaple);
idx_time_var = 1;
t_all = datasample1(:,idx_time_var); % lets say the time is in seconds
dt_jump = 100; % or whatever jump-condition you can use between loops
idxEnd = [find( diff(t_all) > dt_jump ), numel(t_all)];
idxStart = [1,idxEnd(1:end-1)+1];
for iLoop = 1:numel(idxStart)
iCurr = idxStart(iLoop):idxEnd(iLoop)
for i2 = 1:sz_data(2) % Here we loop over all columns
data_cell_format{i2}(1:numel(iCurr),iLoop) = datasample(iCurr,i2);
end
end
There is an issue at line "idxEnd = [find( diff(t_all) > dt_jump ), numel(t_all)];". I lloked up each definiton, and from what I understnad this is trying to build some index array where the frist colunm is where we find the change in time to be anything larger then 100, and the seocnd function numel(t_all) is to ensure we do this over the entire dataset?
Bjorn Gustavsson
2022 年 11 月 14 日
Yes, as far as I understand your comment at your question, you have sample-intervalls between 0.01 and 0.07 seconds during the loop and then more than 100 s dwell-time after the end of one loop and the start of the next. If that is a time-step separation that is reliable you should be able to use the time-differences that are larger than 1 s (10? 50?) to identify the end of one and start of the next loop.
David Probst
2022 年 11 月 21 日
Hey so i am having challenges with implementing the following code:
datasample1 = table2array(datasmaple);
sz_data = size(datasmaple);
idx_time_var = 1;
t_all = datasample1(:,idx_time_var); % lets say the time is in seconds
dt_jump = 5; % or whatever jump-condition you can use between loops
idxEnd = [find(diff(t_all)>dt_jump,numel(t_all))];
idxStart = [1,idxEnd(1:end-1)+1];
for iLoop = 1:numel(idxStart)
iCurr = idxStart(iLoop):idxEnd(iLoop)
for i2 = 1:sz_data(2) % Here we loop over all columns
data_cell_format{i2}(1:numel(iCurr),iLoop) = datasample(iCurr,i2);
end
end
It seems to break down where i bolded (idxStart).
Any advice for suggestions for me to test? I appreciate the support!
David Probst
2022 年 11 月 21 日
SO i changed the code to the following:
datasample1 = table2array(datasmaple);
sz_data = size(datasmaple);
idx_time_var = 1;
t_all = datasample1(:,idx_time_var); % lets say the time is in seconds
dt_jump = 5; % or whatever jump-condition you can use between loops
idxEnd = [find(diff(t_all)>dt_jump,numel(t_all))];
idxStart = [idxEnd(1:end-1)];
for iLoop = 1:numel(idxStart)
iCurr = idxStart(iLoop):idxEnd(iLoop)
for i2 = 1:sz_data(2) % Here we loop over all columns
data_cell_format{i2}(1:numel(iCurr),iLoop) = datasample(iCurr,i2);
end
end
It wokrs but fills the final arrays with only the first row and colunm with vlaues all other values are 0.
Bjorn Gustavsson
2022 年 11 月 22 日
It might be easiest if you mock up a very small test data set, with say some 5 data-points between jumps and a coupld of loops, 6-7 perhaps. Then you can step through the algorithm with the debugger line by line and check what the different variables and indices should be and how to correct them. if you for example define a time-variable like this:
t_all = [1:5,11:15,21:25,31:34,41:45]';
You should rather easily be able to make some data, and figure out the indices to the starts and stops of each loop, then check that you get from the code-snippet and how to modify it.
David Probst
2022 年 11 月 22 日
That is a good idea! thank you! I appreicate the support, i am very very novice to coding.
参考
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!エラーが発生しました
ページに変更が加えられたため、アクションを完了できません。ページを再度読み込み、更新された状態を確認してください。
Web サイトの選択
Web サイトを選択すると、翻訳されたコンテンツにアクセスし、地域のイベントやサービスを確認できます。現在の位置情報に基づき、次のサイトの選択を推奨します:
また、以下のリストから Web サイトを選択することもできます。
最適なサイトパフォーマンスの取得方法
中国のサイト (中国語または英語) を選択することで、最適なサイトパフォーマンスが得られます。その他の国の MathWorks のサイトは、お客様の地域からのアクセスが最適化されていません。
南北アメリカ
- América Latina (Español)
- Canada (English)
- United States (English)
ヨーロッパ
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom(English)
アジア太平洋地域
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)