How to extract specific rows with requirements from multiple CSV files?

5 ビュー (過去 30 日間)
Ryan Wang
Ryan Wang 2017 年 8 月 28 日
コメント済み: MUKESH KUMAR 2020 年 6 月 3 日
Hi,
I am having a question regarding to data extracting and need your help.
I have roughly 700 CSV files under three folders with the same format. Each CSV contains 40 columns, and the columns I am interested in are the 32th and 33th column, which are "Latitude" and "Longitude".
Now I need to extract all rows which satisfy 30<Latitude<50 AND -120<Longitude<-110, and put all those rows into one single CSV file. What I have known so far is to open the file, textscan those data into 40 different cells, and close the file.
Could someone help me on this issue? Thanks!
Typical CSV files are like:
Date Time ...... Latitude Longitude ......
08/27 7:00 ...... 33.44 110.43 ......
08/28 7:00 ...... 35.44 160.43 ......
08/29 7:00 ...... 33.44 120.48 ......
08/29 7:00 ...... 32.44 150.42 ......
08/29 7:00 ...... 38.44 130.46 ......
08/29 7:00 ...... 53.44 110.63 ......
......
Two out of 600 sample CSV files have been attached.
  2 件のコメント
Akira Agata
Akira Agata 2017 年 8 月 28 日
To import CSV file, readtable function would be more easier, because it reads csv data and saves it to table format.
Regarding an extraction of specific rows in table, Access Data in a Table page will be helpful.
If you could upload your csv file here, I (...or someone) can provide more detailed solution. So, I would recommend uploading your sample data.
Ryan Wang
Ryan Wang 2017 年 8 月 28 日
Thank you very much Akira. I just uploaded two csv files, where I have nearly 600 of them. Specifically, the latitude and longitude columns are the 32th and the 33th, and their requirements are 33.975~33.982, and -117.373~-117.331, respectively.

サインインしてコメントする。

採用された回答

Akira Agata
Akira Agata 2017 年 8 月 28 日
Hi Ziran-san, thank you for uploading some sample csv files. Thanks to these files, I could understand the point!
Based on these files, I believe the solution for your problem will be like the following. In this code, I assumed that your csv files are stored in ./data folder.
But looking at your sample csv files, there are no rows that matches your condition. So, using your two csv files, the output of the following code is empty (I hope this is simply due to these two sample files...). Anyway, I would be happy if my answer could be your help! :-)
% List of CSV files
dataFolder = './data/'
list = dir([dataFolder,'*.csv']);
% CSV data format
format = ['%{MM/dd/yy HH:mm:ss}D',repmat('%f',1,39)];
% Extraction condition
latLim = [33.975, 33.982];
lonLim = [-117.373, -117.331];
% Read each CSV file, extract rows and store them to 'Output'
Output = table();
for kk = 1:numel(list)
data = readtable(...
fullfile(list(kk).folder, list(kk).name),...
'HeaderLines',2,...
'Format', format);
idxLat = (data{:,32} >= latLim(1)) & (data{:,32} <= latLim(2));
idxLon = (data{:,33} >= lonLim(1)) & (data{:,33} <= lonLim(2));
idx = idxLat & idxLon;
Output = [Output; data(idx,:)];
end
  3 件のコメント
Ryan Wang
Ryan Wang 2017 年 9 月 4 日
Hi Akira,
Just one further question. I am getting an error "Reference to non-existent field 'folder' today. Before this happened, the code works pretty well. Do you know why does this happen?
Thanks,
Ziran

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeApplication Deployment についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by