Create a signalDatastore from csv files
4 ビュー (過去 30 日間)
古いコメントを表示
Hello everyone,
I have to train some NNs with large data, but it's the first time I'm dealing with it and I got stuck. I'd like to create a signalDatastore, that seems to be the best option for my purpose, from many csv files. These files contain around 80 features each (column-wise), but features are not the same in each file. I'd like to create a datastore with these csv, then select only the features present in every file (through SelectedVariableNames) and go through filtering, etc. I wouldn't like to read every file and pre-select these features (I already know which are shared among all of the files) before creating datastore because it would be time and resource consuming.
If my workflow is not correct, please let me know, I'd be happy to hear from you.
Thanks in advance.
0 件のコメント
採用された回答
LeoAiE
2023 年 4 月 24 日
In your case, you can use a tabularTextDatastore to read the CSV files, and then use the SelectedVariableNames property to select only the shared features. Since you mentioned that you already know which features are shared among all the files, you can directly set the SelectedVariableNames property.
Here's an example of how to create a tabularTextDatastore from multiple CSV files, and select specific features using SelectedVariableNames:
% Create a list of CSV files
fileList = {'file1.csv', 'file2.csv', 'file3.csv'}; % Replace with your actual file names
% Create a tabularTextDatastore from the list of CSV files
ds = tabularTextDatastore(fileList, 'TreatAsMissing', 'NA', 'MissingValue', NaN);
% Set the selected features (Replace 'Feature1', 'Feature2', etc. with your actual shared feature names)
sharedFeatures = {'Feature1', 'Feature2', 'Feature3'};
ds.SelectedVariableNames = sharedFeatures;
% Read the data from the datastore
data = readall(ds);
% Proceed with filtering, processing, and training the neural network
By using the SelectedVariableNames property, you'll only read the shared features from each CSV file, avoiding the need to pre-process or filter the data beforehand. This should help you save both time and resources.
0 件のコメント
その他の回答 (0 件)
参考
カテゴリ
Help Center および File Exchange で Large Files and Big Data についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!