Read many data blocks from text file

7 ビュー (過去 30 日間)
Timo
Timo 2017 年 6 月 13 日
コメント済み: Timo 2017 年 7 月 7 日
Hello everybody,
I tried to read certain data blocks of different text files (as well as .lila files). The data within the files is more or less structured in the same way and the files consist of 2 columns (; delimited) and contain a header of 17-lines and subsequently in column 1 a time series (hourly time steps) with a corresponding value in column 2 is following. This scheme is repeating many times within each fil, but the time series of each data block is different, whereby the amount of lines between 2 different headers differs.
Here an example of the data of a file:
  1. Station;X1
  2. Stationsnummer;1
  3. Stationskennung;M
  4. Betreiber;D1
  5. Datenart;TTAU
  6. Datentyp;P
  7. Datenursprung;mes
  8. Datenbezug;Station
  9. Dimension;Grad C
  10. Zeitintervall;01:00
  11. Zeitzone;UTC+1
  12. X-Koordinate;4385224
  13. Y-Koordinate;5604764
  14. Koordinatensystem;31468
  15. Hoehe;450
  16. Hohensystem;m ue NN
  17. Kommentar;jweuiwe 01.01.1997 00:00;-100
  18. 01.01.1997 01:00;-1
  19. 01.01.1997 02:00;-1
  20. 01.01.1997 03:00;-1
  21. 01.01.1997 04:00;-1
  22. 01.01.1997 05:00;-1
  23. 01.01.1997 06:00;-3
  24. 01.01.1997 07:00;-1
  25. 01.01.1997 08:00;-1
  26. 01.01.1997 09:00;-1
  27. 01.01.1997 10:00;-1
  28. 01.01.1997 11:00;-4
  29. Station;X2
  30. Stationsnummer;2
  31. Stationskennung;SMUE
  32. Betreiber;DWD
  33. Datenart;TTAU
  34. Datentyp;P
  35. Datenursprung;mes
  36. Datenbezug;Station
  37. Dimension;Grad C
  38. Ztintervall;01:00
  39. Zeitzone;UTC+1
  40. X-Koordinate;4412996
  41. Y-Koordinate;5613131
  42. Koordinatensystem;31468
  43. Hoehe;937
  44. Hohensystem;m ue NN
  45. Kommentar;wewerreewr
  46. 01.01.1997 00:00;-1
  47. 01.01.1997 01:00;-1
  48. 01.01.1997 02:00;-12...
In reality the time series is not restricted to a low number of measurements as depicted above, but can contain thousands of lines and each gauge has a different time series. My goal is to seperate the time series of each station and therefore to delete the header of each station and read the dates and corresponding values in a distinct array/cell-array. I read a lot how to do it and tried different ways but I am new and have huge problems to get a solution. My approach until now:
FID=fopen('teststation.lila','r')
C=textscan(FID,'%s %s','delimiter',';')
C2=[C{:}]
g=0
for i=1:length(C2)
if strcmp(C2{i,1},'Station')
g=g+1
Cneu{g,1}=i
end
end
lCneu=length(Cneu)
Data={}
for t=1:lCneu
for i=Cneu{t,1}+17:Cneu{t+1,1}-1
Data{i-17,1}=C2{i,1}
end
end
In the beginning it reads the time series between the stations but in the end the error Index exceeds matrix dimensions arises (for the line with i=Cneu{t,1}+17:Cneu{t+1,1}-1 ). Then I tried it likes this:
for t=1:lCneu
if t>lCneu
Cneu{t+1,1}=90
for i=Cneu{t,1}+17:Cneu{t+1,1}-1 Data{i-17,1}=C2{i,1}
end
end
end
But then I only get an empty cell-array.
During my search I often found some recommendations relating to ~feof(FID) and fgetl(FID), but I don't figured out how to use these commands or if they are really helpful for my problem. An other option I was thinking about was to use fprintf due to the fact that I have to compare each time series for missing dates (or hours) with a reference time series.
Hopefully someone could help me. Thanks in advance :) Greetings
Timo

採用された回答

Jayaram Theegala
Jayaram Theegala 2017 年 6 月 26 日
Since the date that you are dealing with has the same number of columns, with the same delimiter, you can use the "readtable" function. For more information about this function, you can click on the following URL:
Once the data is in a table (with all character vector/string type), you can do further processing based on its contents, and further allows you to remove the header contents.
You may find the following MATLAB script helpful to get started:
data = readtable('test.txt'); % I copied the data you provided into this this text file and using the readtable function
dates = []; %the variable which will store all the date and times
counter = 1;
for i = 1:height(data)
try
dates = [dates;datetime(data{i,1}, 'InputFormat', 'dd.MM.yyyy hh:mm')];
counter = counter+1;
catch ME
%You can display that header is omitted if you want
end
end
  1 件のコメント
Timo
Timo 2017 年 7 月 7 日
Thank you and sorry for my late respond, but this serves as a good introduction and it makes the data extraction a lot easier, hopefully I won't struggle again :) Thanks a lot!

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeString Parsing についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by