Strange behavior from readtable
現在この質問をフォロー中です
- フォローしているコンテンツ フィードに更新が表示されます。
- コミュニケーション基本設定に応じて電子メールを受け取ることができます。
エラーが発生しました
ページに変更が加えられたため、アクションを完了できません。ページを再度読み込み、更新された状態を確認してください。
古いコメントを表示
I have some text files that I'm reading in with readtable. For most of the files, everything seems to work fine (example1.txt). Strings come through as strings, and numbers come through as numbers. But for some reason, I'm getting strange results when trying to read in example2.txt. The last 2 columns....which are all numbers....are coming through as strings. I'm thinking there has to be some kind of strange format in the example2.txt file, but I've scanned both files, and don't see anything that would be causing this. My call is simply data = readtable(filename). Obviously I'll try reading in everything as strings to try to fix this, but was just curious as to what could be causing this.
1 件のコメント
Frank
2023 年 6 月 27 日
Oops....my call is data = readtable(filename,'NumHeaderLines',2,'TreatAsMissing','-999')
採用された回答
While READTABLE's automagic file format detection is great, the more a file deviates (missing data, lines filled with asterisks) the more help you will have to give it. For this use DETECTIMPORTOPTIONS, SETVARTYPE, etc.
Once it is set up correctly you do not need to repeat this for each file: you can reuse the options object for all files.
F1 = 'example1.txt';
F2 = 'example2.txt';
Ob = detectImportOptions(F1, 'FileType','fixedwidth', 'Range',1, 'TreatAsMissing','-999');
Ob = setvartype(Ob, 'double');
Ob = setvartype(Ob, {'Latitude___','Longitude___'},'char');
Ob = setvartype(Ob, {'FlightTimeSinceLaunch_mins_secs_','UTCTime_hrs_mins_secs_'},'duration');
Ob = setvaropts(Ob, 'FlightTimeSinceLaunch_mins_secs_', 'InputFormat','mm:ss');
T1 = readtable(F1,Ob)
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
T1 = 7063×10 table
FlightTimeSinceLaunch_mins_secs_ UTCTime_hrs_mins_secs_ Altitud___ Latitude___ Longitude___ Pressure_hPa_ Temperature__C_ Humidity__RH_ WindSpeed_m_s_ WindDirection___
________________________________ ______________________ __________ ___________ ____________ _____________ _______________ _____________ ______________ ________________
-08:31 NaN NaN {0×0 char} {0×0 char} 1023.6 NaN NaN NaN NaN
-08:30 NaN NaN {0×0 char} {0×0 char} 1023.6 28.7 NaN NaN NaN
-08:29 NaN NaN {0×0 char} {0×0 char} 1023.6 28.7 NaN NaN NaN
-08:28 NaN NaN {0×0 char} {0×0 char} 1023.6 28.6 74.9 NaN NaN
-08:27 NaN NaN {0×0 char} {0×0 char} 1023.6 28.6 75 NaN NaN
-08:26 NaN NaN {0×0 char} {0×0 char} 1023.6 28.6 75.1 NaN NaN
-08:25 NaN NaN {0×0 char} {0×0 char} 1023.6 28.6 75.1 NaN NaN
-08:24 NaN NaN {0×0 char} {0×0 char} 1023.6 NaN NaN NaN NaN
-08:23 NaN NaN {0×0 char} {0×0 char} 1023.6 28.6 75.2 NaN NaN
-08:22 NaN NaN {0×0 char} {0×0 char} 1023.6 28.6 75.2 NaN NaN
-08:21 NaN NaN {0×0 char} {0×0 char} 1023.6 28.6 75.1 NaN NaN
-08:20 NaN NaN {0×0 char} {0×0 char} 1023.6 28.6 75 NaN NaN
-08:19 NaN NaN {0×0 char} {0×0 char} 1023.6 28.6 75 NaN NaN
-08:18 NaN NaN {0×0 char} {0×0 char} 1023.6 28.6 75 NaN NaN
-08:17 NaN NaN {0×0 char} {0×0 char} 1023.6 28.6 75 NaN NaN
-08:16 NaN NaN {0×0 char} {0×0 char} 1023.6 28.6 75 NaN NaN
T1{270:279,2}
ans = 10×1 duration array
NaN
NaN
NaN
NaN
14:41:01
14:41:02
14:41:03
14:41:04
14:41:05
14:41:06
T2 = readtable(F2,Ob)
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
T2 = 7136×10 table
FlightTimeSinceLaunch_mins_secs_ UTCTime_hrs_mins_secs_ Altitud___ Latitude___ Longitude___ Pressure_hPa_ Temperature__C_ Humidity__RH_ WindSpeed_m_s_ WindDirection___
________________________________ ______________________ __________ _______________ ________________ _____________ _______________ _____________ ______________ ________________
-12:57 NaN NaN {0×0 char } {0×0 char } 1021.8 26.7 NaN NaN NaN
-12:56 NaN NaN {0×0 char } {0×0 char } 1021.8 26.7 NaN NaN NaN
-12:55 NaN NaN {0×0 char } {0×0 char } 1021.8 26.7 NaN NaN NaN
-12:54 NaN NaN {0×0 char } {0×0 char } 1021.8 26.7 NaN NaN NaN
-12:53 NaN NaN {0×0 char } {0×0 char } 1021.8 26.7 74.1 NaN NaN
-12:52 NaN NaN {0×0 char } {0×0 char } 1021.8 NaN NaN NaN NaN
-12:51 NaN NaN {0×0 char } {0×0 char } 1021.8 NaN NaN NaN NaN
-12:50 NaN NaN {0×0 char } {0×0 char } 1021.8 26.7 74.2 NaN NaN
-12:49 NaN NaN {0×0 char } {0×0 char } 1021.8 26.7 74.2 NaN NaN
-12:48 NaN NaN {0×0 char } {0×0 char } 1021.8 26.7 74.2 NaN NaN
-12:47 NaN NaN {0×0 char } {0×0 char } 1021.8 26.7 74.2 NaN NaN
-12:46 NaN NaN {0×0 char } {0×0 char } 1021.8 26.7 74.2 NaN NaN
-12:45 10:02:23 -18.3 {'28°28'32.9"'} {'-80°33'29.7"'} 1021.8 26.7 74.2 NaN NaN
-12:44 10:02:24 -17.7 {'28°28'32.9"'} {'-80°33'29.7"'} 1021.8 26.7 74.2 NaN NaN
-12:43 10:02:25 -18.4 {'28°28'32.9"'} {'-80°33'29.7"'} 1021.8 26.7 74.2 NaN NaN
-12:42 10:02:26 -18.9 {'28°28'32.9"'} {'-80°33'29.7"'} 1021.8 26.7 74.2 NaN NaN
Note that you can use CONVERTVARS to efficiently convert the degree+minute+second text into numeric vectors (or scalars), e.g.:
T2 = convertvars(T2,@iscell,@myfun)
T2 = 7136×10 table
FlightTimeSinceLaunch_mins_secs_ UTCTime_hrs_mins_secs_ Altitud___ Latitude___ Longitude___ Pressure_hPa_ Temperature__C_ Humidity__RH_ WindSpeed_m_s_ WindDirection___
________________________________ ______________________ __________ ___________________ ___________________ _____________ _______________ _____________ ______________ ________________
-12:57 NaN NaN NaN NaN NaN NaN NaN NaN 1021.8 26.7 NaN NaN NaN
-12:56 NaN NaN NaN NaN NaN NaN NaN NaN 1021.8 26.7 NaN NaN NaN
-12:55 NaN NaN NaN NaN NaN NaN NaN NaN 1021.8 26.7 NaN NaN NaN
-12:54 NaN NaN NaN NaN NaN NaN NaN NaN 1021.8 26.7 NaN NaN NaN
-12:53 NaN NaN NaN NaN NaN NaN NaN NaN 1021.8 26.7 74.1 NaN NaN
-12:52 NaN NaN NaN NaN NaN NaN NaN NaN 1021.8 NaN NaN NaN NaN
-12:51 NaN NaN NaN NaN NaN NaN NaN NaN 1021.8 NaN NaN NaN NaN
-12:50 NaN NaN NaN NaN NaN NaN NaN NaN 1021.8 26.7 74.2 NaN NaN
-12:49 NaN NaN NaN NaN NaN NaN NaN NaN 1021.8 26.7 74.2 NaN NaN
-12:48 NaN NaN NaN NaN NaN NaN NaN NaN 1021.8 26.7 74.2 NaN NaN
-12:47 NaN NaN NaN NaN NaN NaN NaN NaN 1021.8 26.7 74.2 NaN NaN
-12:46 NaN NaN NaN NaN NaN NaN NaN NaN 1021.8 26.7 74.2 NaN NaN
-12:45 10:02:23 -18.3 28 28 32.9 -80 33 29.7 1021.8 26.7 74.2 NaN NaN
-12:44 10:02:24 -17.7 28 28 32.9 -80 33 29.7 1021.8 26.7 74.2 NaN NaN
-12:43 10:02:25 -18.4 28 28 32.9 -80 33 29.7 1021.8 26.7 74.2 NaN NaN
-12:42 10:02:26 -18.9 28 28 32.9 -80 33 29.7 1021.8 26.7 74.2 NaN NaN
Basic matrix multiplication could also be very useful to combine the degrees+minutes+seconds into degrees:
T2.Longitude___ * [1;1/60;1/3600]
ans = 7136×1
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
function V = myfun(C);
V = nan(numel(C),3);
V(~cellfun(@isempty,C),:) = sscanf([C{:}],'%f°%f''%f"',[3,Inf]).';
end
Thinks out loud: it would be interesting if DURATION accepted a wider range of formats, then degree+minutes+seconds could be natively imported as a duration type...
3 件のコメント
Frank
2023 年 6 月 28 日
Note to all, I erred saying example2.txt was the problem file......it's in fact example1.txt which is the problem file. Sorry about that. Also note that it's really the last 5 columns I'm interested in. I don't really care about the date or lat/lon data.
Thanks for the feedback Stephen. It's interesting when I do 0b.VariableTypes on example1, the result is...
'char' 'char' 'char' 'char' 'char' 'double' 'double' 'double' 'char' 'char'
while the results for example2 are...
'char' 'duration' 'double' 'char' 'char' 'double' 'double' 'double' 'double' 'double'
Randomly selecting several other files, they have the same result as example2......which is what I expect.
I guess example1.txt just has something strange embedded in it that is throwing the read off. I'll either have to toss that file, or build a different read function.
" It's interesting when I do 0b.VariableTypes on example1, the result is..."
That should not be the result after the STEVARTYPE calls. Please check after the variable types and options have been set. That is rather the whole point of many answer, I would be very surprised if it did not work.
As I wrote in my answer, you can reuse the same options object for all files. It should not change.
Frank
2023 年 6 月 28 日
Yes.....I'm dumb.....sorry about that. I was so fixated on the fact that example1.txt was different than the others, that I didn't realize that I can specify what those data types should be. Thanks Stephen, your suggestion did the job. I actually made it simple by doing...
Ob = detectImportOptions(F2, 'FileType','fixedwidth', 'Range',1, 'TreatAsMissing','-999');
and then...
data = readtable('example1.txt',Ob);
Thanks all for the help!!
その他の回答 (0 件)
カテゴリ
ヘルプ センター および File Exchange で MATLAB についてさらに検索
タグ
参考
Web サイトの選択
Web サイトを選択すると、翻訳されたコンテンツにアクセスし、地域のイベントやサービスを確認できます。現在の位置情報に基づき、次のサイトの選択を推奨します:
また、以下のリストから Web サイトを選択することもできます。
最適なサイトパフォーマンスの取得方法
中国のサイト (中国語または英語) を選択することで、最適なサイトパフォーマンスが得られます。その他の国の MathWorks のサイトは、お客様の地域からのアクセスが最適化されていません。
南北アメリカ
- América Latina (Español)
- Canada (English)
- United States (English)
ヨーロッパ
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
