Problem with dates and intersect

1 回表示 (過去 30 日間)
MC3105
MC3105 2014 年 11 月 3 日
編集済み: dpb 2014 年 11 月 4 日
Hello everyone,
i came across the following problem this morning and I haven't been able to fix it for the last few hours.. so I am hoping, someone here might be able to help out.
I have two date vectors. One contains all dates for the year 2011:
dat_2011=(datenum(2011,01,01,0,0,0):1/24:datenum(2011,12,31,23,0,0))';
The other one contains only some of the dates of the year 2011: in total it contains 8461 dates of the year 2011. This vector ist called dat_xx
Now I want to use intersect to find out which indexes in vector dat_2011 correspond to the dates in vector dat_xx. So it is very important to me to find out ia and ib:
[dat_xx,ia,ib]=intersect(dat_2011,dat_xx,'rows');
When I run my code, matlab tells me that there are in total only 7722 dates that are the same in the two vectors. ia and ib both have 7722 entries. The problem is, that i know, that in total there should be 8461 dates that are part of both vectors.
Did I maybe do something wrong when I created the vector dat_2011?
I can use several matlab functions like hour, month, year, minute, second for all the elements in both of my date vectors... so I have now idea what the problem could be... matlab seems to recognize the elements in both vectors as dates...
Thanks alot!!

採用された回答

dpb
dpb 2014 年 11 月 3 日
編集済み: dpb 2014 年 11 月 4 日
dat_2011=(datenum(2011,01,01,0,0,0):1/24:datenum(2011,12,31,23,0,0))';
... Did I maybe do something wrong when I created the vector dat_2011?
Ayup...you generated the date vector from the two end date values and the floating point delta using colon instead of using internally (to datenum) generated values. Use
dat_2011=(datenum(2011,1,1, [0:24*365-1].',0,0);
instead. Internally datenum will generate self-consistent values that will work for comparisons while the colon operation uses different algorithms to minimize error between initial and final values.
The general rule is to use integer-valued increments of the proper size to span the desired time and granularity desired rather than using the fractional days in floating point and introducing that external rounding error. The values will be very similar but when you use floating point comparisons later, even a single bit in the least significant position will cause a failure.
Try looking at the difference between the two series as generated above; I would expect you'll find that the difference is otoh E-15 and will be symmetric around the midpoint of the series owing to how : works internally.
ADDENDUM
I did the comparison..the actual difference in the two is
>> dat_2011=(datenum(2011,01,01,0,0,0):1/24:datenum(2011,12,31,23,0,0))';
>> dn=datenum(2011,1,1,[0:24*365-1].',0,0);
>> max(dat_2011-dn)
ans =
1.1642e-10
>>
In the previous error estimate I was forgetting to factor in the magnitude of datenums being on order of 10E5.
>> eps(dn(1))
ans =
1.1642e-10
>> eps(1)
ans =
2.2204e-16
>>
ADDENDUM 2
...The general rule is to use integer-valued increments of the proper size to span the desired time and granularity desired rather than using the fractional days in floating point and introducing that external rounding error. The values will be very similar but when you use floating point comparisons later, even a single bit in the least significant position will cause a failure.
ATTN: TMW The above caution or similar should be in the documentation on date numbers. This issue arises repeatedly owing to what seems to be a reasonable way to generate the date vector is, in fact, guaranteed to fail as demonstrated above. This question comes up over and over and is, afaik, never mentioned in the doc's. Although one with some experience can infer it from floating point behavior, it'll catch virtually everybody at some time or the other until they've seen/experienced it.

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeTime Series Objects についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by