Parsing ⅛ and ⅓ Characters from actxserver Outlook Mail object Body and Converting to Floats

1 回表示 (過去 30 日間)
Mark Whirdy
Mark Whirdy 2014 年 1 月 31 日
コメント済み: Walter Roberson 2014 年 2 月 5 日
Hi all
I am parsing Outlook mails in Matlab by actxserver and regexp.
Some mails contain fraction characters as below
The ½,¼,¾ characters are read ok, but the eighths (⅛,⅜,⅝,⅞) and thirds (⅓,⅔) are present in the body property of the mail object as "?" [char(63)] as per below screenshot from the command-line print of the mail body.
Matlab recognises only ¼ ½ ¾ [char(188:190)] so I guess I need to access non ASCII chars. Its not clear whether the issue is Matlab's 16bit unicode or the actxserver object. The characters are available on Windows Vista Arial font as U+215C,E etc
You can verify this for yourself by emailing yourself a mail with the subjectline
⅛¼⅓⅜½⅝⅔¾⅞
and then running the code below in matlab to regexp this subjectline of the mail in your inbox. Put a breakpoint at the regexp line to inspect what the subject variable looks like, should see "?" in there.
Two questions here:
1. How could I extend Matlab's ASCII set to read these characters
2. Is there a neat way to convert them into equivalent floats (3¼ -- > 3.25) within regexp ?
Grateful for any suggestions here
Mark
% Below function will need to be adapted depending on how your outlook folders are set up:
function myfrac = TestReadFractions
outlook = actxserver('Outlook.Application');
mapi = outlook.GetNamespace('mapi');
folder1 = mapi.Folders(1);
myaccount = folder1.Item(2);
inboxmails = myaccount.Folders.Item(2).Folders.Item(9).Items;
count = inboxmails.Count;
myfrac = {};
for i = count:-1:count-10
if strcmp(inboxmails.Item(i).SenderEmailAddress,'yourname@youraddress.com')
subject = inboxmails.Item(i).Subject; % Mail Subject-Line
myfrac = regexp(subject,'\x215c','match');
end
end

回答 (1 件)

Walter Roberson
Walter Roberson 2014 年 2 月 3 日
regexprep('ABC','B','\x215c')
  4 件のコメント
Mark Whirdy
Mark Whirdy 2014 年 2 月 5 日
編集済み: Mark Whirdy 2014 年 2 月 5 日
Hi Walter,
emailing myself "⅛¼⅓⅜½⅝⅔¾⅞" and reading as per code above gives
K>> subject+0
ans =
63 188 63 63 189 63 63 190 63
Since all the question-marks have same 63 integer, I think that passing through nativetounicode will not work.
I can't change the pc locale as this will effect all other applications I think.
Do you know why changing the InternetCodepage property of the outlook mail, doesn't work? (i.e. as above if I set to anything other than 65001, it is still 65001 when I check in then). I guess the property is immutable, perhaps there is a way of setting it in the actxserver constructor? But even if I can do this, I don't know which InternetCodepage value would fix it
Mark
Walter Roberson
Walter Roberson 2014 年 2 月 5 日
Sorry, I am not familiar with how InternetCodePage properties work.

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeString Parsing についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by