MATLAB Answers

How do I extract the contents of an HTML table on a web page into a MATLAB table?

73 ビュー (過去 30 日間)
Pat Canny
Pat Canny 2020 年 6 月 23 日
I'd like to plot and analyze the TSA traveler data from this website: https://www.tsa.gov/coronavirus/passenger-throughput
The data is embedded on the page as an HTML table element.
How do I extract the table content into a MATLAB table?

採用された回答

Pat Canny
Pat Canny 2020 年 6 月 23 日
You can extract the <table> content, which is all stored in a set of <td> tags, as a string array and go from there.
You first need to use findElement and extractHTMLText on an htmlTree object.
You then can use reshape to arrange the data, then use array2table to convert to a table.
Here is one approach:
travel_data = webread('https://www.tsa.gov/coronavirus/passenger-throughput');
travel_data_tree = htmlTree(travel_data);
selector = "td";
subtrees = findElement(travel_data_tree,selector);
str = extractHTMLText(subtrees);
table_data = str(4:end); % first three elements are just the column names
reshape_ncols = 3;
reshape_nrows = length(table_data)/reshape_ncols;
table_data_reshaped = reshape(table_data,reshape_ncols,reshape_nrows)';
% Convert to table
traveler_data_table = array2table(table_data_reshaped,'VariableNames',["Date" "Travelers_Today" "Travelers_Last_Year"]); % I got lazy with VariableNames, I know.
% Convert data types from strings to appropriate types
traveler_data_table.Date = datetime(traveler_data_table.Date);
traveler_data_table.Travelers_Today = str2double(traveler_data_table.Travelers_Today);
traveler_data_table.Travelers_Last_Year = str2double(traveler_data_table.Travelers_Last_Year);
traveler_data_table.Traveler_Ratio = traveler_data_table.Travelers_Today ./ traveler_data_table.Travelers_Last_Year;
% Plot the results
figure
plot(traveler_data_table.Date,traveler_data_table.Traveler_Ratio)
title("TSA Traveler Ratio by Date (2020 vs. 2019)")
grid on
% Some more fun analysis
% When did it bottom out?
[min_ratio,idx] = min(traveler_data_table.Traveler_Ratio);
min_ratio_pct = 100*min_ratio;
min_date = traveler_data_table.Date(idx);
disp("The minimum traveler ratio of " + min_ratio_pct + "% occurred on " + string(min_date))
latest_pct = 100*traveler_data_table.Traveler_Ratio(1);
disp("The current ratio is " + latest_pct + "%")
  1 件のコメント
Yesid Fonseca Vargas
Yesid Fonseca Vargas 2021 年 3 月 16 日
Thank u, It works perfectly to solve my problem

サインインしてコメントする。

その他の回答 (0 件)

タグ

製品


リリース

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by