How do you extract from a website table?
30 ビュー (過去 30 日間)
古いコメントを表示
Christopher Taylor
2022 年 6 月 3 日
回答済み: Toshiaki Takeuchi
2023 年 10 月 24 日
I'm trying to extract data from the table on this page(http://www.newyorkschools.com/districts/nyc-district-11.html).
I've tried tp uses webread but it isn't quite working for me. I'm attempting to extract the school names and the grade level and them place that into an excel file. (Helping a friend starting a stem program)
How do you think I should do?
url ='http://www.newyorkschools.com/districts/nyc-district-7.html';
data = webread(url)
tree=htmlTree(url)
selector = 'School Name'
subtrees = findElement(tree,selector)
subtrees(:)
0 件のコメント
採用された回答
Christopher Creutzig
2022 年 6 月 7 日
The problem with this page is that it is not using an HTML <table> for the data you are looking for. Otherwise, you would be able to simply use readtable(url) or maybe readtable(url,TableIndex=2).
Also, the selector needs to follow what is found in the HTML source, which again in this particular case is not made easy. MATLAB does not control what you need in there.
Here's something to get you started with:
url ='http://www.newyorkschools.com/districts/nyc-district-7.html';
data = webread(url);
tree = htmlTree(data);
tabs = findElement(tree,"#myTabContent > div");
schools = tabs(1);
rows = findElement(schools,".p_div");
schoolnames = findElement(schools,".pp-col-40");
extractHTMLText(schoolnames)
0 件のコメント
その他の回答 (2 件)
Seth Furman
2022 年 6 月 6 日
Try the approach suggested in the following MATLAB Answers post.
0 件のコメント
Toshiaki Takeuchi
2023 年 10 月 24 日
You can use readtable https://www.mathworks.com/help/matlab/ref/readtable.html
url = "https://www.mathworks.com/help/matlab/text-files.html";
T = readtable(url,TableSelector="//TABLE[contains(.,'readtable')]", ...
ReadVariableNames=false)
0 件のコメント
参考
カテゴリ
Help Center および File Exchange で Environment and Settings についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!