How do you extract from a website table?

Question

Christopher Taylor 2022 年 6 月 3 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1732795-how-do-you-extract-from-a-website-table

回答済み: Toshiaki Takeuchi 2023 年 10 月 24 日

I'm trying to extract data from the table on this page(http://www.newyorkschools.com/districts/nyc-district-11.html).

I've tried tp uses webread but it isn't quite working for me. I'm attempting to extract the school names and the grade level and them place that into an excel file. (Helping a friend starting a stem program)

How do you think I should do?

url ='http://www.newyorkschools.com/districts/nyc-district-7.html';

data = webread(url)

tree=htmlTree(url)

selector = 'School Name'

subtrees = findElement(tree,selector)

subtrees(:)

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Christopher Creutzig 2022 年 6 月 7 日

1
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1732795-how-do-you-extract-from-a-website-table#answer_980420

MATLAB Online で開く

The problem with this page is that it is not using an HTML <table> for the data you are looking for. Otherwise, you would be able to simply use readtable(url) or maybe readtable(url,TableIndex=2).

Also, the selector needs to follow what is found in the HTML source, which again in this particular case is not made easy. MATLAB does not control what you need in there.

Here's something to get you started with:

url ='http://www.newyorkschools.com/districts/nyc-district-7.html';
data = webread(url);
tree = htmlTree(data);
tabs = findElement(tree,"#myTabContent > div");
schools = tabs(1);
rows = findElement(schools,".p_div");
schoolnames = findElement(schools,".pp-col-40");
extractHTMLText(schoolnames)
ans = 38×1 string array
    "School Name"
    "Academy For Public Relations"
    "Alfred E. Smith Vocational High School"
    "Bronx Academy Of Letters"
    "Community High School For Social Justice"
    "Foreign Language Academy Of Global Studies"
    "Health Opportunities Program"
    "Hostos-Lincoln Academy Of Science"
    "I.S. 184 Rafael C. Y. Molina School"
    "Is 222"
    "J.H.S. 151 Henry Lou Gehrig Junior High School"
    "Jhs 162 L. Rodriguez De Tio School"
    "Mott Haven Village Prep High School"
    "Ms 203"
    "Ms 223 The Labratory School Of Finance"
    "New Explorers High School"
    "P.S. 1 Courtland School"
    "P.S. 154 Jonathan D. Hyatt School"
    "P.S. 156 Benjamin Banneker School"
    "P.S. 157 Grove Hill School"
    "P.S. 161 Ponce De Leon School"
    "P.S. 18 John Peter Zenger School"
    "P.S. 220 Mott Haven Village School"
    "P.S. 25 Bilingual School"
    "P.S. 277"
    "P.S. 30 Wilton School"
    "P.S. 43 Jonas Bronck School"
    "P.S. 49 Willis Avenue School"
    "P.S. 5 Port Morris School"
    "P.S. 65 Mother Hale Academy"

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

Answer 2

Seth Furman 2022 年 6 月 6 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1732795-how-do-you-extract-from-a-website-table#answer_979935

Try the approach suggested in the following MATLAB Answers post.

https://www.mathworks.com/matlabcentral/answers/553537-how-do-i-extract-the-contents-of-an-html-table-on-a-web-page-into-a-matlab-table

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

Answer 3

Toshiaki Takeuchi 2023 年 10 月 24 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1732795-how-do-you-extract-from-a-website-table#answer_1339676

MATLAB Online で開く

You can use readtable https://www.mathworks.com/help/matlab/ref/readtable.html

url = "https://www.mathworks.com/help/matlab/text-files.html";
T = readtable(url,TableSelector="//TABLE[contains(.,'readtable')]", ...
    ReadVariableNames=false)
T = 4×2 table
          Var1                             Var2                    
    ________________    ___________________________________________

    "readtable"         "Create table from file"                   
    "writetable"        "Write table to file"                      
    "readtimetable"     "Create timetable from file (Since R2019a)"
    "writetimetable"    "Write timetable to file (Since R2019a)"   

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

How do you extract from a website table?

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

その他の回答 (2 件)

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

How do you extract from a website table?

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

その他の回答 (2 件)

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示