How do I read raw text from a webpage (ESPN Fantasy Baseball)?

5 ビュー (過去 30 日間)
James Bopp
James Bopp 2019 年 5 月 20 日
コメント済み: James Bopp 2019 年 5 月 20 日
I am trying to read the raw text from the following url (using Google Chrome, and R2016b MATLAB):
The format of the text is JSON. All I need is the raw text so I can parse it with JSONDECODE. Right now I'm having to manually copy and paste the text from the webpage into a text document, which allows me to read the data as a string. I don't want to do this. I am open to any methods that will allow me to go to the above url and directly read all of the text into a string.
Bonus problem: the above url is for a public website, however I would like to be able to log into ESPN at the following url: http://www.espn.com/login then proceed with getting the text from the url.
To sum up, I want to do the following:
1) Log into ESPN (go to its url and input the user/password, either manually or programatically)
2) Go to an ESPN url to read in its data (either the raw text which I can decode with JSONDECODE, or read it in directly as a JSON structure).

回答 (1 件)

Geoff Hayes
Geoff Hayes 2019 年 5 月 20 日
James - as a first step, try using webread to read the data from the URL which you can then pass to jsondecode like
urlToFetch = 'http://fantasy.espn.com/apis/v3/games/flb/seasons/2019/segments/0/leagues/15243217?view=mMatchupScore&view=mRoster&view=mScoreboard&view=mSettings&view=mTopPerformers&view=mTeam&view=modular&view=mNav';
jsonData = webread(urlToFetch);
As for logging into the website, that may be a little more difficult. Do you know of any public APIs that would allow this? I suspect that when you login, a token of some kind would be returned that you would need to use on subsequent calls to get the other data.
  3 件のコメント
Geoff Hayes
Geoff Hayes 2019 年 5 月 20 日
編集済み: Geoff Hayes 2019 年 5 月 20 日
The 401 error means that you are unauthorized and so need to provide the correct login credentials.
As for getting the content (the json) from the web browser, could you try something like
urlToFetch = 'http://fantasy.espn.com/apis/v3/games/flb/seasons/2019/segments/0/leagues/15243217?view=mMatchupScore&view=mRoster&view=mScoreboard&view=mSettings&view=mTopPerformers&view=mTeam&view=modular&view=mNav';
[stat,h] = web(urlToFetch);
jsonData = get(h, 'HtmlText') ;
using the private URL once you have logged in.
James Bopp
James Bopp 2019 年 5 月 20 日
Thanks again Geoff,
It's funny, I had tried that before on a different problem I was working and ran into trouble, so when I came upon this problem I thought I had already tried that. As it turns out, it seems I hadn't. The getHtmlText seems to work (just need to strip the HTML tags, but that's no problem). When I get home from work I will look more into this, but I believe you have solved the problem for my workaround and I can proceed with my project. Thanks!
Now, if anyone can figure out how to log into ESPN's website and use that for webscraping follow-on calls to ESPN's site, that would still be greatly appreciated.

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeJSON Format についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by