MATLAB Answers

WebRead CSV Missing Rows show in browser

11 views (last 30 days)
Ryan Klingert
Ryan Klingert on 30 Jan 2020
Answered: Guillaume on 7 Feb 2020
I want to start by saying I am pretty new to webscraping and while I have had some sucess working with HTML and the string editing functions I havent been able to figure out downloadin a table.
The overal background is that I am working on a project to build a roster picking model for daily fantasey sports. There are several websites, including the one that i am using, which have relativly acurate projection for each players projected daily points. In order to backtest my model I need to collect projections from past season and so am trying to scrape this site.
This site displays a table of historical results, it also has a link to download these results as a CSV:
The issue is that when visiting that link in a web browser you get a csb with 100's of rows, matching the html page, however when you try to use Webread to systematicly download and save the CSV you only get a slect few of those rows. Code is posted below.
any help would be great!!!!!!
options = weboptions('Timeout',15);
date = datetime(2019,12,12)
useDay = char(string(day(date)));
if size(useDay,2) == 1
useDay = '0' + string(useDay);
useMonth = char(string(month(date)));
if size(useMonth,2) == 1
useMonth = '0' + string(useMonth);
html = webread('' + string(year(date)) + '-' + string(useMonth) + '-' + string(useDay) ,options);

  1 Comment

Rik on 30 Jan 2020
I suspect the title might have triggered the spam filter. A word of advice: remove all non-Matlab relevant content. The point of your question is that webread doesn't download the same csv as you see in your browser, so that is the only relevant part for the question title.

Sign in to comment.

Answers (1)

Guillaume on 7 Feb 2020
If I try to download the file from your link, using a web browser, I only get a few rows. Considering that when you visit the main webpage you get a prominent banner telling you you can only see rosters when a premium user, the problem seems clear: You need to be logged in order to download the full file.
Modifying your weboptions to specify username/password should work (assuming the website is designed properly):
options = weboptions('Timeout',15, 'Username', '??', 'Password', '***');
%rest of code as is...


Sign in to comment.