Project: Scrapping marathon results¶
Project weight: 5 points
The website results.chicagomarathon.com/well-known/2021/ contains results of the 2021 Chicago Marathon. Using this website we can get a list of runners, The name of each runner is linked to a webpage with more detailed information about the runner.
Objectives¶
Use requests and BeautifulSoup to scrap information about the top 50 male runners. The information you retrieve should include the following data:
name
age group
bib number
age
city/state
split times
Record this data in a pandas DataFrame and show the DataFrame in the notebook. The format of the resulting DataFrame should be as follows:
[5]:
marathon_df
[5]:
Name (CTZ) | Age Group | Bib Number | City, State | Short | 05K | 10K | 15K | 20K | HALF | 25K | 30K | 35K | 40K | Finish | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Tura Abdiwak, Seifu (ETH) | 20-24 | 3 | Iseo | ST | 00:14:43 | 00:29:15 | 00:44:21 | 00:59:13 | 01:02:29 | 01:14:42 | 01:30:06 | 01:45:01 | 01:59:44 | 02:06:12 |
1 | Rupp, Galen (USA) | 35-39 | 9 | Portland | GR | 00:14:43 | 00:29:25 | 00:44:23 | 00:59:24 | 01:02:40 | 01:14:44 | 01:30:07 | 01:45:02 | 01:59:53 | 02:06:35 |
2 | Kiptanui, Eric (KEN) | 30-34 | 7 | Iten | EK | 00:14:43 | 00:29:17 | 00:44:21 | 00:59:13 | 01:02:29 | 01:14:42 | 01:30:06 | 01:45:01 | 02:00:05 | 02:06:51 |
3 | Suzuki, Kengo (JPN) | 25-29 | 5 | Chiba City Chiba | KS | 00:14:44 | 00:29:16 | 00:44:22 | 00:59:15 | 01:02:30 | 01:14:44 | 01:30:07 | 01:45:30 | 02:01:48 | 02:08:50 |
4 | Tamru Aredo, Shifera (ETH) | 20-24 | 6 | Iseo | ST | 00:14:36 | 00:29:15 | 00:44:06 | 00:59:10 | 01:02:29 | 01:14:43 | 01:30:08 | 01:45:59 | 02:02:16 | 02:09:39 |
Notes¶
Do not scrap more than 50 runners to limit the usage of the web server hosting the marathon website.
Add a delay of 2 seconds between each request sent to the server.
This is a programming assignment. There is no required narrative, aside from code documentation and possible notes explaining how your code works. Reports will be graded based on
code correctness and completeness (80%)
report organization and code documentation (20%)