Project: Scrapping marathon results

Project weight: 5 points

The website results.chicagomarathon.com/well-known/2021/ contains results of the 2021 Chicago Marathon. Using this website we can get a list of runners, The name of each runner is linked to a webpage with more detailed information about the runner.

Objectives

Use requests and BeautifulSoup to scrap information about the top 50 male runners. The information you retrieve should include the following data:

  • name

  • age group

  • bib number

  • age

  • city/state

  • split times

Record this data in a pandas DataFrame and show the DataFrame in the notebook. The format of the resulting DataFrame should be as follows:

[5]:
marathon_df
[5]:
Name (CTZ) Age Group Bib Number City, State Short 05K 10K 15K 20K HALF 25K 30K 35K 40K Finish
0 Tura Abdiwak, Seifu (ETH) 20-24 3 Iseo ST 00:14:43 00:29:15 00:44:21 00:59:13 01:02:29 01:14:42 01:30:06 01:45:01 01:59:44 02:06:12
1 Rupp, Galen (USA) 35-39 9 Portland GR 00:14:43 00:29:25 00:44:23 00:59:24 01:02:40 01:14:44 01:30:07 01:45:02 01:59:53 02:06:35
2 Kiptanui, Eric (KEN) 30-34 7 Iten EK 00:14:43 00:29:17 00:44:21 00:59:13 01:02:29 01:14:42 01:30:06 01:45:01 02:00:05 02:06:51
3 Suzuki, Kengo (JPN) 25-29 5 Chiba City Chiba KS 00:14:44 00:29:16 00:44:22 00:59:15 01:02:30 01:14:44 01:30:07 01:45:30 02:01:48 02:08:50
4 Tamru Aredo, Shifera (ETH) 20-24 6 Iseo ST 00:14:36 00:29:15 00:44:06 00:59:10 01:02:29 01:14:43 01:30:08 01:45:59 02:02:16 02:09:39

Notes

  • Do not scrap more than 50 runners to limit the usage of the web server hosting the marathon website.

  • Add a delay of 2 seconds between each request sent to the server.

  • This is a programming assignment. There is no required narrative, aside from code documentation and possible notes explaining how your code works. Reports will be graded based on

    • code correctness and completeness (80%)

    • report organization and code documentation (20%)