Callysto.ca Banner

Open in Callysto

National Hockey League Statistics

We can look at NHL statistics by team or by player, using data from hockey-reference.com or ESPN NHL Statistics.

Statistics by Team

team = 'EDM'
year = '2019'

# download the data
import pandas as pd
team_stats_url = 'https://www.hockey-reference.com/teams/'+team+'/'+year+'_games.html'
team_stats = pd.read_html(team_stats_url)[0]
# clean up the data
team_stats = team_stats[team_stats['Date']!='Date'].set_index('GP').drop(columns=['W','L','OL','Streak','Notes'])
team_stats.columns = ['Date', 'Away', 'Opponent', 'Goals For', 'Goals Against', 'Win', 'Overtime', 'Attendance', 'Duration']
team_stats = team_stats.fillna(0).replace('@', 1).replace('OT', 1).replace('W',1).replace('SO',1).replace('L',0)
# convert text string columns to number columns
team_stats['Goals For'] = pd.to_numeric(team_stats['Goals For'])
team_stats['Goals Against'] = pd.to_numeric(team_stats['Goals Against'])
team_stats['Attendance'] = pd.to_numeric(team_stats['Attendance'])
# convert duration in h:mm to duration in minutes
duration_values = team_stats['Duration'].str.split(':', expand=True).astype(int)
team_stats['Duration'] = duration_values[0]*60 + duration_values[1]
# display the data
team_stats
Date Away Opponent Goals For Goals Against Win Overtime Attendance Duration
GP
1 2018-10-06 1 New Jersey Devils 2 5 0 0 12044 151
2 2018-10-11 1 Boston Bruins 1 4 0 0 17565 148
3 2018-10-13 1 New York Rangers 2 1 1 0 17085 144
4 2018-10-16 1 Winnipeg Jets 5 4 1 1 15321 149
5 2018-10-18 0 Boston Bruins 3 2 1 1 18347 146
... ... ... ... ... ... ... ... ... ...
78 2019-03-30 0 Anaheim Ducks 1 5 0 0 18347 141
79 2019-04-01 1 Vegas Golden Knights 1 3 0 0 18367 144
80 2019-04-02 1 Colorado Avalanche 2 6 0 0 17021 142
81 2019-04-04 0 San Jose Sharks 2 3 0 0 18347 147
82 2019-04-06 1 Calgary Flames 3 1 1 0 19289 145

82 rows × 9 columns

Statistics by Player

This data set contains the following columns for each player in the NHL:

  • GP: Games Played

  • G: Goals

  • A: Assists

  • PTS: Points

  • +/-: Plus/Minus Rating

  • PIM: Penalty Minutes

  • PTS/G: Points Per Game

  • SOG: Shots on Goal

  • PCT: Shooting Percentage

  • GWG: Game-Winning Goals

  • G.1: Power-Play Goals

  • A.1: Power-Play Assists

  • G.2: Short-Handed Goals

  • A.2: Short-Handed Assists

This will take a while to run, since it needs to get data from multiple pages.

# download the data
points_url = 'http://www.espn.com/nhl/statistics/player/_/stat/points'
import pandas as pd
for i in range(20):
    try:
        p = pd.read_html(points_url+'/count/'+str(1+40*i), header=1)[0]
        p = p[p['PLAYER']!='PLAYER'].dropna(subset=['PLAYER']).fillna(method='ffill')
        if i == 0:
            points = p
        else:
            points = points.append(p).reset_index().drop(columns='index')
    # if the site has run out of data
    except:
        pass
# convert text string columns to number columns
for column in points.columns:
    if column != 'PLAYER' and column != 'TEAM':
        points[column] = pd.to_numeric(points[column])
# split the player name and position into two columns
points['POSITION'] = points['PLAYER'].str.split(',', expand=True)[1]
points['PLAYER'] = points['PLAYER'].str.split(',', expand=True)[0]
# display the data
points
RK PLAYER TEAM GP G A PTS +/- PIM PTS/G SOG PCT GWG G.1 A.1 G.2 A.2 POSITION
0 1 Nikita Kucherov TB 20 6 20 26 13 20 1.30 72 8.3 1 0 7 0 0 RW
1 2 Nathan MacKinnon COL 15 9 16 25 13 12 1.67 65 13.8 0 3 6 0 0 C
2 2 Brayden Point TB 18 9 16 25 10 8 1.39 54 16.7 2 1 3 0 0 C
3 4 Miro Heiskanen DAL 22 5 18 23 5 2 1.05 47 10.6 0 2 6 0 0 D
4 5 Mikko Rantanen COL 15 7 14 21 11 6 1.40 55 12.7 0 2 6 0 0 RW
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
370 361 Marcus Foligno MIN 4 0 1 1 -2 5 0.25 4 0.0 0 0 0 0 0 LW
371 361 Carl Hagelin WSH 8 0 1 1 -4 2 0.13 4 0.0 0 0 0 0 0 LW
372 361 Jake Evans MTL 6 0 1 1 -1 0 0.17 3 0.0 0 0 0 0 0 C
373 361 Ilya Kovalchuk WSH 8 0 1 1 0 2 0.13 5 0.0 0 0 0 0 0 LW
374 361 Morgan Geekie CAR 8 0 1 1 -1 0 0.13 4 0.0 0 0 0 0 0 C

375 rows × 18 columns

Callysto.ca License