Using Python to Mimic NFL Sportsbook Odds Determination

To be clear, NFL odds makers use software that combines sophisticated algorithms, vast data analytics, and real-time updates to generate highly accurate betting lines. There is no way a single person is going to replicate that power on a laptop. But this project is worth wild in that it lifts the hood and shows how these systems run.

Below we build a model that mimics how sportsbooks determine odds based on team statistics, player injuries, and historical matchups.

Perhaps this seed project will help you get started with your own odds calculation project. You will find the foundation for fetching NFL data, calculating team metrics, and computing betting odds. This seed project saves you time by providing a solid base from which to build your odds calculator.

Setting Up Your Environment

First, ensure you have Python installed. You can download it from the official Python website. You will also need several libraries to facilitate data manipulation and web scraping. Open your terminal and run:

pip install pandas numpy requests beautifulsoup4

Project Structure

Create a project folder named nfl_odds_calculator. Inside this folder, create four files:

  • fetch_data.py
  • odds_calculator.py
  • model.py
  • main.py

Step 1: Fetching NFL Data

In fetch_data.py, we will fetch team statistics from an online source. Here, we will scrape data or use a fictional API for demonstration purposes. Below is a sample script to retrieve team statistics via scrape. The best way is to use an api (MySportsFeeds/mysportsfeeds-api: Feature requests for the MySportsFeeds Sports Data API. (github.com)) for example.

# fetch_data.py

import requests
import pandas as pd

def fetch_nfl_data():
    # Replace this with your API key from MySportsFeeds
    api_key = "YOUR_API_KEY"
    
    # MySportsFeeds API endpoint for NFL team statistics
    url = "https://api.mysportsfeeds.com/v2.1/pull/nfl/current/team_stats_totals.json"

    # Headers for the request with API key authentication
    headers = {
        'Authorization': f'Basic {api_key}'
    }
    
    response = requests.get(url, headers=headers)
    
    if response.status_code == 200:
        # Parse the JSON response
        data = response.json()
        
        # List to store team data
        teams_data = []

        # Loop through the teams and extract relevant statistics
        for team_info in data['teams']:
            team_stats = team_info['teamStats']
            team_data = {
                'team': team_info['team']['name'],
                'wins': team_stats['standings']['wins'],
                'losses': team_stats['standings']['losses'],
                'points_scored': team_stats['offense']['pointsFor'],
                'points_allowed': team_stats['defense']['pointsAgainst'],
            }
            teams_data.append(team_data)
        
        # Create a DataFrame from the extracted data
        df = pd.DataFrame(teams_data)
        
        # Save the DataFrame to a CSV file
        df.to_csv('nfl_team_stats.csv', index=False)
        print("NFL data fetched and saved to nfl_team_stats.csv")
    else:
        print(f"Failed to fetch data. Status code: {response.status_code}")
        print(response.text)

if __name__ == "__main__":
    fetch_nfl_data()

Explanation:

  1. API Endpoint: We are using the MySportsFeeds endpoint for NFL team stats. This URL fetches the current team statistics.
  1. Authentication: MySportsFeeds API requires authentication using an API key. We pass this key in the request header as a Basic Authorization. If you are using an API key, it should be base64 encoded, but MySportsFeeds supports API key directly in the header for simplicity.
  • Replace YOUR_API_KEY with your actual API key.
  1. Response Handling: The data returned from the API is in JSON format. We extract team statistics from the response and build a list of dictionaries with the required fields (teamwinslossespoints_scored, and points_allowed).
  2. Saving Data: The data is saved to a CSV file (nfl_team_stats.csv) for easy access and analysis.

Step 2: Modeling Team Performance

Next, we need to analyze the fetched data. In model.py, we will define functions to calculate advanced statistics that can influence odds. For example, we will consider metrics such as scoring differential, win percentage, and average points per game.

# model.py

import pandas as pd

def calculate_team_metrics(df):
    # Avoid division by zero by replacing 0 games played with NaN (not a number)
    df['games_played'] = df['wins'] + df['losses']
    df.loc[df['games_played'] == 0, ['win_percentage', 'scoring_differential', 'average_points_scored', 'average_points_allowed']] = None
    
    # Calculate win percentage and scoring differential
    df['win_percentage'] = df['wins'] / df['games_played']
    df['scoring_differential'] = df['points_scored'] - df['points_allowed']
    df['average_points_scored'] = df['points_scored'] / df['games_played']
    df['average_points_allowed'] = df['points_allowed'] / df['games_played']

    return df

def compute_odds(df):
    # Avoid division by zero for win percentage
    df.loc[df['win_percentage'] == 0, 'odds'] = None
    df['odds'] = 1 / df['win_percentage']
    
    # Handle any NaN odds (e.g., for teams with no games played or win percentage of 0)
    df['odds'].fillna(0, inplace=True)

    # Normalize the odds for betting lines
    total_odds = df['odds'].sum()
    if total_odds > 0:
        df['normalized_odds'] = df['odds'] / total_odds
    else:
        df['normalized_odds'] = 0

    return df[['team', 'win_percentage', 'scoring_differential', 'normalized_odds']]

Step 3: Calculating Odds

Now, we will create the odds_calculator.py file, which will read the data, compute team metrics, and calculate the odds.

# odds_calculator.py

import pandas as pd
from model import calculate_team_metrics, compute_odds

def calculate_odds():
    # Load the data and check for missing columns
    try:
        df = pd.read_csv('nfl_team_stats.csv')
        required_columns = ['team', 'wins', 'losses', 'points_scored', 'points_allowed']
        
        # Check if all required columns are present
        for column in required_columns:
            if column not in df.columns:
                raise ValueError(f"Missing required column: {column}")
        
        # Ensure numeric columns are of the correct type
        df[['wins', 'losses', 'points_scored', 'points_allowed']] = df[['wins', 'losses', 'points_scored', 'points_allowed']].apply(pd.to_numeric, errors='coerce')

        # Calculate team metrics
        df = calculate_team_metrics(df)

        # Calculate odds based on metrics
        odds_df = compute_odds(df)

        # Print the calculated odds
        print("Calculated Odds:")
        print(odds_df)

    except FileNotFoundError:
        print("The file 'nfl_team_stats.csv' was not found.")
    except ValueError as ve:
        print(f"Data Error: {ve}")
    except Exception as e:
        print(f"An unexpected error occurred: {e}")

if __name__ == "__main__":
    calculate_odds()

Step 4: Orchestrating the Application

Finally, we need a main script in main.py that ties everything together. This script will fetch the data, calculate metrics, and then determine the odds.

# main.py

from fetch_data import fetch_nfl_data
from odds_calculator import calculate_odds
import os

def main():
    fetch_nfl_data()  # Step 1: Fetch data
    
    # Check if the CSV was created before proceeding
    if os.path.exists('nfl_team_stats.csv'):
        calculate_odds()  # Step 2: Calculate odds
    else:
        print("Error: nfl_team_stats.csv not found. Data fetching may have failed.")

if __name__ == "__main__":
    main()

Running the Project

To run your project, open your terminal and navigate to your project folder:

cd path/to/nfl_odds_calculator

Then run the main script:

python main.py

The program will fetch NFL team statistics and calculate odds based on win percentages and scoring differentials. You will see the calculated odds printed in your terminal.

Enhancements to Consider

This basic setup serves as a foundation for more complex modeling. Here are some ways to enhance your project:

  1. Advanced Statistics: Incorporate additional metrics, such as yards gained and allowed, turnovers, and penalties, for a more robust model.
  2. Injury Adjustments: Include player injury data to dynamically adjust odds based on which players will be active in upcoming games.
  3. Betting Trends: Track how public betting influences odds. You can scrape betting data to see how shifts occur before game day.
  4. User Input: Allow users to input specific matchups to calculate odds for particular games rather than all teams at once.
  5. Visualizations: Use libraries like Matplotlib or Seaborn to visualize team performance over time, allowing users to easily grasp trends.
  6. Database Integration: Store historical data in a local database like SQLite, making it easier to run complex queries and maintain data over time.

For free NFL statistics, you can utilize several APIs that offer access to a wide range of data. One popular option is MySportsFeeds, which provides a RESTful API allowing access to NFL stats, including schedules, scores, play-by-play data, and more. It is free for developers, students, and hobbyists for non-commercial use, making it a great resource for personal projects​

(MySportsFeeds/mysportsfeeds-api: Feature requests for the MySportsFeeds Sports Data API. (github.com).

Another excellent resource is nflverse, which hosts a collection of data packages for NFL analytics. It provides access to various data formats, including CSV, which can be quite handy for Python projects​

(GitHub).

The data includes player statistics, team performance, and other relevant metrics.

You can also explore options like nfl_data_py, a Python library that simplifies access to NFL data sourced from repositories like nflfastR and nfldata. This package includes play-by-play data, seasonal stats, and more​

(section-1)​(GitHub).

These resources should provide you with the necessary data to effectively mimic how sportsbooks calculate NFL odds. For more detailed information, you can check out MySportsFeeds and the nflverse GitHub page.

A really good medium article on this subject: How Bookmakers Create their Odds, from a Former Odds Compiler | by Trademate Sports | Medium


Thank you for following along with this tutorial. We hope you found it helpful and informative. If you have any questions, or if you would like to suggest new Python code examples or topics for future tutorials/articles, please feel free to join and comment. Your feedback and suggestions are always welcome!

You can find the same tutorial on Medium.com.

Leave a Reply