Scraping Soccerdonna

The transferwindow is coming to a close in women’s football and with that I wanted to express the importance of data. This time I don’t want to talk as much about performance data, because I don’t want to measure how well/bad a player performs. I want to gather data that’s important for building a portfolio for recruitment . In short, I like data that give data on biography.

In men’s football we use transfermarkt and we rely heavily on it. Perhaps to heavily? We should be more citical of that. However, since a few years we also have Soccerdonna on the women’s side and it’s only recently that the database has been sufficient so we can use.

I have created a soccerdonna scrape package for Python (R will follow) so you can easily get the right information from that website for your media, scouting or other purposes.

Which information do I need and want?

I want to get an overview of player information concern all teams in the league I choose. This is good for scouting and recruitment, but also will help with writing journalistic articles.

These are the pieces of information I want to pull.

  • Name
  • Club
  • Date of birth
  • Nationality
  • Position
  • Height
  • Shirt number
  • Profile URL
  • Additional attributes from the Soccerdonna player page

I have created a python package that helps with pulling the data from the website.

The soccerdonna_scraper package provides tools for extracting player and club data from Soccerdonna.de, one of the largest databases for women’s football.

It can scrape all clubs in a given league, collect detailed player profiles, and export the data into an Excel file for analysis.

Installation

You can install the package directly from GitHub:

pip install git+https://github.com/marclamberts/soccerdonna_scraper.git

The following dependencies are installed automatically:

  • requests (HTTP requests)
  • beautifulsoup4 (HTML parsing)
  • pandas (data handling)
  • openpyxl (Excel export)

Python 3.8 or newer is required.

Usage

You can use the scraper either from Python or via the command line interface (CLI).

from soccerdonna_scraper import run_scraper

# Scrape the Women's Super League (ENG1) and save results
run_scraper("womens-super-league", "ENG1", output_file="WSL_ENG1_players.xlsx")

Parameters:

  • league_code: The league slug used in Soccerdonna URLs (e.g., womens-super-league).
  • comp_code: The league competition code (e.g., ENG1).
  • output_file (optional): Name of the Excel file to save. Defaults to {league_code}_{comp_code}_players.xlsx.

### Using from CLI

After installation, a command-line script wsl-scraper becomes available:

wsl-scraper womens-super-league ENG1 -o WSL_ENG1.xlsx

Arguments:

  • league_code: The league slug (e.g., kvindeligawomens-super-league).
  • comp_code: The competition code (e.g., 3FLENG1).
  • -o, --output: Optional Excel file name.

Examples

Scraping the Kvindeliga 2024-25 season:

from soccerdonna_scraper import run_scraper
run_scraper("kvindeliga", "3FL", "Kvindeliga_2024_2025.xlsx")

Scraping the Women’s Super League:

wsl-scraper womens-super-league ENG1 -o WSL_ENG1.xlsx

Submit a Response

Je e-mailadres wordt niet gepubliceerd. Vereiste velden zijn gemarkeerd met *

```
```