Grafik: Datadriven Volleyballfeld
Computer Vision

Data-Driven Volleyball: Creating Player Statistics Using Computer Vision

Lesezeit
9 ​​min

In this blog post inovex cooperates with the Baden Volleys Karlsruhe in a master thesis to research the automated generation of volleyball player statistics using computer vision.

With the increasing professionalization of sports, data-driven evaluations have become increasingly crucial. Collected statistics aid in identifying the best player, pinpointing opponents‘ weaknesses, and highlighting areas for improvement during training. Coaches must make decisions, such as which player to place in a certain position in a given situation. Even in amateur sports, objective data could greatly enhance subjective evaluations and strengthen the foundation for decision-making. Collecting statistics manually can be a significant effort, especially for less popular or amateur sports, where the benefits may not justify the effort. In contrast, automated systems are commonly used in professional sports, such as football, soccer, or ice hockey, which employ various sensors and cameras to obtain precise 3D data. However, research in this area of volleyball is limited due to the sport’s lower popularity.
We tried to create a system that automatically generates volleyball player statistics from game videos.

The final system consists of 4 modules:

  • Object Detection to detect players and volleyballs
  • Multiple Object Tracking to identify players and track them over multiple frames
  • Action Detection to determine the moment of player actions
  • Action Classification to classify the detected actions and finally create player statistics

Object Detection

The goal of the object detection module is to identify players and volleyballs on the playing field. A pre-trained, state-of-the-art object detector can already identify many objects, including players, referees, and spectators. For this research, YOLOv5x was used. However, for creating player statistics, we are only interested in players on the close side of the playing field and the volleyball.

YOLOv5x pre-trained
YOLOv5x pre-trained

For creating player statistics, we are only interested in players on the close side of the playing field and the volleyball. To exclusively detect these objects, the model must be fine-tuned by retraining it on custom data. The training data originates from the VolleyVision Project and self-annotated data based on game recordings of the Baden Volleys. The results are displayed below.

YOLOv5x fine-tuned on players and volleyballs
YOLOv5x fine-tuned on players and volleyballs

Multiple Object Tracking

After detecting players and volleyballs, for further analysis, the position of each player has to be tracked. This is usually achieved with multiple object tracking. The video below showcases the implementation of ByteTrack on the YOLO detections. The annotations are displayed in green boxes and the tracker output in blue boxes. If a track is lost, it is assigned a higher track number. The currently available trackers, such as SORT or ByteTrack, have not performed well on the data due to frequent track loss during player occlusions.

Vanilla ByteTrack

To improve the tracking, we combined an optical character recognition model (easyOCR) with multiple object tracking to develop a reliable volleyball player tracking algorithm. The final model is shown in the video below. If the model detects a jersey number, it assigns it to the corresponding track and uses it to remap lost tracks. The video demonstrates reliable detection of players in the back row, but the detection of players in the front row is less accurate. Further development is necessary to ensure the reliable detection of all players. The red box is displayed during optical character recognition.

Custom MOT algorithm

Action Detection

To finally evaluate player actions, first, the moment of the player’s action has to be determined using action detection.

Various player actions are of interest in volleyball statistics. The focus of this research lay on the reception action, which is the first action in the rally after the serve and determines further attack possibilities. This action is carried out by players in the back row, who can be identified and tracked by the custom MOT algorithm we developed. A reception action in volleyball typically begins with the opponent serving. The libero, Player 17, receives the ball and passes it to the setter, Player 10, who then sets the ball to the attacker.

Volleyball rally including serve, reception, and set

The vertical and the horizontal ball positions in this snippet can be plotted. Green dots display a ball located in a player box with a recognized player number. If the dot is orange, the ball is still located in a player box but the player’s number is not recognized.

Vertical ball position
Horizontal ball position

 

Sliding window technique to identify reception actions

When performing this task for multiple receptions, it becomes clear that the pattern repeats. The reception action pattern of 180 frames can be found in the game data using a sliding window technique. The technique includes the following procedure:

  • The sample of 180 frames is moved across the entire time series, one frame at a time.
  • In each time step, each data point in the frame is subtracted from the corresponding time series point.
  • The result of the distance is displayed in a plot.

Sliding window technique example

When the sliding window technique is applied to a game, a distance plot is generated. To classify distances as receptions and no receptions, a threshold of 42 was used. The triangles represent the receptions to be detected, with green triangles indicating found receptions and red triangles indicating receptions that were not found. Altering the threshold can result in more receptions being detected, but it also increases the number of false detections.

Distance plot of sliding window technique

Deep Learning

To enhance the reception detection results, we trained a 1D ResNet with three residual blocks. A Keras template for time series classification served as the template for this model. The model was trained using 180 ball positions around receptions, including samples with and without receptions. The correlation between reception occurrence and vertical ball position was found to be much higher than that of horizontal ball position. Therefore, the net was trained solely on vertical ball positions. The final ResNet was trained on all receptions of a train game and could detect 69 % of all receptions with a precision of 72 % on another test game. That corresponds to an AUCPRC score of 0.72 for a variable threshold.

1D ResNet architecture

Action Classification

To determine the precise moment of the reception action within the identified 180-frame reception windows, we utilized ball touches presented in the ball position graphs above. The initial ball touch on the team’s side during a rally is considered the reception action, and the second ball touch is considered the set action. The reception action is categorized based on the ball location where the set action can be executed, which should be near the net or in the middle, with the ball being at an adequate height. This information could be translated into statistical rules to perform binary classification of good and bad receptions and finally the creation of player statistics.

perfect ball position after receiving action during set action (at the middle of the net and high enough)

It is important to note that 7 receptions were not found, while 8 additional receptions were identified, resulting in a total of 45 receptions compared to the expected 44. While classification works well when considering the sum of all receptions, it is not reliable for individual player analysis. Particularly, bad receptions are often not recognized, especially for the libero with number 17. To improve the classification of receptions for individual players, the detection and tracking of numbers need further refinement.

Nummer Good (prediction/expected) Bad (prediction/expected) Σ (prediction/expected)
3 6/6 0/2 6/8
15 4/4 3/5 7/9
16 5/7 2/1 7/8
17 7/11 0/8 7/19
Number not detected 5/0 6/0 11/0
Wrong number detected 5/0 2/0 7/8
Σ 32/28 13/16 45/44

Further Work

Object Detection

The object detection model is overfitted on data of the Baden Volleys. To utilize the model with other teams and varying camera setups, it is necessary to retrain the model with a more diverse set of training data.

Multiple Object Tracking

To enhance optical character recognition, preprocessing techniques can be applied and different models, such as Tesseract, can be utilized. Embeddings can be used to improve tracking by remapping lost tracks. However, standard embeddings, like those used in DeepSORT, are too imprecise and cannot distinguish between players with the same jersey. Therefore, a model that produces a more detailed embedding should be developed.

Action Detection on still images

To detect other actions and to identify the exact moment of a player’s action, action detection on still images could be used in addition to ball touch detection. A small demo is provided below.

Action detection demo

Hat dir der Beitrag gefallen?

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert