Authors:
Sam Andersson Centre for Psychiatry Research, Department of Clinical Neuroscience, Karolinska Institutet, & Stockholm Health Care Services, Region Stockholm, Stockholm, Sweden

Search for other papers by Sam Andersson in
Current site
Google Scholar
PubMed
Close
https://orcid.org/0009-0003-2012-8626
,
Per Carlbring Department of Psychology, Stockholm University, Stockholm, Sweden
School of Psychology, Korea University, Seoul, South Korea

Search for other papers by Per Carlbring in
Current site
Google Scholar
PubMed
Close
https://orcid.org/0000-0002-2172-8813
,
Keenan Lyon LeoVegas Group, Stockholm, Sweden

Search for other papers by Keenan Lyon in
Current site
Google Scholar
PubMed
Close
https://orcid.org/0000-0002-4374-2077
,
Måns Bermell LeoVegas Group, Stockholm, Sweden

Search for other papers by Måns Bermell in
Current site
Google Scholar
PubMed
Close
, and
Philip Lindner Centre for Psychiatry Research, Department of Clinical Neuroscience, Karolinska Institutet, & Stockholm Health Care Services, Region Stockholm, Stockholm, Sweden

Search for other papers by Philip Lindner in
Current site
Google Scholar
PubMed
Close
https://orcid.org/0000-0002-3061-501X
Open access

Abstract

Background and Aims

The digitalization of gambling provides unprecedented opportunities for early identification of problem gambling, a well-recognized public health issue. This study aimed to advance current practices by employing advanced machine learning techniques to predict problem gambling behaviors and assess the temporal stability of these predictions.

Methods

We analyzed player account data from a major Swedish online gambling provider, covering a 4.5-year period. Feature engineering was applied to capture gambling behavior dynamics. We trained machine learning models, XGBoost, to classify players into low-risk and higher-risk categories. Temporal stability was evaluated by progressively truncating the training dataset at various time points (30, 60, and 90 days) and assessing model performance across truncations.

Results

The models demonstrated considerable predictive accuracy and temporal stability. Key features such as loss-chasing behavior and net balance trend consistently contributed to accurate predictions across all truncation periods. The model's performance evaluated on a separate holdout set, measured by metrics like F1 score and ROC AUC, remained robust, with no significant decline observed even with reduced data, supporting the feasibility of early and reliable detection.

Discussion and Conclusions

These findings indicate that machine learning can reliably predict problem gambling behaviors over time, offering a scalable alternative to traditional methods. Temporal stability highlights their potential for real-time application in gambling operators' Duty of Care. Consequently, advanced techniques could strengthen early identification and intervention strategies, potentially improving public health outcomes by preventing the escalation of harmful behaviors.

Abstract

Background and Aims

The digitalization of gambling provides unprecedented opportunities for early identification of problem gambling, a well-recognized public health issue. This study aimed to advance current practices by employing advanced machine learning techniques to predict problem gambling behaviors and assess the temporal stability of these predictions.

Methods

We analyzed player account data from a major Swedish online gambling provider, covering a 4.5-year period. Feature engineering was applied to capture gambling behavior dynamics. We trained machine learning models, XGBoost, to classify players into low-risk and higher-risk categories. Temporal stability was evaluated by progressively truncating the training dataset at various time points (30, 60, and 90 days) and assessing model performance across truncations.

Results

The models demonstrated considerable predictive accuracy and temporal stability. Key features such as loss-chasing behavior and net balance trend consistently contributed to accurate predictions across all truncation periods. The model's performance evaluated on a separate holdout set, measured by metrics like F1 score and ROC AUC, remained robust, with no significant decline observed even with reduced data, supporting the feasibility of early and reliable detection.

Discussion and Conclusions

These findings indicate that machine learning can reliably predict problem gambling behaviors over time, offering a scalable alternative to traditional methods. Temporal stability highlights their potential for real-time application in gambling operators' Duty of Care. Consequently, advanced techniques could strengthen early identification and intervention strategies, potentially improving public health outcomes by preventing the escalation of harmful behaviors.

Introduction

Due to the high societal and individual costs associated with problem gambling, early identification is crucial (Eadington, 2003; Hofmarcher, Romild, Spångberg, Persson, & Håkansson, 2020; Jonsson, Abbott, Sjöberg, & Carlbring, 2017). The digitalization of gambling (Jonsson, Munck, Volberg, & Carlbring, 2017) offers unprecedented opportunities to do so, since every login, deposit, bet, and outcome is logged. Once identified, timely interventions can significantly increase the likelihood that individuals are helped before gambling-related harm accumulates (Clune et al., 2024). Existing methods for identifying problem gambling, such as self-report questionnaires and behavioral tracking, have varying degrees of validity and reliability (Edgren et al., 2016; Jonsson, Munck, et al., 2017). Self-report methods depend on individuals accurately reporting their behaviors and experiences. However, these methods can be susceptible to underreporting and bias (Goldstein et al., 2017; Sato & Kawahara, 2011), whereas behavioral tracking requires sophisticated data analytics to interpret effectively (Bitar et al., 2017; Catania & Griffiths, 2021; Haeusler, 2016; Kuentzel, Henderson, & Melville, 2008). Thus, the multifaceted nature of gambling, which involves various psychological, social, and situational factors, makes it challenging to assess with a singular approach (Browne et al., 2017; Hahmann, Hamilton-Wright, Ziegler, & Matheson, 2021). For example, individuals may stop gambling for diverse reasons, including not only harm or financial loss but also personal or strategic considerations (Weatherly, Montes, Peters, & Wilson, 2012).

Much of the existing literature on identification focuses on cross-sectional data (Gainsbury, Sadeque, Mizerski, & Blaszczynski, 2013), which provides only a snapshot of gambling behavior at a single point in time and fails to capture the temporal dynamics by design. While understanding the temporal patterns of gambling behavior is crucial, as it allows for a more accurate identification of problem gambling at different timepoints (Braverman, LaPlante, Nelson, & Shaffer, 2013; Braverman & Shaffer, 2012; Deng, Lesch, & Clark, 2019), it is equally important to consider aggregated behavioral data that captures broader trends and patterns over time. Longitudinal studies, although more complex, offer the potential for deeper insights into the evolution of gambling behavior and the onset of problem gambling (Dowling et al., 2017) and could in theory extend the prediction window, allowing identification of not just current problem gamblers, but also future ones.

With access to player account data, predictive analytics can develop scalable, data-driven methods to identify problem gamblers (Auer & Griffiths, 2022; Perrot et al., 2022). Various machine learning models have shown promise in identifying problem gamblers (Kairouz et al., 2023; Murch et al., 2023; Perrot et al., 2022), revealing that they can leverage complex datasets and scientifically informed feature engineering to identify patterns of gambling behavior related to problem gambling. However, while these studies demonstrate significant potential, they also have limitations. For instance, many existing models often rely on self-reported data, which can be prone to biases and inaccuracies (Percy, França, Dragičević, & d’Avila Garcez, 2016). Moreover, in many predictive studies, researchers often not only utilize cross-sectional data but also frame the prediction problem itself as a cross-sectional analysis, rather than leveraging longitudinal or retrospective data windows (Paterson, Taylor, & Gray, 2020). This approach, particularly in how data is aggregated and features are engineered, often overlooks the temporal richness inherent in the raw data (Suzuki, Nakamura, Inagaki, Watanabe, & Takagi, 2019), potentially skewing the results towards recent data points while missing out on longer-term trends and broader progressions over time (Park, Eom, Seo, & Choi, 2020).

The Swedish Gambling Act, mandates counteracting excessive gambling through continuous monitoring of gambling behavior (Swedish Gambling Act, 2018). Whether the Duty of Care should extend to predictive analytics that foresee problematic patterns before they fully develop remains to be thoroughly examined and empirically validated. This study aims to enhance understanding of the temporal dynamics in identifying problem gambling by applying advanced machine learning methods focused on predicting manual assessments and evaluating the temporal stability of these predictions through truncating the training set at various time points. Our approach leverages aggregated data to capture broader behavioral indicators, ensuring comprehensive analysis and improved prediction accuracy. By transitioning from monitoring to proactive prediction, our research enables gambling operators to implement timely interventions to prevent the escalation of problem gambling behaviors. Such advancements align with legislative frameworks and could significantly improve public health outcomes by reducing gambling-related harms through early prediction.

Methods

Participants

We utilized player account data from one of Sweden's largest licensed online gambling providers, covering 4.5 years from January 1, 2019 (at which point Sweden switched to a licensed gambling market), to July 1, 2023. The dataset included extensive behavioral and transactional details for n = 35,048 unique, authenticated players, all of whom are based in Sweden, allowing for a comprehensive analysis of gambling behaviors within this specific context.

Measures

Data preprocessing and feature engineering

All data pre-processing and analyses were conducted using Python (3.11); the fully reproducible code is available online (https://github.com/SamAndersson-C/temporal-dynamics-problem-gambling). We performed extensive feature engineering on raw data consisting of 11 data frames, which included information on bets, transactions, sessions, demographics, payments, responsible gambling actions and predictions, manual risk assessments, and multiple accounts. Using SQL scripts, we combined the data shards into raw tables within a PostgreSQL database. As in past research (Hopfgartner, Auer, Griffiths, & Helic, 2022, 2024) features were derived to reflect various aspects of online gambling behavior, such as loss chasing, betting frequency, session lengths, and spending patterns. Accurate alignment of all tables was crucial due to the granularity of the timestamps, ensuring meaningful feature engineering (Wang et al., 2009) and to capture the evolution of gambling behaviors and detect significant changes or trends, we ensured temporal alignment of data tables and took specific care to avoid data leakage (using information from outside the training set in model training). We enhanced performance through indexing, partitioning, and query optimization. By precisely aligning and securely managing the data, we prevented inadvertent leakage that could produce overly optimistic results. All feature aggregations strictly used activity data up to each labeling date (Fig. 1).

Fig. 1.
Fig. 1.

Data pre-processing and analysis pipeline

Citation: Journal of Behavioral Addictions 2025; 10.1556/2006.2025.00013

Nominal variables were numerically coded to facilitate modeling. Features with more than 50% missing values were excluded, while others underwent median imputation to preserve data integrity. Heavily skewed features with an excess of zero values were log-transformed to achieve a more normalized distribution. To evaluate the temporal stability of our predictions, we implemented a temporal division for training and test sets, reserving the final year's data for testing. This ensured the model was evaluated on unseen data, simulating a real-world deployment scenario (Barros, Nascimento, Guedes, & Monsueto, 2023). We employed a data truncation strategy based on timestamps from the gambling operator's raw data to further assess temporal stability. Using June 1, 2022, as a general reference, we truncated each player's data by removing records 30, 60, and 90 days prior to their maximum timestamp in the training set. This resulted in three distinct datasets: 30-day, 60-day, and 90-day truncated data, each undergoing the same feature engineering for model training and evaluation. This allowed us to analyze how model performance varies with different amounts of historical data, providing insights into the temporal stability of the predictions over varying time horizons.

Labeling

The primary label indicating customer risk was derived from manual assessments conducted by the gambling provider as part of their Responsible Gambling operations as per their Duty of Care commitment (Cisneros Örnberg & Hettne, 2018). These assessments targeted players exhibiting concerning gambling behaviors, classifying them into five risk levels based on deposit patterns, session length, denied transactions, and responsible gambling tool use. Higher-risk cases involved persistent high deposits, prolonged play, or self-reported loss of control. Additionally, certain customers were flagged for manual review by the support team if communication raised concerns. Since assessments focused on flagged individuals rather than a random sample, the study population primarily reflects at-risk players rather than all gamblers, congruent with the aim of prediction algorithms in this context. A database logging error was discovered during the analysis, revealing that most of the manually assessed labels were concentrated at the beginning of the dataset's time frame, with some customers (n = 5,848) being flagged with “unknown risk” as their label. These were accounts that the operator's RG analysts began to review but could not complete due to unsuccessful attempts to contact the individual in question, resulting in the suspension of the review. These accounts were temporarily restricted from gambling until the company could establish contact with the individuals, allowing them to complete the review process in accordance with the Responsible Gambling (RG) procedures. These customers were included in the training data if they had a corresponding risk label from the RG prediction table on the same date as the “unknown risk” label (n = 1,844). If such a label was available, we replaced the “unknown risk” with the corresponding RG prediction label. Subsequent customers labeled as “unknown risk” without a corresponding RG prediction label were discarded from the training data (n = 4,004) while all “unknown risk” customers were discarded from the hold-out set (n = 1,902). This approach aimed to fill the gaps in the training data, thereby enhancing the dataset's coverage and the robustness of the predictive models and providing us with a more comprehensive and temporally distributed training dataset.

We acknowledge that this imputation method, while beneficial, does introduce some potential noise into the model. However, this noise can have a regularizing effect on the complex model. Since the imputation was applied only to the training data, we avoided any potential data leakage. Without imputation, large temporal gaps between manually assessed labels could have led to overfitting on sparse data patterns. By filling these gaps, we provided a more continuous and diverse set of training examples, helping the model generalize better to unseen data. This regularizing effect reduced overfitting, enhancing the robustness and reliability of the model's predictions.

Initially, customers were categorized into six risk levels, creating a multi-class classification problem. However, preliminary models performed poorly, as fine-grained classification can introduce unnecessary complexity and variance (Blanco, Perez-de-Viñaspre, Pérez, & Casillas, 2020; Elyan & Gaber, 2017). To gain deeper insight into risk labels and reduce the number of categories, we first generated average SHAP (SHapley Additive exPlanations) (Lundberg, Allen, & Lee, n.d.; Ukhov, Bjurgert, Auer, & Griffiths, 2021) value plots based on our preliminary models. From these plots, we selected the top 25 most influential variables and used them in a hierarchical clustering analysis with the complete linkage method, known for its robustness to noise and outliers (Laurikkala & Juhola, 2001). Euclidean distance was used; single linkage was also considered but found unsuitable due to data noise. This analysis revealed that the data naturally clustered into two groups: low-risk and higher-risk. This supported our decision to binarize the labels into low risk and all other risk levels. This binary framework allowed us to focus on distinguishing low-risk customers from those at elevated risk, aligning with responsible gambling objectives.

Procedure

Feature selection

After initial feature engineering in SQL, we conducted feature selection using SHAP values (Lundberg et al., n.d.; Ukhov et al., 2021) and Generalized Matrix Learning Vector Quantization (GMLVQ) (Lövdal & Biehl, 2024) to identify the most relevant features for the binary classification task. SHAP values decompose a model's prediction for an individual instance into contributions from each feature, providing local and consistent explanations. They ensure that the sum of SHAP values equals the difference between the model's prediction for that instance and the average prediction over the dataset, making them useful for interpreting complex models with clear, additive feature contributions.

In parallel, we applied GMLVQ, a supervised learning technique designed to enhance the discriminative power of features by optimizing a relevance matrix. GMLVQ adjusts the feature space to maximize the margin between classes, which is crucial for effectively distinguishing between classes. GMLVQ assigns different levels of relevance to each feature, thereby improving the model's ability to focus on the most discriminative features for accurate predictions. This approach not only aids in classification but also provides a way to interpret the contribution of each feature to the decision boundaries defined by the model.

To ensure equal contribution of all features during model training, we scaled them using a standard scaler. We reduced redundancy by calculating a correlation matrix and removing one feature from each pair of highly correlated features (Yu & Liu, 2003). Subsequently, we trained a model on the scaled training dataset: XGBoost (Chen & Guestrin, 2016). SHAP values were computed to evaluate the importance of each feature in the prediction process.

To finalize the feature selection, we combined the top 25 features identified by each method. Choosing 25 features was a heuristic decision to balance model complexity and interpretability. This subset size ensured the final models were both accurate and interpretable, maintaining manageable complexity. By merging the most informative and discriminative features from both methods, we created a comprehensive and optimized feature set for the classification task.

Statistical analysis

We used XGBoost to classify customers into low-risk and higher-risk categories. Comprehensive hyperparameter tuning was conducted using Optuna (Akiba, Sano, Yanase, Ohta, & Koyama, 2019), an automated optimization framework, to ensure the model's accuracy and generalizability across different datasets. We focused on optimizing the F1 score to balance precision and recall, exploring hyperparameters such as learning rate, number of estimators, maximum tree depth, subsampling ratio, column sampling ratio, and regularization parameters. We also optimized a probability threshold for converting predicted probabilities into binary classifications. To respect the chronological structure of the data during model selection and avoid any leakage from future observations, we employed a nested forward-chaining cross-validation procedure. Specifically, we sorted all training instances by date and split them into a 5-fold outer loop using a time-series split, ensuring that each validation fold came strictly after the training folds in time. Within each outer training fold, we performed a 3-fold time-series split in an inner loop to refine hyperparameters, again preserving the temporal order. This approach minimized overfitting and ensured that each step of hyperparameter tuning, and model evaluation respected the temporal sequence of events.

We ran up to 1,000 Optuna trials, training XGBoost with different hyperparameters and selecting the best based on average F1 score across inner cross-validation folds. Using these optimal parameters, we trained final models on the full dataset and each truncation period (30-day, 60-day, 90-day) before evaluating them on a hold-out test set (the unused data) to assess real-world generalizability.

Predicted probabilities were converted into binary predictions using the optimized threshold, and performance metrics—including F1 score, ROC AUC, precision, recall, accuracy, and confusion matrices—were computed. To assess the stability of predictions across different amounts of historical data, we repeated this process for each truncation period (30-day, 60-day, 90-day, and full) and compared performance metrics. Finally, we applied linear regression to these metrics to identify trends as the amount of data decreased and used bootstrapping to compute confidence intervals for the slopes, determining whether changes in performance were statistically significant over time.

In addition to classification, we conducted a regression analysis to predict continuous risk scores, providing a more granular understanding of the model's predictive capabilities. We used XGBoost as a regressor to predict risk scores on a continuous scale, which were subsequently categorized into low, medium, and high-risk levels.

Ethics

The study procedures were carried out in accordance with the Declaration of Helsinki. The study was reviewed and approved by the Swedish Ethical Review Authority (Dnr 2023-07288-02). Informed consent was waived by the review board to permit research on pre-existing registry data.

Results

Predictions of problem gambling exhibited considerable temporal stability, even with progressively truncated data. Across all truncation periods (30-day, 60-day, 90-day, and full data), “loss chasing behavior weekly log transformed,” “net balance trend,” “max deposit log transformed,” “session sum p25,” and “total bets daily log transformed” consistently had the highest SHAP values, indicating a strong influence on the model's predictions (Fig. 2). As shown in Fig. 3, hold-out set metrics improved slightly with more data, suggesting larger datasets enhance generalization and decision boundaries—particularly in identifying true positives. Overall, model performance was modest yet consistent (Table 1).

Fig. 2.
Fig. 2.

Feature importance plot

Citation: Journal of Behavioral Addictions 2025; 10.1556/2006.2025.00013

Fig. 3.
Fig. 3.

Temporal evaluation and prediction stability

Citation: Journal of Behavioral Addictions 2025; 10.1556/2006.2025.00013

Table 1.

Model performance metrics for different truncation labels

TruncationAccuracyPrecision (PPV)Recall (Sensitivity)F1 ScoreROC AUCSpecificityNPV
30-day-truncated-data0.6680.6910.9290.7930.6210.1070.411
60-day-truncated- data0.6590.6900.9110.7850.6130.1170.378
90-day-truncated- data0.6840.6870.9840.8100.6080.0360.515
Full data0.6370.6960.8340.7580.6130.2130.374

A bootstrap analysis of linear slopes across truncation periods (Full → 30-day → 60-day → 90-day) found no significant trend for most metrics, as their 95% confidence intervals included zero: Accuracy [−0.009, 0.031], Recall (Sensitivity) [−0.018, 0.095], F1 Score [−0.008, 0.035], and ROC AUC [−0.008, 0.008]. However, Precision (PPV) had a 95% CI entirely below zero [−0.005, −0.001], indicating a consistently negative slope with increasing truncation. Practically, Precision dropped slightly when moving from full to truncated data. Despite this, overall model performance remained stable across all truncation periods.

We used a regression model to predict continuous risk scores—grouped as low, medium, and high risk—to evaluate performance by risk level (Table 2 and Fig. 4). The model performed well for medium- and high-risk categories, with predicted means closely matching true means. For instance, in the 30-day dataset, the medium-risk group's actual mean was 0.618 vs. a predicted mean of 0.500, and the high-risk group's actual mean was 0.761 vs. 0.758. In the 60-day dataset, the high-risk group's actual mean was 0.765 vs. 0.755. However, the model consistently underestimated risk for the low-risk category in every dataset: for example, in the 60-day dataset, the low-risk group's true mean was 0.529 vs. a predicted mean of 0.248, and in the full dataset, 0.557 vs. 0.219. Thus, the model effectively identifies medium- and high-risk individuals but struggles to accurately capture low-risk cases.

Table 2.

Risk category prediction table with difference

DatasetRisk CategoryTrue MeanPredicted MeanDifference
30-day-truncated-dataLow Risk0.6000.2420.358
30-day-truncated-dataMedium Risk0.6180.5000.118
30-day-truncated-dataHigh Risk0.7610.7580.003
60-day-truncated- dataLow Risk0.5290.2480.281
60-day-truncated- dataMedium Risk0.6250.4970.127
60-day-truncated- dataHigh Risk0.7650.7550.010
90-day-truncated- dataLow Risk0.5880.2550.333
90-day-truncated- dataMedium Risk0.6090.5380.072
90-day-truncated- dataHigh Risk0.7710.7240.047
Full dataLow Risk0.5570.2190.337
Full dataMedium Risk0.6460.4580.188
Full dataHigh Risk0.7790.7220.057
Fig. 4.
Fig. 4.

Difference between true and predicted means

Citation: Journal of Behavioral Addictions 2025; 10.1556/2006.2025.00013

Figure 4 shows better performance in medium and high-risk groups, with smaller gaps between true and predicted means, whereas the model underestimated risk in the low-risk group (the gap increased with longer truncation). This was most pronounced in the full dataset, where the difference for low-risk cases reached 0.337, compared to 0.188 for medium risk and 0.057 for high risk.

Discussion

The results suggest that machine learning predictions of problem gambling, assessed manually or through proxy measures, show relative stability over time, with time being intrinsically linked to data amount. This indicates that early predictions are consistent and reliable, highlighting our model's robustness. Our claims are based on the model's performance on a holdout validation set. By reserving a full year of data for validation, we evaluated the model on unseen data, mimicking real-world conditions for Duty of Care obligations. Our findings confirm that predictive analytics and machine learning are promising in identifying problem gamblers (Auer & Griffiths, 2022; Deng et al., 2019; Perrot et al., 2022), validating the effectiveness of these methods in a temporally robust manner. Metrics like ROC AUC and F1 score remained consistent across data truncation levels, indicating model reliability. Bootstrapping showed no significant slopes for Accuracy, Recall, F1, and ROC AUC, but Precision exhibited a slight, consistently negative slope from full data to 30-, 60-, and 90-day truncations. Despite this, overall performance stayed relatively stable. Unlike preliminary analysis based on training data and time series cross-validation, the holdout evaluation did not show a decline in performance metrics with the full dataset; instead, metrics such as recall and F1 score improved with increased dataset size, underscoring the importance of using a separate validation set for an accurate reflection of the model's true performance. Therefore, our methods avoid the limitations of traditional approaches like self-report questionnaires and simple behavioral tracking, which often suffer from validity and reliability issues (Edgren et al., 2016; Hodgins & Makarchuk, 2003; MacKillop, Anderson, Castelda, Mattson, & Donovick, 2006). Our machine learning approach offers a more reliable and scalable solution. The model consistently demonstrates reliable performance across different truncation periods, with SHAP values clarifying which features drive its predictions. This highlights the model's ability to effectively interpret complex behavioral data that traditional methods might not capture.

Finally, studies relying on cross-sectional data inherently struggle to capture the temporal dynamics of gambling behavior or (Castrén, Kontto, Alho, & Salonen, 2018; Gainsbury et al., 2013; Paterson et al., 2020). Our study addresses this gap by evaluating the temporal stability of predictions. The consistent importance of key features across different truncation periods, as shown by SHAP values and performance metrics, underscores this stability. This is critical for developing models that can accurately predict problem gambling over extended periods, enhancing our understanding of gambling behavior dynamics.

The findings have practical implications for early identification and intervention in problem gambling. The stability of predictions supports the timely implementation of preventive measures, which can mitigate the risks associated with problem gambling and aid stakeholders in developing effective public health monitoring and intervention programs (Jonsson, Munck, Hodgins, & Carlbring, 2023).

Limitations

This study has several limitations. First, inconsistent application of risk labels over time may cause the model to capture temporal biases rather than genuine risk patterns, especially in dynamic environments like gambling where user behavior and risk profiles can change rapidly. The presence of “unknown risk” labels lead to an imbalanced dataset, underrepresenting certain risk categories and potentially skewing the model's learning process toward more prevalent categories. Our imputation strategy—filling gaps with responsible gambling (RG) prediction labels—aimed to mitigate this by improving the quality and quantity of labeled training data. This approach increased the number of labeled data points and ensured a more uniform temporal distribution, allowing the models to learn from a broader and more representative sample. While this enhanced dataset reduced the risk of overfitting and increased generalizability, inherent imbalances may still pose challenges. Importantly, the hold-out validation data did not suffer from this limitation.

Second, potential bias introduced by manual assessments used for labeling must be acknowledged. Analysts' subjective judgments could have impacted the consistency and accuracy of the labels. Despite this potential bias, manual assessments are generally considered more reliable than self-assessments, which are often prone to inaccuracies (either deliberate or indeliberate) and inconsistencies.

Third, our truncation strategy intended to ensure temporal stability by focusing on consistent windows of activity. However, it may have inadvertently caused accounts with the most cumulative activity to contribute disproportionately to the predictions. Initially, we attempted to use accounts with 30, 60, or 90 days of total activity, but too few accounts met these criteria for meaningful model training. Consequently, we opted for an activity truncation strategy as a compromise, including enough data points for model training but possibly biasing the model toward accounts with more extensive histories.

Fourth, our dataset comes from a single gambling operator in a competitive market, and does not include any given gamblers' activity at other operators. Problem gamblers are typically more likely to gamble with multiple operators. Incomplete behavioral histories can lead to underestimation or misclassification of certain gambling behaviors and limit the broader applicability of our findings. Ideally, a “single customer view” mechanism—aggregating data from multiple operators—would yield more comprehensive insights and potentially more accurate predictive models. In lieu of a centralized system for sharing account-tracking data across operators, operator-specific predictions remain the pragmatic approach to minimizing gambling harms.

Lastly, although the analysis uses a robust setup—temporal holdout splits and nested cross-validation—the limited bootstrapping approach (four samples per metric) may reduce sensitivity to subtle trends. Even so, narrow confidence intervals suggest stable performance metrics over time, indicating temporal consistency. Future research with larger samples or alternative methods could further validate these findings.

Future research directions

Our findings suggest several avenues for future research. One key area is determining the optimal data window for reliable predictions, balancing data sufficiency with model performance. Exploring other machine learning techniques or refining labeling methods could further enhance accuracy. Validating the model with different datasets or in varied contexts will improve its generalizability and robustness.

To improve predictive capabilities, gambling operators should routinely collect relevent features reflecting various risk levels of problem gambling—beyond purely transactional data. This might include browsing patterns, time spent on different site areas, or engagement with specific features. Like how physical casinos observe customer behavior on the floor, incorporating such behavioral indicators online could enhance the model's ability to identify at-risk individuals more accurately.

Conclusions

This study demonstrates the value of advanced machine learning techniques and rigorous methodologies in gambling research. Our findings show stable long-term prediction performance, evidenced by consistent metrics across different truncation periods. This supports the feasibility of early detection and timely interventions, underscoring the importance of methodological rigor in developing reliable predictive models. These results have significant implications, providing a strong foundation for further research and development in the field.

Funding sources

This study was funded by the LeoVegas Group, a licensed gambling operator in Sweden.

Authors' contributions

SA conceived the analysis pipeline, designed the methods, performed all statistical analyses, developed and executed the modeling and feature engineering, built the database, and drafted the manuscript. PL, as the main supervisor, was responsible for project organization and oversight, providing critical revision of the manuscript and engaging in discussions regarding study concept and methodological and analysis approaches particularly around temporal stability. PC, as co-supervisor, contributed to the study concept alongside PL and provided feedback during critical revision of the manuscript. KL, a data scientist at LeoVegas, supervised the data science aspects, provided technical input, facilitated data transfer, and participated in discussions on machine learning and statistical methods. MB, as head of data science at LeoVegas, contributed to data acquisition, addressed data-related queries, and provided expertise on study feasibility, technical aspects, and conceptual input. All authors were part of the steering group, which met monthly and was organized by PL. SA and PL had full access to all data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. All authors approved the final manuscript.

Conflict of interest

This study is part of an industry-academia collaboration on Responsible Gambling, financed by the LeoVegas Group, a licensed gambling operator in Sweden. The research was planned, performed and submitted under full academic freedom, guaranteed per a written agreement. The funders had no role in the design or execution of the study, nor the decision to publish.

SA's doctoral position is financed by the LeoVegas Group but is an employee of Karolinska Institutet and reports no other potential conflicts of interest. PL and PC reports past and ongoing industry-academia collaborations with several gambling providers, including project-specific funding, but have no personal ties to the gambling industry, financial or otherwise. MB is employed by the LeoVegas Group.

Generative AI use

During the preparation of this work, the corresponding author utilized ChatGPT to enhance writing style and correct grammar and spelling. Following the use of this tool, the authors thoroughly reviewed and edited the content as required, taking full responsibility for the final content of the publication. The AI's role was limited to enhancing clarity, coherence, and presentation of the manuscript.

Acknowledgements

We would like to acknowledge Prof. Ion Petre at the University of Turku, whose lecture series “Foundations of Machine Learning I-III” inspired much of the analysis pipeline used in this study.

References

  • Akiba, T., Sano, S., Yanase, T., Ohta, T., & Koyama, M. (2019). Optuna: A next-generation hyperparameter optimization framework. http://arxiv.org/abs/1907.10902.

    • Search Google Scholar
    • Export Citation
  • Auer, M., & Griffiths, M. D. (2022). Predicting limit-setting behavior of gamblers using machine learning algorithms: A real-world study of Norwegian gamblers using account data. International Journal of Mental Health and Addiction, 20(2), 771788. https://doi.org/10.1007/s11469-019-00166-2.

    • Search Google Scholar
    • Export Citation
  • Barros, B. de M., Nascimento, H. A. D. do, Guedes, R., & Monsueto, S. E. (2023). Evaluating splitting approaches in the context of student dropout prediction. https://arxiv.org/abs/2305.08600.

    • Search Google Scholar
    • Export Citation
  • Bitar, R., Nordt, C., Grosshans, M., Herdener, M., Seifritz, E., & Mutschler, J. (2017). Telecommunications network measurements of online gambling behavior in Switzerland: A feasibility study. European Addiction Research, 23(2), 106112. https://doi.org/10.1159/000471482.

    • Search Google Scholar
    • Export Citation
  • Blanco, A., Perez-de-Viñaspre, O., Pérez, A., & Casillas, A. (2020). Boosting ICD multi-label classification of health records with contextual embeddings and label-granularity. Computer Methods and Programs in Biomedicine, 188, 105264. https://doi.org/10.1016/j.cmpb.2019.105264.

    • Search Google Scholar
    • Export Citation
  • Braverman, J., LaPlante, D. A., Nelson, S. E., & Shaffer, H. J. (2013). Using cross-game behavioral markers for early identification of high-risk internet gamblers. Psychology of Addictive Behaviors, 27(3), 868877. https://doi.org/10.1037/a0032818.

    • Search Google Scholar
    • Export Citation
  • Braverman, J., & Shaffer, H. J. (2012). How do gamblers start gambling: Identifying behavioural markers for high-risk internet gambling. The European Journal of Public Health, 22(2), 273278. https://doi.org/10.1093/eurpub/ckp232.

    • Search Google Scholar
    • Export Citation
  • Browne, M., Rawat, V., Greer, N., Langham, E., Rockloff, M., & Hanley, C. (2017). What is the harm? Applying a public health methodology to measure the impact of gambling problems and harm on quality of life. Journal of Gambling Issues, 36. https://doi.org/10.4309/jgi.v0i36.3978.

    • Search Google Scholar
    • Export Citation
  • Castrén, S., Kontto, J., Alho, H., & Salonen, A. H. (2018). The relationship between gambling expenditure, socio‐demographics, health‐related correlates and gambling behaviour—a cross‐sectional population‐based survey in Finland. Addiction, 113(1), 91106. https://doi.org/10.1111/add.13929.

    • Search Google Scholar
    • Export Citation
  • Catania, M., & Griffiths, M. D. (2021). Understanding online voluntary self-exclusion in gambling: An empirical study using account-based behavioral tracking data. International Journal of Environmental Research and Public Health, 18(4), 2000. https://doi.org/10.3390/ijerph18042000.

    • Search Google Scholar
    • Export Citation
  • Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. https://doi.org/10.1145/2939672.2939785.

  • Cisneros Örnberg, J., & Hettne, J. (2018). The future Swedish gambling market: Challenges in law and public policies. In Gambling Policies in European welfare states (pp. 197216). Springer International Publishing. https://doi.org/10.1007/978-3-319-90620-1_11.

    • Search Google Scholar
    • Export Citation
  • Clune, S., Ratnaike, D., White, V., Donaldson, A., Randle, E., O’Halloran, P., & Lewis, V. (2024). What is known about population level programs designed to address gambling-related harm: Rapid review of the evidence. Harm Reduction Journal, 21(1). https://doi.org/10.1186/s12954-024-01032-8.

    • Search Google Scholar
    • Export Citation
  • Deng, X., Lesch, T., & Clark, L. (2019). Applying data science to behavioral analysis of online gambling. Current Addiction Reports, 6(3), 159164. https://doi.org/10.1007/s40429-019-00269-9.

    • Search Google Scholar
    • Export Citation
  • Dowling, N. A., Merkouris, S. S., Greenwood, C. J., Oldenhof, E., Toumbourou, J. W., & Youssef, G. J. (2017). Early risk and protective factors for problem gambling: A systematic review and meta-analysis of longitudinal studies. Clinical Psychology Review, 51, 109124. https://doi.org/10.1016/j.cpr.2016.10.008.

    • Search Google Scholar
    • Export Citation
  • Eadington, W. R. (2003). Measuring costs from permitted gaming: Concepts and categories in evaluating gambling’s consequences. Journal of Gambling Studies, 19(2), 185213. https://doi.org/10.1023/A:1023681315907.

    • Search Google Scholar
    • Export Citation
  • Edgren, R., Castrén, S., Mäkelä, M., Pörtfors, P., Alho, H., & Salonen, A. H. (2016). Reliability of instruments measuring at-risk and problem gambling among young individuals: A systematic review covering years 2009–2015. Journal of Adolescent Health, 58(6), 600615. https://doi.org/10.1016/j.jadohealth.2016.03.007.

    • Search Google Scholar
    • Export Citation
  • Elyan, E., & Gaber, M. M. (2017). A genetic algorithm approach to optimising random forests applied to class engineered data. Information Sciences, 384, 220234. https://doi.org/10.1016/j.ins.2016.08.007.

    • Search Google Scholar
    • Export Citation
  • Gainsbury, S., Sadeque, S., Mizerski, D., & Blaszczynski, A. (2013). Wagering in Australia: A retrospective behavioural analysis of betting patterns based on player account data. The Journal of Gambling Business and Economics, 6(2), 5068. https://doi.org/10.5750/jgbe.v6i2.581.

    • Search Google Scholar
    • Export Citation
  • Goldstein, A. L., Vilhena-Churchill, N., Munroe, M., Stewart, S. H., Flett, G. L., & Hoaken, P. N. S. (2017). Understanding the effects of social desirability on gambling self-reports. International Journal of Mental Health and Addiction, 15(6), 13421359. https://doi.org/10.1007/s11469-016-9668-0.

    • Search Google Scholar
    • Export Citation
  • Haeusler, J. (2016). Follow the money: Using payment behaviour as predictor for future self-exclusion. International Gambling Studies, 16(2), 246262. https://doi.org/10.1080/14459795.2016.1158306.

    • Search Google Scholar
    • Export Citation
  • Hahmann, T., Hamilton-Wright, S., Ziegler, C., & Matheson, F. I. (2021). Problem gambling within the context of poverty: A scoping review. International Gambling Studies, 21(2), 183219. https://doi.org/10.1080/14459795.2020.1819365.

    • Search Google Scholar
    • Export Citation
  • Hodgins, D. C., & Makarchuk, K. (2003). Trusting problem gamblers: Reliability and validity of self-reported gambling behavior. Psychology of Addictive Behaviors, 17(3), 244248. https://doi.org/10.1037/0893-164X.17.3.244.

    • Search Google Scholar
    • Export Citation
  • Hofmarcher, T., Romild, U., Spångberg, J., Persson, U., & Håkansson, A. (2020). The societal costs of problem gambling in Sweden. BMC Public Health, 20(1), 1921. https://doi.org/10.1186/s12889-020-10008-9.

    • Search Google Scholar
    • Export Citation
  • Hopfgartner, N., Auer, M., Griffiths, M. D., & Helic, D. (2022). Predicting self-exclusion among online gamblers: An empirical real-world study. Journal of Gambling Studies, 39(1), 447465. https://doi.org/10.1007/s10899-022-10149-z.

    • Search Google Scholar
    • Export Citation
  • Hopfgartner, N., Auer, M., Helic, D., & Griffiths, M. D. (2024). Using artificial intelligence algorithms to predict self-reported problem gambling among online casino gamblers from different countries using account-based player data. International Journal of Mental Health and Addiction. Advance online publication. https://doi.org/10.1007/s11469-024-01312-1.

    • Search Google Scholar
    • Export Citation
  • Jonsson, J., Abbott, M. W., Sjöberg, A., & Carlbring, P. (2017). Measuring gambling reinforcers, over consumption and fallacies: The psychometric properties and predictive validity of the Jonsson-Abbott scale. Frontiers in Psychology, 8. https://doi.org/10.3389/fpsyg.2017.01807.

    • Search Google Scholar
    • Export Citation
  • Jonsson, J., Munck, I., Hodgins, D. C., & Carlbring, P. (2023). Reaching out to big losers: Exploring intervention effects using individualized follow-up. Psychology of Addictive Behaviors, 37(7), 886893. https://doi.org/10.1037/adb0000906.

    • Search Google Scholar
    • Export Citation
  • Jonsson, J., Munck, I., Volberg, R., & Carlbring, P. (2017). GamTest: Psychometric evaluation and the role of emotions in an online self-test for gambling behavior. Journal of Gambling Studies, 33(2), 505523. https://doi.org/10.1007/s10899-017-9676-4.

    • Search Google Scholar
    • Export Citation
  • Kairouz, S., Costes, J.-M., Murch, W. S., Doray-Demers, P., Carrier, C., & Eroukmanoff, V. (2023). Enabling new strategies to prevent problematic online gambling: A machine learning approach for identifying at-risk online gamblers in France. International Gambling Studies, 23(3), 471490. https://doi.org/10.1080/14459795.2022.2164042.

    • Search Google Scholar
    • Export Citation
  • Kuentzel, J. G., Henderson, M. J., & Melville, C. L. (2008). The impact of social desirability biases on self-report among college student and problem gamblers. Journal of Gambling Studies, 24(3), 307319. https://doi.org/10.1007/s10899-008-9094-8.

    • Search Google Scholar
    • Export Citation
  • Laurikkala, J., & Juhola, M. (2001). Hierarchical clustering of female urinary incontinence data having noise and outliers (pp. 161167). https://doi.org/10.1007/3-540-45497-7_24.

    • Search Google Scholar
    • Export Citation
  • Lövdal, S., & Biehl, M. (2024). Iterated relevance matrix analysis (IRMA) for the identification of class-discriminative subspaces. Neurocomputing, 577, 127367. https://doi.org/10.1016/j.neucom.2024.127367.

    • Search Google Scholar
    • Export Citation
  • Lundberg, S. M., Allen, P. G., & Lee, S.-I. (n.d.). A unified approach to interpreting model predictions. https://github.com/slundberg/shap.

    • Search Google Scholar
    • Export Citation
  • MacKillop, J., Anderson, E. J., Castelda, B. A., Mattson, R. E., & Donovick, P. J. (2006). Divergent validity of measures of cognitive distortions, impulsivity, and time perspective in pathological gambling. Journal of Gambling Studies, 22(3), 339354. https://doi.org/10.1007/s10899-006-9021-9.

    • Search Google Scholar
    • Export Citation
  • Murch, W. S., Kairouz, S., Dauphinais, S., Picard, E., Costes, J., & French, M. (2023). Using machine learning to retrospectively predict self‐reported gambling problems in Quebec. Addiction, 118(8), 15691578. https://doi.org/10.1111/add.16179.

    • Search Google Scholar
    • Export Citation
  • Park, Y., Eom, D., Seo, B., & Choi, J. (2020). Improved predictive deep temporal neural networks with trend filtering. In Proceedings of the first ACM international conference on AI in finance (pp. 18). https://doi.org/10.1145/3383455.3422565.

    • Search Google Scholar
    • Export Citation
  • Paterson, M., Taylor, M., & Gray, M. (2020). Trajectories of social and economic outcomes and problem gambling risk in Australia. Social Indicators Research, 148(1), 297321. https://doi.org/10.1007/s11205-019-02194-w.

    • Search Google Scholar
    • Export Citation
  • Percy, C., França, M., Dragičević, S., & d’Avila Garcez, A. (2016). Predicting online gambling self-exclusion: an analysis of the performance of supervised machine learning models. International Gambling Studies, 16(2), 193210. https://doi.org/10.1080/14459795.2016.1151913.

    • Search Google Scholar
    • Export Citation
  • Perrot, B., Hardouin, J. B., Thiabaud, E., Saillard, A., Grall-Bronnec, M., & Challet-Bouju, G. (2022). Development and validation of a prediction model for online gambling problems based on players’ account data. Journal of Behavioral Addictions, 11(3), 874889. https://doi.org/10.1556/2006.2022.00063.

    • Search Google Scholar
    • Export Citation
  • Sato, H., & Kawahara, J. (2011). Selective bias in retrospective self-reports of negative mood states. Anxiety, Stress & Coping, 24(4), 359367. https://doi.org/10.1080/10615806.2010.543132.

    • Search Google Scholar
    • Export Citation
  • Suzuki, H., Nakamura, R., Inagaki, A., Watanabe, I., & Takagi, T. (2019). Early detection of problem gambling based on behavioral changes using shapelets. In Proceedings - 2019 IEEE/WIC/ACM international Conference on web intelligence, WI 2019 (pp. 367372). https://doi.org/10.1145/3350546.3352549.

    • Search Google Scholar
    • Export Citation
  • Swedish Gambling Act, Pub. L. No. 2018:1138, Swedish code of statutes (2018).

  • Ukhov, I., Bjurgert, J., Auer, M., & Griffiths, M. D. (2021). Online problem gambling: A comparison of casino players and sports bettors via predictive modeling using behavioral tracking data. Journal of Gambling Studies, 37(3), 877897. https://doi.org/10.1007/s10899-020-09964-z.

    • Search Google Scholar
    • Export Citation
  • Wang, T. D., Plaisant, C., Shneiderman, B., Spring, N., Roseman, D., Marchand, G., … Smith, M. (2009). Temporal summaries: Supporting temporal categorical searching, aggregation and comparison. IEEE Transactions on Visualization and Computer Graphics, 15(6), 10491056. https://doi.org/10.1109/TVCG.2009.187.

    • Search Google Scholar
    • Export Citation
  • Weatherly, J. N., Montes, K. S., Peters, D., & Wilson, A. N. (2012). Gambling behind the walls: A behavior-analytic perspective. The Behavior Analyst Today, 13(3–4), 28. https://doi.org/10.1037/h0100725.

    • Search Google Scholar
    • Export Citation
  • Yu, L., & Liu, H. (2003). Efficiently handling feature redundancy in high-dimensional data. In Proceedings of the Ninth ACM SIGKDD international conference on knowledge discovery and data mining, pp. 685690. https://doi.org/10.1145/956750.956840.

    • Search Google Scholar
    • Export Citation
  • Akiba, T., Sano, S., Yanase, T., Ohta, T., & Koyama, M. (2019). Optuna: A next-generation hyperparameter optimization framework. http://arxiv.org/abs/1907.10902.

    • Search Google Scholar
    • Export Citation
  • Auer, M., & Griffiths, M. D. (2022). Predicting limit-setting behavior of gamblers using machine learning algorithms: A real-world study of Norwegian gamblers using account data. International Journal of Mental Health and Addiction, 20(2), 771788. https://doi.org/10.1007/s11469-019-00166-2.

    • Search Google Scholar
    • Export Citation
  • Barros, B. de M., Nascimento, H. A. D. do, Guedes, R., & Monsueto, S. E. (2023). Evaluating splitting approaches in the context of student dropout prediction. https://arxiv.org/abs/2305.08600.

    • Search Google Scholar
    • Export Citation
  • Bitar, R., Nordt, C., Grosshans, M., Herdener, M., Seifritz, E., & Mutschler, J. (2017). Telecommunications network measurements of online gambling behavior in Switzerland: A feasibility study. European Addiction Research, 23(2), 106112. https://doi.org/10.1159/000471482.

    • Search Google Scholar
    • Export Citation
  • Blanco, A., Perez-de-Viñaspre, O., Pérez, A., & Casillas, A. (2020). Boosting ICD multi-label classification of health records with contextual embeddings and label-granularity. Computer Methods and Programs in Biomedicine, 188, 105264. https://doi.org/10.1016/j.cmpb.2019.105264.

    • Search Google Scholar
    • Export Citation
  • Braverman, J., LaPlante, D. A., Nelson, S. E., & Shaffer, H. J. (2013). Using cross-game behavioral markers for early identification of high-risk internet gamblers. Psychology of Addictive Behaviors, 27(3), 868877. https://doi.org/10.1037/a0032818.

    • Search Google Scholar
    • Export Citation
  • Braverman, J., & Shaffer, H. J. (2012). How do gamblers start gambling: Identifying behavioural markers for high-risk internet gambling. The European Journal of Public Health, 22(2), 273278. https://doi.org/10.1093/eurpub/ckp232.

    • Search Google Scholar
    • Export Citation
  • Browne, M., Rawat, V., Greer, N., Langham, E., Rockloff, M., & Hanley, C. (2017). What is the harm? Applying a public health methodology to measure the impact of gambling problems and harm on quality of life. Journal of Gambling Issues, 36. https://doi.org/10.4309/jgi.v0i36.3978.

    • Search Google Scholar
    • Export Citation
  • Castrén, S., Kontto, J., Alho, H., & Salonen, A. H. (2018). The relationship between gambling expenditure, socio‐demographics, health‐related correlates and gambling behaviour—a cross‐sectional population‐based survey in Finland. Addiction, 113(1), 91106. https://doi.org/10.1111/add.13929.

    • Search Google Scholar
    • Export Citation
  • Catania, M., & Griffiths, M. D. (2021). Understanding online voluntary self-exclusion in gambling: An empirical study using account-based behavioral tracking data. International Journal of Environmental Research and Public Health, 18(4), 2000. https://doi.org/10.3390/ijerph18042000.

    • Search Google Scholar
    • Export Citation
  • Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. https://doi.org/10.1145/2939672.2939785.

  • Cisneros Örnberg, J., & Hettne, J. (2018). The future Swedish gambling market: Challenges in law and public policies. In Gambling Policies in European welfare states (pp. 197216). Springer International Publishing. https://doi.org/10.1007/978-3-319-90620-1_11.

    • Search Google Scholar
    • Export Citation
  • Clune, S., Ratnaike, D., White, V., Donaldson, A., Randle, E., O’Halloran, P., & Lewis, V. (2024). What is known about population level programs designed to address gambling-related harm: Rapid review of the evidence. Harm Reduction Journal, 21(1). https://doi.org/10.1186/s12954-024-01032-8.

    • Search Google Scholar
    • Export Citation
  • Deng, X., Lesch, T., & Clark, L. (2019). Applying data science to behavioral analysis of online gambling. Current Addiction Reports, 6(3), 159164. https://doi.org/10.1007/s40429-019-00269-9.

    • Search Google Scholar
    • Export Citation
  • Dowling, N. A., Merkouris, S. S., Greenwood, C. J., Oldenhof, E., Toumbourou, J. W., & Youssef, G. J. (2017). Early risk and protective factors for problem gambling: A systematic review and meta-analysis of longitudinal studies. Clinical Psychology Review, 51, 109124. https://doi.org/10.1016/j.cpr.2016.10.008.

    • Search Google Scholar
    • Export Citation
  • Eadington, W. R. (2003). Measuring costs from permitted gaming: Concepts and categories in evaluating gambling’s consequences. Journal of Gambling Studies, 19(2), 185213. https://doi.org/10.1023/A:1023681315907.

    • Search Google Scholar
    • Export Citation
  • Edgren, R., Castrén, S., Mäkelä, M., Pörtfors, P., Alho, H., & Salonen, A. H. (2016). Reliability of instruments measuring at-risk and problem gambling among young individuals: A systematic review covering years 2009–2015. Journal of Adolescent Health, 58(6), 600615. https://doi.org/10.1016/j.jadohealth.2016.03.007.

    • Search Google Scholar
    • Export Citation
  • Elyan, E., & Gaber, M. M. (2017). A genetic algorithm approach to optimising random forests applied to class engineered data. Information Sciences, 384, 220234. https://doi.org/10.1016/j.ins.2016.08.007.

    • Search Google Scholar
    • Export Citation
  • Gainsbury, S., Sadeque, S., Mizerski, D., & Blaszczynski, A. (2013). Wagering in Australia: A retrospective behavioural analysis of betting patterns based on player account data. The Journal of Gambling Business and Economics, 6(2), 5068. https://doi.org/10.5750/jgbe.v6i2.581.

    • Search Google Scholar
    • Export Citation
  • Goldstein, A. L., Vilhena-Churchill, N., Munroe, M., Stewart, S. H., Flett, G. L., & Hoaken, P. N. S. (2017). Understanding the effects of social desirability on gambling self-reports. International Journal of Mental Health and Addiction, 15(6), 13421359. https://doi.org/10.1007/s11469-016-9668-0.

    • Search Google Scholar
    • Export Citation
  • Haeusler, J. (2016). Follow the money: Using payment behaviour as predictor for future self-exclusion. International Gambling Studies, 16(2), 246262. https://doi.org/10.1080/14459795.2016.1158306.

    • Search Google Scholar
    • Export Citation
  • Hahmann, T., Hamilton-Wright, S., Ziegler, C., & Matheson, F. I. (2021). Problem gambling within the context of poverty: A scoping review. International Gambling Studies, 21(2), 183219. https://doi.org/10.1080/14459795.2020.1819365.

    • Search Google Scholar
    • Export Citation
  • Hodgins, D. C., & Makarchuk, K. (2003). Trusting problem gamblers: Reliability and validity of self-reported gambling behavior. Psychology of Addictive Behaviors, 17(3), 244248. https://doi.org/10.1037/0893-164X.17.3.244.

    • Search Google Scholar
    • Export Citation
  • Hofmarcher, T., Romild, U., Spångberg, J., Persson, U., & Håkansson, A. (2020). The societal costs of problem gambling in Sweden. BMC Public Health, 20(1), 1921. https://doi.org/10.1186/s12889-020-10008-9.

    • Search Google Scholar
    • Export Citation
  • Hopfgartner, N., Auer, M., Griffiths, M. D., & Helic, D. (2022). Predicting self-exclusion among online gamblers: An empirical real-world study. Journal of Gambling Studies, 39(1), 447465. https://doi.org/10.1007/s10899-022-10149-z.

    • Search Google Scholar
    • Export Citation
  • Hopfgartner, N., Auer, M., Helic, D., & Griffiths, M. D. (2024). Using artificial intelligence algorithms to predict self-reported problem gambling among online casino gamblers from different countries using account-based player data. International Journal of Mental Health and Addiction. Advance online publication. https://doi.org/10.1007/s11469-024-01312-1.

    • Search Google Scholar
    • Export Citation
  • Jonsson, J., Abbott, M. W., Sjöberg, A., & Carlbring, P. (2017). Measuring gambling reinforcers, over consumption and fallacies: The psychometric properties and predictive validity of the Jonsson-Abbott scale. Frontiers in Psychology, 8. https://doi.org/10.3389/fpsyg.2017.01807.

    • Search Google Scholar
    • Export Citation
  • Jonsson, J., Munck, I., Hodgins, D. C., & Carlbring, P. (2023). Reaching out to big losers: Exploring intervention effects using individualized follow-up. Psychology of Addictive Behaviors, 37(7), 886893. https://doi.org/10.1037/adb0000906.

    • Search Google Scholar
    • Export Citation
  • Jonsson, J., Munck, I., Volberg, R., & Carlbring, P. (2017). GamTest: Psychometric evaluation and the role of emotions in an online self-test for gambling behavior. Journal of Gambling Studies, 33(2), 505523. https://doi.org/10.1007/s10899-017-9676-4.

    • Search Google Scholar
    • Export Citation
  • Kairouz, S., Costes, J.-M., Murch, W. S., Doray-Demers, P., Carrier, C., & Eroukmanoff, V. (2023). Enabling new strategies to prevent problematic online gambling: A machine learning approach for identifying at-risk online gamblers in France. International Gambling Studies, 23(3), 471490. https://doi.org/10.1080/14459795.2022.2164042.

    • Search Google Scholar
    • Export Citation
  • Kuentzel, J. G., Henderson, M. J., & Melville, C. L. (2008). The impact of social desirability biases on self-report among college student and problem gamblers. Journal of Gambling Studies, 24(3), 307319. https://doi.org/10.1007/s10899-008-9094-8.

    • Search Google Scholar
    • Export Citation
  • Laurikkala, J., & Juhola, M. (2001). Hierarchical clustering of female urinary incontinence data having noise and outliers (pp. 161167). https://doi.org/10.1007/3-540-45497-7_24.

    • Search Google Scholar
    • Export Citation
  • Lövdal, S., & Biehl, M. (2024). Iterated relevance matrix analysis (IRMA) for the identification of class-discriminative subspaces. Neurocomputing, 577, 127367. https://doi.org/10.1016/j.neucom.2024.127367.

    • Search Google Scholar
    • Export Citation
  • Lundberg, S. M., Allen, P. G., & Lee, S.-I. (n.d.). A unified approach to interpreting model predictions. https://github.com/slundberg/shap.

    • Search Google Scholar
    • Export Citation
  • MacKillop, J., Anderson, E. J., Castelda, B. A., Mattson, R. E., & Donovick, P. J. (2006). Divergent validity of measures of cognitive distortions, impulsivity, and time perspective in pathological gambling. Journal of Gambling Studies, 22(3), 339354. https://doi.org/10.1007/s10899-006-9021-9.

    • Search Google Scholar
    • Export Citation
  • Murch, W. S., Kairouz, S., Dauphinais, S., Picard, E., Costes, J., & French, M. (2023). Using machine learning to retrospectively predict self‐reported gambling problems in Quebec. Addiction, 118(8), 15691578. https://doi.org/10.1111/add.16179.

    • Search Google Scholar
    • Export Citation
  • Park, Y., Eom, D., Seo, B., & Choi, J. (2020). Improved predictive deep temporal neural networks with trend filtering. In Proceedings of the first ACM international conference on AI in finance (pp. 18). https://doi.org/10.1145/3383455.3422565.

    • Search Google Scholar
    • Export Citation
  • Paterson, M., Taylor, M., & Gray, M. (2020). Trajectories of social and economic outcomes and problem gambling risk in Australia. Social Indicators Research, 148(1), 297321. https://doi.org/10.1007/s11205-019-02194-w.

    • Search Google Scholar
    • Export Citation
  • Percy, C., França, M., Dragičević, S., & d’Avila Garcez, A. (2016). Predicting online gambling self-exclusion: an analysis of the performance of supervised machine learning models. International Gambling Studies, 16(2), 193210. https://doi.org/10.1080/14459795.2016.1151913.

    • Search Google Scholar
    • Export Citation
  • Perrot, B., Hardouin, J. B., Thiabaud, E., Saillard, A., Grall-Bronnec, M., & Challet-Bouju, G. (2022). Development and validation of a prediction model for online gambling problems based on players’ account data. Journal of Behavioral Addictions, 11(3), 874889. https://doi.org/10.1556/2006.2022.00063.

    • Search Google Scholar
    • Export Citation
  • Sato, H., & Kawahara, J. (2011). Selective bias in retrospective self-reports of negative mood states. Anxiety, Stress & Coping, 24(4), 359367. https://doi.org/10.1080/10615806.2010.543132.

    • Search Google Scholar
    • Export Citation
  • Suzuki, H., Nakamura, R., Inagaki, A., Watanabe, I., & Takagi, T. (2019). Early detection of problem gambling based on behavioral changes using shapelets. In Proceedings - 2019 IEEE/WIC/ACM international Conference on web intelligence, WI 2019 (pp. 367372). https://doi.org/10.1145/3350546.3352549.

    • Search Google Scholar
    • Export Citation
  • Swedish Gambling Act, Pub. L. No. 2018:1138, Swedish code of statutes (2018).

  • Ukhov, I., Bjurgert, J., Auer, M., & Griffiths, M. D. (2021). Online problem gambling: A comparison of casino players and sports bettors via predictive modeling using behavioral tracking data. Journal of Gambling Studies, 37(3), 877897. https://doi.org/10.1007/s10899-020-09964-z.

    • Search Google Scholar
    • Export Citation
  • Wang, T. D., Plaisant, C., Shneiderman, B., Spring, N., Roseman, D., Marchand, G., … Smith, M. (2009). Temporal summaries: Supporting temporal categorical searching, aggregation and comparison. IEEE Transactions on Visualization and Computer Graphics, 15(6), 10491056. https://doi.org/10.1109/TVCG.2009.187.

    • Search Google Scholar
    • Export Citation
  • Weatherly, J. N., Montes, K. S., Peters, D., & Wilson, A. N. (2012). Gambling behind the walls: A behavior-analytic perspective. The Behavior Analyst Today, 13(3–4), 28. https://doi.org/10.1037/h0100725.

    • Search Google Scholar
    • Export Citation
  • Yu, L., & Liu, H. (2003). Efficiently handling feature redundancy in high-dimensional data. In Proceedings of the Ninth ACM SIGKDD international conference on knowledge discovery and data mining, pp. 685690. https://doi.org/10.1145/956750.956840.

    • Search Google Scholar
    • Export Citation
  • Collapse
  • Expand

Dr. Zsolt Demetrovics
Institute of Psychology, ELTE Eötvös Loránd University
Address: Izabella u. 46. H-1064 Budapest, Hungary
Phone: +36-1-461-2681
E-mail: jba@ppk.elte.hu

Indexing and Abstracting Services:

  • Web of Science [Science Citation Index Expanded (also known as SciSearch®)
  • Journal Citation Reports/Science Edition
  • Social Sciences Citation Index®
  • Journal Citation Reports/ Social Sciences Edition
  • Current Contents®/Social and Behavioral Sciences
  • EBSCO
  • GoogleScholar
  • PsycINFO
  • PubMed Central
  • SCOPUS
  • Medline
  • CABI
  • CABELLS Journalytics

2023  
Web of Science  
Journal Impact Factor 6.6
Rank by Impact Factor Q1 (Psychiatry)
Journal Citation Indicator 1.59
Scopus  
CiteScore 12.3
CiteScore rank Q1 (Clinical Psychology)
SNIP 1.604
Scimago  
SJR index 2.188
SJR Q rank Q1

Journal of Behavioral Addictions
Publication Model Gold Open Access
Submission Fee none
Article Processing Charge 990 EUR/article
Effective from  1st Feb 2025:
1400 EUR/article
Regional discounts on country of the funding agency World Bank Lower-middle-income economies: 50%
World Bank Low-income economies: 100%
Further Discounts Corresponding authors, affiliated to an EISZ member institution subscribing to the journal package of Akadémiai Kiadó: 100%.
Subscription Information Gold Open Access

Journal of Behavioral Addictions
Language English
Size A4
Year of
Foundation
2011
Volumes
per Year
1
Issues
per Year
4
Founder Eötvös Loránd Tudományegyetem
Founder's
Address
H-1053 Budapest, Hungary Egyetem tér 1-3.
Publisher Akadémiai Kiadó
Publisher's
Address
H-1117 Budapest, Hungary 1516 Budapest, PO Box 245.
Responsible
Publisher
Chief Executive Officer, Akadémiai Kiadó
ISSN 2062-5871 (Print)
ISSN 2063-5303 (Online)

Senior editors

Editor(s)-in-Chief: Zsolt DEMETROVICS

Assistant Editor(s): 

Csilla ÁGOSTON

Dana KATZ

Associate Editors

  • Stephanie ANTONS (Universitat Duisburg-Essen, Germany)
  • Joel BILLIEUX (University of Lausanne, Switzerland)
  • Beáta BŐTHE (University of Montreal, Canada)
  • Matthias BRAND (University of Duisburg-Essen, Germany)
  • Daniel KING (Flinders University, Australia)
  • Gyöngyi KÖKÖNYEI (ELTE Eötvös Loránd University, Hungary)
  • Ludwig KRAUS (IFT Institute for Therapy Research, Germany)
  • Marc N. POTENZA (Yale University, USA)
  • Hans-Jurgen RUMPF (University of Lübeck, Germany)
  • Ruth J. VAN HOLST (Amsterdam UMC, The Netherlands)

Editorial Board

  • Sophia ACHAB (Faculty of Medicine, University of Geneva, Switzerland)
  • Alex BALDACCHINO (St Andrews University, United Kingdom)
  • Judit BALÁZS (ELTE Eötvös Loránd University, Hungary)
  • Maria BELLRINGER (Auckland University of Technology, Auckland, New Zealand)
  • Henrietta BOWDEN-JONES (Imperial College, United Kingdom)
  • Damien BREVERS (University of Luxembourg, Luxembourg)
  • Julius BURKAUSKAS (Lithuanian University of Health Sciences, Lithuania)
  • Gerhard BÜHRINGER (Technische Universität Dresden, Germany)
  • Silvia CASALE (University of Florence, Florence, Italy)
  • Luke CLARK (University of British Columbia, Vancouver, B.C., Canada)
  • Jeffrey L. DEREVENSKY (McGill University, Canada)
  • Geert DOM (University of Antwerp, Belgium)
  • Nicki DOWLING (Deakin University, Geelong, Australia)
  • Hamed EKHTIARI (University of Minnesota, United States)
  • Jon ELHAI (University of Toledo, Toledo, Ohio, USA)
  • Ana ESTEVEZ (University of Deusto, Spain)
  • Fernando FERNANDEZ-ARANDA (Bellvitge University Hospital, Barcelona, Spain)
  • Naomi FINEBERG (University of Hertfordshire, United Kingdom)
  • Sally GAINSBURY (The University of Sydney, Camperdown, NSW, Australia)
  • Belle GAVRIEL-FRIED (The Bob Shapell School of Social Work, Tel Aviv University, Israel)
  • Biljana GJONESKA (Macedonian Academy of Sciences and Arts, Republic of North Macedonia)
  • Marie GRALL-BRONNEC (University Hospital of Nantes, France)
  • Jon E. GRANT (University of Minnesota, USA)
  • Mark GRIFFITHS (Nottingham Trent University, United Kingdom)
  • Joshua GRUBBS (University of New Mexico, Albuquerque, NM, USA)
  • Anneke GOUDRIAAN (University of Amsterdam, The Netherlands)
  • Susumu HIGUCHI (National Hospital Organization Kurihama Medical and Addiction Center, Japan)
  • David HODGINS (University of Calgary, Canada)
  • Eric HOLLANDER (Albert Einstein College of Medicine, USA)
  • Zsolt HORVÁTH (Eötvös Loránd University, Hungary)
  • Susana JIMÉNEZ-MURCIA (Clinical Psychology Unit, Bellvitge University Hospital, Barcelona, Spain)
  • Yasser KHAZAAL (Geneva University Hospital, Switzerland)
  • Orsolya KIRÁLY (Eötvös Loránd University, Hungary)
  • Chih-Hung KO (Faculty of Medicine, College of Medicine, Kaohsiung Medical University, Taiwan)
  • Shane KRAUS (University of Nevada, Las Vegas, NV, USA)
  • Hae Kook LEE (The Catholic University of Korea, Republic of Korea)
  • Bernadette KUN (Eötvös Loránd University, Hungary)
  • Katerina LUKAVSKA (Charles University, Prague, Czech Republic)
  • Giovanni MARTINOTTI (‘Gabriele d’Annunzio’ University of Chieti-Pescara, Italy)
  • Gemma MESTRE-BACH (Universidad Internacional de la Rioja, La Rioja, Spain)
  • Astrid MÜLLER (Hannover Medical School, Germany)
  • Daniel Thor OLASON (University of Iceland, Iceland)
  • Ståle PALLESEN (University of Bergen, Norway)
  • Afarin RAHIMI-MOVAGHAR (Teheran University of Medical Sciences, Iran)
  • József RÁCZ (Hungarian Academy of Sciences, Hungary)
  • Michael SCHAUB (University of Zurich, Switzerland)
  • Marcantanio M. SPADA (London South Bank University, United Kingdom)
  • Daniel SPRITZER (Study Group on Technological Addictions, Brazil)
  • Dan J. STEIN (University of Cape Town, South Africa)
  • Sherry H. STEWART (Dalhousie University, Canada)
  • Attila SZABÓ (Eötvös Loránd University, Hungary)
  • Hermano TAVARES (Instituto de Psiquiatria do Hospital das Clínicas da Faculdade de Medicina, Universidade de São Paulo, São Paulo, Brazil)
  • Wim VAN DEN BRINK (University of Amsterdam, The Netherlands)
  • Alexander E. VOISKOUNSKY (Moscow State University, Russia)
  • Aviv M. WEINSTEIN (Ariel University, Israel)
  • Anise WU (University of Macau, Macao, China)
  • Ágnes ZSILA (ELTE Eötvös Loránd University, Hungary)

 

Monthly Content Usage

Abstract Views Full Text Views PDF Downloads
Oct 2024 0 0 0
Nov 2024 0 0 0
Dec 2024 0 0 0
Jan 2025 0 0 0
Feb 2025 0 1044 117
Mar 2025 0 18827 282
Apr 2025 0 0 0