Archives

Computational text analysis on unstructured police data: A scoping review

 

Police reports made following attendance at events such as crashes, domestic violence and theft often contain rich contextual details including indicators of mental health issues or abuse types, and persons/entities involved and their relationships, which are not typically captured in structured administrative data, interviews or official statistics. However, the sheer volume of information along with strict data access protocols render manual analysis impractical. Computational text analysis methods offer a feasible and effective approach to automatically process this underutilized data source.

The research team led by Dr Wilson Lukmanjaya (University of New South Wales) included VISION Research Fellow Dr Darren Cook. The team conducted an overview of studies using computational text analysis (e.g., text mining, natural language processing (NLP)), on unstructured police data, serving as a guide for researchers interested in employing similar methodologies. 

Their article, Computational text analysis on unstructured police data: A scoping review, was conducted in accordance with the PRISMA-SCR guidelines, following the two screening processes (title/abstract and full text screening) and the development of a pre-defined protocol. A search was conducted across seven electronic databases covering the past 20 years.

After removing duplicate entries and screening titles/abstracts and full-text publications, 61 studies met the inclusion criteria. Included studies were published between 2004 and 2024, with most from the United States, Australia and the Netherlands.

The scoping review indicates applications of computational text analysis on unstructured police data have moderate to high performance. Common limitations included variable data quality, with reliability depending on the level of detail provided by the police report’s author, and failure to report ethical implications or methodological limitations.

Computational text analysis can extract key information from unstructured police data. However, future research should clearly report ethics approvals and implications, and methodological limitations. 

Recommendation

  1. Establishing a structured data-sharing framework between law enforcement and researchers is crucial to facilitate access and support high quality, impactful research in this field.

To download the paper: Computational text analysis on unstructured police data: A scoping review

To cite: Lukmanjaya, W., Halmich, C., Butler, T., Cook, D., Karystianis, G. Computational text analysis on unstructured police data: a scoping review. Crime Sci (2026). https://doi.org/10.1186/s40163-026-00272-2

Photograph from Adobe Photo Stock subscription

Use of text mining to study Intimate Partner Violence

Computational text mining methods are proposed as a useful methodological innovation in Intimate Partner Violence (IPV) research. Text mining can offer researchers access to existing or new datasets, sourced from social media or from IPV-related organisations, that would be too large to analyse manually. This article aims to give an overview of current work applying text mining methodologies in the study of IPV, as a starting point for researchers wanting to use such methods in their own work.

A systematic review was conducted to PRISMA guidelines, searching 8 databases and identifying 22 unique studies to include in the review.

The studies cover a wide range of methodologies and outcomes. Supervised and unsupervised approaches are represented, including rule-based classification (n = 3), traditional Machine Learning (n = 8), Deep Learning (n = 6) and topic modelling (n = 4) methods. Datasets are mostly sourced from social media (n = 15), with other data being sourced from police forces (n = 3), health or social care providers (n = 3), or litigation texts (n = 1). Only a few studies commented on the ethics of computational IPV research.

Text mining methodologies offer promising data collection and analysis techniques for IPV research. However, future work in this space must consider the ethical implications of computational approaches.

For further information please see:  A Systematic Literature Review of the Use of Computational Text Analysis Methods in Intimate Partner Violence Research | SpringerLink or contact Lilly Neubauer at j.neubauer@cs.ucl.ac.uk  or Dr Leonie Tanczer at l.tanczer@ucl.ac.uk

Illustration: graphicwithart / Shutterstock.com