Computational text analysis on unstructured police data: A scoping review

 

Police reports made following attendance at events such as crashes, domestic violence and theft often contain rich contextual details including indicators of mental health issues or abuse types, and persons/entities involved and their relationships, which are not typically captured in structured administrative data, interviews or official statistics. However, the sheer volume of information along with strict data access protocols render manual analysis impractical. Computational text analysis methods offer a feasible and effective approach to automatically process this underutilized data source.

The research team led by Dr Wilson Lukmanjaya (University of New South Wales) included VISION Research Fellow Dr Darren Cook. The team conducted an overview of studies using computational text analysis (e.g., text mining, natural language processing (NLP)), on unstructured police data, serving as a guide for researchers interested in employing similar methodologies. 

Their article, Computational text analysis on unstructured police data: A scoping review, was conducted in accordance with the PRISMA-SCR guidelines, following the two screening processes (title/abstract and full text screening) and the development of a pre-defined protocol. A search was conducted across seven electronic databases covering the past 20 years.

After removing duplicate entries and screening titles/abstracts and full-text publications, 61 studies met the inclusion criteria. Included studies were published between 2004 and 2024, with most from the United States, Australia and the Netherlands.

The scoping review indicates applications of computational text analysis on unstructured police data have moderate to high performance. Common limitations included variable data quality, with reliability depending on the level of detail provided by the police report’s author, and failure to report ethical implications or methodological limitations.

Computational text analysis can extract key information from unstructured police data. However, future research should clearly report ethics approvals and implications, and methodological limitations. 

Recommendation

  1. Establishing a structured data-sharing framework between law enforcement and researchers is crucial to facilitate access and support high quality, impactful research in this field.

To download the paper: Computational text analysis on unstructured police data: A scoping review

To cite: Lukmanjaya, W., Halmich, C., Butler, T., Cook, D., Karystianis, G. Computational text analysis on unstructured police data: a scoping review. Crime Sci (2026). https://doi.org/10.1186/s40163-026-00272-2

Photograph from Adobe Photo Stock subscription

Publications