Archives

Computational text analysis on unstructured police data: A scoping review

 

Police reports made following attendance at events such as crashes, domestic violence and theft often contain rich contextual details including indicators of mental health issues or abuse types, and persons/entities involved and their relationships, which are not typically captured in structured administrative data, interviews or official statistics. However, the sheer volume of information along with strict data access protocols render manual analysis impractical. Computational text analysis methods offer a feasible and effective approach to automatically process this underutilized data source.

The research team led by Dr Wilson Lukmanjaya (University of New South Wales) included VISION Research Fellow Dr Darren Cook. The team conducted an overview of studies using computational text analysis (e.g., text mining, natural language processing (NLP)), on unstructured police data, serving as a guide for researchers interested in employing similar methodologies. 

Their article, Computational text analysis on unstructured police data: A scoping review, was conducted in accordance with the PRISMA-SCR guidelines, following the two screening processes (title/abstract and full text screening) and the development of a pre-defined protocol. A search was conducted across seven electronic databases covering the past 20 years.

After removing duplicate entries and screening titles/abstracts and full-text publications, 61 studies met the inclusion criteria. Included studies were published between 2004 and 2024, with most from the United States, Australia and the Netherlands.

The scoping review indicates applications of computational text analysis on unstructured police data have moderate to high performance. Common limitations included variable data quality, with reliability depending on the level of detail provided by the police report’s author, and failure to report ethical implications or methodological limitations.

Computational text analysis can extract key information from unstructured police data. However, future research should clearly report ethics approvals and implications, and methodological limitations. 

Recommendation

  1. Establishing a structured data-sharing framework between law enforcement and researchers is crucial to facilitate access and support high quality, impactful research in this field.

To download the paper: Computational text analysis on unstructured police data: A scoping review

To cite: Lukmanjaya, W., Halmich, C., Butler, T., Cook, D., Karystianis, G. Computational text analysis on unstructured police data: a scoping review. Crime Sci (2026). https://doi.org/10.1186/s40163-026-00272-2

Photograph from Adobe Photo Stock subscription

Reforms to British policing: Does performance equal progress?

 

 

Reflections on performance and productivity markers in the 2026 police reforms white paper

 

By Mattie Jones, PhD student, Violence & Society Centre, City St George’s University of London

In January, the British government released the white paper ‘From local to national: a new model for policing,’ outlining sweeping proposed changes to the police in England and Wales. To date, the media has primarily focused on the proposed National Police Service (NPS), calling it the “British FBI.” While the creation of the NPS is a major section of the paper, it is simply one piece of a much larger effort to reform British policing in a new and developing, post-Casey Review era. 

As a policing researcher and former United States police officer, what gave me pause when analysing the proposals were the pervasive underlying themes of improving productivity and tracking performance. Alongside the white paper, the Home Office introduced the Police Performance Framework. This framework sets out performance metrics that clarify parameters of success and identify areas for improvement. While performance metrics are useful for informing evidence-based policing, elements of this framework appear to lack specific direction on how forces will achieve the objectives, and it does not properly contextualise the desired outcomes that coincide with the numeric change. 

An example of this from the framework is to ‘increase the volume of crimes’ where a suspect receives VAWG-related charges. While commendable, goals like this might lend themselves to target-driven enforcement.  Officers may feel pressure to chase targets and demonstrate productivity, which could lead to unnecessary minor arrests without actually reducing serious crime or benefitting victims. In 2015 the Home Office interrogated issues of target chasing, directly attributing it to mis-recording crimes and shifting efforts to minor or ‘volume crime’ to meet metrics.  The proposed framework, combining a focus on performance indicators and a prioritisation of productivity, raises similar warning signs for a policing environment inclined toward quota-driven enforcement. 

Quotas are a loaded term in police practices, and it’s important to not over conflate all performance metrics with quotas. Police quotas combine four elements: formal channels and/or informal mechanisms of implementation, quantification of an acceptable threshold, requirement to meet the threshold, and negative action upon failure to meet the threshold (Ossei-Owusu, 2021). Policing in a quota-based system leads to officers and forces focusing on meeting metrics which may be at odds with discretion and focusing on positive outcomes for the public (Ossei-Owusu, 2021). These, along with issues of discrimination, are why many scholars, practitioners, and the public push back against their implementation (Ossei-Owusu, 2021). With the introduction of the Police Performance Framework, the Police Performance Dashboard for data monitoring between forces, a Tiered Performance System, and a “more active, ‘hands on’ Home Office,” the white paper outlines an environment ripe for quota-driven enforcement.  

The line between creating quality metrics that provide data to drive improvements and encouraging forces to adopt quotas or enforcement targets is very fine.  To strike the right balance, the Home Office will need to take care and offer specificity when operationalising the objectives and quantifying the outcomes of these proposed changes. Quality data on policing and performance is necessary, but we must be cautious that we don’t let the pursuit of quantification and measurement lend itself to ill-advised practice. 

For further information, please contact Mattie at mattie.jones@citystgeorges.ac.uk

Illustration from Adobe Stock subscription.

Improving police recorded crime data with natural language processing

Understanding and preventing Domestic Violence and Abuse (DVA) is compounded by long-standing data quality issues in police records. Accurate police-recorded crime data is vital for responding to DVA, yet it often contains missing values and inaccuracies.

Across all crime types, the quality of police data in England and Wales has been a concern. While there have been improvements in overall crime data recording since 2014, individual police forces still encounter difficulties adequately recording instances of DVA in police-recorded crime datasets. 

Correcting poorly recorded or missing data at this scale is non-trivial and beyond the capabilities of manual intervention alone. Fortunately, the increasing availability of computational solutions and machine learning algorithms such as text mining and natural language processing (NLP) can augment, and to a degree, offset much of this processing. NLP is supported by a growing body of interdisciplinary research, which shows that valuable information can be automatically extracted from unstructured data such as crime reports and case summaries through technology.

However, automated prediction systems are not without risk, particularly when applied in sensitive domains such as policing. Data inherently reflects societal biases that poorly designed AI solutions can amplify, and in the context of DVA, these biases may stem from underreporting of marginalized demographic groups or inconsistencies in police recording practices.

In their recent study, Improving police recorded crime data for domestic violence and abuse through natural language processing, VISION researchers Dr Darren Cook and Dr Ruth Weir (City St George’s University of London) and Dr Leslie Humphries (University of Lancashire), evaluated the capability of supervised machine learning models to automatically extract victim–offender relationship information from free-text crime notes in DVA cases.

Both models demonstrated that such tools could serve as cost-effective and efficient alternatives to manual coding, accurately classifying relationship type in around four out of five cases. The incorporation of a selective classification function improved precision for the most challenging cases by abstaining from low-confidence predictions, though at the cost of reduced coverage. This research represents a meaningful step toward addressing concerns about the completeness and reliability of police-recorded crime data.

Recommendation

Given that police-recorded crime lost its status as an accredited official statistic in 2014 due in part to weaknesses in data collection and processing, the application of data science methods to reliably impute missing values offers a promising route to restoring confidence in these records. Police constabularies are encouraged to use the available technology and implement text mining and NLP solutions to extract valuable information from unstructured data such as crime reports and case summaries.

For further information: Please contact Darren at darren.cook@citystgeorges.ac.uk

To cite: Cook DWeir R, Humphries, L. Improving police recorded crime data for domestic violence and abuse through natural language processing. Front. Sociol., 24 November 2025, Sec. Medical Sociology Volume 10 – 2025 https://doi.org/10.3389/fsoc.2025.1686632

Photograph from Adobe Photo Stock subscription

Exploring Better Responses to Teenage Relationship Abuse

Dr Ruth Weir

Blog by Dr Ruth Weir, VISION Senior Research Fellow in Criminology

Between April and July, Gloucestershire Constabulary’s Deputy Chief Constable Katy Barrow-Grint and I led a series of three place-based roundtables in Gloucestershire, Northumbria and Oxford, as part of our ongoing research into teenage relationship abuse (TRA).

The events were supported by VISION colleagues Annie Bunce, Polina Obolenskaya and Julia Sahin, alongside Kat Hadjimatheou, Honorary Senior Fellow at the Violence and Society Centre and Senior Lecturer at University of Essex. Each roundtable brought together a wide range of local practitioners—from policing, social care, education, health, and specialist services—as well as people with lived experience of adolescent domestic abuse or teenage relationship abuse. The aim was to explore what is working locally, where the challenges lie, and what would be needed for the local area to become a national exemplar in responding to TRA.

The level of engagement was striking. Attendance was high across all three sessions, with more than 80 participants in Northumbria alone. Far from being one-off conversations, the roundtables have sparked ongoing collaboration with local working groups already being set up to continue improving multi-agency responses to teenage relationship abuse.

The roundtables also provided a platform for Katy, VISION’s first Practitioner in Residence and now an Honorary Research Fellow, to share early findings from her national survey of police forces’ current responses to teenage relationship abuse. These insights are helping to build a clearer picture of practice across the country and will inform the next stage of research and policy development.

For further information, please contact Ruth at ruth.weir@citystgeorges.ac.uk

Dr Ruth Weir and Gloucestershire Constabulary’s Deputy Chief Constable Katy Barrow-Grint

New leadership role in National Criminology Network

Dr Darren Cook

We’re delighted to share that Dr Darren Cook, Research Fellow at VISION and the Violence & Society Centre at City St George’s UoL, has been appointed Co-Chair of the British Society of Criminology’s Policing Network.

In this role, Darren will help lead national discussions on contemporary policing research and practice, and support collaboration across the UK’s criminology and policing research community.

Darren will play a key role in shaping the Network as a space for thoughtful, impactful, and outward-facing policing research.

The team plans to foster cross-disciplinary collaboration, support early-career researchers, and ensure criminological perspectives are part of broader debates around policing, technology, and justice.

Deputy Chief Constable awarded Practitioner in Residence at Violence and Society Centre

Katy Barrow-Grint, Deputy Chief Constable, Gloucestershire

City St George’s, UoL, offers a Practitioner in Residence programme at the School for Policy and Global Affairs. It is for mid-level and senior policy practitioners within the UK and provides a platform to grow and explore their practice in partnership with the school.

Katy Barrow-Grint, Deputy Chief Constable in Gloucestershire and an executive leader in national policing, became aware of the opportunity via her work with VISION Senior Research Fellow, Dr Ruth Weir,  on the VISION adolescent domestic abuse (ADA) research programme. Having recently written a book entitled ‘Policing Domestic Abuse’ with Ruth and others, the research identified a national gap academically and in policing with how ADA is understood.

Katy’s focus will be on how police constabularies document ADA and developing a better understanding of the impact of the statutory age limitations on the practical work police officers do on the front line.

Forces do not routinely record ADA as the statutory guidance states that domestic abuse occurs in relationships where both parties are aged 16 or over. As a result, whilst crimes against young people will be recorded and investigated, they are not necessarily classified as domestic abuse, and it may be that child protection, domestic abuse or front-line response teams deal with the case.

Her project work will seek to understand how forces are recording such incidents, and what type of officer and role is investigating. Katy will work with policing nationally through the National Police Chief‘s Council (NPCC) domestic abuse and child protection portfolios and collate an up-to-date picture across all forces in England and Wales to understand how they are recording and who is investigating ADA.

Katy is also undertaking specific localised work in Oxfordshire, Gloucestershire and Northumbria, hosting roundtables with Dr Ruth Weir and  practitioners from all relevant agencies to gain a qualitative understanding of the problems staff encounter when dealing with ADA.

Photograph from Adobe Photo Stock subscription

Natural Language Processing: Improving Data Integrity of Police Recorded Crime

By Darren Cook, Research Fellow in Natural Language Processing at City, University of London

Did you know that police recorded crime data for England and Wales are not accredited by the UK’s Office for Statistics Regulation (OSR)? This decision, made by the OSR after an audit in 2014, was due to concerns about the reliability of the underlying data.

Various factors affect the quality of police-recorded data. Differences in IT systems, personnel decision-making, and a lack of knowledge-sharing all contribute to reduced quality and consistency. Poor data integrity leads to a lack of standardisation across police forces and an increase in inaccurate or missing entries. I recently spoke about this issue at the Behavioural and Social Sciences in Security (BASS) conference at the University of St. Andrews, Scotland.

Correcting missing values is no small feat. In a dataset of 18,000 police recorded domestic violence incidents, we found over 4,500 (25%) missing entries for a single variable. Let’s assume it takes 30 seconds to find the correct value for this variable – that’s 38 hours of effort – almost a full working week. Given that there could be as many as twenty additional variables, it would take over four months to populate all the missing values in our dataset. Expanding such effort across multiple police forces and for multiple types of crime highlights the inefficiency of human-effort in this endeavour.

In my talk, I outlined an automated solution to this problem using Natural Language Processing (NLP) and supervised machine learning (ML). NLP describes the processes and techniques used by machines to understand human language, and supervised ML describes how machines learn to predict an outcome based on previously seen examples. In this case, we sought to predict the relationship between the victim and offender – an important piece of demographic information vital to ensuring victim safety.

The proposed system would use a text-based crime ‘note’ completed by a police officer to classify the victim offender relationship as either ‘Ex-Partner”, “Partner”, or “Family” – in keeping with the distinction made by Women’s Aid. Crime notes are an often-overlooked source of information in police data, yet we found they consistently referenced the victim-offender relationship. The goal of our system, therefore, was to extract the salient information from the free-form crime notes and populate the corresponding missing value in our structured data fields.

Existing solutions based on keywords and syntax parsing are used by multiple UK police forces. While effective, they require manual effort to create, update, and maintain the dictionaries, and they don’t generalise well. Our supervised ML system, however, can be automatically updated and monitored to maintain accuracy.

When tested, our system achieved 80% accuracy, correctly labelling the relationship type in four out of five cases. In comparison, humans performed this task with approximately 82% accuracy – an arguably negligible difference. Moreover, once trained, our system could classify the entire test set (over 1,000 crime notes) in just sixteen seconds.

However, we noted some limitations, the biggest of which was a high linguistic overlap in crime notes between ‘Ex-Partner’ and ‘Partner’ that caused several misclassifications. We believe more advanced language models (i.e., word embeddings) will improve discrimination between these relationships.

We also discovered a potential prediction bias against minorities. Although victim ethnicity wasn’t included in our training setup, we observed reduced accuracy for Black or Asian victims. The source and extent of this bias are subjects of ongoing research.

Our findings highlight the promise of automated solutions but serve as a cautionary tale against assuming these systems can be applied carte blanche without careful consideration of their limitations. Several outstanding questions remain. Is a system with 80% accuracy good enough? Is it better to leave missing values rather than predict incorrect ones? Incorrectly identifying a perpetrator as a current partner rather than an ex-partner could significantly impact the victim’s safety. Additionally, a model biased against certain ethnicities risks overlooking the specific needs of minority groups.

The conference sparked lively and engaging conversation about many of these issues, as well as the role that automation can be play within the social sciences more broadly. A research article describing these results in full is the focus of ongoing work, and the presentation slides are available below as a download.

For further information please contact Darren at darren.cook@city.ac.uk or via LinkedIn @darrencook1986

Dr Darren Cook, An application of Natural Language Processing (NLP) to free-form Police crime notes – 1 download

Photo by Markus Spiske on Unsplash

Differentiating risk: The association between relationship type and risk of repeat victimization of domestic abuse

Much of the literature on domestic abuse focuses on those in intimate partner relationships or ex-partners, however, in the UK the Home Office definition also includes those in familial relationships. The Domestic Abuse, Stalking, and Harassment and Honour-Based Violence Risk Assessment assumes homogeneous risk factors across all relationships.

This paper, Differentiating risk: The association between relationship type and risk of repeat victimization of domestic abuse, therefore examines the risk factors for repeat victimization of domestic abuse by relationship type between the victim and perpetrator in a UK police force.

Using police-recorded domestic abuse incident and crime data, a logistic regression model found that the most similar repeat victimization risk profiles for 14,519 victims were amongst partners and ex-partners, with both relationships demonstrating the greatest degree of gender asymmetry, compared with other familial relationships. Physical violence was the strongest predictor of repeat victimization and was a statistically significant predictor for ex-partners, partners, and all familial relationships. Coercive behaviour was also a significant predictor for all relationships apart from partners, but not at the same magnitude as physical abuse.

Recognizing the difference in risk by relationship type may assist the police in deciding the most appropriate response and interventions to reduce the risk of further harm. 

 For further information please see: https://academic.oup.com/policing/article/doi/10.1093/police/paae024/7641219?login=false

Or contact Ruth at ruth.weir@city.ac.uk  

Photo from licensed Adobe Stock library

Better utilisation of healthcare data to measure violence

Despite violence being recognised as a harm to health, it is not consistently or adequately captured in healthcare data systems. Administrative health records could be a valuable source for researching violence and understanding the needs of victims, but such datasets are currently underutilised for this purpose.

VISION researcher Dr Anastasia Fadeeva, with input from Dr Estela Capelas Barbosa, Professor Sally McManus and Public Health Wales’ Dr Alex Walker, examined violence indicators in emergency care, primary care, and linked healthcare datasets in the paper Using Primary Care and Emergency Department datasets for Researching Violence Victimisation in the UK.

Anastasia worked with Hospital Episode Statistics Accident and Emergency (HES A&E) and the Emergency Care Data Set (ECDS) while on secondment at the Department of Health and Social Care (DHSC), with helpful review provided by researchers in the department.

Among the datasets reviewed in the study, the South Wales Violence Surveillance dataset (police and emergency department data linked by Public Health Wales) had the most detail about violent acts and their contexts, while the Clinical Practice Research Datalink (CPRD) provided the more extensive range of socioeconomic factors about patients and extensive linkage with other datasets. Currently, detailed safeguarding information is routinely removed from the ECDS extracts provided to researchers, limiting its utility for violence research. In the HES A&E, only physical violence was consistently recorded.

Addressing these limitations and increasing awareness of the potential utility of health administrative datasets to violence-related research has the potential to provide insight into the health service needs of victims.

For further information please see: Social Sciences | Free Full-Text | Using Primary Care and Emergency Department Datasets for Researching Violence Victimisation in the UK: A Methodological Review of Four Sources (mdpi.com)

Or contact Dr Anastasia Fadeeva at anastasa.fadeeva@city.ac.uk

Photo from licensed Adobe Stock library

VISION Adolescent Domestic Abuse conference

This event is in the past.

If registered, please enter through the main entrance in the University Building, across from Northampton Square, a green space with a gazebo. There is also a silver sculpture in front of University Building.

Only those that registered will be able to enter the conference room.

To register please see: VISION and VASC Adolescent Domestic Abuse conference

The UK Prevention Research Partnership Violence, Health & Society (VISION) consortium and the Violence and Society Centre at City, University of London, are pleased to announce the Adolescent Domestic Abuse conference.

Thursday 18th April 2024, 10:00 – 17:00 followed by a reception 
Oliver Thompson Lecture Theatre (Tait Bldg), City, University of London, EC1B 0HB 

Adolescent domestic abuse, which includes physical, emotional, and/or sexual abuse that occurs between young people who are, or were, dating, is often overlooked in research, policy and practice. The current definition of domestic abuse leaves those aged under 16 in teenage relationships falling into the gap between child protection procedures and adult-focused domestic abuse policy. 

The conference brings together academics, practitioners, and policy makers to share existing research, policy and practice.

Registration is required and free. This is an in person conference only and catering will be provided. If you cannot attend but would like the slides, please contact the email listed below.

The programme: 

  • 9:30 – 10:00 Registration & refreshments 
  • 10:00 – 10:20 Welcome & setting the scene, Dr Ruth Weir, Violence and Society Centre, City, University of London and Katy Barrow-Grint, Assistant Chief Constable, Thames Valley Police
  • 10:20 – 10:40 Introductory Speaker, Louisa Rolfe OBE, Metropolitan Police and National Police Chief Council lead for Domestic Abuse
  • 10:40 – 11:00 Rapid evidence review on domestic abuse in teenage relationships, Flavia Lamarre, and Dr Ruth Weir, City, University of London
  • 11:00 – 11:30 Learning from the lived experience, SafeLives Changemakers
  • 11:30 – 12:00 Researching abuse within teenage relationships: A critique of a decade’s work and what we could do better, Professor Christine Barter, Co-Director of the Connect Centre for International Research on Interpersonal Violence and Harm, University of Central Lancashire 
  • 12:00 – 13:00 Lunch
  • 13:00 – 14:20 Panel 1: Teenage relationships and abuse: What the research says, chaired by Professor Sally McManus, Director of the Violence and Society Centre and Deputy Director of the VISION research project
  • Panel 1: Step up, Speak Out: Amplifying young people’s voices in understanding and responding to adolescent domestic abuse, Janelle Rabe, Centre for Research into Violence and Abuse, Durham University
  • Panel 1: In practice it can be so much harder’: Young people’s approaches and experiences of supporting friends experiencing domestic abuse, Jen Daw and Sally Steadman South, SafeLives
  • Panel 1: Healthy relationships: children and young people attitudes and influences, Hannah Williams and Sarah Davidge, Women’s Aid
  • Panel 1: Intimate partner femicide against young women, Dr Shilan Caman, Karolinska Institutet, Sweden
  • 14:20 – 14:35 Break
  • 14:35 – 15:35 Panel 2: Sexual violence in teenage relationships, chaired by Katy Barrow-Grint, Thames Valley Police
  • Panel 2: “Always the rule that you can’t say no”: Adolescent women’s experiences of sexual violence in dating relationships – Dr Kirsty McGregor, Loughborough University 
  • Panel 2: Empowering Youth: Addressing Online Pornography and Adolescent Domestic Abuse – Insights from the CONSENT Project – Berta Vall, Elena Lloberas and Jaume Grané, Blanquerna, Barcelona, Spain and The European Network for Work with Perpetrators of Domestic Violence, Berlin, Germany
  • Panel 2: Image-Based Sexual Abuse as a Facet of Domestic Abuse in Young People’s Relationships – Dr Alishya Dhir, Durham University
  • 15:35 – 15:50 Break
  • 15:50 – 16:50 Panel 3: Specialist services and local government, chaired by Dr Olumide Adisa, University of Suffolk
  • Panel 3: The role and value of Early Intervention Workers in supporting children and young people aged 11–18 in a domestic abuse service context – Elaha Walizadeh and Leonor Capelier, Refuge 
  • Panel 3: Prevention, Identification, Intervention and Protection: Learning on teenage domestic abuse from a multi-agency model in the London Borough of Islington – Aisling Barker, Islington Borough Council
  • Panel 3: Tackling adolescent domestic abuse in Lambeth – Rose Parker, Erika Pavely, Ariana Markowitz, and Siofra Peeren, Lambeth Health Inequalities Research and Evaluation Network 
  • 16:50 – 17:00 Closing remarks and next steps
  • 17.00 – onwards Drinks reception, Conference attendees are invited to a drinks reception in the Oliver Thompson foyer

The abstracts

The abstracts and information on the poster presentations and stands are below for downloading.

For further information and any questions, please contact VISION at VISION_Management_Team@city.ac.uk

Photo by Tim Mossholder on Unsplash