Archives

Natural Language Processing: Improving Data Integrity of Police Recorded Crime

    By Darren Cook, Research Fellow in Natural Language Processing at City, University of London

    Did you know that police recorded crime data for England and Wales are not accredited by the UK’s Office for Statistics Regulation (OSR)? This decision, made by the OSR after an audit in 2014, was due to concerns about the reliability of the underlying data.

    Various factors affect the quality of police-recorded data. Differences in IT systems, personnel decision-making, and a lack of knowledge-sharing all contribute to reduced quality and consistency. Poor data integrity leads to a lack of standardisation across police forces and an increase in inaccurate or missing entries. I recently spoke about this issue at the Behavioural and Social Sciences in Security (BASS) conference at the University of St. Andrews, Scotland.

    Correcting missing values is no small feat. In a dataset of 18,000 police recorded domestic violence incidents, we found over 4,500 (25%) missing entries for a single variable. Let’s assume it takes 30 seconds to find the correct value for this variable – that’s 38 hours of effort – almost a full working week. Given that there could be as many as twenty additional variables, it would take over four months to populate all the missing values in our dataset. Expanding such effort across multiple police forces and for multiple types of crime highlights the inefficiency of human-effort in this endeavour.

    In my talk, I outlined an automated solution to this problem using Natural Language Processing (NLP) and supervised machine learning (ML). NLP describes the processes and techniques used by machines to understand human language, and supervised ML describes how machines learn to predict an outcome based on previously seen examples. In this case, we sought to predict the relationship between the victim and offender – an important piece of demographic information vital to ensuring victim safety.

    The proposed system would use a text-based crime ‘note’ completed by a police officer to classify the victim offender relationship as either ‘Ex-Partner”, “Partner”, or “Family” – in keeping with the distinction made by Women’s Aid. Crime notes are an often-overlooked source of information in police data, yet we found they consistently referenced the victim-offender relationship. The goal of our system, therefore, was to extract the salient information from the free-form crime notes and populate the corresponding missing value in our structured data fields.

    Existing solutions based on keywords and syntax parsing are used by multiple UK police forces. While effective, they require manual effort to create, update, and maintain the dictionaries, and they don’t generalise well. Our supervised ML system, however, can be automatically updated and monitored to maintain accuracy.

    When tested, our system achieved 80% accuracy, correctly labelling the relationship type in four out of five cases. In comparison, humans performed this task with approximately 82% accuracy – an arguably negligible difference. Moreover, once trained, our system could classify the entire test set (over 1,000 crime notes) in just sixteen seconds.

    However, we noted some limitations, the biggest of which was a high linguistic overlap in crime notes between ‘Ex-Partner’ and ‘Partner’ that caused several misclassifications. We believe more advanced language models (i.e., word embeddings) will improve discrimination between these relationships.

    We also discovered a potential prediction bias against minorities. Although victim ethnicity wasn’t included in our training setup, we observed reduced accuracy for Black or Asian victims. The source and extent of this bias are subjects of ongoing research.

    Our findings highlight the promise of automated solutions but serve as a cautionary tale against assuming these systems can be applied carte blanche without careful consideration of their limitations. Several outstanding questions remain. Is a system with 80% accuracy good enough? Is it better to leave missing values rather than predict incorrect ones? Incorrectly identifying a perpetrator as a current partner rather than an ex-partner could significantly impact the victim’s safety. Additionally, a model biased against certain ethnicities risks overlooking the specific needs of minority groups.

    The conference sparked lively and engaging conversation about many of these issues, as well as the role that automation can be play within the social sciences more broadly. A research article describing these results in full is the focus of ongoing work, and the presentation slides are available below as a download.

    For further information please contact Darren at darren.cook@city.ac.uk or via LinkedIn @darrencook1986

    Dr Darren Cook, An application of Natural Language Processing (NLP) to free-form Police crime notes – 1 download

    Photo by Markus Spiske on Unsplash

    Reflections on producing evidence syntheses on violence and abuse

      The VISION systematic review team, Dr Natalia Lewis, Dr Elizabeth Cook, Dr Jessica Corsi, Dr Sophie Carlisle and Dr Annie Bunce, presented at the London Evidence Syntheses and Research Use Seminars on 17 July 2024. The event was an initiative jointly hosted by the EPPI Centre (at UCL) and the Centre for Evaluation at London School of Hygiene and Tropical Medicine (LSHTM).

      The team presented under the collective theme Producing evidence syntheses on violence and abuse: reflections on the disciplinary variations and practicalities, with the aim of prompting conversation about how systematic review methodologies can be adapted across disciplines. Dr Natalia Lewis (Systematic Review Lead, University of Bristol) introduced the session, describing the emergence of systematic reviews at the top of the ‘hierarchy of evidence’ that is often referenced in evidence-based medicine. Accompanying systematic reviews are a range of reporting standards, tools and guidance stipulating recommended practices for conducting reviews. However, as the team discussed in their presentations, such standardised frameworks and approaches do not provide space for reflection on the process and implications of adapting these methodologies to evidence on violence and abuse.

      The seminar included the following presentations:

      • Dr Elizabeth Cook (City, University of London): Evidence syntheses in a global context: A systematic review of sex/gender disaggregated homicide.
      • Dr Jessica Corsi (City, University of London): Evidence synthesis on legal records: challenges and adaptations.
      • Dr Sophie Carlisle (Health Innovation East Midlands): Evidence synthesis in the context of UK domestic and sexual violence services: Involving professional stakeholders.

      The presentations are available to view and download below.

      Dr Natalia Lewis, Producing evidence syntheses on violence and abuse: reflections on the disciplinary variations and practicalities

      Dr Sophie Carlisle, Evidence synthesis in the context of UK domestic and sexual violence services: involving professional stakeholders

      Dr Elizabeth Cook, Evidence synthesis in a global context: A systematic review of sex/gender disaggregated homicide

      Dr Jessica Corsi, Evidence synthesis on legal records: Challenges and adaptations

      The seminar was held in hybrid format. The talks were recorded and are available through the following link: https://eppi.ioe.ac.uk/cms/Coursesseminars/Previousseminarsandevents/tabid/3317/Default.aspx

      Presentations from the 2024 VISION Annual Conference

        The presentations from the 3rd VISION annual conference are now available for downloading.

        The event was held at Kings College London, Strand campus, on 11 June. The theme was Violence prevention in research and policy: Bridging silos. Keynote speakers, Dr Claudia Garcia-Moreno (World Health Organisation) and Professor Katrin Hohl (City, UoL) considered the changes needed for effective violence prevention from the perspectives of health and justice. Three symposiums highlighted interdisciplinary research from the VISION consortium and partners on:

        – Violence against older people: Challenges in research and policy;

        – Learning across statutory review practices: Origins, ambitions and future directions; and

        – Responding to experiences and expressions of interpersonal violence in the workplace

        Approximately 80 academics, central and local government officials, practitioners, and voluntary and community sector organisations attended from a range of health and crime / justice disciplines.

        All the slides that could be shared are available below. Please feel free to download.

        Photo caption: Symposium 3, ‘Responding to experiences and expressions of interpersonal violence in the workplace’. From left to right: Chair, Dr Olumide Adisa (University of Suffolk) and Panellists Dr Vanessa Gash (City, UoL), Dr Alison Gregory (Alison Gregory Consulting), Catherine Buglass (Employers’ Initiative on Domestic Abuse) and Dr Niels Blom (City, UoL)

        Professor Gene Feder, VISION Director – Welcome – 1 download

        Keynote Speaker, Dr Claudia Garcia-Moreno – Violence against women: From research to policy and action – 1 download

        Symposium 1 – Violence against older people: Challenges in research and policy – 4 downloads (Hourglass, Office for National Statistics, Public Health Wales & VISION)

        Symposium 2 – Learning across statutory review practices: Origins, ambitions and future directions – 1 download

        Symposium 3 – Responding to experiences and expressions of interpersonal violence in the workplace – 3 downloads (Employers’ Initiative on Domestic Abuse, and 2 from VISION)

        Presentations from 2nd VISION annual conference now available

          We are pleased to provide the presentations from our 2nd annual conference held 21 September 2023 at Mary Ward House in London. 

          The theme was Responding to violence across the life course. Sessions included presentations on childhood and teenage years; working life, poverty & economic impacts; older years; and social inclusion in policy and research. The conference concluded with a panel discussion on violence and complex systems.

          Seventy-seven academics, central and local government officials, practitioners, and voluntary and community sector organisations attended from a range of health and crime / justice disciplines.

          Please feel free to download the presentations below. Each session is one download.

          Photo caption: Dr Ladan Hashemi, Senior Research Fellow at VISION, answers a question after her presentation, ‘Adverse Childhood Experiences and Childhood Obesity:​ Exploring Potential Mediating and Moderating Factors​’

          Download the Welcome slides

          Download the slides from Session 1 – Childhood and teenage years

          Download the slides from Session 2 – Social inclusion in policy & research

          Download the slides from Session 3 – Working life, poverty and economic impacts

          Download the slides from Session 4 – Older people

          Presentations available from the 1st VISION Annual Conference

            The September 2022 Annual Conference marked the first year of the UKPRP Violence, Health and Society consortium.  Participants, including the VISION researchers, Third Sector organisations, government, and academics, reflected on the first year and looked forward to the next four years. VISION presentations covered the entire research project: Health & Health Services, Crime & Justice, Data Integration, and Ethnicity & Intersectionality. These presentations provided highlights of completed research and thoughts on next steps. For further information, please see the slide show directly below or feel free to download the file underneath the slide show.

            Photo credit: Sincerely Media / Unsplash.com