Archives

Natural Language Processing: Improving Data Integrity of Police Recorded Crime

By Darren Cook, Research Fellow in Natural Language Processing at City, University of London

Did you know that police recorded crime data for England and Wales are not accredited by the UK’s Office for Statistics Regulation (OSR)? This decision, made by the OSR after an audit in 2014, was due to concerns about the reliability of the underlying data.

Various factors affect the quality of police-recorded data. Differences in IT systems, personnel decision-making, and a lack of knowledge-sharing all contribute to reduced quality and consistency. Poor data integrity leads to a lack of standardisation across police forces and an increase in inaccurate or missing entries. I recently spoke about this issue at the Behavioural and Social Sciences in Security (BASS) conference at the University of St. Andrews, Scotland.

Correcting missing values is no small feat. In a dataset of 18,000 police recorded domestic violence incidents, we found over 4,500 (25%) missing entries for a single variable. Let’s assume it takes 30 seconds to find the correct value for this variable – that’s 38 hours of effort – almost a full working week. Given that there could be as many as twenty additional variables, it would take over four months to populate all the missing values in our dataset. Expanding such effort across multiple police forces and for multiple types of crime highlights the inefficiency of human-effort in this endeavour.

In my talk, I outlined an automated solution to this problem using Natural Language Processing (NLP) and supervised machine learning (ML). NLP describes the processes and techniques used by machines to understand human language, and supervised ML describes how machines learn to predict an outcome based on previously seen examples. In this case, we sought to predict the relationship between the victim and offender – an important piece of demographic information vital to ensuring victim safety.

The proposed system would use a text-based crime ‘note’ completed by a police officer to classify the victim offender relationship as either ‘Ex-Partner”, “Partner”, or “Family” – in keeping with the distinction made by Women’s Aid. Crime notes are an often-overlooked source of information in police data, yet we found they consistently referenced the victim-offender relationship. The goal of our system, therefore, was to extract the salient information from the free-form crime notes and populate the corresponding missing value in our structured data fields.

Existing solutions based on keywords and syntax parsing are used by multiple UK police forces. While effective, they require manual effort to create, update, and maintain the dictionaries, and they don’t generalise well. Our supervised ML system, however, can be automatically updated and monitored to maintain accuracy.

When tested, our system achieved 80% accuracy, correctly labelling the relationship type in four out of five cases. In comparison, humans performed this task with approximately 82% accuracy – an arguably negligible difference. Moreover, once trained, our system could classify the entire test set (over 1,000 crime notes) in just sixteen seconds.

However, we noted some limitations, the biggest of which was a high linguistic overlap in crime notes between ‘Ex-Partner’ and ‘Partner’ that caused several misclassifications. We believe more advanced language models (i.e., word embeddings) will improve discrimination between these relationships.

We also discovered a potential prediction bias against minorities. Although victim ethnicity wasn’t included in our training setup, we observed reduced accuracy for Black or Asian victims. The source and extent of this bias are subjects of ongoing research.

Our findings highlight the promise of automated solutions but serve as a cautionary tale against assuming these systems can be applied carte blanche without careful consideration of their limitations. Several outstanding questions remain. Is a system with 80% accuracy good enough? Is it better to leave missing values rather than predict incorrect ones? Incorrectly identifying a perpetrator as a current partner rather than an ex-partner could significantly impact the victim’s safety. Additionally, a model biased against certain ethnicities risks overlooking the specific needs of minority groups.

The conference sparked lively and engaging conversation about many of these issues, as well as the role that automation can be play within the social sciences more broadly. A research article describing these results in full is the focus of ongoing work, and the presentation slides are available below as a download.

For further information please contact Darren at darren.cook@city.ac.uk or via LinkedIn @darrencook1986

Dr Darren Cook, An application of Natural Language Processing (NLP) to free-form Police crime notes – 1 download

Photo by Markus Spiske on Unsplash

Uncovering ‘hidden’ violence against older people

By Dr Anastasia Fadeeva, VISION Research Fellow

Violence against older people is often overlooked. As a society, we often associate violence with young people, gangs, unsafe streets, and ‘knife crime’. However, violence also takes place behind front doors, perpetuated by families and partners, and victims include older people. 

Some older people may be particularly vulnerable due to poorer physical health, disability, dependence on others, and financial challenges after retirement. Policy rarely addresses the safety of this population, with even health and social care professionals sometimes assuming that violence does not affect older people. For example, doctors may dismiss injuries or depression as inevitable problems related to old age and miss opportunities to identify victims (1). In addition, older people may be less likely to report violence and abuse because they themselves may not recognise it, do not want to accuse family members, or out of fear (2). 

Given victims of violence often remain invisible to health and social services, police, or charities, the most reliable statistics on violence often come from national surveys such as the Crime Survey for England and Wales (CSEW) conducted by the Office for National Statistics. However, for a long time the CSEW self-completion – the part of the interview with the most detail on violence and abuse – excluded those aged 60 or more, and only recently extended to include those over 74. Some national surveys specifically focus on older people, but these ask very little about violence and abuse. Additionally, despite people in care homes or other institutional settings experiencing a higher risk of violence, it can be challenging to collect information from them. Therefore, many surveys only interview people in private households, which excludes many higher-risk groups.

We need a better grasp of the extent and nature of violence and abuse in older populations. First, reliable figures can improve the allocation of resources and services targeted at the protection of older people. Second, better statistics can identify the risk factors for experiencing violence in later life and the most vulnerable groups.

In the VISION consortium, we used the Adult Psychiatric Morbidity Survey (APMS 2014) to examine violence in people aged 60 and over in England (3). While we found that older people of minoritised ethnic backgrounds are at higher risk of violence (prevalence of 6.0% versus 1.7% in white people in 12 months prior to the survey), more research needs to be done to distinguish the experiences of different ethnic groups. Our research also showed that loneliness and social isolation were strongly related to violence in later life. Older people may experience social isolation due to limiting health issues or economic situations, and perpetrators can exploit this (4). Moreover, isolation of victims is a tool commonly used by perpetrators, especially in cases of domestic abuse (5).  Knowing about these and other risk factors can help us better spot and protect potential victims.

Additionally, more needs to be learnt about the consequences of life course exposure to violence for health and well-being in later life. This is still a relatively unexplored area due to limited data and a lack of reporting from older victims and survivors. It is sometimes more difficult to establish the link between violence and health problems because the health impacts are not always immediate but can accumulate or emerge in later life (6). Also, as people develop more illnesses as they age, it is more challenging to distinguish health issues attributable to violence. Therefore we are also using the English Longitudinal Study of Ageing (ELSA) to examine temporal relationships between lifetime violence exposure and health in older age.

In an inclusive society, every member should be able to lead a life where they feel safe and respected. We are delighted that the CSEW has removed the upper age limit to data collection on domestic abuse, which is one step towards making older victims and survivors heard. Continuous work on uncovering the ‘hidden’ statistics and examining the effects of intersectional characteristics on violence is crucial in making our society more inclusive, equal, and safe for everyone. For example, one VISION study (7) has demonstrated that the risks of repeated victimisation in domestic relationships had opposite trends for men and women as they aged. We are committed to support the Hourglass Manifesto to end the abuse of older people (8), and are willing to provide decision makers with evidence to enable a safer ageing society.

For further information, please see: Violence against older people and associations with mental health: A national probability sample survey of the general population in England – ScienceDirect

Or please contact Anastasia at anastasia.fadeeva@city.ac.uk

Footnotes

  • 1.  SafeLives U. Safe later lives: Older people and domestic abuse, spotlights report. 2016.
  • 2.  Age UK. No Age Limit: the blind spot of older victims and survivors in the Domestic Abuse Bill. 2020.
  • 3.  Fadeeva A, Hashemi L, Cooper C, Stewart R, McManus S. Violence against older people and mental health: a probability sample survey of the general population. forthcoming.
  • 4.  Tung EL, Hawkley LC, Cagney KA, Peek ME. Social isolation, loneliness, and violence exposure in urban adults. Health Affairs. 2019;38(10):1670-8.
  • 5.  Stark E. Coercive control. Violence against women: Current theory and practice in domestic abuse, sexual violence and exploitation. 2013:17-33.
  • 6.  Knight L, Hester M. Domestic violence and mental health in older adults. International review of psychiatry. 2016;28(5):464-74.
  • 7.  Weir R. Differentiating risk: The association between relationship type and risk of repeat victimization of domestic abuse. Policing: A Journal of Policy and Practice. 2024;18:paae024.
  • 8.  Hourglass. Manifesto A Safer Ageing Society by 2050. 2024.

Photo from licensed Adobe Stock library

The next generation of researchers studying violence

by City criminology undergraduate student, Matilde Sciarrini

As a Criminology with Data Analytics student, I had the opportunity to complete a work placement at the Violence and Society Centre (VASC), through the Q-step programme, which aims to improve quantitative skills in social science students. My initial interest in VASC and their main research project, the VISION Consortium, stemmed from the desire to better understand the different experiences of victims of violent crimes, and the amount of support they received from their family, friends and social services. 

The work placement took place one day a week for 10 weeks, during which I was tasked to analyse crime reporting trends, by utilising the Crime Survey for England and Wales. This survey is divided into a non-victim form, which gathers general demographic information about the respondents, such as sex, age and ethnicity; and a victim form, specifically asking about crimes they experienced in the past year.  

During the first few weeks, I selected the relevant variables, refined by recoding skills, and harmonised the variables from 2001 to 2020. The variables I selected for my analysis included: 

  • Did the police come to know about the matter?  
  • How did they come to know about it?  
  • Can you tell me why you decided to report this crime to the police?  
  • Can you tell me why you decided NOT to report this crime to the police? 
  • Do you think the police treated you fairly?  
  • Were you satisfied or dissatisfied with the way in which you were able to report the matter? 

I decided to specifically focus on violent and sexual crimes for my analysis. This analysis emphasised the importance of understanding the various reasons for reporting and not reporting violent crimes to the police amongst different groups in society. This would not only help explain the discrepancy between police-recorded crime and the figures from the national victimization survey (Crime Survey for England and Wales), but it would aid in more effectively addressing the victim’s specific socio-cultural needs.    

My experience at the Centre was insightful and a valuable opportunity to understand the working of a research centre firsthand. I found that VASC was a more sociable environment than what I had anticipated. Everyone I met was open to provide coding guidance throughout the workday. Moreover, a productive degree of teamwork took place at the Centre, with full-staff meetings occurring on a weekly basis and constant communication between colleagues. This high level of teamwork was also present in their work, with multiple coding debates taking place every day. Although I had no previous knowledge of the Stata software, I was given the opportunity to learn and utilise it as part of my code. 

This placement gave me the opportunity not only to enhance my data analysis skills, but to further learn how to work in an office environment and improve my communication with others. I have come to understand how my criminological knowledge can contribute to research, and how it can shape social policies and affect governmental practices. I thoroughly enjoyed the placement since the very first week, having learned the importance of seeking assistance as well as independently solving problems.

Finally, I am grateful for the help and support of my line managers at VASC/VISION, who were always open to provide help and feedback about my work and my future career aspirations. I consider myself very fortunate to have had such an amazing opportunity, and I would encourage others to take an interest in the ongoing work at the Centre.   

Calling all crime analysts: Share your experiences of using text data in analysis

Are you a crime analyst or researcher? If so VISION would really like to hear about your experiences of using text data in your analysis.

We developed a short survey that will take approximately 5 minutes to complete. Qualtrics Survey | Crime Analyst Survey

This survey is designed to explore your experiences working with free-text data. Your feedback will enable us to evaluate the need for software designed to assist analysts working with large amounts of free text data.

Participation is voluntary and all responses will be anonymous. Information will be confidential and will not be shared with any other parties, and will be deleted once it is no longer needed.

The deadline to provide feedback using the link above is 30 June 2024.

Illustration from licensed Adobe Stock library

VISION researchers presenting at UK Data Service Health Studies Conference 2024

Two researchers of the VISION consortium project are presenting at the Health Studies Conference in July.

Dr Elizabeth Cook, Senior Lecturer at City, University of London, will present Indirect victims of violence: mental health and the close relatives of serious assault victims in England.

Dr Annie Bunce, Research Fellow at City, University of London, will present Prevalence and nature of workplace bullying and harassment and associations with mental health conditions in England: a cross-sectional probability sample survey

The free event, on 1 July at University College London (UCL), organised by the UK Data Service in collaboration with UCL and the National Centre for Social Research, will provide updates from the data producers of key UK social surveys with health-related content, such as the Health Survey for England, Understanding Society and the English Longitudinal Study of Ageing. There will also be presentations by researchers who have conducted analyses using health data

Register for the event

Illustration at top of page is from licensed Adobe Stock library

Prevalence of physical violence against people in insecure migration status 

VISION researchers from the Systematic Review working group (Andri Innes, Sophie Carlisle, Hannah Manzur, Elizabeth Cook, Jessica Corsi and Natalia Lewis) have published a systematic review and meta-analysis in PLOS One, estimating prevalence of physical violence against people in insecure migration status. This is the first review of its type, synthesizing global data on violence against migrants in all types of insecure status. 

The review finds that around 1 in 3 migrants in insecure status experience physical violence. Violence included physical interpersonal, community and state violence. Insecure status was conceptualised encompassing undocumented status, lapsed statuses, asylum seeking and other pending applications, and any status that embeds a form of insecurity by tying status to a particular relationship (such as spousal or employer-employee). Studies were only included in the review if the violence happened while the victim was in insecure status. 

The VISION team reviewed academic literature published between January 2000 and May 2023, across social and health sciences. The study was global in scope, although data was limited by the English language search.  

Key Findings 

More than one in four migrants in insecure status disclosed intimate partner violence specifically. Spousal visas embed a particular risk of violence because the visa status is connected to an intimate partner relationship, creating an important power disparity. Nevertheless, there was no significant difference in prevalence of violence by gender across the dataset. Prevalence also did not differ meaningfully across geographic region, perpetrator, status type or time frame.  

The most significant findings included that violence exposure is not meaningfully different for people in undocumented status than in other types of insecure status. Physical violence is a concern across all types of insecure migration status types. 

The findings were limited because of high levels of heterogeneity in the data. It was also difficult to consider intersectional identity characteristics such as age, race or ethnicity, nationality, religion, marital status, socio-economic status, education level or motivation for migration because these were not standardised across included studies. This suggests that further and specified research is needed in this area. 

The review is open access and is available to read in full here

If you have any comments or feedback for the authors, please contact Andri at alexandria.innes@city.ac.uk  

Photo from licensed Adobe Stock library

VISION Policy Series: The impact of intimate partner violence on job loss and time off work in the UK

Key research findings

The latest research by VISION colleagues, Vanessa Gash and Niels Blom at City, finds serious negative effects of intimate partner violence and abuse (IPVA) on labour market outcomes, with 3.6% of those who experienced intimate partner violence losing their jobs because of the abuse. Furthermore, 1 in 10 of those who experienced intimate partner violence took a period of leave from work, with 1 in 4 of those who took leave needing to take a month or more off work.

Based on a large statistically representative sample for England and Wales, this research is one of the first to examine different types of IPVA, with five categories distinguished in the analysis.

The report examines differences between those who experienced; (1) physical abuse, (2) sexual abuse, (3) stalking, (4) coercive or controlling behaviour, as well as those who were (5) threatened with abuse by a current or former intimate partner. There were strong differences in prevalence of IPVA by sex, with women disproportionately exposed to threats (34% compared to 15% for men) and to sexual violence (7% compared to 3% for men). Additionally, compared to men, women were more likely to report multiple types of violence and abuse.

Job loss is associated with all five forms of IPVA, and the risks were highest for those who experienced: stalking, sexual violence as well as physical threats by an intimate partner. The research also includes qualitative findings from those with lived experience of IPVA and abuse. Participants noted an ongoing stigmatisation of victims of abuse, which had serious impacts on disclosure. Victim-survivors noted their fear of being declared ‘unfit for work’ and of becoming a ’marked person’ should they disclose their abuse to relevant managers.

Policy implications

  • Though IPVA was found to have significant effects on victims’ experiences at work, those with lived experience noted a reluctance to disclose IPVA to relevant managers.
  • Employers may therefore want to consider enhanced IPVA and DA support systems for employees in the workplace.
  • While we can expect enhanced support to improve job retention and productivity, we currently lack the appropriate data to directly examine these effects

For further information please download the full report below and / or contact Dr Vanessa Gash at vanessa.gash.1@city.ac.uk.

About the authors

Dr Vanessa Gash is a Reader in the Department of Sociology and Criminology at City and a member of the UKPRP VISION team based at the Violence & Society Centre.

Dr Niels Blom is a Research Fellow at the Violence & Society Centre and a member of the UKPRP VISION team.

Consultation: Is there a need for a VAWG data dashboard?

In 2022, the UK Office for National Statistics (ONS) developed a prototype violence against women and girls (VAWG) dashboard. The tool presents statistics and charts on violence against women and girls in England and Wales, drawing on multiple sources. However, due to re-prioritisation at ONS, maintenance of the dashboard was halted and from 1st April 2024 it will no longer be accessible.

The VISION consortium is consulting on whether there is need for a VAWG data dashboard. This consultation is seeking views on:

  •  Whether the dashboard was useful
  •  Who used it and why
  •  If the dashboard was to continue, what aspects should be kept, dropped or added.

The consultation link is here: Qualtrics Survey | Qualtrics Experience Management

Anyone interested in the idea of a VAWG data dashboard is welcome to respond to the survey, particularly if interested in using one in the future.

Answer as many questions as you like. You can provide contact details or complete this anonymously. The findings will be used to draft a report and provide recommendations on whether the dashboard should continue. The report will include a list of the groups and organisations that participated (where details are provided). Individuals will not be named, although quotes may be taken from the text provided. The report may be published, for example on the VISION website.

The ONS VAWG dashboard was available online until 31 March 2024. Therefore, if you would like to participate in this consultation, please view the sample screenshots of the tool below.

This consultation closes Monday 22 April.

For further information, please contact us at VISION_Management_Team@city.ac.uk

Better utilisation of healthcare data to measure violence

Despite violence being recognised as a harm to health, it is not consistently or adequately captured in healthcare data systems. Administrative health records could be a valuable source for researching violence and understanding the needs of victims, but such datasets are currently underutilised for this purpose.

VISION researcher Dr Anastasia Fadeeva, with input from Dr Estela Capelas Barbosa, Professor Sally McManus and Public Health Wales’ Dr Alex Walker, examined violence indicators in emergency care, primary care, and linked healthcare datasets in the paper Using Primary Care and Emergency Department datasets for Researching Violence Victimisation in the UK.

Anastasia worked with Hospital Episode Statistics Accident and Emergency (HES A&E) and the Emergency Care Data Set (ECDS) while on secondment at the Department of Health and Social Care (DHSC), with helpful review provided by researchers in the department.

Among the datasets reviewed in the study, the South Wales Violence Surveillance dataset (police and emergency department data linked by Public Health Wales) had the most detail about violent acts and their contexts, while the Clinical Practice Research Datalink (CPRD) provided the more extensive range of socioeconomic factors about patients and extensive linkage with other datasets. Currently, detailed safeguarding information is routinely removed from the ECDS extracts provided to researchers, limiting its utility for violence research. In the HES A&E, only physical violence was consistently recorded.

Addressing these limitations and increasing awareness of the potential utility of health administrative datasets to violence-related research has the potential to provide insight into the health service needs of victims.

For further information please see: Social Sciences | Free Full-Text | Using Primary Care and Emergency Department Datasets for Researching Violence Victimisation in the UK: A Methodological Review of Four Sources (mdpi.com)

Or contact Dr Anastasia Fadeeva at anastasa.fadeeva@city.ac.uk

Photo from licensed Adobe Stock library

Working with specialist services’ administrative data

VISION researchers Dr Annie Bunce and Dr Estela Capelas Barbosa have been working with administrative data provided by specialist domestic and sexual violence and abuse (DSVA) support services.

Whilst the wealth and breadth of the data collected creates exciting opportunities for improving our understanding of patterns in experiences of violence and service use, the process of preparing the data for analysis has its challenges. Such challenges- and potential strategies for overcoming them- are not well documented, creating missed opportunities for improving the utilisation of specialist services’ data.

In their new publication, Annie and Estela, along with City, University of London PhD student, Katie Smith, and Dr Sophie Carlisle, a former VISION researcher, reviewed the scope and merits of administrative data generally, and that collected by specialist DSVA services specifically, and the evidence to date for its use by researchers.

They found that the extent to which new insights on violence from specialist services’ data can be used to inform policy and practice is limited by three interrelated challenges: different approaches to the measurement of violence and abuse; the issue of disproportionate funding and capacity of services, and the practicalities of multi-agency working.

Nonetheless, the authors maintain the unique contribution to knowledge on violence that can be provided by DSVA services’ administrative data, and are hopeful that the paper will encourage further discussion about how to better utilise it. Additional resources, collaboration between multiple agencies, service providers and researchers, and the integration of specialist services’ data with other sources of data on violence are needed to maximise policy impact. Given the benefits individuals and society stand to gain, this is a worthwhile endeavour.

For further information please see: Challenges of using specialist domestic and sexual violence and abuse service data to inform policy and practice on violence reduction in the UK in: Journal of Gender-Based Violence – Ahead of print (bristoluniversitypressdigital.com)

Or contact Dr Annie Bunce at annie.bunce@city.ac.uk

Photo from licensed Adobe Stock library