The DNA database and you - How big is it? How many get off it? Your questions answered...

El Reg Special Report: The DNA database and you

The National DNA Database (NDNAD) keeps growing: it now hols more than five million DNA profiles of individuals. Getting off the database, if you have been sampled by England or Wales forces, remain as unlikely as ever. And it remains difficult to make sense of the stats bandied at us, with the press quoting wildly differing figures. So we decided to investigate.

In August, the Daily Mail reported that "4.5 million genetic profiles [are] on record. Up to 1.5 million - or a third of these - are from innocent people".

In another article on the same day, the Mail reported "[t]he figure of 573,639 people on the database who have not been convicted, cautioned, formally warned or reprimanded has pushed the overall total to 4.2 million."

This is an extreme example of the difficulty of making sense of statistics concerning the NDNAD. Our first step was to find source data.

In May 2007 the National Policing Improvement Agency (NPIA) started to administer the NDNAD. We'll use data obtained in a recent response to a Freedom of Information request to the NPIA to get some sense out of the data and figure out what are all the implied assumptions.

Data from the NPIA is authoritative, but the organisation's view of what is the NDNAD is a matter of opinion. The NPIA claims that "The NDNAD is not a criminal records database. It holds very little information about a subject's identity - only their name, date of birth, sex and ethnic appearance. Inclusion on the DNA database does not signify a criminal record, and there is no personal cost or disadvantage by being on it".

The National DNA Database Ethics Group and the Human Genetics Commission both consider the NDNAD to be a crime-related intelligence database. Recent findings suggest that composite statistics do not mask identity within genome-wide association studies and that DNA profiles previously considered anonymous and not containing genetic markers may reveal much more than was thought.

Often assumptions are made about the data; and these may not be the same for different sets of data. The NDNAD includes profiles of DNA samples taken by forces from England, Wales, Scotland and Northern Ireland, but often figures given in Parliament or in the press are only for samples taken by the England and Wales forces.

The NDNAD includes DNA profiles of the DNA samples taken from individuals and profiles of the DNA found at crime scenes. Here are figures up-to 2008-09-01.

(At 2008-09-01) England & Wales forces Other forces
Total number of subject profiles 4,969,225 327,088
Estimated total number of individuals 4,319,807 273,358
Total number of crime scene profiles 320,335 13,749

The subject profiles consist of both profiles of DNA samples taken from individuals following arrest for a recordable offence, known as criminal justice samples, and profiles of subjects who volunteered a DNA sample (whether those that do so are sufficiently informed before they give their consent is an issue that was raised during the presentation of the Nuffield Council on Bioethics; the NDNAD Ethics Group has been discussing the volunteer consent form for DNA sampling and accompanying information), for example, for elimination purpose.

Another source of confusion is that the number of subject profiles on the NDNAD is higher than the estimated number of individuals on it. This is often misrepresented. It happens because some of the profiles held are replicates. Multiple samples are taken from the same subject and profiled when on different occasions there's confusion concerning the person's name. Replication also happens when the police decide to resample an individual. The number of replications is estimated at around 13 per cent (it varies over time and between police forces).

Define innocent

A common question is how many of these individuals are innocents. This is particularly difficult to find out.

First, the National DNA Database was allegedly never set up to record this information; this is in the Police National Computer (PNC).

Second, what is meant by innocent is not always consistent; the obvious definition of all those never charged and those acquitted may not map directly to the information available. The NPIA ran a report on 2008-03-31:

(At 2008-03-31 for England & Wales forces) Total individuals Percentage of total
With a conviction, caution, formal warning or reprimand 3,259,347 79%
No conviction, caution, formal warning or reprimand listed 573,639 14%
Not known as PNC record removed 283,727 7%
Estimated total number of individuals 4,116,713 100%

From the above table it can be deduced that, as of March 2008, there were DNA profiles for at least 573,639 innocent individuals and possibly for as many as 857,366 innocents. Fourteen to 21 per cent of the sampled individuals recorded in the NDNAD are innocent. Furthermore, that does not take into account any mistakes in the PNC.

What happens to the DNA samples and profiles of all those innocents? Most of them are kept and retained forever. The procedure to get off the NDNAD is complex and assume that one case is considered exceptional enough to justify such a procedure in the first place.

See El Reg's How to delete your DNA profile for more on this. (Note that the only process map the Metropolitan Police has published since is a rehash of the usual guidelines and the Specialist Crime Directorate 12 wrote that '[t]here is no additional information I can supply on this subject'.)

Subject profiles removals 2003 2004 2005 2006 2007 2008 (adjusted)
England & Wales forces 677 34 81 271 310 222
Other forces 23,492 19,160 21,580 21,969 21,265 19,164

The huge difference in numbers between removals of samples taken England & Wales forces and by other forces is due to differences between English & Welsh and Scottish laws. DNA profiles and samples of innocents taken by Scotland forces can't be kept forever.

Whether England and Wales forces can keep stalling on the removal of DNA profiles (and destruction of DNA samples) of innocents has gone all the way to the Grand Chamber of the European Court of Human Rights:

"The [Marper and S v. UK] case concerns the decision to continue storing fingerprints and DNA samples taken from the applicants after unsuccessful criminal proceedings against them were closed." The hearing was in February and the ruling will be given later this year. (Note that the adjusted figure for 2008 is based on data up to September adjusted for the rest of the year.)

I did not request the data for calendar additions to the NDNAD, but to put things in perspective, the yearly average number of subject profiles added to the NDNAD for the the financial years 2005-07 was 711,645 (NPIA NDNAD Annual report data). For England and Wales forces it was 646,767 (John Reid in Parliament written answers).

Profiling at a young age

There's particular concern as to how many young individuals are included in the NDNAD. Depending on whether you consider the NDNAD as a criminal database, being included in it at a young age is worrying.

(At 2008-09-01) England & Wales forces Other forces
Total subject profiles from 10-17 year old 343,745 10,671

The England & Wales forces again lead in in their aggressiveness to sample DNA. Six pe rcent of all the profiles in the NDNAD were taken by other forces, but only three percent of the DNA profiles of subjects 10 to 17 years old (when the report was ran) was for samples taken by other forces.

The NPIA last ran a more complete report concerning 10-17 year-olds on 2008-04-10:

(At 2008-04-10 for England & Wales forces) Total individuals Percentage of total
With a conviction, caution, formal warning or reprimand 264,297 87%
No conviction, caution, formal warning or reprimand listed 39,095 13%
Estimated total number of individuals 303,393 100%

(The number of those with a PNC record is one less that the estimated total number of individuals. The NPIA did not state if there's one youngster with a PNC record already removed, which is unlikely or whether this should be viewed as a statistical error). From the above table, it can be seen that at least 39,095 innocent youngsters are affected.

If you happen to live in England or Wales, being young or innocent, or both, is not enough to ensure you won't be captured in this massive database. ®


It can be argued that retaining DNA profiles of individuals is not even effective in solving crimes. Helen Wallace, from GeneWatch, debunked this assumption last year when looking at who should be on the NDNAD:

"Collecting more DNA from crime scenes has made a big difference to the number of crimes solved, but keeping DNA from more and more people who have been arrested - many of whom are innocent - has not. Since April 2003, about 1.5 million extra people have been added to the Database, but the chances of detecting a crime using DNA has remained constant, at about 0.36%."

(Two days ago, the Lords voted in favour of an amendment to the Counter Terrorism Bill, which aims 'to try to spark a national debate about the retention of samples and to inform the public about what information is being held on them'.)

