Tag Archives: data

The news where you are: digital preservation and the digital dark ages

(Image c/o Pierre-Louis FERRER on Flickr.)

The following article was contributed by William Kilbride, Executive Director of the Digital Preservation Coalition

That’s all from us, now the news where you are….

This awkward cliché, repeated at the end of every BBC news report, signals a crude shift in gear. It seems that ‘The News’ has two parts: ‘the news where we are’ (London-centred politics, war, economics, English premiership football); and ‘the news where you are’  (local and parochial oddities that may entertain the yeomanry but which won’t deflect the ship of state from its mighty  progress).  Ruthlessly and deservedly lampooned during last year’s independence debate, the phrase came to mind last week as Vint Cerf shared his fears on the evanescence of digital memory and the need to take collective action to counter the pernicious and ubiquitous impact of obsolescence.  Reported by the BBC, the Independent, the Guardian and others (mostly from San Jose CA) it would seem that a digital black hole is set to initiate a digital dark age sometime soon.  There’s a choice of metaphors but none of them good.

The news where I am (The Digital Preservation Coalition) is surprisingly different from the news where they are.

First thing’s first: I don’t have a copy of Vint Cerf’s original remarks so my observations are really only about the reportage.  In fact almost anything he might choose to say would have been welcome.  It’s undoubtedly true that preserving digital content through technological change is a real and sometimes daunting challenge.  Our generation has invested as never before in digital content and it is frankly horrifying when you consider what rapid changes in technology could do to that investment.  Vint, as one of the architects of the modern world, is exceptionally well placed to help us raise the issue among the engineers and technologists that need to understand the problem.

We do desperately need to raise awareness about the challenge of digital preservation so that solutions can be found and implemented.  Politicians and decision makers are consistently under-informed or unaware of the problem.  In fact awareness raising was one of the reasons that the DPC was founded. Since 2002 DPC has been at the forefront of joint activity on the topic in the UK and Ireland, supporting specialist training, helping to develop practical solutions, promoting good practice and building relationships.  A parliamentarian recently asked me which department of government will be best supported by all this work (presumably in an attempt to decide which budget should pay for it).  I answered ‘all of them’.  I am not sure if the question or the answer was more naïve: it’s hard to imagine an area of public and private life that isn’t improved by having the right data available in the right format to the right people at the right time; or conversely frustrated by its absence.  Digital preservation is a concern for everyone.

But that’s not the same as saying that a digital black hole is imminent. It might have been in 2002 but since then there’s been rather a lot to celebrate in the collective actions of the digital preservation community globally (and especially here in the UK and Ireland) where agencies and individuals are beginning to wake up to the problem in large numbers.  These days we’re seeing real interest from across the spectrum of industry and commerce. Put simply the market is ripe for large scale solutions.  It’s easy to focus on the issue of loss, but we can also talk confidently now about the creative potential of digital content over an extended lifecycle.

In January this year the DPC welcomed its 50th organisational member: the Bank of England.  It’s a household name but nor is it particularly a memory institution with a core mission to preserve.  Other new members in the last year include HSBC, NATO and the Royal Institution of British Architects.  They all depend on data and they all need to ensure the integrity of their processes, but they are not memory institutions with a mission to preserve.  Any organisation that depends on data beyond the short life spans of current technology – we’re all data driven decision makers now – needs to put digital preservation on its agenda.

If the last decade has taught us anything, it’s that we face a social and cultural challenge as well as a technical one.  We certainly need better tools, smarter processes and enhanced capacity which is ultimately what Vince’s suggestion for Digital Vellum is about (though others dispute the detail of his proposal).  But this won’t solve the problem alone. We also need competent and responsive workforces ready to address the challenges of digital preservation.  Time and again surveys of the digital preservation community show that the skills are lacking and where they exist they are themselves subject to rapid obsolescence.  We know that digital skills are crucially short in the UK economy: at the same time as Vint was arguing for Digital Vellum the Chief Constable of Police Scotland had to apologise for having misled parliament because statistics about draconian stop-and-search powers were inadvertently deleted.  The nation’s most senior policeman could lose his job because his organisation lacked digital preservation skills.  Arguably the lack of skills is a bigger challenge than obsolescence.

Moreover a political and institutional climate responsive to the need for digital preservation would allow us to make sense of the peculiarities of copyright.  Those who argue for the right to be forgotten ingenuously assume an infrastructure where you will be remembered: a somewhat populist rush for data protection and cybersecurity is tending to stifle reasonable calls for data retention.  This is pretty raw stuff.  At the same time as the technology commentators were worrying about technical obsolescence a senior politician was caught deleting content of his own containing comments that now seem ill-judged. The machinations of those who want us to forget might well be a bigger threat to our collected memories than digital obsolescence.

DPC was founded to ensure closer and more productive collaboration by its members.  I grant you that some of this has involved the slow grind of hard problems: a standard here, a training programme there, a research project peering into the future, a policy review, a procedures manual.  All of it is worth celebrating and we’ve been doing so for years now. I have no idea why journalists haven’t noticed this: we’ve been trying to get their attention for years.

San Jose is lovely in early spring. But there’s a better story about digital preservation where we are.


Do you have something to say on a current issue facing the information world? We’re always looking for new contributions to Informed from the information professional community. If you would like to write something for the site, do drop us a line!

How better information management could minimise disasters

The following article was submitted by Katharine Schopflin.

Image c/o Eric the Fish on Flickr

I recently attended a talk given by Jan Parry, who provided information and research support to the Hillsborough Independent Panel, which, in 2012, oversaw the disclosure of public documents related to the 1989 disaster. Their website provides very clear background on what happened on 15th April 1989, as well as outlining the work they did to gather related documents and make them publicly available. I should state that the following is entirely my own interpretation: Jan’s measured, even-handed talk was essentially factual and did not offer up any opinions on either the disaster or the work of the Panel. But it made me think about the information public organisations hold, how they use it, and its value for preventing disasters and revealing facts. I’ve selected a couple of things from Jan’s very fulsome talk, and any mistakes in reporting are my own misremembering or misinterpretation.

Jan began by showing us the map of the Hillsborough ground which was used on the day. This plan both showed and lacked information pertinent to what happened. Much has been written about the power of maps to give and withhold information: they are representations and a choice is always made about what is included and how it is shown. This map shows how space was constricted at the Leppings Lane end of the ground but is limited in representing how confusing the divided inner concourse was for supporters, nor exactly how uneven the distribution of turnstiles was, things which had they been more apparent might have encouraged different policing decisions.  Jan also mentioned information that was available about previous (non-fatal) crushing incidents at Hillsborough and the results of the report into the Bradford City ground, which identified the dangers of using ‘pens’ in football terraces. These were not considered as part of plans for crowd control during the match.

As well as information which was available, but was ignored, some information was out of date. The police Operational Plan, which exists for every football match, had not been updated since the two teams last met at the ground the previous year. From a knowledge management point of view, both teams having met at the same ground for the 1988 semi-final could have provided useful learning points to inform a new Operational Plan, but if there was a debrief of police officers present at the previous match, it was not added to it. And in this case, knowledge transfer was vital, as the Match Commander in 1989 had no experience at Hillsborough. Other information could have been added to the plan: roadworks which made many Liverpool fans late are thought to have contributed to the crowding which built up at the Leppings Lane entrance. I’m not in a position to judge whether having a better map of the ground, an up-to-date operational plan, or more shared experience about managing a match at the ground, would have prevented the disaster. But these seem to me to be vital pieces of knowledge which needed to be expressed, and once expressed, shared and made explicit.

Jan also talked at length about the evidence gathered in the lengthy series of inquiries and investigations which followed the disaster. The stated purpose of the Panel was to allow documents related to the disaster to be released into the public domain ahead of the usual 30 years, but in the process an enormous amount of information was assembled together for the first time. Each contributing organisation: the government, police authorities, coroner’s service and NHS were asked to scan, digitise, redact and catalogue these documents with Home Office support. Some of the documents had always been available, but not read as significant. South Yorkshire Police happily released material which illustrated that some police statements had been edited before being made public, but did not see that this might have damaged earlier investigations. Many of these documents are now being used as new evidence and cases are being reopened.

One of the recommendations from the Panel was that public bodies such as police authorities and the coroner’s office should have a mandate to manage and keep their records (something which surprised me was not already the case). South Yorkshire Police had been going to dispose of some of the records which they instead gave to the Panel, and were within their rights to do so. Of course there are limits to ‘just in case’ record-keeping, but these public sector bodies could, if they had wished, destroyed information pertinent to new investigations. If the suggestions are followed, they will have a responsibility to keep records and manage them in a way that enables them to be found again. This seems essential to me if we want to be able to prevent or subsequently investigate significant events in the future.

However, I think this alone is not enough. Public sector information also needs to be joined up and exposed to analysis, in a way that modern technology now makes possible. Take the absence of information about road works in Operational Plan for the match at Hillsborough in April 1989. Modern data analytics tools enable police to combine public domain data about planned roadworks, standard traffic flows and their own knowledge of expected crowds to build a clear picture of what the outside of any stadium turnstile will look like just before a match. And this isn’t a theoretical picture, an actual map illustrating anticipated road and person congestion at any time of the day could be generated without difficulty. The guidance on policing football matches is undoubtedly more sophisticated now as a result of lessons learned from Hillsborough. Whether or not they have these kinds of tools at their disposal I do not know, but it is surely vital that they do.

Similarly, every investigation into a large-scale disaster or act of violence that I have read about has indicated that joining up information at an earlier date could have prevented death or injury or at the very least found answers to questions asked by victims or their families. Once disparate information is brought together it becomes clear if something in the situation is or was ‘not right’. Indeed, it was with these ideas in mind that the Home Offices large-scale IT systems, HOLMES and HOLMES2 were introduced across police services in the UK in the 1980s and 1990s. Many local authorities are to be lauded for releasing data into the public domain, but it is by no means mandated across all public bodies. Technology has created great possibilities for manipulating data, but it needs to be available to begin with. This requires public bodies to maintain their records and, within reasonable and compliant limits, make them available as early as possible. The technology and analytical expertise exist. Opening up the data also needs to follow.