CIO Folder: The devil is in the data
This month’s TechPro lead feature is about ‘Invisible Technology’ and the ongoing movement towards hiding or at least masking ICT complexity so as not to frighten the punters. Those punters certainly include in-house IT teams and even CIOs. In truth, it is potentially a great approach and likely to be the new orthodoxy. Outsource the heavyweight infrastructural stuff, to the cloud or wherever, and concentrate on the applications and immediate daily things that keep the users and the customers and the Board happy. Smart, innovative things. Clever, joined-up solutions to business processes. Time and money savers.
‘Keeping the lights on’ and all of the other clichés about it make ICT infrastructure almost the ultimate example of hygiene factors — those things that no one notices until they go wrong. Then they stink to high heaven in minutes. But as every CIO and IT manager knows, getting budget by threat is both tough and not conducive to good relations with the CFO. The longer the threat takes to materialise the more the disposition of the money and resources is resented. If it does, when it does, guess who gets the blame? So when you cannot win, changing the game is an attractive option. Taking all that infrastructural complexity and maintenance and expense and hiving it off to a service provider would solve many CIO problems at a swoop.
“Forensic data discovery is now an expert and expensive art. If you make your organisation liable for it unnecessarily, there will be costs — and quite probably repercussions”
When that is done, it would be a good idea to open your nose to the ROT — that wonderful American acronym refers to Redundant, Obsolete/Outdated or Trivial. It applies specifically to data in the organisation. In these days of business intelligence and data analytics, it comes into focus — or at least it should — because data is right back front and centre in ICT and what it is actually for.
Which brings us in turn back to the good old days when data types and quality and relative importance were all considerations for investment in expensive storage resources. Volume was a challenge, as it is in Big Data, but for more down to earth reasons — it was expensive to store data properly. Remember when tiered storage was as much about economics as performance? We used the Information Life Cycle to allow us to make data disposal decisions — to save on storage capacity costs.
Retention policies
Data retention policies are central today, but in general they are guided by compliance and governance regulations rather than by the needs of data analytics. A CIO pal in financial services once remarked: “There’s only one thing worse than being caught by an investigation without information you should have — that’s being caught with information you did not need to have.”
There are several potential reasons for that but the simple one is that if you can say ‘Sorry, over seven* years ago, so we have deleted that’ there is nothing more for you to do. But if you do have it, you now have to follow through with a complete search and compliance procedure with its attendant consumption of time and resources. Forensic data discovery is now an expert and expensive art. If you make your organisation liable for it unnecessarily, there will be costs — and quite probably repercussions.
On the other hand, data analytics and data mining are all about massive data volumes, the bigger the better. It is a sort of perverse truth that the value of any individual piece of data goes down with time but the value of a collection of data goes up with time. Funnily enough, quite often the more heterogeneous and even random the information is the better. So perhaps in many types of organisation the proper impulse should be to store and archive everything. Like museums and libraries since the dawn of civilisation, today’s ephemera are tomorrow’s treasures.
That in turn suggests that data classification should be an ongoing preoccupation of the CIO — or somebody, like a Chief Data Officer. If there is one. But there soon has to be a Data Protection Officer under EU law, principally to ensure compliance. In organisations that are not really large enough for that role to be full-time it would certainly make sense to add the internal responsibility for the proper management of corporate data.
Compliance needs
Come to think of it, the emerging role of the CDO could cover both sides — the management of data for business purposes and the protection and deletion of data for compliance. Over and above the strict regulatory regimes that legally demand compliance there are multiple ethical questions that have been arising like flies from data cowpats for years, long before Big Data was coined. A simple one is the issue of cross-selling, because one side of the house knows something about a customer that might help sell something else. Google can use such associations freely, but what about financial products or healthcare? Even if customers give consent, are they really aware of or informed about the potential consequences? Can you sell customer lists with past behaviour or buying patterns?
There should not be any real conflict of interest between the two roles or objectives, compliance and responsible corporate data management. There is also the potential in data science for the purging of personal and other regulated information while still retaining the characteristics and patterns shown up by analytics when the detailed data was validly held. The salient point is that there are at least two major strands in data management that are becoming more critical by the day and we are clearly not doing enough, in the commercial world or even, arguably, in the state sector.
Now between all of this it is easy to come to the melancholy conclusion that the devil is in the data. That is where danger lurks. Apps and applications are wonderful, systems are ever-faster and ever more powerful, cloud is no longer vapour-ware and we can treat ICT more and more like a commodity with the mysteries tactfully hidden behind the scenes, But all of those progressive things are for information. Data processing, after something like 60 years, is still the simplest and most accurate term in ICT. That is one paradigm that has never shifted.
Constant danger
The data is also bedevilled because that is what ICT security is needed to protect and is, as we all know, less than universally or consistently successful. Loss of data is a constant danger. The Association of Data Protection Officers in a recent survey established that a full third of data protection professionals experienced a data breach in the last 12 months. It is mathematically spurious, but one cannot help suggesting that this implies that Irish organisations big enough to have a data protection pro nevertheless have a data breach at least once every three years. But of course it could be more a question of serial offenders and averages.
Another angle from this survey is that 71% of the data breaches were attributed to inadvertent actions (or inactions) by staff. That is another data management issue in itself. Other than deliberately for malicious reasons, how can this be possible? It is the old story about the lost laptop in the taxi/train/plane — never mind the device, why was the data on it and why was it not protected by encryption or strong access security? As for data embezzlement or other employee crime, serious data protection includes access control or prevention. Sure you can look at information record by record, file by file, if you have the right permission. But you sure as hell cannot copy the entire database unless you are the DBA with the requisite authorisation
Or can you? Time to do a data governance and protection audit. On the negative side, you are in danger of prosecution, legal suits by unhappy customers, loss of money or customers or market position, reputational damage and lots of unnecessary time and trouble. Should we mention directors and the possibilities of personal liability? On the positive side, your store of data is a core asset with real value that is appreciating as it builds.
It is all about the data.
* Or whatever the specific requirement is
Subscribers 0
Fans 0
Followers 0
Followers