no image

Data value

Pro
Image: Nasa

1 May 2012

Anybody who remembers ‘data mining’ and its promise of riches untold in predicting customer and consumer behaviour will know it faded but was never going to go away. The idea is just too attractive: put a mass of data in place, run it through some sort of super-powered data grinder and get answers to all sorts of business questions. Early data mining applications were asking the same questions: who buys our products? What else do they buy? The more we know the better we can target marketing, incentives, even placement in a store.

Who would link beer and nappies? Yet a famous discovery in the early 90s by a first generation data mining application in the giant Wal-Mart chain showed that young husbands allocated the family shopping chores are inevitably attracted by the ultra-giant bargain packs of disposable nappies (so that fewer future trips will be necessary, Irish cynics will speculate). What else do they buy on these expeditions? Six-packs of beer of course! So an enterprising merchandising exec put two and two together, put special offers of nappies and 24-can slabs of beer at the ends of adjacent aisles and discovered that 2+2 sometimes does make 22 because sales of both items soared.

Correlations
The quest goes on. Hidden and potentially valuable connections and insights can be found by data mining and other analytical techniques. The beer and nappies combination has struck marketing people ever since as a revelation, the kind of wacky insight that was unlikely ever to have been discovered by orthodox research other than by fluke. It gave a major impetus to the through line of data handling and analysis from data mining and data warehousing through to the smart customer relationship management (CRM) systems of today.

Once upon a time the simple objective of data management was to record and store information so that it would be secure and available quickly when retrieval was required. Ah, innocent times! Today, we generally assume that almost all data can yield added value if you collect enough of it and analyse it smartly enough. There would be no hype about Big Data (apart from the cost of storing it all) if we did not regard it as potentially useful. Like other forms of mining, however, it is taking a lot of seismic probes to establish if there is actually anything there-and even more to find out if it would be economic to mine.

 

advertisement



 

There is a huge focus on data analytics today, in business as in science, government, security and of course our friends in the camouflage gear. It has come into the fore in recent years (look at the way BI has become mainstream, and almost commoditised with some Microsoft tools) because of a number of drivers. Global, consumer mass markets are unmanageable without clear views from the top and across marketing, supply chain and partner relationships. Social networking has thrown many market leaders into a spin because consumer behaviour has become increasingly erratic compared to the traditional distribution/retail model with Christmas planning in February and next season’s fashions three or four months ahead.

User view
IBM’s Mike Baker is consultant for Business Analytics Software Solutions in Ireland and tends to take an end-user point of view. "There are lots of terms out there now that are downright confusing, ‘predictive’ and ‘smarter’ and so on and even BI itself which has lost precise meaning-and is certainly abused in the market. IBM has been in this field for decades and today we are concerned with the useful and practical application of analytics in specific business areas and sectors. Financial performance and strategy has always been a key area, traditional BI has extended analytics out to other business functions, notably sales and marketing but other operational areas as well."

Baker sees three broad strands of applied analytics as the most important in today’s market: "Customer insight is very important and clearly also is one of the potential glamour applications, especially when some predictive power is added in. The mainstream is operational and performance efficiency monitoring with increasing elements of ‘What if…?’ scenario modelling. Global supply chain management is a major area, for example."

Business analytics systems are becoming particularly significant, he adds, the monitoring of all aspects of risk, with which these days governance and compliance can be bundled. "Most analytics users are comfortable with the range of ‘accuracy’ that can be expected from their systems. They understand perfectly well that something like Solvency II compliance requires total accuracy based on clean, well-managed data stores. On the other hand, marketing and consumer behaviour stuff is inherently about gaining insights on which to make better informed human judgements."

There is a growing trend towards more free-wheeling data analysis tools that can work with less formally organised data or, for instance, blend highly reliable data as in a data warehouse with other looser information. "Trying to model possible business scenarios will probably involve different strands of data with varying degrees of ‘reliability’ because, after all, you are usually trying to gain insights into real life," Baker points out. "Today there are analytical tools that allow users to take a ‘follow your nose’ approach to exploring data, investigation relationships and possibilities outside of traditional reports or ways of looking at things. Again, you are not expecting high standards of ‘mathematical’ accuracy but you can certainly gain valuable understanding."

Processing power
Another view of analytics today is that the sheer power of current computing to run complex algorithms on large data stores in practical time scales is driving or at least encouraging the expansion of business analytics. Adrian Simpson, CTO of SAP for UK and Ireland, reckons there may be some truth in that but it hardly matters because Big Data is a genuine current problem. "Our challenges include the three Vs-Volume, Variety and Velocity. Of those three, velocity is the toughest in many respects. Today’s business speeds in retail, financial services, supply chain, web marketing and all other global operations need analysis on two fronts. The traditional retrospective analysis of stored data has great value still, but enterprises need live analysis on transactional processing as it happens."

Taking SAP’s now well-known HANA in-memory technology to make his point, Simpson explains that powerful technology like this can support near real-time analysis in transactions where time is a critical element. But its power also gives a platform for crunching through large volumes of data in realistic time scales where I/O and other technology limits can render analysis of Big Data impracticable.

But data analysis is about data and quality, Simpson points out, which brings in human skills and judgement when designing systems and projects. "Telcos are often used as examples, because mostly they have customer records going back decades and more. Now call records-and that can certainly be Big Data territory-have very few parameters so there are definite limits to what we might learn from analysis. But add in customer records and you have lots more information to play with, geographical and gender and possibly demographic and so on. But in today’s world with accelerating mobile use and changing habits, would any more than the most recent few years’ data really have anything to tell you? Even the mobile carriers, who have been keenly analysing churn factors and all of that for years, may have to decide that more than a couple of years’ data is just unlikely to help in predicting today’s changed market characteristics."

That sort of questioning becomes central to decisions about the expensive mining of big data. Is the vast bulk of the data going to be sheer noise with minimal contribution to the validity of the analysis? On the other hand, in looking at retail shopper behaviour changing habits may give very good clues to changed economic circumstances, pregnancy, dieting and so on.

Data consistency
Data quality is central, in the view of Bob Duffy, database architect and consultant with Prodata, a Microsoft SQL specialist. "Conformance, the consistency of data from different systems, is a key start point for business analytics generally and BI and all of the increasingly powerful tools that are coming into the market and being used to good effect by non-professional people. Dealing with large volumes of data and empowering those end users to slice and dice and visualise, there is a need for aggregation engines or formal data warehouse structures."

"It would be harsh to say data warehousing is essential, but all analysis depends in large degree for its performance on clean, consistent data," Duffy believes. "Self-service BI is becoming the most popular form of business analytics, with newer tools like PowerPivot and PowerView adding real power to enable users to look in fresh ways at their information. It is literally a new generation of data exploration with no expertise required. Google’s cloud-based Data Explorer offers public information in a way that can be linked to in-house tools to look for patterns, possible causal factors and so on. This is potentially extraordinary analytical power for ordinary people just to be led by their brains."

High end, high volume, click stream analysis of complex data in real time is the top level, although separated more by processing power and specialist expertise than by any conceptual differences in the analytics. "This level is possibly epitomised by online games and gambling," Duffy says. "The casino poker enterprises are monitoring everything, working with 50TB or even 100TB or more of data in real time to alert for fraud or cheating and other irregularities before they can become significant. It is Las Vegas security on a global scale. But is a model and exemplar for the power of applied analytics."

Data appliance
As one of the world’s giant software vendors that began as a database company, Oracle is certainly aiming to stay in the vanguard with its Exadata ‘appliance’, essentially a dedicated database machine that is engineered specifically for high performance data analysis, whether for data warehouse retrospective applications or for OLTP real-time stuff. "There’s just no question, it is the huge horsepower advances in today’s IT that have enabled analytics to move on to greater levels yet be deployed in a practical way in real life business situations," says John Caulfield, Oracle Ireland solutions director.

The Exadata appliance shows the way things are trending, engineered specially for the needs of high speed analytics. It is not just computing power but data loading at 10 or 20 times the norm, special compression technology and of course massive RAM. A phrase from the marketing like "…..more than 5 terabytes of Exadata Smart Flash Cache" certainly gives a fair clue to the architectural approach.

Caulfield’s attitude to Big Data challenges is that in practice in sectors like retail, telcos and especially mobile carriers, financial services, online sales, gambling and many others, the time to apply smart analytics is always now. "John is in the store/online and is doing something usual/unusual/variable. That is the moment. That is when the power of your analytics is tried and applied and it is a lot more than just CRM operating very fast. The other side of this new world is that consumer behaviour, product and service offerings and other factors are changing so rapidly that your carefully gathered historical data, maybe just from last year, may already be largely irrelevant."

Beyond warehousing
In the past decade we have progressed from business analytics that was based on data warehouses and deep reporting or BI and insights into operations and customers and markets that was all about the past, says Paul Pierotti, senior analytics manager of Accenture and based in its new Analytics Innovation Centre in Ireland. With 25 jobs already filled it is on target or a total team of 100 next year. "Now the preoccupation is with predictive analytics, knowing at least broadly where the market and the customers will be at tomorrow and adjusting the enterprise and its offerings to maximise the business gains from that advance knowledge.

"On a narrower front, analysis to monitor and predict specific fraud potential has become very sophisticated and is being applied in sectors from banking and insurance to online gaming. From a boardroom perspective, smart analytical applications to manage risk and ensure compliance have become and investment priority."

"We are seeing massive demand from clients globally, multinationals and smaller enterprises that are just ambitious and smart, also from governments and agencies. Yes, it’s the obvious human lure of seeing into the future but the point is that we now have many practical and proven tools to do that," Pierotti says. "Modern analytics is a blend of mathematics and statistics, clever algorithms and computing power. It is by no means all about magic computing machines, despite some of the marketing hype form vendors. Serious human professional skills are key-which is why we are recruiting the best people we can find-because the problems and challenges are different in every organisation, much less between sectors, markets and cultures."

There is no one size to fit all in data analytics, he emphasises, because the analysis itself has to be designed to fit the case. Even after that, systems have to be refined and adjusted to incorporate what has been learned from previous stages. "At the heart of most business analytics we are trying to assess, measure and predict risks and possible alternative scenarios. Probability is the key, tempered by data inputs from actual experience or external factors of known value."

Knowledge hunger
Tom Khabaza is an independent consultant with more than 20 years’ experience in data analytics who works regularly with Presidion, formerly SPSS. He is quite clear that the topic is front and centre in today’s society, as well as business, because there has been a pent-up hunger for the value and accuracy of logical analysis in many spheres. "Now we have the IT power for that hunger to begin to be satisfied. Big business, for example, has grown beyond the ability of the owner-proprietor to know all the customers. So in a very real sense business analytics today has the power to become the corporate brain, to monitor and remember and even ‘learn’ what is required and how to make good decisions. Point of sale and CRM and SCM and other systems are the eyes and ears of the enterprise at the touch points with customers and partners. The business can be modelled and then all of its activity and transactions serve to enrich the model over time. That should, in theory, foster greater efficiency and decision making and all the rest."

He is firm also on what applied analysis is for: "To my mind there is only one value, business value. The validity of an analytical model is in its usefulness-remembering that we are talking business knowledge, not statistical certainty. If your analysis or modelling comes up with something mad, you will examine it in the light of business knowledge and experience. There is a strong link between today’s analytics and automated decision making, but it is a link not a process."

Khabaza makes a differentiation between BI and other types of business analytics based on existing data and processes. Data mining and attempts at predictive analysis are different, looking to achieve a kind of corporate perception. "Data of its nature, and especially Big Data, is just beyond the grasp of the individual human brain to grasp in totality, to see patterns and anomalies. So the analytical tools are serving to amplify and expand the cognitive capacity of humans, meaning organisations in this context. As always, the objective is to improve decision making."

Read More:


Back to Top ↑

TechCentral.ie