Rethinking data platforms to enable digital transformation: Part 1
Evaluating the changing technology landscape
Organizations are enthusiastically transforming their operational processes and infrastructure with a razor-sharp focus on the customer at its core. The success of a business now truly depends on the rate and depth of analytics within its organization in a bid to serve customers a more intimate and targeted experience. The accrescent adoption of digital technology landscape by the users and businesses to buy and sell has convinced organizations to direct efforts on instituting appropriate data infrastructure and advanced analytics processes internally to cater to the ever-evolving digitally driven customer of today. At LatentView Analytics, we specialize in driving digital transformation for our clients by rethinking their data platforms and leading internal business processes to grow through their analytical process adoption curve.
My colleague Ramesh Hariharan, in his series of posts on decoding the data science terrain, has explained the ways in which analytics is conducted and should be conducted for diverse organizational set-ups. A data science terrain, as we understand through our years of industry experience, is a potpourri of data, analytics, infrastructure, security and services directed towards the betterment of internal processes, ultimately aiming towards customer satisfaction. At the intersection of data, infrastructure and analytics lies modern data platforms which, when built for specific business needs and securely presented for adoption, can yield returns manifold vis-à-vis a traditional data warehousing set-up.
In this series of blog posts, I would like to address the various components and stakeholders involved in rethinking data platforms and also dive into our experience on how we have implemented modern data architectures at scale.
Identifying the weakest link
The traditional data warehouse set-up has accommodated conventional business use cases like reporting and data extraction. To power and support the diverse needs of different data user personas, the entire data architecture has to evolve and support multiple analytical workloads such as interactive, real-time and advanced analytics. Cloud and other newer technology paradigms have led to a sharp growth in adoption of data to power everyday business process. While migrating to the cloud from on premise data warehouses eventually will provide the estimated cost and efficiency benefits, it is not directly solving the core problems associated with engineering analytics. In order to build and provision a platform that holds the central ground truth and enables an accountable way of delivering analysis, one must ensure business continuity by providing necessary support to data consumers.
In this regard, organizations can look at short-term solutions like fine tuning the warehouse to accommodate different/diverse workloads. However, this will eventually lead to the throttling of resources and restraining access. A typical concern of a data analyst would be to have sufficient horsepower and workspace to help narrate a convincing business point of view. This would warrant provisioning custom workspaces and additional capacity to support their requirements. Business users and managers would like to explore data visually and analyze various relationship between metrics. As data warehouses are not naturally designed to support such requirements at scale, business users have greatly relied on third party tools to serve themselves data and propel their analysis. Data scientists might want to build sophisticated and best-of-breed analytical models for various business use cases. As they reach out to pre-process and extract years of transactional and dimensional data, the warehouse will naturally come to a grinding halt. Data scientists are then forced to down-sample data as much as possibly in a stratified way and continue with benchmarking their advanced analytics approaches to multi-dimensional and nonlinear business problems.
This creates an ecosystem where multiple versions of truth are maintained as data has to be pumped into multiple destinations for consumption by different business groups.
A bigger challenge in such data architectures is a lack of processes to benchmark, qualify and productionize advanced analytics models. In order to support the newer analytical workloads, the data ecosystem gets fractured and as a result, leads to no central ground truth, an inability to collaborate across different data ecosystems, constraints in productionizing advanced analytics solutions at scale and above all, a lack of accountability and user-friendly way to consume data.
Why you need to rethink your data platforms
Any infrastructure that deals with analytics should be proactive and should be built to reduce lead time to insights. As the data moves through its nozzles, the systems and infrastructure should harness the value out of the same. The need to be at the right place at the right time with the right focus makes all the difference. But, fundamentally you have to be data hungry!
“It is a capital mistake to theorize before one has data. Insensibly, one begins to twist the facts to suit theories, instead of theories to suit facts.”
– Sherlock Holmes
The world is indeed one big data problem. Industries have certainly moved from data insufficiency to data inundation. This poses a huge concern for organizations building and evolving their analytics infrastructure. To be able to mine from the data deluge to build a customer-centric business process is the most critical need and ask.
What would it take to build an ideal data ecosystem that supports multiple analytical workloads without compromising on the central ground truth? What will be the framework of a workload aware data ecosystem? What are the core problems that will be solved and what are the other challenges that will arise as we set out to implement a fundamental, but significant change within the organization that will look to embed analytics at the core of all its business processes? We will look to address these questions in our next blog post!