Architecting Modern Data Platforms

As organisations struggle to capture and leverage multitudes of data, there is a surge of technological options to choose from. Well designed data platforms facilitate experimentation, have shorter time to markets, have faster adaptation to latest advancements in data technologies, promote self-service thereby accelerating data adoption.  Data being the key enabler for business transformations, it is vital to build platforms that accelerate validation of use cases and can handle scaling of use cases and users. Designing a platform which is elastic enough to embody all the above can be quite a daunting task.

MDA

The primary points to consider when architecting modern data platforms:

  • Customer centric

Organisations battle immensely with legacy data technologies to deliver personalization, and customer experience, despite there being so much emphasis on hyper personalization. Thinking on the lines of creating 360 ° customer view helps align technological choices after business pain points.

  • Cloud Native

Cloud solutions support elastic scaling, high availability  and secure fully managed services with integration to a range of enterprise security systems including LDAP, Active Directory, Kerberos and SAML. Cloud  solutions allow pluggable architecture – replacing components if better options are available with minimum reconstructing. Cloud platforms eliminate the time-consuming work of provisioning resources and infrastructure, thereby reducing time to market.

  • Multi-platform architectures

Be it multi-cloud or multiple data storage patters, it should be the use cases that dictate the architectural patterns and not vice versa. Datawarehouses, datalakes and NoSQL databases can all co-exist on multi-cloud platforms if the use cases demand so. Organisations should avoid platform/vendor lock-ins, because then businesses are forced to make technology choices that are not in the best interests of the company.

  • Microservice-enabled

It is critical to  envision data as not just a means for visualization like a diagnostic tool, data is critical to help organizations adapt to change, in evolving business environments and to innovate and every company wants to expedite the process to be the first ones to come up with innovative products and services. Data plays a key role in this aspect. Monolithic applications are a major bottleneck in this case. In microservices based design small decoupled services are developed completely independent of each other  to achieve business requirements, faster, generally through REST APIs or event streams.

  • Flexible

Modern data platforms should be flexible enough to accomodate rapidly evolving business requirements. Be it integrating new data sources or feeding data into futurist data products. Modern data platforms should simplify testing new ideas on a small scale prior to making heavy investments in infrastructure.

Modernization continues to be a strong trend in data platforms, whether on Hadoop or RDBMS or multi-tenant solutions. It is the ease of integrating new data sources, TCO, prototyping functionalities, security and scaling that matter most in modern platform architectures.

 

Intelligence Of Things

IoT
IoT

IoT – Internet of things, is the science of an interconnected everyday life through devices communicating over WiFi, cellular, ZigBee, Bluetooth, and other wireless, wired protocols, RFID (radio frequency identification), sensors and smartphones. Data monetization has lead to generating revenue by gathering, analyzing customer data, industrial data, web logs from traditional IT systems, online stream, mobile devices and sensors and an interconnection of them all, in other words, IoT. IoT is hailed as the new way to transform  the education sector, retail, customer care, logistics, supply chain and health care. IoT and data monetization have a domino effect on each other which generate actionable insights for business metrics, transformation and further innovation.

The wearable devices are a great way to keep tab on patient heart rates, step counts, calories consumed and burnt. The data gathered from such devices are not only beneficial for checking vital signs but also can be used to scrutinize effectiveness of drug trials, analyzing the causes behind the way body reacts to different stimulus. IoT in logistics, by reading the bar codes at every touch point that track the delivery of products, comparing the estimated with the actual time of delivery, analyzing the reasons causing the difference can help businesses bolster better processes. In Smart buildings, HVAC (heating, ventilation, air conditioning), electric meters, security alarm data are integrated, analyzed to monitor building security, improve operational efficiencies, reducing energy consumption and improving occupant experiences.

IoT is expected to generate large amounts of data from varied sources  with a high volume and very high-velocity, thereby increasing the need to better index, store and process such data. Earlier the data gathered from each of the sources was analyzed in a central hub and communicated to other devices, but the IoT brings a new dimension called the M2M (machine to machine) communication. The highlights of such M2M platforms are

  • Improved device connectivity
  • API, JSON, RDF/XML integration availability for data exchange
  • Flexible to be able to capture all formats of data
  • Data Scalability
  • Data security across multiple protocols
  • Real-time data management – On premise, cloud or hybrid platforms
  • Low TCO (total cost of ownership)

The data flow for an end-to-end IoT usecase entails capturing sensor-based data using SPARQL for RDF encoded data from different devices, wearables into a common data platform to be standardised, processed, analyzed and communicated further as dashboards, insights, as input to some other device or for continuous business growth and transformation. Splunk, Amazon, Axeda are some of the M2M platform vendors that provide end to end connectivity of multiple devices, data security and realtime data storage and mining advantages. Data security is another important aspect of IoT, adhering to data retention policies. As IoT evolves, so will the interconnectivity of machine-to-machine platforms, exciting times ahead!

The data value chain

lifecycle
The Consumer Lifecycle

The terms “Data driven” and “Big Data” are the buzz words of today, hyped definitely, but the implications and potential are real and huge! Tapping into the enormous amount of data and associating this data from multiple sources creates a data chain, proving valueable for any organisation. Creating a data value chain consists of four parts: collection, storage, analysis, and implementation. With data storage getting cheaper, the volume and variety of data available to be exploited is increasing exponentially. But unless businesses ask the right questions and better understand the value that the data brings in and be sufficiently informed to make the right decisions, it does not help storing the data. For example, in marketing, organisations can gather data from multiple sources about acquiring a customer, about the customer’s purchasing behaviour, customer feedback on different social media, about the company’s inventory and logistics of product delivery. Analyzing this stored data can lead to substantial number of customers being retained.

A few of the actionable insights can be as follows:
  • Improving SEO (search engine optimization), increasing the visibility of the product site and attracting more customers
  • CRO (Conversion rate optimization) i.e. converting prospects into sales, by analzying the sales funnel. A typical sales funnel is Home page > search results page > product page > proposal generation and delivery > negotiation > checkout
  • Better inventory control systems, resulting in faster deliveries
  • Predicting products that a consumer might be interested in, from the vast inventory, by implementing good recommendation algorithms that scan through the consumer behaviour and can predict their preferences
  • If some of the above points are taken care of, customer loyalty can increase manifold, based on the overall experience during the entire consumer lifecycle.
actionable
Data blending which leads to a Single Customer View and Actionable Insights

Often the focus lies on the Big data technology rather than the business value of implementing big data projects. Data is revolutionising the way we do business. Organisations, today, are inundated with data. To be able to make sense of the data and create a value chain, there has to be starting point and the customer is a good starting point. The customer’s lifecycle with experiences at every touch point defines business growth, innovation and product development. The big data implementations allow blending data from multiple sources leading to a holistic single view of customer, which in turn gives rise to enlightening insights. The data pretaining to customer, from multiple sources, like CRM/ERP/Order Management/Logitics/Social/cookie trackers/Click traffic etc., should be stored, blended and analysed to gain useful actionable insights.

In order to be able to store the gigantic amount of data, organisations have to invest in robust big data technologies. The earlier BI technologies that we had do not support the new forms of data sources such as unstructured data and the huge volumes, variety & velocity of data. The big data architecture consists of the integration from the data sources, the data storage layer, the data processing layer where data exploration can be performed and/or topped with a data visualization layer. Both structured and unstructured data from various sources can be ingested into the big data platform, using Apache Sqoop or Apache Flume, real-time interactive analyses can be performed on massive data sets stored in HDFS or HBase using SQL with Impala, HIVE or using statistical programming language such as R. There are very good visualization tools, such as Pentaho, Datameer, Jaspersoft that can be integrated into the Hadoop ecosystem to get visual insights. Organisations can offload expensive datawarehouses to low cost and high storage enterprise big data technology.

bigdatarch
Edited image from Hortonworks

Irrespective of the technical implementation, business metrics such as increasing revenue, reducing operational costs and improving customer experience, should always be kept in mind. The manner in which the data is analyzed could create new business opportunites and transform businesses. Data is an asset and investing in a value chain, from gathering to analyzing, implementing, analyzing the implementations and evolving continuously, will result in huge business gains.

Streamlining the process of processing

simplifyThe customer expectations are very different, now. Decisions need to be taken in real time, to convert a prospective customer into committing. In an age, where customer seeks instant gratification, organisations that have a longer time-to-market due to cumbersome internal processes, customer loyalty is hard to win. For example, a customer visits your physical store, if you offer a discount at the very first visit, the chances that the customer will revisit your store are high. On the other hand, if you are merely noting customer behaviour which then has to pass through unwieldy processes, later, to mete out a discount coupon, the second time the customer visits your store… if at all, is a thing of the past. The advanced analytics systems now, are able to handle data influx from multiple disparate systems, cleanse and house in the dmp (data management platforms), ready to be queried in real time to cater to predictive and actionable insights, on the fly.

However, if the business methodologies used are not complimenting this speed of data processing, the business will still suffer. The widely used, Lean methodology preaches creating more value for customers with fewer resources. Anything that does not yield value should be eliminated. But organisations need to adapt to only the best of the best practices. Following methodologies by the book, on the contrary, causes bottlenecks. To be able to leverage more out of the Business Analytics systems and solutions, the processes and tools, both, need to be streamlined to create customer satisfaction. A lot of the business intelligence projects take too long to deliver and are inflexible, resulting in the functional business teams procuring BI tools which promise quick wins. The problem with such data discovery tools, apart from creating data silos, are that they lack data governance, hinder data sharing at an enterprise level and increase licensing costs.

It is not a solution to have no business process at all. There needs to be accountability and that comes from business processes. It is a continuous iterative process to find the right balance between processes and the speed of delivering value to keep the costs low and increase the profitability of any business. One size does not fit all and it applies to organisations, as well. Methodologies/processes need to be tweaked, tuned and tailor made for each company. Organisations that try to implement Lean/Agile/Scrum but fail are because they lose the customer focus, some companies do not have a clear strategy in place with employees being assigned foggy responsibilities and lack of communication and this in turn results in the focus shifting from the task at hand to the nitty gritties of such project management methods.

To avoid pitfalls, a clear business strategy needs to be defined specifying business goals in order to maximise gains. The next step is to trim all the processes that lead to this gain.