Many, many, years ago, in a galaxy far, far away, business owners simply looked at their computer systems as a way to ensure proper invoicing to their customers. The thought process was, I need to get away from the pen and paper and ensure timely and accurate billing of my customers, so that I can control my cash flow. However, over the years, business owners began to realize that the underlying data behind these invoices were worth much more than the efficiencies gained by improved control of cash flow and billing of their customers – they started to understand that the data was one of the most important assets of their business. This data gave visibility into questions like – who is buying my products, when they are buying them, how much am I making on the sale and whether a product I’m selling is really the best thing on the market or is a dog that I’m wasting valuable cash to keep inventory. Furthermore, it is also gave a way to shed light on other elements of the business, like whether the sales staff is performing period over period and how often the inventory is turning over in the warehouse. Finally, it provided a mechanism to quantify sales lost due to not having the inventory in stock when the customer needs it.  This data has real value!

In those early days (I’ve been in this game for over 30 years) computer speed was not great enough to ‘crunch’ this data into meaningful facts, disk storage was costly and limited, and many business couldn’t afford to maintain multiple years of sales history. Therefore much of that data was purged at the end of the month or at least, the end of the year. Now let’s fast forward to today. Computer speeds have increased greatly. Hard drives can store years upon years of historical data and businesses can’t afford NOT to maintain this history.

The question now is, how much data is enough, who needs to see it, how fast do I need it and what am I going to do with it?

Many businesses I visit these days are equally excited about the daily processing and workflows that we’ve architected as the methods and different ways they’ll be able to cull, and slice and dice their data. The questions surrounding who is going to be able to see what data – employees vs vendors vs ownership vs customers, are all the rage.

  • Employees from purchasing to sales ownership – across the enterprise – need to see the data to plan their next selling cycle
  • Vendors want to see depletion data to understand how fast product is being sold off the shelves of their distribution channel so that they can make their plans
  • Ownership, well, they want to know if the business is growing and where they should be investing their money.

In many cases, the data is extended beyond the application and accessible over the Internet vis-à-vis a Web Portal,  so that end customers can not only enter orders, but view historical sales and invoices in real time.  So, to answer the question of who –  the answer is everyone.

The Value of a Data Warehouse

With all these groups of people looking to access the data, one has to wonder how much of a strain will be put on the database and how it will impact the company as they try to process the daily business. We also have to think about the security that is needed to protect this data from theft and hacking. Although this blog will not delve deeply into the security side of the conversation, it is important to understand that the data needs to be protected and we only want to expose that data which is relevant and tailored to those consuming it. We don’t want to give the proverbial keys to the kingdom to those who only need to see small snippets of data.

And now to the question of how much data and how fast do I need it.  In terms of how much, the sky is both literally and figuratively the limit.  With confluence of the cloud and the extremely low costs of storage/disk space, you can store as much data as you need.

In terms of performance, it’s important to understand that there are two basic ways to house your data – a data warehouse vs live reads against the database. The idea of a data warehouse is as foreign as it might seem.  Essentially, it is a storage location similar to the one your business stores it inventory but instead, the inventory is the DATA. This data is valuable and needs to be protected and there is a cost to maintain and protect. When people try to access your data, when it is in a data warehouse, they are only accessing that data you felt was important and necessary for them to see. It will be ‘cleansed’ so that it is precisely what the user needs – nothing more and nothing less. Typically, data gets into the warehouse via nightly transfers from your live database when the network load is at a minimum to limit the disruption it may have on your system.

Therefore, when the users access the data in the warehouse, it is already properly organized and those accessing it will not impact your employees trying to run their daily functions.  This is where the speed is affected – in a data warehouse, multiple years of data may be stored but this data is only used when it is needed.  This is a great way to keep your production database lean in terms of historical data – you draw upon it in the data warehouse only when needed.  However, it is important to note that one drawback to the data warehouse is latency – it is typically a day behind in data. Since the data is transferred the night before, the data is not accurate is not real time – however, in most cases (of course, depending on the data and scenario), this data disconnect does not disrupt the consumer of this data from getting the information they’re looking for.  As I’ve already stated, the main advantage of the data warehouse is speed and efficiency.

On the other hand, in the absence of a data warehouse – and reporting is being done against live reads on database, the impact this reporting may have on your system’s performance could be significant, across the enterprise.  Not only would your reporting consume valuable processing power, your data would not be staged and efficiently cleansed, thus simultaneously slowing the time it takes to process the requested report.  Further, the user would be getting way more data than they need – they would have to wade through data fields they might not want or understand to find the data they are actually looking for.  And if ever your production system needed to be updated, rebooted or repaired, the data would be inaccessible.

Given the choice between leveraging a data warehouse compared with performing live reads against the production databased, I think that, for all of the reasons mentioned above, a data warehouse is a no brainer.  Once the data warehouse is in place, web pages can be written to pull the data when users log in to view it, emails can be created to push important information to users as needed and reports can be generated to slice the data any which way without impacting your internal users.  Isn’t that really the desired effect?  Allowing other access to your data without impacting your daily business activities.

Lastly, it is important to work with a competent networking organization to properly secure this data with firewalls and security hardware and/or software to ensure your data remains your data and is not exposed to your competitors.  These types of vulnerabilities will only embolden the naysayers who will claim that things were better off in the old when we everything was done on paper and nobody was able to see our data – and this couldn’t be further from the truth.

If you haven’t already thought about your data strategy, contact our team so that we can help you understand how you can make sense of your data – your businesses most important asset. Or, schedule a free infrastructure assessment