As a data engineer, data scientist or just a general data team, how do you go from high level goal to a final data product? What is the process taken to successfully create a product that provides insights and action?
Just to clarify a data product can be a model, a dashboard, a web app/API or even a simple excel output. But it all has to start somewhere.
Note: In this series we will be walking through creating a data product. In this case we will be basing it on a company that has both online and in person sales. We will be creating videos, posts, designs, like an actual project. Going through the SQL, python, and other possible steps.
If you are interested in keeping up with all our posts, then please sign up here.
Let’s say we are a company like Ikea, Bernhardt or Dorel. A furniture company in the modern era would probably have an e-commerce website and in-person locations that sells furniture.
A top level director or VP at this company will decide they want to increase sales of product category X or Y, or they want to improve sales in a specific region, or perhaps improve profit margins, reduce costs etc. This is where the ball starts rolling for metric and KPI development.
High Level Goals
The problem with metrics is you can track everything. You can track down to very specific granularities. But this doesn’t always provide value.
In order to provide value you need to first define what is important to track. In order to know what to track you need to have a high level goal of what you are trying to improve. That is why the first step in the process is deciding what metrics will help track and support the business goals.
In order to track and create these metrics the data team will need a general understanding of what the business team is trying to do.
Nothing needs to be solidified, but it is good for the stakeholders to know what is important and what strategies they are looking to put into place. In addition, having some general goals like “we want to increase the average sale per person by 5%” is good because it makes it easier to foresee what possible strategies might work (but it also makes it easier to point out when the strategy failed).
So step one is making sure these goals are written out somewhere as a proposal. It doesn’t have to be more than a page. It just clearly states the background of the goal, the why, etc. This will help the data team decipher what the business team is looking for.
We pointed this out earlier, but the reason it is important for the business team to have a clear idea of what they are trying to do is because there are so many metrics that could be tracked.
This could cause a few issues. For instance, a good data team won’t start work until you have provided more context. However, a less experienced team might try to do the work and struggle to create anything of value. Instead, they might spend 6 months creating a product that has too many metrics, the wrong metrics or maybe they just never finish because it is really hard to make forward progress when you don’t know where you are going.
This is why it is important in step one for the business team to have a clear ask. Just to show some of the metrics that could be tracked we have listed them out below. They all could provide value, but it depends on what the overall goal is.
There are a lot of possible metrics out there. All strange abbreviations, with different rules and all that somehow track very similar things from different angles.
Let’s go over a few that the furniture store might be interested in using.
Average revenue per user (ARPU) — This would be a pretty standard metric. However, if in this case we are adding in the idea of some form of ad-campaign then there is another angle we might want to consider. That is when the person was exposed to the ad. We will discuss how to analyze this in a later post
Average Purchases Per Week,Month,Year Per Store — This metric can be tricky. This can be very dependent on how much your average product costs, and the types of up-sell opportunities there are. For instance, a car is very expensive, very few people are going to buy two at once. However, people may be interested in spending $1000 dollars for a dvd player to be included or a specific set of rims, etc. For instance on Amazon you are buying products that are usually within that 20 -200 dollar range. That is easy to bundle an extra book or two, or a pan when you buy a spatula, or something like that.
Customer Acquisition Cost (CAC) — This metric is used to calculate the cost to acquire new customers. It is important because one goal a company might have is to reduce the overall CAC. Depending on how this approached it can tell if ads are getting more effective or possibly if products are getting better. It would depend on what the company did prior to shift in CAC.
Number Of New Visitors e.e. Acquisition — This would be a whole number and not a difficult metric to calculate.
Percent Of New Users Compared To Base — Compared to the number of new visitor metric, viewing things as a percentage can tell you more fairly how your growth is going.
These are just a few examples of metrics and we will continue to walk through this process.
In our next article we will start to discuss taking data from an operational database and moving it into a data warehouse. This is a key step into analyzing data because it makes the data available for data engineers and analysts. We will discuss ETLs, data models, etc.
Please scroll to the bottom if you would like to sign up for our future articles and videos where we will continue the process of developing a data product.
Are You Interested In Learning About Data Science Or Tech?
25 Of The Best Data Science Courses Online
Learning Data Science: Our Favorite Data Science Books
What Is Data Science Really As Told By An Ex-FAANG Data Scientist
How Algorithms Can Become Unethical and Biased
How To Load Multiple Files With SQL
How To Develop Robust Algorithms
Dynamically Bulk Inserting CSV Data Into A SQL Server
4 Must Have Skills For Data Scientists
SQL Best Practices — Designing An ETL Video
We are a team of data scientists and network engineers who want to help your functional teams reach their full potential!