Unstructured Data, and How to Analyze it!
Content creation and promotion can play a huge role in a company's success on getting their product out there. Think about Star Wars and Marvel. Both of these franchises are just as much commercials for their merchandise, as they are just plain high quality content.
Companies post blogs, make movies, even run pinterest accounts. All of this produces customer responses and network reactions that can be analyzed, melded with current data sets and run through various predictive models to help a company better target users, produce promotional content, and alter products and services to be more in tune with the customer.
Developing a machine learning model can be done by finding value and relationships in all the different forms of data your content produces, segmenting your users and responders, and melding all your data together. In turn, your company can gain a lot more information, besides the standard balance sheet data(see picture above).
Change Words to Numbers
Machine learning has created a host of libraries that can simplify the way your team performs data analysis. In fact, python has several libraries that allow programmers with high level knowledge of data science and machine learning application design and implementation the opportunity to produce fast and meaningful analysis.
One great Python library that can take content data like blogs posts, news articles, and social media posts is TextBlob. TextBlob has some great functions like
“Scary Monsters love to eat tasty, sweet apples”
You can use the lines below to pull out the nouns and what was used to describe said nouns.
How to use TextBlob to Analyze Text Data
This takes data that is very unstructured and hard to analyze, and begins to create a more analysis friendly data sets. Other great uses of this library are projects such as chat bots
From here, you can combine polarity, positivity, shares, topic focus to see what type of social media posts, blog posts, etc, become the most viral.
Another library worth checking out are word2vec which exists in Python, R, Java, etc. For instance, check out deeplearning4j.
Marketing Segmentation with Data Science
Social media allows for once hard to get data such as, people's opinions on products, their likes, dislikes, gender, location, and job to be much more accessible. Sometimes you may have to purchase it, other times, some sites are kind enough to allow you to take it freely.
In either case, this allows companies an open door to segmenting markets with much finer detail. This isn’t based off of small surveys that only have 1000 people, we are talking about millions, and billions of people. Yes, there is a lot more data scrubbing required. But there is an opportunity to segment individuals, and use their networks to support your company's products.
One example is a tweet we once passed off to SQL Server. They quickly responded. Now, based off the fact that we interacted with SQL Server and talk so much about data science and data. You probably can assume we are into technology, databases, etc. This is basically what twitter, facebook, Google, etc do to place the right ads in front of you. They also combine cookies, and other data sources like geolocation.
If you worked for Oracle, perhaps you would want me to see some posts about the benefits of switching to Oracle, or ask for my opinion on why someone prefers(we personally have very little preference, as we have used both, and find both useful) using SQL Server over Oracle. Whatever it may be, there are opportunities to swing customers. Now what if your content was already placed in front of the right people. Maybe you tag a user, or ask them to help you out or join your campaign! Involve them, see how you can help them.
For instance, bloggers are always looking for ways to get their content out their. If your company involves them, or partners with them in a transparent way. Your product now has access to a specific network. Again, another great place where data science and basics statistics come into play.
If you haven’t tried tools like NodeXL, it is a great example of developing a model to find strong influencers in specific networks. This tool is pretty nifty. However, it is limited. So you might want to make some of your own.
Utilizing the data gathered from various sites, and algorithms like K nearest neighbor, PCA, etc. You can find the words used in profiles, posts and shares, the company's customers interact with, etc. Then:
The lists goes, on. It may be better to start with NodeXL, just to see what you are looking for.
Now what is the value of doing all this analysis, data melding, and analytics?
ROI Of Content:
At the end of the day, you have plenty of questions to answer.
These aren’t the easiest question to answer. However, here is where you can help turn the data from your social presence into value for your company:
Typical predictive analytics utilize standard business data(balance sheet, payroll, CRM, and operational data). This limits companies to the “what” happened, and not the why. Managers will ask why did the company see that spike in Q2? Or dip or Q3? It is difficult to paint a picture when you are only looking at the data that has very little insight into the why. Simply doing a running average isn’t always great and putting in seasonal factors is limited to domain knowledge.
However, data has grown, and now, having access to the “Why” is much more plausible. Everything from social media, to CRMs to online news provide much better insight into why your customers are coming or going!
This data has a lot of noise, and it wouldn’t really be worth it for humans to go through it.. This is where having an automated exploratory system developed will help out a lot.
Finding correlations between content, historical news, and company internal data would take analyst's years. By the time they found any value, the moment would have passed.
Instead, having a correlation discovery system that is automated will save your company time, and be much better at finding value. You can use this system to find those small correlating factors that play a big effect. Maybe your customers are telling you what is wrong with your product, and you just aren’t listening. Maybe, you find a new product idea.
In the Acheron Analytics process, this would be part of our second and third phase. We always look for as many possible correlations, and then develop hypotheses and prototypes that leads to company value.
This process allows companies to have data help define their next steps. This provides their managers with data defended plans. Ones that they can go confidently to their managers with.
When it comes to analyzing your company's content and marketing investments, utilizing techniques like machine learning, sentiment analysis, segmentation which can help develop data driven marketing strategies.
We hope this inspired some ideas how to meld your company’s data! Let us know if you have any questions.
We are a team of data scientists and network engineers who want to help your functional teams reach their full potential!