I recently gave a webinar on Requirements for Analytics Projects and had a few interesting questions come up during it. I thought I’d share some of the Q&A here, since we didn’t get to cover them all in the webinar.
You can listen to the webinar here.
Here are some of the questions and answers.
Q: Is there a way of adding structure to unstructured data?
First off, most data at least is semi-structured. But regardless, though you can’t just put it in tables with rows and columns, you can use data management products to give some of the structure. The software frameworks available allow you to store, analyze, order, and access unstructured data. I’m not an expert in these by any means, but Hadoop is the one I most commonly hear about.
Q: Privacy issues are of importance to my project — lots of people who may not want to be followed closely everywhere they go online; how do you handle this?
I think this is probably pretty common, or will become more common soon. People are starting to realize how much data is being collected about them and it’s alarming to some. Not to mention there will likely be more regulatory rules around this. Either way, as a business analyst, you need to understand and specify the business rules for your organization around privacy. For example, you might have a rule that you strip personal information as part of storing the data, and you will need to specify exactly what data fields are stripped. Your organization can still look at trends in the data though. You won’t know WHO the trends are about, but if you have some basic information about the type of people, then you can still look at trends relative to general characteristics (gender, age, etc). Obviously this makes things like targeted marketing more difficult. Some companies are giving the customers a choice – they get some benefits if they allow the company to know who they are and track what marketing is doing for those individuals.
Q: Experience with the company’s domain seems very important in order to structure data. How would you mitigate lack of this experience?
We actually find that good business analysts can ramp up on a domain relatively quickly, so I don’t think this is a major issue. Also, our work is not necessarily to order structured data as much as it is to understand what the data is and what the users want to do with it – and help bridge that gap. That said, if someone is a data scientist, and deeply immersed in analyzing data, that answer might differ. Also, there is some research from Professor Dan Berry that shows for requirements elicitation in general, having a team of combined domain experts and domain ignorants (new to the domain) is the most productive.
Q: How do you handle all the different data silos in a company?
The answer ultimately lies within data management and governance software applications that will help unify data sources. There are products that bring unstructured and structured data together in one source. So I think in the end, the IT organization has to be able to demonstrate the business case for putting data management software in place to bring siloed data together. The business value could be something like demonstrating the expected benefits for knowing more in aggregate about the customers allows marketing to them in a way that leads to more revenue.
A recent example from one customer demonstrated the silo effect. One part of the organization was doing the analytics analysis on the data to determine some changes that needed to happen to the retail site. But the business analysts documenting requirements are far removed from those analytics results. They are told what features are needed by some executives, but no one is helping make the connection to the actual data results. I think in the near future we’ll see this link is more direct.
Q: What if the company you are working for doesn’t know what they want out of all data available, its not for fun but they are curious, but where would you suggest that business analytics to start, the past?
This is a great question and it really represents how analytics start. I would suggest get a team of people together to collaboratively brainstorm about ideas of what might be relevant in your company. Sometimes there are industry papers specific to domains that might help guide you too. You can look at past data, but I think it’s more interesting to first think about what might be a useful outcome from analytics and work backwards into what data you should look at or acquire and then what systems changes help implement those. Particularly if you are new to this and don’t know what is needed, you are well setup to take an agile approach to analytics – build a little at a time and grow it through evolution.