Data Drive Through: Phil Simon

Data Drive Through Gregory Piatetsky Shapiro 1


Phil Simon is a speaker, journalist, and award winning author. He has written four books on technology, business, and Big Data. His latest is Too Big to Ignore: The Business Case for Big Data.

In our latest Data Drive Thru we spoke to Phil about privacy, buzzwords, human intuition, and other hot topics around Big Data.

What role should instinct and intuition play in a company that wants to rely more on Big Data insights?

Well, it still has an important role. The era of Big Data doesn’t mean that judgement and intuition go away. But we no no longer have to guess on things that we can potentially quantify.

I don’t know if Big Data will necessarily tell us what will be a big hit, or what will be a great product. Some things you still can’t quantify and I’m not sure that’s changing anytime soon.

But if you look at what some companies like Netflix are doing, they are bringing data into the decision. It isn’t just, “Is this a good series idea?” They’re saying, “Well, what do our users typically like?” And if enough of them like similar content they may drop $100 million, like they did on “House of Cards.”

So, there are ways to use data to sort of compliment intuition. I don’t think data replaces or supplants intuition. The need for that will always exist.

Gregory Piatetsky-Shapiro and Christopher Mims both say that small companies don’t have Big Data problems. Do you agree?

Define big companies. Are we talking about amount of revenue, number of employees, public versus private?

In the book, “Too Big To Ignore” I write about two small companies that take advantage of Big Data.

Now, large organizations honestly by definition generate more data. Look, do I as a small business owner generate billions of transactions? I wish! So there is some truth to that. I’m not going to say they’re wrong.

But in the book I talk about companies like Exploris, which is a small Cleveland startup, and they do Big Data for healthcare and they’re dealing with petabytes of unstructured data. If that’s not Big Data then I don’t know what is. They’re not a big company. I think they have 130 employees.

How can an organization experiment with getting value out Big Data before jumping in with both feet?

There’s so many ways to get a “little bit pregnant” with Big Data. There are sites like TopCoder, InnoCentive, and Kaggle that will let you essentially put out a contest – “We got this data set; we don’t know what to do with it. So we’re offering $50,000 to someone who can come up with an algorithm that can solve this problem.”

Or you can do many different things with Big Data as a service companies. You don’t have to buy millions of dollars worth of hardware and software and hire a bunch of data scientists at $200,000 a pop. You really can experiment and see if it makes sense.

This isn’t 1998, when in order to say run an ERP or CRM system you would need to make multi-million dollar investments. You really can get started very organically, very inexpensively.

Big Data means that companies can know more about their customers than ever before. Where do you see the notion of privacy headed in the new world of Big Data?

It’s a huge issue. In chapter 7 of the book I write about “Big Problems with Big Data,” – specifically privacy and security.

And I wrote the book before the PRISM scandal broke, but that definitely shed light on what is a major issue in this world these days. No one ever thought that Barack Obama would ever use the word “meta data” in public. So it is an enormous issue.

I’d say that if you are a user of a product like Facebook or Google and you don’t pay them any money, then they are going to try to monetize you and your data. And if you don’t like it you don’t have to use Facebook or Google. There are other search engines.

There are private social networks like Diaspora that take privacy much more seriously than Facebook does. There are search engines like DuckDuckGo that while not as well knows as Google, has privacy baked right into it.

So if you are just a user and you are taking advantage of these companies’ products and services I would argue that this doesn’t give them the right to post your information everywhere, but they are going to try to monetize that data.

So I think that it is important to understand what you’re getting into. But to me there’s a big difference between being a user and being a customer.

Once the world has mastered Big Data, what will become common that may seem extraordinary to us today?

Look what’s happening with wearable technology.

Look what’s happening with the Internet of Things. Right now our laptops, our tablets, our smartphones are connected to the internet.

What happens when your light bulbs do? What happens when your television or your refrigerator does?

That’s coming.

In the book I write about the Nest thermostat which basically learns about what you’re doing, how you like the temperature during times of the day, times of the year, days of the week.

We are going to see the arrival, sooner than you think, of the Internet of Things.

GE is spending billions of dollars creating the industrial Internet. There no reason why I won’t be able to, if I have a pool (and I don’t), ten minutes from home activate the heater so in the Winter I can jump in for a swim. There’s no reason that I have to do that when I get home.

With regard to wearable technology we’re seeing the Nike Fuelband, we’re seeing Fitbit, we’re seeing people constantly quantifying themselves. So, when I think about what’s coming (I’m working on a new book on Data Visualization that will be out early next year) we’re going to be consistently interacting with data, and not just at work. More of our lives will be spent trying to interpret information.

So, that’s what I think is coming. My crystal ball isn’t perfect, but I’m not the only person who recognizes those three things are a big deal.

Related Blogs

Many enterprises using Databricks for ETL workflows face challenges with isolated data management across workspaces. This…

Businesses are embracing the scalability and flexibility offered by cloud solutions. However, cloud migration often poses…

Streamlit is an open-source Python library designed to effortlessly create interactive web applications for data science…

Scroll to Top