This article original appeared on the Forbes website.
There have been a number of pain points uncovered by the Covid-19 pandemic. Across all sectors, businesses, researchers, governments, and citizens have come face-to-face with a hard reality that when push came to shove the information and resources, they needed in order to function properly weren’t guaranteed.
Data in particular has been a glaring problem. While there was no shortage of data producers who were eager to provide relevant Covid-19 related information to the public (WHO, Johns Hopkins, CDC), the suddenness and rapid evolution of the crisis meant that these data sources were changing daily, sometimes hourly. The WHO, ostensibly the primary source of truth for infection rates, released their information in daily situation reports. These PDFs were critical data points to understand how the pandemic was changing over time, but were released in a way that made it effectively impossible to use – it’s nice to see the information, but you can’t model, spot trends, and predict outcomes with PDFs.
Faced with this problem (lots of data available, very little of it accessible) organizations are making do with what they have, tying into data sources when possible, piecing together spreadsheets when not.
In the first four months of the pandemic, these primary sources of data were important in order to understand where Covid-19 was and where it was going. Now, as organizations look towards returning to “business-as-usual,” many people are considering how they can use external data more generally to help them navigate the uncertain economy. I recently connected with Bryan Smith, CEO of ThinkData Works, about how businesses can streamline their data infrastructure without creating a lot of additional overhead.
Gary Drenik: As we head into the sixth month of working from home, what do you think is the primary business objective for organizations that are trying to get back on track?
Bryan Smith: I think we’re going to see a dramatic shift in business priorities from innovation to optimization. Data science divisions have been overburdened for years and that needs to change, fast. The janitorial work of connecting to external data is a huge problem, and it’s burning 80% of a data scientist’s time. Optimizing this process should be priority one for any business that wants to become “data first.”
Drenik: There’s a lot of data out there. How do you define “external” data?
Smith: It’s a good question. Our company started out in the open data space, and over the years we’ve watched the conversation shift a lot. To us, external data is anything you didn’t create, whether it’s coming from government sources, other businesses, or Joan in accounting. If you didn’t make the data, it’s external, and you’re probably working harder than you should to get it. There’s an assumption that the enterprise has this figured out, but if you scratch the surface of most data pipelines, there’s a lot of painstaking and manual work being done to use even just a fraction of what’s available. That’s why we have teamed up with Prosper to bring a fresh perspective on high quality external data that goes beyond stats of past behaviors and provides factual insights and data about how consumers are feeling, what they are doing , why they are doing it and what they plan on doing in the future. This is especially timely for today’s world where disruption seems to be the norm and left many wondering about what the future now being shaped by new consumer behaviors will look like.
Drenik: What role do you think external data plays for businesses that are trying to map out their game plan for next year?
Smith: Traditional models are going to have to be enhanced with new information. For many companies, finding these sources of data is still a big headache. For us, partnering with organizations like Prosper gives us and our network the ability to find new sources of data from a central clearinghouse, which lets them plug into analysis-ready consumer data for economic forecasting and predictive intelligence. We used to assume that most organizations had figured out a way to pull in data from statistical agencies like the Census Bureau and Statistics Canada, but what we’re hearing these days is that there’s a lot more manual work being done to gather this data than we previously thought. If you’re down in the mines trying to get basic demographic indicators from government sources, you probably don’t have the bandwidth to pull in signal-rich data from third-party data providers. The problem is that these days you really need both. Figuring out a scalable way to automate the flow of public data and connect to new sources of data should be top of mind for anyone who’s trying to get predictive.
Drenik: What do you say to a company that’s invested in data management solutions but isn’t seeing a big lift in overall analysis and insight?
Smith: You’ve solved one piece of the puzzle. Figuring out a way to get data into your environment is a big hurdle. A good ETL gives you a pipeline of raw information, but at the end of the day your data scientists are still managing a raw resource, and it’s a misuse of their time. Refining this data into decision-grade products lets them perform actual data science.
Drenik: What do you think is the biggest blind spot facing businesses that use external data? How should they overcome it?
Smith: I think there’s a big disconnect between business priorities and data science realities. At the end of the day organizations need to stream more data, reduce overhead, and add confidence to the entire process. To do this, the first step is to understand where you’re at. Now is the right time to perform a data audit across all divisions to see what data you’re pulling in from where and eliminate as much overhead as possible. A lot of data providers have made a lot of money selling the same product to different divisions within the same company. As businesses tighten their budgets the best way to free up some spend on new data is to eliminate the redundancies that have cropped up over time.
Drenik: How should businesses change their economic models to manage the fluctuations we’re seeing in the market?
Smith: It’s not about starting from square one but adding to what you already have. Obviously, data from the OECD and World Bank is still going to be useful, but what a lot of people are struggling with right now is the latency of these traditional sources. Getting GDP stats from July in September is fine as a baseline, but you need to also grab data that’s up to date. This is where getting data from third-party providers can give you a better overall picture of the economy. Products like the ones you’ve designed at Prosper – which help you understand behavioral trends, consumer sentiment, impulsivity – aren’t necessarily traditional signals, but they’re increasingly important right now.
Drenik: Between CCPA and GDPR there’s a lot of new regulations around data use. How can businesses ensure they’re being compliant when the landscape is changing?
Smith: Flexibility is important. Companies need to develop a strategy to manage and audit the flow of data through their environment, and this strategy needs to accommodate new regulations as they’re passed down. It’s not enough to have a good policy framework, you also need to back that up with infrastructure that supports the rules you’ve set up. Having mechanisms in place that monitor who’s accessing data, how it’s changing over time, and how it’s shared are all technical requirements wrapped in policy questions.
Drenik: What’s the future of data in the enterprise?
Smith: Every company needs to become data first in order to survive. If you look at Amazon and Apple, who have always focused on data as a core feature of how they do business, it’s clear that the market will be defined by the organizations that figure out how to streamline the flow of data into their everyday processes. Optimizing external data use doesn’t sound as sexy as innovation, but it’s the prerequisite for unlocking the value of data for your business.
Drenik: Thanks Bryan for your astute insights on the value of quality external data in today’s data centric world and the need for businesses to sync data science with business priorities in order to achieve organizational data success.
Complimentary Coronavirus/Covid-19 findings are available at AWS Data Exchange. To learn more, click here: Strategic Insights: Coronavirus Covid-19 Consumer