The value of data

In a digital economy, data can be costly to acquire and structure. Ultimately, its value is set by the benefits that derive from data-driven predictions.

Contemporary data center
Data drives economic value, but not all data is the same. © Getty Images

In a nutshell

  • Quality, use and volume make data a multifaceted class of goods
  • Data can play different roles in the digital economy
  • The value of data lies in how it can lead to better decision-making

“Data is the new oil,” goes the saying, but it is wrong. The economics of data is intricate. It is not the abundance of data that drives value; it is the benefit one can extract from data-driven predictions. Data as a good is multifaceted. And so is the economics of data.

Base good and complement

Much of the digital economy, especially artificial intelligence (AI), is about prediction. And prediction relies on data. As economists put it, data is a complement to prediction. Complements are goods that add value to another. Usually, the base good is relatively cheap, and the complement good is relatively expensive. For example, printers are a base good and the ink cartridges are its complement;  the printer is cheap and the cartridge expensive. The maker of printers locks in customers with the relatively more affordable base good and makes its profits via the expensive complement. In the digital world, a free application is a base good and the paid in-app services are the complement.

The same economics apply to the digital economy at large, especially for AI. The AI making the prediction, the algorithm, is made relatively cheap, locking in users. The data used by this algorithm is valuable. The cheaper the algorithm as a base good becomes, the more the value of data as a complement increases. This trend is likely to continue. Programming the mathematical calculations that make up the algorithms will become more standardized, easier, and, therefore, cheaper. Getting the “right” data in the “right” way and using it “correctly” will increasingly become the differentiating factor and, therefore, more valuable.

Aspects of data value

While data can be a differentiator and value driver,  not all data is the same. Quality, use and volume make data a multifaceted good – or rather, class of goods. Big data economics postulates that the more data there is, the better the results  – for example, the predictions an algorithm can yield. However, the quality of data is equally important. The better the data, the better the predictions. Regarding data quality, architecture is critical. Data that has been correctly labeled and structured is more valuable than loose data points whose information content must be discovered and repackaged, often manually.

The optimal amount and quality of data depend on the benefits generated by AI-based prediction.

And then there is data usage. Data can play three different roles in the digital economy. When it comes to AI, data can be input data, namely data fed to an algorithm to make a prediction. When a user looks up directions from one place to another, AI uses maps as input data to calculate the route. But data can be training data, too, to make the AI good enough to predict the complexities of the real world. This kind of data is used to teach AI to select routes and predict arrival times.

Finally, data can be feedback data, used to improve the AI’s performance with experience. When someone decides to take a different route than the one suggested by the algorithm, this provides valuable feedback data that can enhance future calculations.

In some situations, considerable overlap between these data uses exists, such as when the same data plays all three roles. The more overlap, the better the data, since its structure and labeling enable the AI to manage its simultaneous usage more readily, focusing on learning, predicting and reacting to feedback.

Cost of data

Data can be costly to acquire and structure. Thus, the investment involves a trade-off between the benefit of more and better data and the acquisition cost. The optimal amount and quality of data depend on the benefits generated by AI-based prediction. Let us first look at the cost.

The benefits one can extract from data are the differentiating factor in business models and value drivers.

From the vantage point of economic theory, data as such has decreasing returns to scale. Adding a third data point to a second is much more valuable than adding a 100th to a 99th point. On the other hand, adding more and better data pushes up marginal costs. Incorporating the eight-millionth data point is more difficult or costly than adding the 14th. It is like learning one’s way in a new city: the first and second time one takes the bus, one learns a lot about the city’s layout and mass transit system. On the three-hundredth trip, it has become routine. Only vestigial new information is being acquired (decreasing returns). Or one would have to pay an incommensurate amount of attention to the minor details to learn something new (increasing marginal costs).

See more technology reports

Cost drivers in acquiring and structuring data are mainly opening channels for data collection and exchange, labeling, developing a flexible architecture for evaluating and using the different data points, as well as setting up, adapting and expanding the physical infrastructure enabling these activities. Even if the individual data point can be acquired free of charge, the processes for extracting their informational value and taking advantage of them are not.

Benefits of data

Data is more than a cost factor in the digital economy at large, especially for AI. The benefits one can extract from data are the differentiating factor in business models and value drivers. Conceptualizing these benefits requires a shift of vantage point. The economic value of data cannot be measured by the investment required to acquire and maintain it. Neither can it be assessed with how data technically fits the results of calculations. The critical economic insight about data benefits is that the value of data is a function of how it improves the value one gets from the algorithm using it. To use the example of predictions again: the value of better data is not in how much more precise a prediction is but in how the prediction improves the user’s choices.

The leading search engine company
A man walks near the offices of Google in New York City. © Getty Images

Take internet search engines. Most of them yield the same results. At the time of this writing, Google, DuckDuckGo and Bing produce roughly the same results for “Beethoven.” In this context, data is not a differentiator. However, in a less conventional search, as for “arbitrage,” differentiation kicks in. Bing mainly yields definitions; DuckDuckGo shows definitions and links to financial sites, while Google comes up with definitions, financial sites and some academic references. 

Incorporating more data and better-structured data into its architecture gives Google an advantage. It shows results that increase the choices of the user. This enhanced benefit offered by Google translates disproportionately into the company’s market share. Economists call this phenomenon increasing returns to data differentiation.

Most users use Google for both rare and common searches. Being even a little better in search results can lead to a big difference in market share. To “be even a little better,” the digital company needs to pay special attention to data acquisition and quality. An additional effort in these areas leads to a differentiating factor via enhancing user benefits. That results in an overproportionate increase in the market position, revenues, and improvement in the digital business model.


Facts & figures


  • Data is essential for digital economies and AI; however, it is a multifaceted class of goods. 
  • Data differentiates itself according to volume, quality, and use. 
  • While data can be costly to obtain and structure, only slightly better data can generate overproportionate benefits to users, which, in turn, creates a market advantage for the digital business models providing this add-value. 
  • The critical economic insight about data is that its value is not a function of how it improves a result; it is about how data enhances user benefits. 
  • Often, there are increasing returns to data differentiation: the additional benefit that users derive from better data translates overproportionately into market share and revenues.


There are three base scenarios for picturing how data can further impact the value of digital economies, especially AI.

Market monopolization

In the first and least likely scenario, some companies will specialize even more in data acquisition and structuring, giving them a widening lead in generating user benefits. This will allow them to expand their market share to quasi-monopolies able to gather even more data and invest in improved architecture, which, in turn, will solidify their position. Such a feedback loop ends in the monopolization of markets. It is the least likely scenario because of data’s multifaceted and dynamic nature, which makes its monopolization nigh impossible.

Abundance of data

In a second and more likely scenario, data could lose its differentiating power. This process can occur if channels for data acquisition and their structuring become abundant. And that can happen if data protection and intellectual property regulations are relaxed, if agents agree on complete and real-time unconstrained data diffusion, or with the advent of new and simpler paradigms in data architecture. In this case, it will be easier to acquire and structure data. However, the specific advantage derived from harvesting it and creating a differentiating factor is likely to diminish too. This scenario depends on the convergence of various elements in regulation, technology, and values. In the medium term, such a convergence is only likely in smaller communities of users.

Unfettered competition

A third and most likely scenario is a gradual improvement of data acquisition and architecture paired with intense competition by incumbents to enhance the users’ benefits. Additionally, incumbents will be challenged by new firms trying to increase the added value of predictions or unlock more information from less data to create the same value as others but with cheaper processes. This case is the continuation of the economic logic exposed here and can significantly enhance user experiences while also increasing the revenues and gains of digital companies.

Related reports

Scroll to top