Research prospect: data factor of production

Journal of Internet and Digital Economics

ISSN: 2752-6356

Article publication date: 12 November 2021

Issue publication date: 16 November 2021

2923

Citation

Xu, X. (2021), "Research prospect: data factor of production", Journal of Internet and Digital Economics, Vol. 1 No. 1, pp. 64-71. https://doi.org/10.1108/JIDE-09-2021-005

Publisher

:

Emerald Publishing Limited

Copyright © 2021, Xiang Xu

License

Published in Journal of Internet and Digital Economics. Published by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this licence may be seen at http://creativecommons.org/licences/by/4.0/legalcode


In 2012, a book named Big Data: A revolution that will transform how we live, work and think was published, setting off a wave of “big data” sweeping the world. Communities have been talking about big data and its related technologies, which have brought or will bring profound changes to human economy, society, politics and thoughts. Right now, big data has been used by governments, companies and individuals all around the world, and its popularity continues to grow nearly ten years after the publication of the book.

Big data is also widely used in the area of government governance. Government in various countries and regions have carried out digital transformation, and gradually formed a public management and service mechanism of “decision-making, management and service with data”. The capability of data governance has become the main evaluation criterion for the modernization of governance capability. In company operation, big data analysis has gradually changed from a trendy and clout-chasing cutting-edge practice to the prerequisite for the survival and development of companies. Correspondingly, data analysts, AI trainers and other relevant highly skilled talents have gradually become hotcakes in the job market. In addition to the government and firm level, personalized recommendation based on big data has become the standard configuration of most e-commerce platforms in the choice of personal consumption. However, price discrimination based on big data against existing customers has also become a key issue in consumers' rights protection.

During his presidency, former US President Barack Obama attached great importance to the social scene of data and network response. In March 2012, the Obama administration announced the “Big Data Research and Development Initiative” and set up a start-up fund of $200 million to enhance the ability of collecting, analyzing and extracting knowledge from massive data, so as to accelerate the pace of American invention in the field of science and engineering, to enhance national security and to change the existing teaching and learning modes. On the basis of the Computer Data Protection Law formulated in 1995, the EU issued the General Data Protection Regulation (GDPR) in May 2018 to regulate the use of personal information and sensitive data by Internet and big data companies. In addition, the EU also released A European Strategy for Data in February 2020 to explore the possibility and potential for growth of the EU as a data union. The Chinese government also promulgated The Data Security Law of The People's Republic of China (Draft) on July 3, 2020 and solicited public opinions throughout the country. The draft proposes that the state will implement hierarchical and classified protection of data, and data activities must fulfill the obligation of data security protection and undertake social responsibility.

It can be seen from the introduction that all major economies around the world are making effort to work out a successful plan regarding the valuable resource of data. The above series of plans, strategies and legislation, to a certain extent, all indicate that the country which can take the lead in the development and utilization of data resources will grasp the absolute initiative of economic development and technological progress in the 21st century.

From the perspective of social science research, problems related to “data economy” have become one of the hottest topics in economics, statistics and sociology in recent years. The Allied Social Science Associations (ASSAs), led by the American Economic Association (AEA), held a sub-forum on “big data, national accounts and public policy” at its 2020 annual meeting to discuss in depth on accounting and application of data at the macroeconomic level. A large number of top economics journals, including the Journal of Economic Literature (JEL), have published specific articles to introduce the research frontier of data and macroeconomics. As a country with tremendous data resources, China has also set up a number of economics research institutions focusing on big data technology and data economy and has produced a series of research findings on how data resources play a role in economic activities. Most of these studies regard how data work as a factor of production, which is also the perspective to be taken in this paper.

1. Data factor in the production process

Generally speaking, traditional production factors such as labor, capital and land are physical factors that directly enter the production process. Knowledge, technology, management and other new production factors are virtual factors that can affect the form of production organization and production efficiency. Formally, data factor is close to knowledge, technology and management. Existing works in the literature also emphasize that data factor, as an indirect input, plays a role in promoting industrial upgrading and economic growth by linking other production factors.

However, as ICT and big data technology continue to develop, as well as the increasing input of data into the production process, data factor has begun to engage more directly in the production process as the three other direct physical factors – labor, capital and land. Most companies adopting this production mode are those in emerging industries or leading enterprises of traditional industries. For example, the Chinese and American Internet giants, abbreviated as ATM (Alibaba, Tencent and Meituan) and FAANG (Facebook, Apple, Amazon, Netflix and Google) have remarkable advantages over other companies in data collection, integration, process and utilization. As a result, they have successfully obtained the absolute market position in the segmented market by using these advantages.

There are two ways that data factor can enter the production process.

The first mode is the traditional data-driven decision-making process (hereinafter referred to as the “DDD model”). In this production process, the producer directly takes the data factor as initial input, and integrates the data factor with ICT and other technologies, and then deeply analyzes the data relying on big data technology and data science. Operable economic decision-making, business assessment or production knowledge will be formed based on the results of data analysis. Firms can take or suspend business actions, improve or upgrade the production process, and finally achieve value promotion. According to the DDD model, the added value of data factor in production activities is reflected in the economic return of these actions. Empirical studies have illustrated that the profitability and productivity of companies adopting the DDD model are significantly higher than those without it in the same industry (Brynjolfsson et al., 2011).

The second mode for data factor to enter the production process is a revised version of the DDD model. In this production process, data factor serves not only as intermediate products but also final products. Therefore, its value chain can be extended. This production process is mainly used in data services, business media, investment consulting and other similar industries. Its basic process can be summarized as the following: Producers generate or collect a large scale of original and raw data by investing labor, capital and other production factors. After cleaning, screening, accumulating and analyzing these data, they generate data products or digital services that can be consumed directly, in the form of data factor. After obtaining the data factor, productive companies have three choices: (1) Use the data factor produced by itself directly; (2) Sell data factor to other companies, institutions or individuals; (3) Package data factor with other products and services (such as financial information services provided by financial information platform) and then sell them to other entities.

The second mode is easy to understand. In this mode, the sources of data are more diversified, and the destinations of data are also multiple. Of course, to upgrade to the second mode, two necessary basic prerequisites cannot be ignored: The whole society must further recognize the importance of data factor, and there needs significant progress in big data technology.

In the first mode, economic entities directly use data factor or information and knowledge obtained through data processing and analysis to improve business decision-making and economic behavior. In the second mode, economic entities regard data factor as a tradable commodity or service containing value. It is the recreation of raw and original data based on the development of big data technology, which focuses on the value mining of data itself.

2. The DDD model

In the past two decades, the DDD model has become an essential step in firm decision-making. With the continuous development of digital economy, more and more companies reduce their dependence on leaders' intuition and rely more on data-based analysis when making production and management decisions. Brynjolfsson et al. (2011) developed a statistical method for measuring the use of the DDD model by companies to describe the collection and analysis activities around external and internal data. Based on the survey data and public information of 179 listed companies in the USA, their results indicated that the DDD model can explain 5–6% of the output and productivity growth in American companies from 2005 to 2009. The endogenous analysis using the instrumental variable and substitution model also shows that there is a clear causal relationship between the application of DDD and the productivity growth, rather than correlation or reverse causality. Further research by Brynjolfsson and McElheran (2016b) also found that the proportion of companies using the DDD model in the US manufacturing industry nearly tripled from 2005 to 2010 (from 11% of factories to 30%). And this proportion is expected to exceed 50% by 2020. Obviously, the DDD model has become the “new normal” of American manufacturing companies.

McAfee et al. (2012) also pointed out that data-driven decision-making is better than the subjective decision-making of firm managers. The use of big data can encourage managers to make decisions based on “evidence” rather than “intuition”. Therefore, the way companies are managed may be changed thoroughly. Through investigation and survey, they also found that the average productivity and profit margin of the top three companies with highest frequency of data-driven decision-making in an industry is 5% or 6% higher than that of their opponents. Provost and Fawcett (2013) studied the conceptual differences between data science and the DDD model. They proposed that the development of data science supports the DDD model and makes large-scale automatic decision-making possible.

Brynjolfsson and McElheran (2016a) used the manufacturing data collected by the United States Census Bureau in 2005 and 2010 to analyze the use of the DDD model by manufacturing companies, as well as their investment in ICT technology and the application of other structured management practices. They summarized the six following conclusions about the DDD model:

First, large factories belonging to multi-unit firms use the DDD model more and earlier. Second, the average added value of single-unit firms in the DDD model is 3% higher than the companies without using this model. Third, the performance of most companies has improved significantly after adopting the DDD model. Fourth, the performance gap between the pioneers and latecomers of the DDD model will decrease over time. Fifth, there is a mutually reinforcing and complementary relationship between the DDD model, ICT capital and skilled workers (this is consistent with the complementarity of ICT capital and data factor which is mentioned above). Finally, factories where frontline workers make decisions can obtain higher benefits by adopting the DDD model.

Regarding the specific method selection of the DDD model, Brynjolfsson and Mitchell (2017) studied the “learning apprentice” system in the field of machine learning as a typical case of the DDD model. In short, the “learning apprentice” is to let AI programs act as apprentices to assist human workers. AI learns by observing human decisions and using them as examples to train itself. The “learning apprentice” system helps the machine learn from the combined data of several humans it assists, which may enable the machine to make better decisions than each human in the team that trains it. Nevertheless, the knowledge learned by machines is still limited by the technical capability of the team and the availability of relevant decision variables. The DDD model also makes mistakes and is not perfect.

We have to admit that adopting the DDD model can indeed bring many competitive advantages to companies. However, the transformation process from “rule by man” to “rule by data” cannot be achieved overnight. Companies need to reform their management mode correspondingly in order to adopt the DDD model. Senior decision makers must gradually accept data-driven and evidence-based decision-making. Therefore, firms need to employ “data scientists” or “data strategists” as a supplement who can discover the patterns in data and transform them into usable business information. At the same time, the organizational structure of the whole company must reshape the understanding of the evaluation process.

In addition to providing all kinds of support for the DDD model, there are many related factors influencing the quality of decision-making. Big data and big data analysis do make better decisions, but many other prerequisites must be met. Big data is collected from different sources with different data quality and processed by different organizational entities to form a big data chain (including data collection, preparation, analysis and decision-making). As the basis of decision-making, the quality of source data, the way of data processing and transmission, and the human capital of data scientists all will influence the quality and efficiency of companies' decision-making based on data factor. Indeed, the diversity and wide use of big data and big data analysis enhance the ability to detect fraud, which helps to decrease decision-making errors. But on the other hand, it may also aggravate the discrimination against users and consumers.

In fact, we also observe more utilization of data factor in the process of establishing institutions and initiating policies by governments, rather than only in companies' decision-making. With the support of big data analysis, the policy cycle made by government no longer slavishly follows the traditional mode of successive implementation at all stages, but can continuously evaluate the policy outcomes at each stage. And then it uses big data to analyze different scenes, developing alternative solutions and even giving up policies that were already planned in advance when necessary. In short, the policy-making process of governments is becoming more flexible and gaining more immediacy.

3. The spillover effects of the data factor

The development of data factor, data science and big data technology has deeply changed our understanding of the world. In addition to directly entering the production process as a production factor, data factor also has a significant indirect impact on productive activities. It displays strong spillover effects on economic and social development.

Firstly, data can optimize the utilization process of other production factors, improve the use efficiency and combination efficiency of various other production factors. The job search platform established based on the salary data and firm information shared by incumbents and candidates is a good example. The existence of these platforms can effectively reduce the uncertainty of the labor market and help companies to find suitable candidates from the labor market more quickly and at a lower cost. At the same time, it can also help candidates not to be cheated by the job advertisement and get reliable information about the work situation and salaries.

Secondly, automation technology plays a role in replacing labor force with robot and replacing human capital with AI to a certain extent by combining data factor with traditional physical capital. For instance, based on ICT capital and combined with data factor, Google has developed an AI “Duplex AI” that can make phone calls. Duplex AI's ability to accurately understand the intention of the calling customer and naturally continue the dialog is enough to be “deepfake”, which is very different from the clumsy AI customer service that we are currently exposed to, which can only read fixed sentences and grab keywords. There is no doubt that the Duplex AI can replace manual customer service to a certain extent.

Besides promoting the integration of existing production factors, the spillover effect of data factor is also reflected in its core technology: the general-purpose technology (GPT) nature of big data technology.

The term GPT came from a far-reaching article “General purpose technologies ‘Engines of growth’?”, written by Bresnahan and Trajtenberg in the Journal of Econometrics in 1995. In this paper, they defined GPT as a technology that has a profound impact on the transformation of human economy and society. They believed these technologies have played a role as an engine of economic growth in the whole human history. Inventions that are generally regarded as GPTs include characters, printing, steam engine, electricity, wheel, automation and the Internet.

Mr. An Xiaopeng, the vice president of Ali Research Institute, believes that most GPTs have the following four characteristics: First, GPT can be widely used in various industries. Generally speaking, the GPT mostly appeared in the form of special purpose technology (SPT) at first, and was gradually extended to other fields. Second, the use of GPTs can continuously promote productivity and reduce users' cost. With the further development and application of new technology, the cost of technology application will continue to decline, and the scope of it will keep expanding. Third, GPT can promote other new innovation of technologies and the production of new products. There are strong complementarities and positive externalities between GPT and other technologies. It can also promote the innovation and application of other new technologies through evolving and innovating. Fourth, the application of GPT will continue to promote the adjustment and optimization of production, circulation and organization management. Such technologies not only promote the technological innovation of the production process and the transformation of the production mode but also improve the optimization of companies' organization and management mode, and realize the comprehensive upgrade of product technology, process technology and organization technology.

It is not hard to find that the big data technology developed in the past few decades precisely has the above characteristics of GPT. Firstly, big data technology is universal. Its related technologies can be widely used in various production processes and social practices. Big data technology also has strong permeability and can quickly expand to all fields of social production activities. We can find “big data” in every corner of economy and society.

At the same time, big data technology also has complementarity with other technologies, such as ICT. On the one hand, ICT capital (computers, servers and software/hardware of the Internet) is the infrastructure of big data technology. On the other hand, big data technology is an important application of ICT capital, which improves the value of ICT capital and maximizes the “computing power” of companies.

In addition, big data technology also has strong inspiration for innovation. There are three consensuses in the research of big data-driven management innovation, according to the existing literature. First, big data constitutes a competitive resource of companies in the era of digital economy and plays a role in improving the dynamic ability of firms. Many companies start to pay attention to the concept of “data-empowered”, which means to enhance the ability of economic entities by improving big data technology and data analysis ability, and realize value creation. On the basis of “data-empowered”, some researches further expand and propose the concept of “data enabled” with mode disruption, capability revolution and even a new competition paradigm, emphasizing the new possibilities brought by data factor. Second, the big data resource improves the learning ability of companies. For example, through big data analysis and processing, companies can improve the ability of exploratory and utilized organization learning. Third, the value creating process of big data requires the adaptation, adjustment and update of companies' key structures and capabilities. Firms have to overcome the inertia of the organization's original procedure and business modes to enhance the positive impact of big data analysis on management decisions.

All in all, big data technology provides a new digital and intelligent tool for production process. The combination of big data technology and massive data has promoted profound economic transformation and social change. From this perspective, the data factor has indeed gone beyond the scope of economic capital and has begun to gradually approach the level of generalized capital in Bourdieu's “field theory”.

4. Data-intensive enterprises

According to the different types of utilized production factors in production and the intensiveness of resources, companies can generally be divided into “labor-intensive”, “capital-intensive” and “resource-intensive” ones. In recent years, with the general recognition of the technology and knowledge production factors, the concepts of “technology-intensive” and “knowledge-intensive” have gradually appeared.

In the era of digital economy, “data” are having a significant and profound impact on economic and social development as a unique production factor. Therefore, we can also propose the definition of “data intensive enterprise”: enterprises that invest a large scale of data in the production process and rely much more on data than other production factors and resources.

What kind of enterprises can be regarded as data intensive enterprise? Platform enterprises that collect and process large amounts of data such as Facebook and LinkedIn, enterprises that take data as their main production objects or products – data analysis companies and data service platforms such as Bloomberg and Wind, financial investment enterprises that rely on data analysis for business decisions such as Ant Group, as well as database and cloud service providers providing data-related infrastructure and services such as Amazon AWS and Alibaba Cloud can all be included in the category of data intensive enterprise. According to this, it seems that almost all major technology companies in various economies can be regarded as data intensive enterprises. This is also a proof of the significance of data factor to national economy.

Among the above data intensive enterprises, the development of platform companies has attracted the most attention. At present, the development scale of Internet platform companies has reached the level of “more money than god”. The influence of these platforms in the political, economic and cultural fields even surpassed many sovereign countries. They began to gradually replace the core position of traditional industrial companies and international financial groups in the modern economy. Around 2010, the companies with the highest market value in the world were traditional economic giants like China National Petroleum Corporation (CNPC), Industrial and Commercial Bank of China (ICBC), Exxon Mobil and Broken Hill Proprietary (BHP). Only Microsoft and Apple were in the IT field. However, by 2020, four of five, and seven of ten companies with the highest market value in the world are all Internet platforms or IT companies.

Compared with other data intensive enterprises such as data analysis companies, financial investment companies or database companies, the research on platform companies of the economics community are relatively more comprehensive, systematic and in-depth so far. The theoretical research related to platform economics, such as the research on firm behavior and business mode, has increased and become self-contained, which can help us better understand the commonalities and differences of data intensive enterprises.

The guidelines on antitrust in the field of platform economy (draft for comments) issued by China State Administration for Market Regulation in November 2020 defined Internet platform companies as the following: Platform refers to a business organization form that enables interdependent multilateral entities to interact under the rules and matching provided by specific carriers through network IT, so as to create value together. From the perspective of economics, a remarkable characteristic that differs platform companies from traditional firms is that platform companies gather a large number of users on the same platform, and finally achieve the transaction by promoting the interaction among these users. Therefore, platform companies not only provide information services on transaction and supply–demand matching for user groups but also build and operate a platform market formed by an entity site (such as a shopping center) or a virtual space (such as Taobao), so that buyers and sellers can reach transactions on this platform.

The Digital Market Act proposed by the EU at the end of 2020 clearly defines large digital platforms as “gatekeepers” of the online market. They usually operate a core platform service (online intermediary service, online search engine, social network, video sharing platform, number independent interpersonal communication service, operating system, cloud computing, advertising service, etc.). As a channel for operators in the platform to contact end consumers, they embed the network effect into their own platform ecosystem and therefore occupy or are expected to occupy a deep-rooted and sustained position in the data market.

Platform companies connect the supply and demand sides at the same time, which demonstrates the characteristics of bilateral market structure, forming the so-called two-sided market. The two-sided market is also called the two-sided network, namely an economic network with two independent user groups that provide network benefits to each other. The emergence of two-sided markets has changed the operation mode of traditional markets, creatively derived new pricing strategies and profit modes, and allowed platform companies to make profits by charging transaction, access, enhanced access and enhanced content management service fees.

However, influenced by attributes of technological drivers, standard competition, low marginal cost and network externalities, the two-sided market manipulated by platform companies is easy to gain oligopoly or monopoly position, which leads to the discussion on “antitrust against Internet platform” recently. The pricing strategies and market structure of platform companies and two-sided markets have been explained in detail by digital economics and industrial organization theory. This paper will not repeat this, but focus on how platform companies establish and use data factor.

It is not exaggerated to say that data factor and algorithms are the lifeblood of data intensive platform enterprises. The reason why Google sticks out in the competition of search engine in the early 21st century, defeats strong rivals such as Microsoft and Yahoo and finally becomes the biggest winner is inseparable from its development purpose and business philosophy that always regards data and algorithms as the core competitiveness of the company.

5. Prospects

At present, economic research regarding data, data factor and big data technology is still a frontier field with more questions than answers. Due to the explosion of data scale and types, and the rapid development of data analysis technology, it is no longer feasible for companies to employ individuals to collect, integrate and analyze the massive data owned by firms. It will not only cause the omission of information but also result in data failure. We need to investigate deeper into these problems and come up with effective solutions.

References

Brynjolfsson, E. and McElheran, K. (2016a), “Data in action: data-driven decision making in US manufacturing”, Paper, No. CES-WP-16-06, US Census Bureau Center for Economic Studies.

Brynjolfsson, E. and McElheran, K. (2016b), “The rapid adoption of data-driven decision-making”, American Economic Review, Vol. 106 No. 5, pp. 133-139.

Brynjolfsson, E. and Mitchell, T. (2017), “What can machine learning do? Workforce implications”, Science, Vol. 358 No. 6370, pp. 1530-1534.

Brynjolfsson, E., Hitt, L.M. and Kim, H.H. (2011), “Strength in numbers: how does data-driven decision-making affect firm performance”, Working Paper, No. 1819486, SSRN.

McAfee, A., Brynjolfsson, E., Davenport, T.H., Patil, D.J. and Barton, D. (2012), “Big data: the management revolution”, Harvard Business Review, Vol. 90 No. 10, pp. 60-68.

Provost, F. and Fawcett, T. (2013), “Data science and its relationship to big data and data-driven decision making”, Big Data, Vol. 1 No. 1, pp. 51-59.

Acknowledgements

The author thanks Sijia Hu and Xiaoxuan Tian for their contributions to this paper.

Related articles