Episode 9: Big Data – Big Data is a wealth and energy that drives economic growth

...
Date: 26 - 05 - 2019

Share On:

Dr.. Eng. Ali Mohamed El Khouri

As we started and explained in the previous article, data today has become the primary means and driver of value-added chains and the economic value network, and has even become a wealth and economic value in itself.

With the emergence of the concept of Big Data as a science with its tools and goals, the world has increasingly noticed during the first decade of the twenty-first century the importance of data, and benefiting from its huge daily accumulations, as a material wealth characterized by increasing in value the more it is used, and inexhaustible like wealth. Other natural resources such as oil, gold, iron, plants and others. The scientific definition describes big data as a set of data that is large in size and complex in structure, so that traditional data processing programs cannot deal with it, but rather need advanced systems to manage and analyze it within a reasonable period of time.

Data types and their partitions

Large data sets can have different forms: (1) Structured data, such as in databases, and (2) Unstructured data, such as website, social media, and newspaper data, through which in-depth analysis can be accessed. important concepts and conclusions. According to global models, big data has different qualities, such as:

  1. Volume, which can reach millions of trillions of records and data.
  2. Variety, which means the diversity of the extracted data, including electronic files, databases, web pages, video recordings, and others, and it requires time and effort to prepare them in a suitable form for analysis.
  3. High speed (Velocity), and refers to the speed of data production and extraction, as this data requires real time processing and analysis systems, as critical elements to support decision-making.
  4. Accuracy and reliability: It refers to the reliability of the data.
  5. Value: It represents the real goal of blending the first four qualities, such as the information collected and analyzed to create a product line, new sales opportunities, or measures to reduce costs.

In this way, it can be considered that data is like raw materials. It needs to be analyzed and extracted from the facts and knowledge that these data carry, to benefit from them, as is the case with oil. Just extracting it from the ground does not mean that it is ready as a fuel for industrial use, but rather it needs to be refined and filtered to produce useful usable materials.

How does the economy benefit from big data? The digital economy views big data as one of the most important foundations for building any sustainable economy in the present and the future, because it will be based on knowledge, real analysis and real understanding, and not leave it to chance or luck, and in order to transform big data into clear information and usable knowledge. There are two branches of science that are employed as approaches to data modeling and big data processing, and they are data science and data analytics.

Data science is a field that combines multiple disciplines such as statistics, mathematics, and intelligent data collection techniques, and adapts them to analysis in order to extract insights and information. This science mainly focuses on exploring the things we do not know and finding better ways to analyze information, by asking possible questions and hypotheses, and linking them with different, disparate and unrelated data sources.

As for the science of data analysis (Big Data Analytics), it focuses on statistical analysis procedures for available data sets to reveal hidden patterns and unknown relationships – such as market trends and customer preferences – and on designing simplified ways to display data and information to solve problems, answer questions that concern the decision maker, and find results. It can lead to immediate improvements.

To understand the difference between data science and data analysis, data science can be described as the holistic umbrella that includes scientific methods, mathematics, statistics, and other tools used to analyze and process data. Data analytics is usually more focused than data science because rather than just looking for relationships between variables and data sets, data analysts have the specific goal of presenting summaries of statistical data in a way that supports the decision maker.

The application of Business Intelligence is one of the popular applications for data analysis today, so we find that many government and private institutions cannot dispense with applications of business intelligence, but rather treat them as dashboards and a tool to support decision-making.

Big data is associated with many modern technologies such as machine learning and artificial intelligence, and is considered an essential input for these technologies, especially in the field of machine learning, which considers artificial intelligence as a branch of data science.

It is difficult to draw lines between these technologies, but there are many common spaces between them, and they are deeply and deeply integrated.

In sum, the basic stages of big data management and analysis processes can be arranged according to the following steps:

  1. Collecting data and identifying its various sources and types, and the mechanisms for its transmission and storage.
  2. Data processing and processing for the analysis process, which includes addressing any deficiencies or errors in the data and determining the mechanisms for correcting them, to be in a standardized form ready for analysis, which is a very important stage, especially in cases of data of an unstructured nature, which requires setting clear criteria for classification and identification of fields and variables. and constants.
  3. The analysis stage, in which the objectives of the analysis and its method are determined. Here the types of analysis are divided into four types:
  • Descriptive analysis (Descriptive analytics): This analysis is interested in identifying the general statistical indicators that describe the data that occurred with the past, and most famous of these indicators (arithmetic average – standard deviation – Total – number – ratios – etc.).
  • Investigative analysis (Diagnostic analytics): This analysis examines the causes of the phenomena that occurred and the interrelationships between variables such as the relationship between vapor pressure and temperature, or the relationship between product availability and its price.
  • Predictive Analysis (Predictive Analytics): This type of analysis tries to understand the nature of relationships between variables to each other and to develop its operational model in the sports image, and then can predict future outcomes before they occur. This analysis is closely related to machine learning.
  • Default Analysis (Prescriptive Analytics): This combines analysis between the analyzes and predictive survey puts hypothetical solutions for future predictions. This analysis is closely related to artificial intelligence techniques.
  1. The implementation phase , in which the knowledge extracted from the data is transferred to the service of the decision maker, and this decision may be instantaneous and the machine can implement it directly, such as the decision made by artificial intelligence while driving a self-driving car.

Today, there are many applications specialized in big data, some of which are available for free to programmers and developers, and one of the most famous of these applications is the application of Apache Hadoop, which is the most prominent and most widely used tool in the big data industry with its enormous ability to process data on a large scale. Hadoop is a 100% open source (free) framework that can run on data center servers, or it can run through cloud services. Apache Spark is the second most important application for big data. It is worth noting that (Spark) can deal with data, whether it is real-time data or in the form of (Batch Files Data). As this application relies on fast running through memory, it can process data much faster than conventional system processors.

Away from this technical talk, the most important question is how to assess the extent to which different business sectors benefit from large amounts of data.

We find that the uses of big data in the industrial field have a significant impact on improving product quality and developing its characteristics based on a more accurate understanding of market requirements. It also has a major role in providing the information necessary for planning and managing the supply chain (Supply-Chain Management).

In a McKinsey study, it was announced that the impact of big data applications in the health care field in the United States of America ranged from 12% to 17% of the reduction in expenses, or between 300 and 400 billion dollars. Interestingly, the report indicated that these percentages are very conservative, given that technological innovations will have much greater effects in the future.

Another McKinsey report indicates the importance of big data in medical research and patient follow-up, as the data available at hospitals and insurance companies has become of great material value and is being sold and bought with research agencies and drug and drug manufacturers, which has largely replaced targeted and very expensive field research, which It lasts for many years, as big data is widely considered ready and available and can reveal many causal relationships between the patient and the disease, its causes and the possibilities of responding to its treatment, which will have a significant impact in shortening a lot of time and providing new treatments as a result of analyzing this data, and as a result of the hardware revolution. Digital – connected to the Internet – that can be worn and will have a significant role in protecting patients from pathological surprises or quickly dealing with any worrying changes, as well as predicting them before they occur.

Big data has become a commercial commodity in itself, bought and sold, and it has a large global market where there are many advantages to using it in all types of private and government businesses alike. In a report by IDC, a technology consultancy, it explained that big data revenues worldwide will grow from $130.1 billion in 2016 to more than $203 billion in 2020, a compound annual growth rate of 11.7%.

Overall global opinions of organizations that have dealt with big data in the public and private sectors are very positive. In a recent study by the consulting company Accenture, which included many institutions that conducted one or more projects in the field of big data, officials confirmed that the return on investment in big data reached rates of more than 92%, and that 89% of the responses agreed that These projects were very important to the success, development and profitability of their business. Moreover, 85% of the responses emphasized that big data had a revolutionary role in reconfiguring and developing the business of their organizations.

But how is the situation of data in the Arab world? Multiple statistics indicate that the Arab region, despite its varying levels of digital maturity, the rates of employing this technology in the Gulf countries are higher than others, but it can be asserted that the Arab economy has not benefited seriously so far from big data, and that the areas for benefiting from this technology are still limited and used only At the levels of some large institutions such as banks, airlines and governments on a limited scale.

Moreover, in order to reach real positive economic results in the Arab world, we need legislative support – a key governmental role – that encourages data movement and takes into account the balance between the need to be open to and benefit from big data in exchange for preserving personal rights.

Also, consideration must be given to the fact that the Arab common market does not only concern goods, but must also be applied to the data, whose market must flourish, whether at the level of raw data or information that has been analyzed and reports extracted from these data, and that these data are used at the level The economy, especially in the areas of feasibility studies, and this item in itself would provide billions in expenditures and economic opportunities multiple times this value.

We also appeal to support emerging companies that specialize in the areas of fourth generation technologies, especially big data, because they are simply the engine that we must rely on in the near future in the Arab region.

We are confident that the road is paved and that Arab capabilities, especially youth, can be developed, to have a greater role in improving the Arab economic reality. All they need today is to support their capabilities and knowledge so that they can enter the market, compete and develop real.