Share On:
Dr.. Eng. Ali Mohamed Al-Khouri
As we started and explained in the previous article, data today has become the primary means and driver of value-added chains and the economic value network, and has even become a wealth and economic value in itself. With the emergence of the concept of Big Data as a science with its tools and goals, the world has increasingly noticed during the first decade of the twenty-first century the importance of data, and benefiting from its huge daily accumulations, as a material wealth characterized by increasing in value the more it is used, and inexhaustible like wealth. Other natural resources such as oil, gold, iron, plants and others. The scientific definition describes big data as a set of data that is large in size and complex in structure, so that traditional data processing programs cannot deal with it, but rather need advanced systems to manage and analyze it within a reasonable period of time.Data types and their partitions
Large data sets can have different forms: (1) Structured data, such as in databases, and (2) Unstructured data, such as website, social media, and newspaper data, through which in-depth analysis can be accessed. important concepts and conclusions. According to global models, big data has different qualities, such as:- Volume, which can reach millions of trillions of records and data.
- Variety, which means the diversity of the extracted data, including electronic files, databases, web pages, video recordings, and others, and it requires time and effort to prepare them in a suitable form for analysis.
- High speed (Velocity), and refers to the speed of data production and extraction, as this data requires real time processing and analysis systems, as critical elements to support decision-making.
- Accuracy and reliability: It refers to the reliability of the data.
- Value: It represents the real goal of blending the first four qualities, such as the information collected and analyzed to create a product line, new sales opportunities, or measures to reduce costs.
In this way, it can be considered that data is like raw materials. It needs to be analyzed and extracted from the facts and knowledge that these data carry, to benefit from them, as is the case with oil. Just extracting it from the ground does not mean that it is ready as a fuel for industrial use, but rather it needs to be refined and filtered to produce useful usable materials.
How does the economy benefit from big data? The digital economy views big data as one of the most important foundations for building any sustainable economy in the present and the future, because it will be based on knowledge, real analysis and real understanding, and not leave it to chance or luck, and in order to transform big data into clear information and usable knowledge. There are two branches of science that are employed as approaches to data modeling and big data processing, and they are data science and data analytics.
Data science is a field that combines multiple disciplines such as statistics, mathematics, and intelligent data collection techniques, and adapts them to analysis in order to extract insights and information. This science mainly focuses on exploring the things we do not know and finding better ways to analyze information, by asking possible questions and hypotheses, and linking them with different, disparate and unrelated data sources.
As for the science of data analysis (Big Data Analytics), it focuses on statistical analysis procedures for available data sets to reveal hidden patterns and unknown relationships – such as market trends and customer preferences – and on designing simplified ways to display data and information to solve problems, answer questions that concern the decision maker, and find results. It can lead to immediate improvements.
To understand the difference between data science and data analysis, data science can be described as the holistic umbrella that includes scientific methods, mathematics, statistics, and other tools used to analyze and process data. Data analytics is usually more focused than data science because rather than just looking for relationships between variables and data sets, data analysts have the specific goal of presenting summaries of statistical data in a way that supports the decision maker.
The application of Business Intelligence is one of the popular applications for data analysis today, so we find that many government and private institutions cannot dispense with applications of business intelligence, but rather treat them as dashboards and a tool to support decision-making.
Big data is associated with many modern technologies such as machine learning and artificial intelligence, and is considered an essential input for these technologies, especially in the field of machine learning, which considers artificial intelligence as a branch of data science.
It is difficult to draw lines between these technologies, but there are many common spaces between them, and they are deeply and deeply integrated.
In sum, the basic stages of big data management and analysis processes can be arranged according to the following steps:
- Collecting data and identifying its various sources and types, and the mechanisms for its transmission and storage.
- Data processing and processing for the analysis process, which includes addressing any deficiencies or errors in the data and determining the mechanisms for correcting them, to be in a standardized form ready for analysis, which is a very important stage, especially in cases of data of an unstructured nature, which requires setting clear criteria for classification and identification of fields and variables. and constants.
- The analysis stage, in which the objectives of the analysis and its method are determined. Here the types of analysis are divided into four types:
- Descriptive analysis (Descriptive analytics): This analysis is interested in identifying the general statistical indicators that describe the data that occurred with the past, and most famous of these indicators (arithmetic average – standard deviation – Total – number – ratios – etc.).
- Investigative analysis (Diagnostic analytics): This analysis examines the causes of the phenomena that occurred and the interrelationships between variables such as the relationship between vapor pressure and temperature, or the relationship between product availability and its price.
- Predictive Analysis (Predictive Analytics): This type of analysis tries to understand the nature of relationships between variables to each other and to develop its operational model in the sports image, and then can predict future outcomes before they occur. This analysis is closely related to machine learning.
- Default Analysis (Prescriptive Analytics): This combines analysis between the analyzes and predictive survey puts hypothetical solutions for future predictions. This analysis is closely related to artificial intelligence techniques.
- The implementation phase , in which the knowledge extracted from the data is transferred to the service of the decision maker, and this decision may be instantaneous and the machine can implement it directly, such as the decision made by artificial intelligence while driving a self-driving car.
