DATA MINING - the automated processing of large amounts of DATA in order to extract KNOWLEDGE - c...

Get Started. It's Free
or sign up with your email address
Rocket clouds
DATA MINING - the automated processing of large amounts of DATA in order to extract KNOWLEDGE - clandestine and hidden patterns, trends, correlations, etc, by using various methods of statistics, data analysis, and machine learning by Mind Map: DATA MINING - the automated processing of large amounts of DATA in order to extract KNOWLEDGE - clandestine and hidden patterns, trends, correlations, etc, by using various methods of statistics, data analysis, and machine learning

1. Data Science

1.1. a multifaceted discipline, which encompasses machine learning, statistics and related branches of mathematics, increasingly borrows from high performance scientific computing, all in order to ultimately

1.1.1. EXTRACT INSIGHT from DATA and USE this new found information to tell stories

2. Artificial Intelligence

2.1. systems that display intelligent behaviour by

2.1.1. ANALYSING their environment and TAKING actions with some degree of autonomy - to achieve specific goals

2.2. The mined data (and the accompanying patterns and hypotheses) can then be used as the basis for both artificial intelligence and machine learning.

2.3. data mining, artificial intelligence, and machine learning are so intertwined that it’s difficult to establish a ranking or hierarchy between the three. Instead, they’re involved in symbiotic relationships by which a combination of methods can be used to produce more accurate results. Data mining is an integral part of coding programs with the information, statistics, and data necessary for AI to create a solution.

3. Knowledge Discovery in Databases (KDD)

3.1. the overall process of discovering

3.1.1. useful knowledge from data

4. Data Analysis

4.1. is used to TEST models and Hypothesis on the dataset - such us the effectiveness of a marketing campaign

4.2. in contrast, data mining uses machine-learning and statistical models to uncover CLANDESTINE or HIDDEN patterns in a large volume of data.

4.3. Data mining is the process of extracting nontrivial and potentially useful information, or knowlege, from the enormous data sets available in experimental sciences (historical records, reanalysis, GCM simulations, etc.), providing explicit information that has a readable form and can be used to solve diagnosis, classification or forecasting problems. Traditionally, these problems were solved by direct hands-on data analysis using standard statistical methods, but the increasing volume of data has motivated the study of automatic data analysis using more complex and sophisticated tools which can operate directly from data. Thus, data mining identifies trends within data that go beyond simple analysis.

5. Pattern Recognition

5.1. the automated recognition of


5.2. the process of recognizing

5.2.1. PATTERNS by using machine learning algorithm

5.3. the classification of data based on knowledge already gained or on statistical information extracted from patterns and/or their representation

5.4. “The act of taking in raw data and taking an ‘action’ based on the ‘category’ of the pattern ”

5.5. Data Mining – produce insight and understanding about the structure of large observational datasets – e.g.

5.5.1. Find interesting relationships

5.5.2. Summarize the data in new ways that are understandable and actionable

6. Statistics

6.1. a component of data mining that provides the tools and analytics techniques for dealing with large amounts of data. It is the science of

6.1.1. LEARNING from DATA and includes everything from COLLECTING and ORGANISING to ANALYSING and presenting data.

6.2. What distinguishes data mining from conventional statistical data analysis is that data mining is usually done for the purpose of 'secondary analysis' aimed at finding unsuspected relationships, perhaps, unrelated to the purposes for which the data were originally collected. In other words, data mining is very much an inductive exercise, as opposed to the traditional hypothetico-deductive approach of statistics.

6.3. Statistics is only about quantifying data. While it uses tools to find relevant properties of data, it is a lot like math. It provides the tools necessary for data mining. Data mining, on the other hand, builds models to detect patterns and relationships in data, particularly from large data bases.

7. Deep Packet Inspection

7.1. a type of DATA PROCESSING that INSPECTS in detail the data being sent over a computer network, and usually TAKES ACTIONS by blocking, re-routing, or logging it accordingly

7.2. enables advanced network management, user service, and security functions as well as internet data mining, eavesdropping, and internet censorship.

7.3. a generic name for technologies that enable service providers to CAPTURE and INSPECT (DATA) packet flows

7.4. a technology that enables the network owner to ANALYSE internet traffic (DATA FLOW), through the network, in real-time and to differentiate them according to their payload. Since, this has to be done on real time basis at the high speeds it cannot be implemented by software running on normal processors or switches. It has only become possible in the last few years through advances in computer engineering and in PATTERN MATCHING ALGORITMS.

7.5. DPI systems use expressions to define PATTERNS of interest in network data streams. The equipment is programmed to make decisions about how to handle the packet or a stream of packets based on the recognition of a regular expression or pattern in the payload. This allows networks to CLASSIFY and control traffic based on the content, applications, and subscribers.