Data Mining - Concepts and Techniques
Course Overview
| ID | 2858813 |
| Duration | 2.0 days |
| Methods | Lecture with examples and exercises. |
| Prerequisites | Basics in Statistics |
| Target group | Information workers, IT professionals |
Overview
Data mining (the analysis step of the \"Knowledge Discovery in Databases\" process, or KDD) is the computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems. The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use. Aside from the raw analysis step, it involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating.
Dates
Zurzeit stehen keine offenen Termine zur Verfügung. Nutzen Sie alternativ die Inhouse‑Option.
Learn with customized examples and content—precisely tailored to your requirements.
Your benefits at a glance
- Flexible preferred date
- Customized content
- Intensive exchange
- High practical relevance
Description
In this seminar, you'll learn what data mining is and how you can use it to answer questions about your data. Use data mining in the graphical open-source tool Weka from the University of Waikato to identify patterns in data, such as groups, important variables, or relationships that can be used for classification and prediction.
Services
- Lunch / catering
- Help with hotel / travel
- Comelio certificate
- Flexible: free cancellation up to one day before
Comelio Media
Still looking for additional reading? Discover suitable specialist books in our catalog.
Content
Introduction to Data Mining
Overview: Why Data Mining? What Is Data Mining? What Kinds of Data Can Be Mined? What Kinds of Patterns Can Be Mined? Which Technologies Are Used? - Data Preparation: Data Objects and Attribute Types, Basic Statistical Descriptions of Data, Measuring Data Similarity and Dissimilarity - Data Preprocessing: Data Cleaning, Data Integration, Data Reduction, Data Transformation and Data Discretization - Data Warehousing and Online Analytical Processing (OLAP)
Data Mining for Frequent Patterns
Frequent Itemset Mining Methods - The Apriori Algorithm - Market Basket Analysis - Pattern Evaluation Method
Classification using Decision Trees
Decision Tree Induction - Attribute Selection Measures - Tree Pruning - Scalability and Decision Tree Induction - Rule-Based Classification
Classification using Probabilistic Approaches
Bayes Classification Methods - Bayes´ Theorem –Naïve Bayes Algorithm – Bayesian Networks - Model Evaluation and Selection - Techniques to Improve Classification Accuracy
Classification: Advanced Methods
Classification by Backpropagation and Artificial Neural Networks - Support Vector Machines - Lazy Learners
Cluster Analysis
Overview of Basic Clustering Methods - Measuring Data Similarity and Dissimilarity: Data Matrix versus Dissimilarity Matrix, Proximity Measures for Nominal, Ordinal, and Binary Attributes, Dissimilarity of Numeric Data - Partitioning Methods (k-Means and k-Medoids) - Hierarchical Methods: Agglomerative versus Divisive Hierarchical Clustering
Instructor
Our statistics and data mining trainer, Marco Skulschus, studied economics in Wuppertal and Paris and has been working for over 10 years as a lecturer, author of specialist books on databases, and as a developer of analytics platforms. He develops reporting solutions with data mining components in Microsoft Fabric and Oracle DB and develops in R, Python, and Oracle PL/SQL.
Publications
- Grundlagen empirische Sozialforschung (Comelio Medien )
978-3-939701-23-1 - System und Systematik von Fragebögen (Comelio Medien )
978-3-939701-26-2 - Oracle SQL (Comelio Medien )
978-3-939701-41-5 - MS SQL Server - T-SQL - Abfragen und Analysen (Comelio Medien )
978-3-939701-69-9
