Data Mining using R
Course Overview
| ID | 2201006 |
| Duration | 2.0 days |
| Methods | Lecture with examples and exercises. |
| Prerequisites | General knowledge of math |
| Target group | Information workers, IT professionals |
| Vorgängerkurs | 2201001 |
Overview
Data mining (the analysis step of the \"Knowledge Discovery in Databases\" process, or KDD) is the computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems. The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use. Aside from the raw analysis step, it involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating.
Dates
Zurzeit stehen keine offenen Termine zur Verfügung. Nutzen Sie alternativ die Inhouse‑Option.
Learn with customized examples and content—precisely tailored to your requirements.
Your benefits at a glance
- Flexible preferred date
- Customized content
- Intensive exchange
- High practical relevance
Description
R offers a variety of packages for multivariate analysis and data mining. Use R for data mining to identify patterns in data, such as groups, important variables, or relationships that can be used for classification and prediction. This seminar will show you how to perform many data mining methods using RStudio and common R packages. It will provide you with the mathematical background of the individual methods and demonstrate how to perform data mining practically with R, RStudio, and R Data Miner (Rattle).
Services
- Lunch / catering
- Help with hotel / travel
- Comelio certificate
- Flexible: free cancellation up to one day before
Comelio Media
Still looking for additional reading? Discover suitable specialist books in our catalog.
Content
Data Mining-Grundlagen
Statistik, multivariate Statistik und Data Mining – Data Mining-Kreislauf - Daten-Vorverarbeitung: Beschreibende Datenaggregation, Datenbereinigung, Datenintegration und –transformation – Datenreduktion – Diskretisierung und Konzept-Hierarchien – Data Mining und Business Intelligence: Datenbanken, Data Warehouses und OLAP als Basis für Data Mining
Data Mining mit der Assoziationsanalyse
Suchen von häufigen Kombinationen (Frequent Itemset Mining) – Apriori-Algorithmus - Assoziationsregeln und Assoziationsanalyse - Warenkorbanalyse
Data Mining mit Entscheidungsbäumen
Ableitung von Entscheidungsbäumen – Auswahl von Attributen – Beschneidung von Bäumen – Ableitung von Regeln - Gütemaße und Vergleich von Modellen
Data Mining mit Wahrscheinlichkeitstheorie
Wahrscheinlichkeitstheorie und Bayes Theorem –Naïve Bayes-Algorithmus – Bayes Netze
Fortgeschrittene Data Mining-Verfahren für Klassifikation
Künstliche neuronale Netze und der Backpropagation-Algorithmus - Support Vector Machines für linear und nicht-linear trennbare Daten – Klassifikation mit Assoziationsanalyse – Lazy und Eager Learners
Cluster-Analyse
Einführung in die Cluster Analyse – Ähnlichkeits- und Distanzmessung - Varianten und grundlegende Techniken – Partitionierende Methoden: k-Means-Verfahren - Hierarchische Methoden: agglomerative und divisive Verfahren – Weitere Verfahren: Dichte- und Grid-basierte Methoden
Instructor
Our trainer for statistics and data mining with R, Marco Skulschus, studied economics in Wuppertal and Paris and has been working for more than 10 years as a lecturer, author of specialist books on databases and data analysis, and as a consultant for statistical analysis with R. Participants in his R seminars work in marketing, quality assurance, or are (aspiring) data scientists who want to use R for statistics and data mining.
Publications
- Grundlagen empirische Sozialforschung (Comelio Medien )
978-3-939701-23-1 - System und Systematik von Fragebögen (Comelio Medien )
978-3-939701-26-2 - Oracle SQL (Comelio Medien )
978-3-939701-41-5 - MS SQL Server - T-SQL - Abfragen und Analysen (Comelio Medien )
978-3-939701-69-9
