Exploratory Data Analysis using R
Course Overview
| ID | 2201005 |
| Duration | 3.0 days |
| Methods | Presentation with examples and hands-on labs. |
| Prerequisites | Basics in R and Statistics |
| Target group | Data Analysts |
| Vorgängerkurs | 2201001 |
Overview
Exploratory Data Analysis (EDA) is a statistical approach to analyzing data sets to summarize their main characteristics. This training primarily focuses on four main techniques of EDA: Principal Component Analysis (PCA) for quantitative variables, Correspondence Analysis (CA) and Multiple Correspondence Analysis (MCA) for categorical variables and finally (hierarchical and partitioning) clustering methods. As an umbrealla technique, this training also shows Factor Analysis (FA) and Multiple Factor Analysis (MFA). For the hands-on labs and practical examples the participants will use R and esp. FactoMineR - a special R package for the exploratory data analysis.
Dates
Zurzeit stehen keine offenen Termine zur Verfügung. Nutzen Sie alternativ die Inhouse‑Option.
Learn with customized examples and content—precisely tailored to your requirements.
Your benefits at a glance
- Flexible preferred date
- Customized content
- Intensive exchange
- High practical relevance
Description
Use different techniques to discover important variables, combine variables into factors, and discover differences and similarities in your data.
Services
- Lunch / catering
- Help with hotel / travel
- Comelio certificate
- Flexible: free cancellation up to one day before
Comelio Media
Still looking for additional reading? Discover suitable specialist books in our catalog.
Content
Principal Component Analysis (PCA)
Objectives of PCA and Introduction to PCA - Studying Individuals: The Cloud of Individuals, Fitting the Cloud of Individuals - Variables: The Cloud of Variables, Fitting the Cloud of Variables - Relationships - Interpreting the Data - Testing the Significance of the Components - Implementation with R and FactoMineR
Correspondence Analysis (CA)
Objectives and the Independence Model - Fitting the Clouds: Row and Column Profiles - Interpreting the Data - Implementation with R and FactoMineR
Multiple Correspondence Analysis (MCA)
Objectives: Studying Individuals, Variables, and Categories - Defining Distances between Individuals and Distances between Categories - CA on the Indicator Matrix: Relationship between MCA and CA, The Cloud of Individuals, Variables, and Categories - Implementation with R and FactoMineR
Clustering
Concepts of Similarity and Distance: Similarity between Individuals and Groups - Ward's Method - Partitioning and Hierarchical Clustering - Direct Search for Partitions: K-means Algorithm - Clustering and Principal Component Methods - Implementation with R and FactoMineR
Multiple Factor Analysis (MFA)
Factorial Analysis of Mixed Data - Weighting Groups of Variables - Comparing Groups of Variables and Indscal Model - Qualitative and Mixed Data - Multiple Factor Analysis and Procrustes Analysis - Hierarchical Multiple Factor Analysis - Implementation with R and FactoMineR
Instructor
Our trainer for statistics and data mining with R, Marco Skulschus, studied economics in Wuppertal and Paris and has been working for more than 10 years as a lecturer, author of specialist books on databases and data analysis, and as a consultant for statistical analysis with R. Participants in his R seminars work in marketing, quality assurance, or are (aspiring) data scientists who want to use R for statistics and data mining.
Publications
- Grundlagen empirische Sozialforschung (Comelio Medien )
978-3-939701-23-1 - System und Systematik von Fragebögen (Comelio Medien )
978-3-939701-26-2 - Oracle SQL (Comelio Medien )
978-3-939701-41-5 - MS SQL Server - T-SQL - Abfragen und Analysen (Comelio Medien )
978-3-939701-69-9
