Exploratory Data Analysis using R

Course Overview

ID 2201005
Duration 3.0 days
Methods Presentation with examples and hands-on labs.
Prerequisites Basics in R and Statistics
Target group Data Analysts
Vorgängerkurs 2201001

Overview

Exploratory Data Analysis (EDA) is a statistical approach to analyzing data sets to summarize their main characteristics. This training primarily focuses on four main techniques of EDA: Principal Component Analysis (PCA) for quantitative variables, Correspondence Analysis (CA) and Multiple Correspondence Analysis (MCA) for categorical variables and finally (hierarchical and partitioning) clustering methods. As an umbrealla technique, this training also shows Factor Analysis (FA) and Multiple Factor Analysis (MFA). For the hands-on labs and practical examples the participants will use R and esp. FactoMineR - a special R package for the exploratory data analysis.

Dates

OPEN
IN-HOUSE

Zurzeit stehen keine offenen Termine zur Verfügung. Nutzen Sie alternativ die Inhouse‑Option.

Learn with customized examples and content—precisely tailored to your requirements.

Your benefits at a glance

  • Flexible preferred date
  • Customized content
  • Intensive exchange
  • High practical relevance

Description

Use different techniques to discover important variables, combine variables into factors, and discover differences and similarities in your data.

Services

  • Lunch / catering
  • Help with hotel / travel
  • Comelio certificate
  • Flexible: free cancellation up to one day before
Service-Kaffeekanne

Still looking for additional reading? Discover suitable specialist books in our catalog.

Content

Principal Component Analysis (PCA)

Objectives of PCA and Introduction to PCA - Studying Individuals: The Cloud of Individuals, Fitting the Cloud of Individuals - Variables: The Cloud of Variables, Fitting the Cloud of Variables - Relationships - Interpreting the Data - Testing the Significance of the Components - Implementation with R and FactoMineR

Correspondence Analysis (CA)

Objectives and the Independence Model - Fitting the Clouds: Row and Column Profiles - Interpreting the Data - Implementation with R and FactoMineR

Multiple Correspondence Analysis (MCA)

Objectives: Studying Individuals, Variables, and Categories - Defining Distances between Individuals and Distances between Categories - CA on the Indicator Matrix: Relationship between MCA and CA, The Cloud of Individuals, Variables, and Categories - Implementation with R and FactoMineR

Clustering

Concepts of Similarity and Distance: Similarity between Individuals and Groups - Ward's Method - Partitioning and Hierarchical Clustering - Direct Search for Partitions: K-means Algorithm - Clustering and Principal Component Methods - Implementation with R and FactoMineR

Multiple Factor Analysis (MFA)

Factorial Analysis of Mixed Data - Weighting Groups of Variables - Comparing Groups of Variables and Indscal Model - Qualitative and Mixed Data - Multiple Factor Analysis and Procrustes Analysis - Hierarchical Multiple Factor Analysis - Implementation with R and FactoMineR

Instructor

Our trainer for statistics and data mining with R, Marco Skulschus, studied economics in Wuppertal and Paris and has been working for more than 10 years as a lecturer, author of specialist books on databases and data analysis, and as a consultant for statistical analysis with R. Participants in his R seminars work in marketing, quality assurance, or are (aspiring) data scientists who want to use R for statistics and data mining.

Publications

  • Grundlagen empirische Sozialforschung (Comelio Medien )
    978-3-939701-23-1
  • System und Systematik von Fragebögen (Comelio Medien )
    978-3-939701-26-2
  • Oracle SQL (Comelio Medien )
    978-3-939701-41-5
  • MS SQL Server - T-SQL - Abfragen und Analysen (Comelio Medien )
    978-3-939701-69-9

Projects

As a consultant, Mr. Skulschus designs analysis systems based on relational databases and then develops statistical models and analyses using R programming. His clients include market research companies, marketing departments, quality assurance and process optimization departments, and research institutions.

Research

He led a multi-year research project to develop a questionnaire system with an ontology-based data model and innovative question-answer representations. Funded by the BMWi and in collaboration with various universities.