Microsoft Fabric Data Factory
Microsoft Fabric Data Factory is a cloud-based service for data integration, transformation, and orchestration. In data engineering, it is used to connect data sources in a structured way, automate data flows, and run ETL/ELT processes in a traceable manner.
The focus is on stable data integration—from extraction and transformation through to delivery into analytical target systems.

Data Integration

Connecting multiple data sources
Pipelines can be built to extract data from various sources, both on-premises and in the cloud. Typical sources include SQL databases, NoSQL databases, APIs, file storage such as Azure Blob Storage, or third-party sources such as Amazon S3.
Hybrid data processing
Using an Integration Runtime, data can be processed regardless of whether it is stored on-premises or in the cloud.
ETL and ELT

Data extraction (Extract)
Data is extracted from source systems and prepared for further processing. This can be scheduled or event-driven.
Data transformation (Transform)
Raw data is transformed by applying calculations, validation, cleansing, aggregation, and preparation for analytics. This is often implemented via Mapping Data Flows in Data Factory.
Data loading (Load)
Transformed data is loaded into target systems such as data warehouses, data lakes (e.g., Azure Data Lake), or reporting systems.
Automation

Building data pipelines
Pipelines orchestrate data flows across processes. Activities, conditional execution, and branching can be configured within pipelines.
Workflow automation
Triggers can automate pipelines so they run on schedules or in response to events.
Error handling
Mechanisms for error handling can be implemented so processes are stopped safely or recovered in case of issues.
Data migration

Migration from on-premises to cloud
Data can be migrated securely and efficiently from local systems to the cloud using Microsoft Fabric Data Factory for transfer and transformation.
Data movement between cloud services
Data can be moved between different cloud services, for example from Amazon S3 to Azure Blob Storage or between Azure services.
Data preparation for machine learning

Data preprocessing for ML
Data can be prepared for machine learning models by cleansing, formatting, and aggregating it—often in combination with Microsoft Fabric machine learning workflows.
Automation of ML data pipelines
Data pipelines can be automated to deliver continuously updated datasets for machine learning.
Data governance and security

Access control
Security policies can be implemented to control access to sensitive data and ensure that only authorized users can access specific pipelines and sources.
Compliance and auditing
Traceability can be supported by logging data movement and transformation steps to meet compliance and auditing requirements.
Features
Microsoft Fabric Data Factory is a cloud-based service for data integration that enables organizations to connect, transform, and manage data from multiple sources.

Data integration
Microsoft Fabric Data Factory supports collecting and integrating data from sources such as databases, APIs, file systems, and cloud services (e.g., Azure Blob Storage, SQL Server, Salesforce, Amazon S3).
ETL/ELT processes
Microsoft Fabric Data Factory provides tools for ETL (Extract, Transform, Load) and ELT (Extract, Load, Transform).
It supports extracting data from multiple sources, transforming it (e.g., data cleansing, calculations), and loading it into targets such as data warehouses or data lakes.
Data orchestration
A visual interface allows users to build data pipelines. Pipelines consist of activities and connections that extract, transform, and load data.
Data flows and transformations
With Mapping Data Flows in Fabric, complex transformations can be created without manually writing code, using a drag-and-drop interface for designing data flow processes.
Scalability and performance
Data Factory can process large data volumes efficiently and scale dynamically as requirements change, leveraging Fabric’s cloud capabilities for parallel processing.
Automation and scheduling
Pipelines can be automated to run at scheduled intervals or in response to specific events.
Frequently Asked Questions about Microsoft Fabric Data Factory
In this FAQ you will find the topics that come up most frequently in consulting and training. Each answer is concise and refers to further content where appropriate. Is your question missing? Feel free to contact us.

Which data sources can be connected with Microsoft Fabric Data Factory?
It can integrate relational and non-relational databases, APIs, file storage, cloud services, and on-premises systems.
Does Microsoft Fabric Data Factory support hybrid architectures?
Yes. Using the Integration Runtime, data can be processed from both on-premises systems and cloud environments.
How are transformations implemented?
Transformations can be implemented via Mapping Data Flows, where data can be validated, cleansed, aggregated, and structurally adjusted.
Can Microsoft Fabric Data Factory be used for data migrations?
Yes. It can be used for structured transfer and transformation of data between on-premises and cloud environments, as well as between different cloud services.
