Azure Data Factory documentation
Azure Data Factory is Azure's cloud ETL service for scale-out serverless data integration and data transformation. It offers a code-free UI for intuitive authoring and single-pane-of-glass monitoring and management. You can also lift and shift existing SSIS packages to Azure and run them with full compatibility in ADF. SSIS Integration Runtime offers a fully managed service, so you don't have to worry about infrastructure management.
- Data Factory Documentation
- Switch to Data Factory in Microsoft Fabric documentation
- Overview
- Quickstarts
- Tutorials
- List of tutorials
- Copy and ingest data
- Copy data to and from a Fabric Lakehouse
- From Azure Blob Storage to Azure SQL Database
- From a SQL Server database to Azure Blob Storage
- From Amazon Web Services S3 to Azure Data Lake Storage
- From Azure Data Lake Storage Gen1 to Azure Data Lake Storage Gen2
- From Azure SQL Database to Azure Synapse Analytics
- From SAP BW to Azure Data Lake Storage Gen2
- From Microsoft 365 to Azure Blob storage
- Multiple tables in bulk
- Incrementally load data
- From one Azure SQL Database table
- From multiple SQL Server database tables
- Using change tracking information in SQL Server
- Using CDC in Azure SQL MI
- New files by last modified data
- New files by time partitioned file name
- Build a copy pipeline using managed VNet and private endpoints
- Transform data
- Transform data with mapping data flows
- Prepare data with wrangling
- Using external services
- HDInsight Spark
- Databricks Notebook
- Hive transformation in virtual network
- Build mapping dataflow pipeline using managed VNet and private endpoints
- Control Flow
- Run SSIS packages in Azure
- Lineage
- End-to-end labs
- Managed virtual network
- Self-hosted integration runtime
- Samples
- Concepts
- Pipelines and activities
- Pipeline parameters and variables
- Annotations and user properties
- Nested activities
- Linked services
- Datasets
- Pipeline execution and triggers
- Integration runtime
- Data flows
- Transform data with mapping data flows
- Prepare data with Power Query data wrangling
- Change data capture
- Roles and permissions
- Naming rules
- Data redundancy
- How-to guides
- Create a data factory in UI
- Create Data Factory Programmatically
- Author
- Visually author data factories
- Iterative development and debugging
- Deactivate and reactivate
- Management hub
- Source control
- Author a change data capture resource
- Author a change data capture resource with schema evolution
- Connect to Azure DevOps in another tenant
- Manage your environment with DataOps
- Continuous integration and delivery
- Automated publishing for CI/CD
- Deploy linked ARM templates with VSTS
- Connectors
- Connector overview
- Amazon Marketplace Web Service Deprecated
- Amazon RDS for Oracle
- Amazon RDS for SQL Server
- Amazon Redshift
- Amazon S3
- Amazon S3 Compatible Storage
- AppFigures
- Asana
- Avro format
- Azure Blob Storage
- Azure AI Search
- Azure Cosmos DB analytical store
- Azure Cosmos DB for NoSQL
- Azure Cosmos DB for MongoDB
- Azure Data Explorer
- Azure Data Lake Storage Gen1
- Azure Data Lake Storage Gen2
- Azure Database for MariaDB
- Azure Database for MySQL
- Azure Database for PostgreSQL
- Azure Databricks Delta Lake
- Azure File Storage
- Azure SQL Database
- Azure SQL Managed Instance
- Azure Synapse Analytics
- Azure Table Storage
- Binary format
- Cassandra
- Common Data Model format
- Concur
- Couchbase
- data.world
- DB2
- Dataverse
- Delimited text format
- Delta format
- Drill
- Dynamics 365
- Dynamics AX
- Dynamics CRM
- Excel format
- File System
- FTP
- GitHub
- Google Ads
- Google BigQuery
- Google Cloud Storage
- Google Sheets
- Greenplum
- HBase
- HDFS
- Hive
- HTTP
- HubSpot
- Iceberg format
- Impala
- Informix
- Jira
- JSON format
- Magento
- MariaDB
- Marketo
- Microsoft 365
- Microsoft Access
- Microsoft Fabric Lakehouse
- Microsoft Fabric Warehouse
- MongoDB
- MongoDB Atlas
- MySQL
- Netezza
- OData
- ODBC
- Oracle
- Oracle Cloud Storage
- Oracle Eloqua
- Oracle Responsys
- Oracle Service Cloud
- ORC format
- Parquet format
- PayPal
- Phoenix
- PostgreSQL
- Presto
- Quickbase
- QuickBooks Online
- REST
- Salesforce
- Salesforce Service Cloud
- Salesforce Marketing Cloud
- SAP Business Warehouse Open Hub
- SAP Business Warehouse MDX
- SAP CDC
- SAP Cloud for Customer
- SAP ECC
- SAP HANA
- SAP Table
- ServiceNow
- SFTP
- SharePoint Online List
- Shopify
- Smartsheet
- Snowflake
- Spark
- SQL Server
- Square
- Sybase
- TeamDesk
- Teradata
- Twilio
- Vertica
- Web Table
- Xero
- XML format
- Zendesk
- Zoho
- Move data
- Copy data using copy activity
- Monitor copy activity
- Delete files using Delete activity
- Copy data tool
- Metadata driven copy data
- Format and compression support
- Copy activity performance
- Preserve metadata and ACLs
- Schema and type mapping
- Fault tolerance
- Data consistency verification
- Copy activity log
- Format and compression support legacy
- Transform data
- Execute Data Flow activity
- Execute Power Query activity
- Azure Function activity
- Custom activity
- Databricks Jar activity
- Databricks Notebook activity
- Databricks Python activity
- Data Explorer Command activity
- Data Lake U-SQL activity
- HDInsight Hive activity
- HDInsight MapReduce activity
- HDInsight Pig activity
- HDInsight Spark activity
- HDInsight Streaming activity
- Machine Learning Execute Pipeline activity
- Machine Learning Studio classic Batch Execution activity
- Machine Learning Studio classic Update Resource activity
- Mapping data flows
- Stored Procedure activity
- Script activity
- Compute linked services
- Synapse Notebook activity
- Synapse Spark job definition activity
- Control flow
- Append Variable activity
- Execute Pipeline activity
- Fail activity
- Filter activity
- For Each activity
- Get Metadata activity
- If Condition activity
- Lookup activity
- Set Variable activity
- Set Pipeline Return Value
- Switch activity
- Until activity
- Validation activity
- Wait activity
- Web activity
- Webhook activity
- Data flow transformations
- Parameterize
- Security
- Data movement security considerations
- Data access strategies
- Azure integration runtime IP addresses
- Store credentials in Azure Key Vault
- Use Azure Key Vault secrets in pipeline activities
- Encrypt credentials for self-hosted integration runtime
- Configure outbound allow lists Preview
- Credentials in Data Factory
- Managed identity for Data Factory
- Encrypt data factory with customer managed key
- Managed virtual network
- Azure private link for Data Factory
- Azure security baseline
- Settings
- Monitor and manage
- Monitor visually
- Monitor with Azure Monitor
- Monitor with SDKs
- Pipeline failure and error handling
- Monitor pipelines with email notifications
- Monitor pipelines with Microsoft Teams notifications
- Monitor integration runtime
- Monitor managed virtual network integration runtime
- Monitor Azure-SSIS integration runtime
- Run Data Pipelines with Service Level Agreements
- Reconfigure Azure-SSIS integration runtime
- Copy or clone a data factory
- Create integration runtime
- Azure integration runtime
- Self-hosted integration runtime
- Create and configure a self-hosted integration runtime
- Self-hosted integration runtime auto-update and expire notification
- Shared self-hosted integration runtime
- Automation scripts of self-hosted integration runtime
- Run self-Hosted Integration Runtime in Windows container
- Diagnostic tool for self-hosted integration runtime
- Monitor self-hosted integration runtime in Azure
- Configure self-hosted integration runtime for log analytics collection
- Azure-SSIS integration runtime
- Run SSIS packages in Azure
- Run SSIS packages in Azure from SSDT
- Run SSIS packages with Azure SQL Managed Instance Agent
- Run SSIS packages with Azure-enabled dtexec
- Run SSIS packages with Execute SSIS Package activity
- Run SSIS packages with Stored Procedure activity
- Schedule Azure-SSIS integration runtime
- Join Azure-SSIS IR to a virtual network
- Configure Self-Hosted IR as a proxy for Azure-SSIS IR
- Enable Microsoft Entra authentication for Azure-SSIS IR
- Connect to data with Windows Authentication
- Save files and connect to file shares
- Provision Enterprise Edition for Azure-SSIS IR
- Built-in and preinstalled components on Azure-SSIS IR
- Customize setup for Azure-SSIS IR
- Install licensed components for Azure-SSIS IR
- Configure high performance for Azure-SSIS IR
- Configure disaster recovery for Azure-SSIS IR
- Clean up SSISDB logs automatically
- Use Azure SQL Managed Instance with Azure-SSIS IR
- Migrate SSIS jobs with SSMS
- Manage packages with Azure-SSIS IR package store
- Create triggers
- Data Catalog and Governance
- Scenarios
- Send email from a pipeline
- Send Microsoft Teams notifications from a pipeline
- Data migration for data lake & EDW
- Azure Machine Learning
- Transformation using mapping data flow
- SSIS migration from on-premises
- Templates
- Overview of templates
- Copy files from multiple containers
- Copy new files by LastModifiedDate
- Bulk copy from database
- Bulk copy from files to database
- Delta copy from database
- Replicate data from SAP CDC
- Detect and mask PII data
- Extract data from PDF source
- Migrate data from Amazon S3 to Azure Storage
- Move files
- Transformation with Azure Databricks
- Call Synapse pipeline with a notebook activity
- Understanding pricing
- Data flow reserved capacity overview
- Data flow understand reservation charges
- Better understand different integration runtime charges
- Plan and manage costs
- FinOps in Azure Data Factory
- Pricing examples
- Pricing overview
- Copy data from AWS S3 to Azure Blob storage
- Copy data and transform with Azure Databricks
- Copy/transform data with dynamic parameters
- Run SSIS packages on Azure-SSIS integration runtime
- Using mapping data flow debug for a workday
- Transform blob data with mapping data flows
- Data integration with Managed VNET
- Get delta data from SAP ECC via SAP CDC in mapping data flows
- Troubleshooting guides
- Azure Data Factory Studio
- Activities
- Change data capture
- Connectors
- Overview and general copy activity errors
- Azure Blob Storage
- Azure Cosmos DB including Azure Cosmos DB for NoSQL
- Azure Data Explorer
- Azure Data Lake Gen1 and Gen2
- Azure database for PostgreSQL
- Azure files
- Azure Synapse Analytics, Azure SQL Database, SQL Server, Azure SQL Managed Instance, and Amazon RDS for SQL Server
- Azure Table Storage
- DB2
- Delimited text format
- Dynamics 365, Dataverse Common Data Service, and Dynamics CRM
- File system
- FTP, SFTP, and HTTP
- Google Ads
- Hive
- MongoDB
- Oracle
- ORC format
- Parquet format
- REST
- Salesforce and Salesforce Service Cloud
- SAP Table, SAP Business Warehouse Open Hub, and SAP ODP
- SharePoint Online list
- Snowflake
- XML format
- Pipeline Triggers
- Data Flows
- Continuous Integration and Deployment
- Security and access control
- Self-hosted Integration Runtimes
- Azure-SSIS Integration Runtime
- Package Execution in Azure-SSIS IR
- Diagnose connectivity in Azure-SSIS IR
- Azure Data Factory known issues
- SAP knowledge center
- Workflow Orchestration Manager
- Tutorials
- Concepts
- Get Started
- How-to
- Create Airflow Environment
- Import Dags using Azure Blob Storage
- Delete Dags
- How to change the Airflow password
- Install a Private Package
- Enable Azure Key Vault for Airflow
- Kubernetes secret to access private container registry
- Rest APIs for the Airflow integrated runtime
- Retrieve the IP address of an Airflow cluster
- Sync a GitHub repository
- Diagnostic logs and metrics
- CI/CD Patterns
- Airflow Configurations
- Pricing
- Reference
- Resources