Azure Data factory is a cloud based Data Integration Service that Orchestrates and automates the Movement and transformation of data.

Step 1: I will place the multiple .csv files in the local drive in the “D:\Azure Data Files\InternetSales” as shown in the below screen shot

I1

Step 2: I will create an Azure Data Lake store in the Azure as show in the below screen shot

I2

Click on the Data Explorer, it will take you to new window where you can create a new folder in the Azure Data Lake Store.

I3

By clicking on the New Folder, it ask you to enter name of the folder i.e..; “InternetSales”.  Click OK.

I4

Step 3:  Azure Data Lake Store uses Azure Active Directory for authentication. Before authoring an application that works with Azure Data Lake Store or Azure Data Lake Analytics, you must decide how to authenticate your application with Azure Active Directory (Azure AD). The two main options available are:

  • End-user authentication
  • Service-to-service authentication

Create and configure an Azure AD web application for service-to-service authentication with Azure Data Lake Store using Azure Active Directory.  Here, we will see how to create Azure AD App.

I5

Click on the Azure Active Directory, it will open a new blade

I6

Click on the “New application registration”

I7

Click on create button.

I8

Click on “DataLakeApp” which is highlighted in Red Color.

I

Application ID is used as Service Principal ID in the Azure Data Lake Store Linked Service.

Click on “Keys”, this will allow to generate the Service Principal Key as shown in the below screen shot

I9

Note:  Application ID and Keys are used as Service Principal Id and Service Principal Key

Steps 4: Authenticating “InternetSales” folder in the Azure Data Lake Store using Azure Active Directory.

In the Azure Data Lake Store, Click on Data Explorer.  It will show all the folders created in the Azure Data Lake Store.

Click on the InternetSales folder,

I10

I11

Click on “Access” Button

I12

Click on the Select User or Group and search the Web App created in the Azure Active Directory and set the permissions as show below

I100

Click OK.

Then you will see the permissions on the particular folder in Azure Data Lake Store

I13

Step 5: Download and Install Data Management Gateway on machine, where the files have to be copied into Azure Data Lake Store.

Step 6: Using Azure Data Factory, let us create

  • A Linked Service for Azure Data Lake Store
  • A Linked Service for On-Premise File System
  • A Dataset for Azure Data Lake Store
  • A Dataset for On-Premise File System
  • A Pipeline to group Dataset of Azure Data Lake Store and On-Premise File System and Schedule at the regular intervals

I101

Click on Author and deploy to create Linked Services, Data Sets and Pipelines.

I15

Select Linked services, click on New data store à Azure Data Lake Store

I111

Click on Azure Data Lake Store.

servicePrincipalId and servicePrincipalKey are the Application ID and KeyValues generated for the Web App Registration in Azure Active Directory.

Select Linked services, click on New data store à File System

A1

 

Click on “More”, Select  New Dataset à Azure Data Lake Store

A2

 

On-Premise File Dataset

A3

 

Click on … More to create “New pipeline”

A4