{"id":10211,"date":"2018-05-09T00:00:00","date_gmt":"2018-05-09T05:00:00","guid":{"rendered":"https:\/\/threecloud.wpengine.com\/post\/azure-data-factory-pipelines-and-activities\/"},"modified":"2022-11-30T09:13:01","modified_gmt":"2022-11-30T15:13:01","slug":"azure-data-factory-pipelines-and-activities","status":"publish","type":"post","link":"https:\/\/3cloudsolutions.com\/resources\/azure-data-factory-pipelines-and-activities\/","title":{"rendered":"Azure Data Factory Pipelines and Activities"},"content":{"rendered":"<p>In this post I\u2019d like to go a bit deeper into Azure Data Factory Version 2 and review pipelines and activities. In essence, a pipeline is a logical grouping of activities. If you\u2019re familiar with SSIS, think of an SSIS package being a grouping of activities that are happening with the data.<\/p>\n<p>An example of a pipeline would look like: you want to pull data from a website, file server or database up into Azure and do some kind of transformation on that data, then report from it. Within the pipeline, multiple activities can be defined. If there\u2019s no activity dependency on a set of activities \u2013 so you have one activity running and there\u2019s no dependency on the next activity -then they can run in parallel.<\/p>\n<p>This is good to keep in mind as you\u2019re performing these activities because you may need to schedule them or figure out a way, so they don\u2019t run in parallel or that one runs after another.<\/p>\n<p><iframe loading=\"lazy\" src=\"https:\/\/www.youtube.com\/embed\/Aq60MPHjmW4\" width=\"560\" height=\"315\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><br \/>\nThere are 3 main types of activities:<\/p>\n<p><strong>1. Data Movement Activities<\/strong> \u2013 This is the sources where you\u2019re pulling in data from such as Azure Blob Storage, Azure Data Lake, Azure DB and DW. You can also set up an on premises gateway and pull in databases, such as commonly used DB2, MySQL, Oracle, SAP, Sybase and Teradata, as well as NoSQL databases like Cassandra and MongoDB.<\/p>\n<p>I also mentioned files; you can pull from Amazon, S3, file systems, FTP, HTTP, etc. You also have the Software as a Service (SaaS) options: Dynamics, HubSpot, Marketo, QuickBooks, and Salesforce, to name a few. You can check a complete list on the Azure online documentation.<\/p>\n<p><strong>2. Data Transformation Activities<\/strong>\u00a0\u2013 Here is where you\u2019re taking your data after it\u2019s ingested into Azure and doing something with it. Some common ones are HDInsight, HIVE, PIG, MapReduce, Hadoop Streaming and Spark transformations. These allow you to transform your big data in your Azure environment and stage it for your reporting.<\/p>\n<p>Other common uses would be machine learning into an Azure VM, as well as stored procedures. You can have your stored procedures in SQL Server defined in Azure, and then run that stored procedure, and also use U-SQL for your Data Lake Analytics.<\/p>\n<p><strong>3. Control Activities<\/strong> \u2013 In these activities you can do things like execute your pipelines or run a ForEach statement or Look-up activities, the types of things where you\u2019re controlling how the pipeline is working and interacting with the data.<\/p>\n<p>Need further help? Our expert team and solution offerings can help your business with any Azure product or service, including Managed Services offerings. Contact us at 888-8AZURE or\u00a0 <a href=\"mailto:sales@3cloudsolutions.com\">sales@3cloudsolutions.com<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this post I\u2019d like to go a bit deeper into Azure Data Factory Version&mldr;<\/p>\n","protected":false},"author":21,"featured_media":9572,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"content-type":"","footnotes":""},"categories":[260],"tags":[],"class_list":["post-10211","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-ai","topics-blog"],"acf":[],"_links":{"self":[{"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/posts\/10211","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/users\/21"}],"replies":[{"embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/comments?post=10211"}],"version-history":[{"count":0,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/posts\/10211\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/media\/9572"}],"wp:attachment":[{"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/media?parent=10211"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/categories?post=10211"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/tags?post=10211"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}