{"id":10477,"date":"2020-08-27T00:00:00","date_gmt":"2020-08-27T05:00:00","guid":{"rendered":"https:\/\/threecloud.wpengine.com\/post\/why-use-azure-databricks\/"},"modified":"2022-11-30T09:24:37","modified_gmt":"2022-11-30T15:24:37","slug":"why-use-azure-databricks","status":"publish","type":"post","link":"https:\/\/3cloudsolutions.com\/resources\/why-use-azure-databricks\/","title":{"rendered":"Why Use Azure Databricks?"},"content":{"rendered":"<p><span style=\"font-size: 12.0pt;\">There are many good reasons to use Azure Databricks. In this session of our mini-series on Azure Databricks, I\u2019ll dig deeper into <strong>why you should use Databricks and the advantages that you\u2019ll gain<\/strong>. <\/span><\/p>\n<ul>\n<li><span style=\"font-size: 12.0pt;\">With Databricks you\u2019ll get the <strong>proprietary runtime improvement over Apache Spark<\/strong>. The originators created Spark, which started as Hadoop, and then the founders created the Databricks company. Then that progressed to Azure Databricks as a stand-alone component. <\/span><\/li>\n<li><span style=\"font-size: 12.0pt;\"><strong>You can process huge amounts of data with Databricks and since it is part of Azure, that data is cloud native<\/strong>. The data can be analyzed, processed, reported on, etc. all in the cloud. There are also many machine learning features to take advantage of. <\/span><\/li>\n<li><span style=\"font-size: 12.0pt;\">It <strong>keeps everything in memory along with better speed than other traditional methodologies<\/strong>. <\/span><\/li>\n<li><span style=\"font-size: 12.0pt;\"><strong>Clusters are easier to setup and configure<\/strong>. <\/span><\/li>\n<li><span style=\"font-size: 12.0pt;\">As it\u2019s stored in the Azure cloud, <strong>it separates storage from compute. This saves you money as you are charged separately for compute vs storage, and the storage is fairly cheap. Also, when you shut down or delete a cluster, your data still lives in the cloud. <\/strong><\/span><\/li>\n<\/ul>\n<p><strong><span style=\"font-size: 14.0pt;\">Benefits of Databricks: <\/span><\/strong><\/p>\n<ul>\n<li><span style=\"font-size: 12.0pt;\"><strong>A key benefit is the tight integration into the Azure subscription<\/strong>. <\/span>\n<ul>\n<li><span style=\"font-size: 12.0pt;\">You can <strong>integrate with the Azure Data Lake Store and Blob Storage<\/strong> to store, retrieve and update data. <\/span><\/li>\n<li><span style=\"font-size: 12.0pt;\"><strong>Azure Data Factory can be used as part of your cloud-based extract, transform, load (ETL) proces<\/strong>s. <\/span><\/li>\n<li><span style=\"font-size: 12.0pt;\">It has an <strong>Azure Synapse Analytics connector<\/strong>, as well as the ability to connect to Azure DB. <\/span><\/li>\n<li><span style=\"font-size: 12.0pt;\">Integrates with your <strong>Active Directory<\/strong>.<\/span><\/li>\n<li><span style=\"font-size: 12.0pt;\"><strong>Spin up a cluster using Azure DevOps<\/strong> and maintain your code in a source code repository. By using DevOps, you can save time by not being burdened by the administrative tasks that we\u2019ve had in the past working with this type of data. <\/span><\/li>\n<li><span style=\"font-size: 12.0pt;\">It has <strong>Ganglia to store your Metrics<\/strong>. <\/span><\/li>\n<li><span style=\"font-size: 12.0pt;\">Databricks <strong>supports multiple languages<\/strong>. Scala is the main language, but it also works well with Python, SQL, and R. <\/span><\/li>\n<li><span style=\"font-size: 12.0pt;\">You can <strong>have a collaborative notebook environment<\/strong>. Like Google Docs, people can comment in the margins of the notebook and those comments can be added in real time. There is also revision control to store revisions. <\/span><\/li>\n<li><span style=\"font-size: 12.0pt;\">Databricks <strong>breaks down the silos between data engineers and data scientists<\/strong>, allowing each to be working on the same code at the same time throughout all the components of ELT, machine learning, etc. that you may integrate into your flow and process. <\/span><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p><span style=\"font-size: 12.0pt;\">There are <strong>many ETL use cases in Azure Databricks such as genomics mapping, insurance, risk &amp; regulation (fraud detection), IoT, and supply chain, among others<\/strong>.<\/span><\/p>\n<p><iframe loading=\"lazy\" src=\"https:\/\/www.youtube.com\/embed\/_Y4T0g-wTm0\" width=\"560\" height=\"315\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/p>\n<p><strong><span style=\"font-size: 14.0pt;\">Summary<\/span><\/strong><\/p>\n<p><span style=\"font-size: 12.0pt;\"><strong>Azure Databricks helps developers code quickly, in a scalable cluster, which is tightly integrated into Azure subscriptions.<\/strong> At the end of the day, you can extract, transform, and load your data within Databricks Delta for speed and efficiency. You can also \u2018productionalize\u2019 your Notebooks into your Azure data workflows.<\/span><\/p>\n<p><strong>Need further help? Our expert team and solution offerings can help your business with any Azure product or service, including Managed Services offerings. Contact us at 888-8AZURE or\u00a0 <a href=\"mailto:sales@3cloudsolutions.com\">sales@3cloudsolutions.com<\/a>.<\/strong><\/p>\n","protected":false},"excerpt":{"rendered":"<p>There are many good reasons to use Azure Databricks. In this session of our mini-series&mldr;<\/p>\n","protected":false},"author":29,"featured_media":10830,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"content-type":"","footnotes":""},"categories":[260],"tags":[],"class_list":["post-10477","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-ai","topics-blog"],"acf":[],"_links":{"self":[{"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/posts\/10477","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/users\/29"}],"replies":[{"embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/comments?post=10477"}],"version-history":[{"count":0,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/posts\/10477\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/media\/10830"}],"wp:attachment":[{"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/media?parent=10477"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/categories?post=10477"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/tags?post=10477"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}