{"id":16013,"date":"2016-02-16T20:52:02","date_gmt":"2016-02-17T04:52:02","guid":{"rendered":"https:\/\/devwww.3cloudsolutions.com\/post\/5-reasons-to-get-excited-about-sql-server-2016-and-big-data-2\/"},"modified":"2023-10-04T15:27:52","modified_gmt":"2023-10-04T22:27:52","slug":"5-reasons-to-get-excited-about-sql-server016-and-big-data","status":"publish","type":"post","link":"https:\/\/3cloudsolutions.com\/resources\/5-reasons-to-get-excited-about-sql-server016-and-big-data\/","title":{"rendered":"5 Reasons to Get Excited About SQL Server 2016 and Big Data"},"content":{"rendered":"<p>As Melissa Coates showed us in her recent article, <a href=\"\/blog\/top-3-reasons-to-upgrade-your-analytics-environment-to-sql-server-2016\">Top 3 Reasons to Upgrade Your Analytics Environment to SQL 2016<\/a>, the upcoming\u00a0release is huge! It includes a large number of new features, many of them enabling deep analytics and integration with Big Data solutions. One of the features\u00a0it aligned with Big Data is Polybase.\u00a0 Polybase is included with the\u00a0SQL 2016 Enterprise Edition.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" style=\"display: block; margin-left: auto; margin-right: auto;\" title=\"polybase.jpeg\" src=\"https:\/\/3cloudsolutions.com\/wp-content\/uploads\/2022\/11\/polybase.jpeg\" alt=\"polybase.jpeg\" width=\"716\" height=\"363\" \/><\/p>\n<p><!--more--><\/p>\n<p>Now, Polybase isn&#8217;t exactly new. It was actually developed to be used alongside SQL Server Parallel Data Warehouse (PDW) in the Analytics Platform System (APS)\u00a0appliance. This is, however, the first time many enterprise customers will be introduced to Polybase, as APS is more of a niche product as opposed to the broad adoption of SQL Server.<\/p>\n<h2>What is Polybase?<\/h2>\n<p>Polybase is a feature of SQL Server that bridges the gap between SQL and Hadoop.\u00a0 Simply put, it allows a SQL engineer to write a standard T-SQL query that can reach into a Hadoop cluster and return data. It&#8217;s a very powerful tool for organizations that are currently building, or evaluating, a Data Lake environment.\u00a0 Here are 5 reasons why you should be excited about SQL Server 2016 with Polybase:<\/p>\n<h2>Do Big Data without Learning New Tools<\/h2>\n<p>If that isn&#8217;t the biggest selling point for Polybase, then I don&#8217;t know what is!\u00a0 Seriously, this is a very powerful feature of Polybase. It allows data analysts to use the\u00a0very commonly known T-SQL, in a very commonly used development environment &#8212;\u00a0SQL Server Management Studio &#8212; to query data stored in a Hadoop cluster.\u00a0 No Java or MapReduce required!<\/p>\n<h2>Flexible Storage Options<\/h2>\n<p>Data queried with Polybase doesn&#8217;t reside on your SQL Server. It is persisted on external storage.\u00a0 Polybase provides a couple of options here:<\/p>\n<ul>\n<li>HDFS on Hadoop &#8211; The most common distributed Hadoop file system, HDFS, is a great place to store all of your Big Data.\u00a0 HDFS is a distributed, resilient, redundant\u00a0file system designed to store exabytes of information &#8212; a massive amount (5 exabytes could\u00a0encompass all the words ever spoken on Earth in any language, according to the What&#8217;s A Byte website.)\u00a0 Polybase can reach directly into HDFS and return data from Hadoop alongside your SQL Server data.<\/li>\n<li>Windows Azure Blob storage &#8211; If you don&#8217;t have a Hadoop cluster, you can still take advantage of Polybase&#8217;s ability to query external data. Simply\u00a0place your data in Windows Azure Blob\u00a0storage and provide Polybase with the location information.<\/li>\n<\/ul>\n<h2>Scalable Performance Management<\/h2>\n<p>Polybase does a pretty good job overall of managing the\u00a0performance of remotely executed queries. It automatically shifts between modes of transporting data to SQL Server for native processing, and remote execution on the Hadoop cluster (when run in Hadoop connectivity mode).<\/p>\n<p>However, some situations arise where the developer would like more control over the performance management. Polybase allows this with full predicate push down. With this mode enabled, Polybase will generate a native MapReduce application that will be executed through YARN on the Hadoop cluster.\u00a0 This mode will allow long-running jobs to take full advantage of parallel processing, with minimal data movement across the wire to SQL Server.<\/p>\n<p>Additionally, when extra horsepower is needed, Azure SQL Data Warehouse can easily be scaled to 2x, 4x, 8x, or even more processing power! \u00a0The environment stays up the entire time it is being scaled, so no downtime is requried. \u00a0Scaling works the other way also &#8212; when you don&#8217;t need that much\u00a0<em>oomph, <\/em>scale it back to save money.<\/p>\n<h2>Aligned with Enterprise Security<\/h2>\n<p>Security is always an important concern with any major data management project.\u00a0 Polybase supports with Kerberos negotiation through the Hadoop cluster to ensure that queries will only touch data the logged in user is allowed to see.<\/p>\n<p>Standard SQL Security is included as well, which means you&#8217;ll have object-level access to secure your data. \u00a0Transparent Data Encryption (TDE) is also supported to ensure that in the rare case someone who shouldn&#8217;t places their hands on your data, they won&#8217;t be able to make sense of it.<\/p>\n<h2>Platform Support<\/h2>\n<p>Polybase includes support for both on-premises and cloud-based unstructured data storage platforms.\u00a0 Currently, Polybase supports the following Hadoop clusters:<\/p>\n<ul>\n<li>Cloudera CDH 4.3<\/li>\n<li>Cloudera CDH 5.1<\/li>\n<li>Hortonworks HDP 1.3 (Windows\/Linux)<\/li>\n<li>Hortonworks HDP 2.1 (Linux)<\/li>\n<li>Hortonworks HDP 2.1 (Windows\/Linux)<\/li>\n<li>Hortonworks HDP 2.2 (Windows\/Linux)<\/li>\n<\/ul>\n<p>In addition to supporting Hadoop clusters, Polybase also supports the following cloud-based solutions:<\/p>\n<ul>\n<li>Windows Azure HDInsight (Use the appropriate HDP version)<\/li>\n<li>Windows Azure Blob storage<\/li>\n<li>Windows Azure Data Lake (Future support)<\/li>\n<\/ul>\n<p>When Polybase is connected to Windows Azure Blob storage, it cannot perform predicate push down, as there is no underlying processing framework attached.<\/p>\n<p>Polybase is only one of the many new features in SQL Server 2016.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>One of the new features of SQL 2016 aligned to Big Data is Polybase.  Polybase is a feature of SQL Server that bridges the gap between SQL and Hadoop.<\/p>\n","protected":false},"author":21,"featured_media":15587,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"content-type":"","footnotes":""},"categories":[260],"tags":[304],"class_list":["post-16013","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-ai","tag-modern-data-platform","topics-blog"],"acf":[],"_links":{"self":[{"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/posts\/16013","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/users\/21"}],"replies":[{"embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/comments?post=16013"}],"version-history":[{"count":0,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/posts\/16013\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/media\/15587"}],"wp:attachment":[{"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/media?parent=16013"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/categories?post=16013"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/tags?post=16013"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}