{"id":15656,"date":"2021-11-02T13:30:00","date_gmt":"2021-11-02T20:30:00","guid":{"rendered":"https:\/\/devwww.3cloudsolutions.com\/post\/creating-efficient-power-bi-datasets-over-snowflake-3\/"},"modified":"2023-10-04T10:20:06","modified_gmt":"2023-10-04T17:20:06","slug":"creating-efficient-power-bi-datasets-over-snowflake","status":"publish","type":"post","link":"https:\/\/3cloudsolutions.com\/resources\/creating-efficient-power-bi-datasets-over-snowflake\/","title":{"rendered":"Creating Efficient Power BI Datasets Over Snowflake"},"content":{"rendered":"<p>For the past few months, we&#8217;ve been helping a global Financial Services company with their ongoing Snowflake implementation, which will ultimately sunset older, on-premises data marts in Oracle and SQL Server. Part of this engagement has included Power BI QuickStarts, where we&#8217;ve gotten to roll up our sleeves and put Power BI to work on top of Snowflake warehouses. As we partnered with the client&#8217;s various technology teams, we quickly realized that success lay less in implementing additional features into their environment, than in working to correct several modeling assumptions they were making within Power BI.<\/p>\n<p>Inspired by ideas presented in <a href=\"https:\/\/info.microsoft.com\/ww-thankyou-build-scalable-BI-solutions-using-power-BI-and-snowflake.html?lcid=en-us\">Microsoft and Snowflake\u2019s joint webinar<\/a>, and actual issues encountered during our client\u2019s QuickStarts, this post will aim to educate on how to create efficient Power BI solutions on top of Snowflake.<span style=\"font-size: 20px;\"><!--more--><\/span><\/p>\n<p style=\"font-size: 45px;\"><span style=\"color: #007cba;\"><img decoding=\"async\" style=\"width: 1000px;\" src=\"https:\/\/3cloudsolutions.com\/wp-content\/uploads\/2022\/10\/iStock-665418054-1.jpg\" alt=\"iStock-665418054-1\" width=\"1000\" \/><\/span><span style=\"color: #007cba;\">Connecting to Snowflake<\/span><\/p>\n<p>The first step to building Power BI solutions on top of Snowflake is to connect to a Snowflake warehouse. This post assumes you already know how to use the native Snowflake connector in Power BI to connect to a warehouse. If this is a new concept, Microsoft\u2019s documentation has an easy-to-follow <span style=\"color: #007cba;\"><a style=\"color: #007cba;\" href=\"https:\/\/docs.microsoft.com\/en-us\/power-bi\/connect-data\/desktop-connect-snowflake\">tutorial<\/a><\/span> on how to establish a connection from Power BI Desktop.<\/p>\n<p>Power BI also supports SSO access to Snowflake. While the topic of Snowflake SSO is out-of-scope for this post, this <span style=\"color: #007cba;\"><a style=\"color: #007cba;\" href=\"https:\/\/docs.snowflake.com\/en\/user-guide\/oauth-powerbi.html\">Snowflake documentation<\/a><\/span> has a thorough walkthrough on the process end-to-end. For Power BI Administrators, there is a <a href=\"https:\/\/docs.microsoft.com\/en-us\/power-bi\/connect-data\/service-connect-snowflake#admin-portal\"><span style=\"color: #007cba;\">tenant setting<\/span><\/a> you need to be aware of, and enable, if looking to leverage Snowflake SSO.<\/p>\n<p><img decoding=\"async\" style=\"width: 750px; margin-left: auto; margin-right: auto; display: block;\" src=\"https:\/\/3cloudsolutions.com\/wp-content\/uploads\/2022\/10\/image-png-Oct-28-2021-02-32-51-46-PM.png\" width=\"750\" \/><\/p>\n<h2 style=\"font-size: 43px;\"><span style=\"color: #007cba;\">Power BI Dataset Connectivity Modes<\/span><\/h2>\n<p>When working with Snowflake, or any <span style=\"color: #007cba;\"><a style=\"color: #007cba;\" href=\"https:\/\/3cloudsolutions.com\/resources\/synapse-vs-snowflake-the-data-warehouse-debate\/\" rel=\"noopener\">relational source system<\/a><\/span>, it\u2019s important to understand what your options are for building semantic models on top of it in Power BI. Below, we\u2019ll introduce the three different types of <span style=\"color: #007cba;\"><a style=\"color: #007cba;\" href=\"https:\/\/docs.microsoft.com\/en-us\/power-bi\/connect-data\/service-dataset-modes-understand\">dataset connectivity modes in Power BI<\/a><\/span> and provide general considerations for each.<\/p>\n<h3 style=\"font-weight: bold;\"><span style=\"color: #000000;\">Import<\/span><\/h3>\n<p>Import mode is the most common mode we see when working with customers. With import mode, you ingest a copy of the source data into an in-memory dataset. Power BI\u2019s underlying storage engine (VertiPaq\/xVelocity) provides significant compression capabilities. A quick rule of thumb is that you can typically expect about 10x compression when importing data into Power BI. That said, there are several variables that can affect compression, which we\u2019ll talk more about below.<\/p>\n<div style=\"overflow-x: auto; max-width: 100%; width: 100%; margin: 0px auto;\" data-hs-responsive-table=\"true\">\n<table style=\"width: 100%; border-collapse: collapse; table-layout: fixed; border: 1px solid #ff0201; height: 357.833px;\">\n<tbody>\n<tr style=\"height: 39.0333px;\">\n<td style=\"border: 2px solid #58595b; width: 100%; padding: 5px; text-align: center; height: 39px; background-color: #007cba;\" colspan=\"2\" width=\"312\"><span style=\"color: #ffffff;\"><strong>Import Mode<\/strong><\/span><\/td>\n<\/tr>\n<tr style=\"height: 39.0333px;\">\n<td style=\"border: 2px solid #58595b; width: 50%; padding: 5px; height: 39px; text-align: center;\" width=\"312\"><span style=\"color: #007cba;\"><strong>Pros<\/strong><\/span><\/td>\n<td style=\"width: 50%; padding: 5px; height: 39px; border: 2px solid #58595b; text-align: center;\" width=\"312\"><span style=\"color: #007cba;\"><strong>Cons<\/strong><\/span><\/td>\n<\/tr>\n<tr style=\"height: 112.433px;\">\n<td style=\"border: 1px solid #bdd6ee; width: 50%; padding: 5px; height: 112px; border-color: #58595b;\" width=\"312\">Typically, the fastest report and query performance due to the data being stored in-memory within Power BI.<\/td>\n<td style=\"width: 50%; padding: 5px; height: 112px; vertical-align: middle; border: 1px solid #CFE2F3; border-color: #58595b;\" width=\"312\">Data latency could be an issue since this is a cache of the source data. Use <span style=\"color: #007cba;\"><a style=\"color: #007cba;\" href=\"https:\/\/docs.microsoft.com\/en-us\/power-bi\/connect-data\/refresh-scheduled-refresh\">scheduled refreshes<\/a><\/span> to keep your dataset current.<\/td>\n<\/tr>\n<tr style=\"height: 166px;\">\n<td style=\"border: 1px solid #bdd6ee; width: 50%; padding: 5px; height: 166px; border-color: #58595b;\" width=\"312\">Full Power BI feature set is available.<\/td>\n<td style=\"width: 50%; padding: 5px; height: 166px; border: 1px solid #58595b;\" width=\"312\">Large source data will need thoughtful up-front planning. Features like Power BI premium <a href=\"https:\/\/docs.microsoft.com\/en-us\/power-bi\/admin\/service-premium-large-models\"><span style=\"color: #007cba;\">Large Dataset Storage<\/span><\/a> and <a href=\"https:\/\/docs.microsoft.com\/en-us\/power-bi\/connect-data\/incremental-refresh-overview\"><span style=\"color: #007cba;\">Incremental Refresh<\/span><\/a> should be considered for importing large data volumes.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<p><span style=\"font-size: 20px; color: #000000;\"><img decoding=\"async\" style=\"width: 750px; margin: 0px auto; display: block;\" src=\"https:\/\/3cloudsolutions.com\/wp-content\/uploads\/2022\/10\/image-png-Oct-28-2021-03-08-48-66-PM.png\" width=\"750\" \/><\/span><\/p>\n<h3 style=\"font-weight: bold;\"><span style=\"color: #000000;\">Direct Query<\/span><\/h3>\n<p>With DirectQuery datasets, no data is imported into Power BI. Instead, your Power BI dataset is simply metadata (e.g., tables, relationships) of how your model is structured to query the data source. Data is only brought into Power BI reports and dashboards at query-time (e.g., a user runs a report, or changes filters).<\/p>\n<table style=\"border-collapse: collapse; table-layout: fixed; margin-left: auto; margin-right: auto; width: 100%; border: 1px solid #99acc2; height: 435px;\">\n<tbody>\n<tr style=\"height: 39.0333px;\">\n<td style=\"border: 2px solid #58595b; width: 100%; padding: 5px; text-align: center; height: 39px; background-color: #007cba;\" colspan=\"2\" width=\"312\"><span style=\"color: #ffffff;\"><strong>DirectQuery Mode<\/strong><\/span><\/td>\n<\/tr>\n<tr style=\"height: 39.0333px;\">\n<td style=\"border: 2px solid #58595b; width: 50%; padding: 5px; height: 39px; text-align: center;\" width=\"312\"><span style=\"color: #007cba;\"><strong>Pros<\/strong><\/span><\/td>\n<td style=\"width: 50%; padding: 5px; height: 39px; border: 2px solid #58595b; text-align: center;\" width=\"312\"><span style=\"color: #007cba;\"><strong>Cons<\/strong><\/span><\/td>\n<\/tr>\n<tr style=\"height: 110.433px;\">\n<td style=\"height: 110px;\" width=\"312\">Dataset size limits do not apply as all data still resides in the underlying database.<\/td>\n<td style=\"height: 110px;\" width=\"312\">Typically, query-time performance will suffer, even when querying a cloud data warehouse like Snowflake.<\/td>\n<\/tr>\n<tr style=\"height: 137px;\">\n<td style=\"height: 137px;\" width=\"312\">Datasets do not require a scheduled refresh as data is only retrieved at query-time.<\/td>\n<td style=\"height: 137px;\" width=\"312\">Concurrency (e.g., multiple users running reports) against DirectQuery datasets could cause performance issues against the underlying database.<\/td>\n<\/tr>\n<tr style=\"height: 110px;\">\n<td style=\"height: 110px;\" width=\"312\">Near real-time reports can be developed by leveraging <a href=\"https:\/\/docs.microsoft.com\/en-us\/power-bi\/create-reports\/desktop-automatic-page-refresh\">automatic page refresh<\/a> and Power BI Premium.<\/td>\n<td style=\"height: 110px;\" width=\"312\">Limited Power Query and data modeling feature set.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Here&#8217;s a good reference for a thorough <a href=\"https:\/\/docs.microsoft.com\/en-us\/power-bi\/connect-data\/desktop-directquery-about\" rel=\"noopener\"><span style=\"color: #007cba;\">list of considerations<\/span><\/a> when using DirectQuery mode.<\/p>\n<p style=\"font-size: 17px;\"><span style=\"color: black;\"><img decoding=\"async\" style=\"width: 750px;\" src=\"https:\/\/3cloudsolutions.com\/wp-content\/uploads\/2022\/10\/image-png-Oct-28-2021-03-22-56-52-PM.png\" width=\"750\" \/><\/span><\/p>\n<p>Composite<\/p>\n<p>Composite models attempt to combine the best aspects from Import and DirectQuery modes into a single dataset. With Composite models, data modelers can configure the storage mode for each table in the model. A common design pattern for working with large, frequently-updated fact tables is to set the central fact table as DirectQuery mode, and set the surrounding dimension tables as Import or <a href=\"https:\/\/docs.microsoft.com\/en-us\/power-bi\/transform-model\/desktop-storage-mode#propagation-of-the-dual-setting\"><span style=\"color: #007cba;\">dual<\/span><\/a> mode. This pattern allows your contextual data points (dimension attributes), that are used for filtering and slicing-and-dicing, to be stored in memory.<\/p>\n<p><img decoding=\"async\" style=\"width: 750px; margin-left: auto; margin-right: auto; display: block;\" src=\"https:\/\/3cloudsolutions.com\/wp-content\/uploads\/2022\/10\/image-png-Oct-28-2021-03-25-40-94-PM.png\" width=\"750\" \/><\/p>\n<h2>Common Mistakes<\/h2>\n<h3>Direct Query all the Things!<\/h3>\n<p>The first common mistake we encountered in our QuickStarts was that developers only used, or wanted to use, DirectQuery mode. While DirectQuery mode isn\u2019t inherently bad, it should only be explored if you have specific requirements that make it a necessity. For example, extremely large data volumes, frequently changing data, or the need for <a href=\"https:\/\/docs.snowflake.com\/en\/user-guide\/oauth-powerbi.html\"><span style=\"color: #007cba;\">Snowflake SSO<\/span><\/a>, are all candidates for DirectQuery.<\/p>\n<h5><span style=\"font-size: 21px; color: #000000;\">BFTs<\/span><\/h5>\n<p>The next most common mistake we encountered in our customers\u2019 datasets was what I call BFTs, or Big Fat\/Flat Tables. While it may be easy to get started with a BFT model (after all, it\u2019s only a single table), you will quickly realize the importance of a well-defined data model. We always recommend <a href=\"https:\/\/3cloudsolutions.com\/resources\/dimensional-modeling-in-the-advanced-analytics-age\/\" rel=\"noopener\">dimensional models<\/a>\/ Star Schemas for our customers because the benefits far outweigh the initial data shaping it will take to create one. We outline reasons you should use a star schema model in the Power BI Optimization Best Practices section below.<\/p>\n<p>Speaking of dimensional models, that brings us to our next common mistake\u2026<\/p>\n<h5 style=\"font-size: 21px;\"><span style=\"color: #000000;\">Bidirectional Relationships Galore!<\/span><\/h5>\n<p>One technology team we partnered with had a great star schema design implemented in Snowflake. However, once that design was imported into Power BI, they broke it with bidirectional relationships. Because they were leveraging bidirectional relationships between each table, they were unable to model a constellation schema (multiple facts) because they kept running into the dreaded \u201cambiguity\u201d errors. This happens when the tabular engine has multiple paths, or directions, it can take to filter a table within the model, but no clear instruction on which one, so rather than guess, it doesn\u2019t evaluate at all. Since the relational model was already comprised of one-to-many relationships between dimensions and facts, simply changing all the relationships to a \u201csingle\u201d filter direction allowed them to consolidate several Power BI datasets into one. This dataset consolidation then allowed them to easily analyze different business processes by their conformed dimensions within the same reports.<\/p>\n<h5 style=\"font-size: 24px;\"><span style=\"color: #000000;\"><span style=\"font-size: 21px;\">Size of Data &gt; Number of Records<\/span><\/span><\/h5>\n<p>This mistake refers to a common statement we heard throughout our QuickStarts:<br \/>\n\u201cWe want to stress test Power BI with n million rows.\u201d<\/p>\n<p>These customer statements inevitably led to the typical consultant response:<br \/>\n\u201cWell, that depends on\u2026\u201d<\/p>\n<p>The number of records you can import into Power BI depends on several variables, including:<\/p>\n<ul>\n<li>How wide (number of columns) is your table?<\/li>\n<li>What is the cardinality (number of distinct values) for each column?<\/li>\n<li>What is the data type of each column?<\/li>\n<li>What is the data length of each column?<\/li>\n<\/ul>\n<p>A quick example was when we were importing investment metrics into Power BI. The source table had 100 columns, of which, 85 were numeric ratios with a precision of 12 decimal places. Simply rounding those 85 metrics down to four decimal places decreased the cardinality of those columns, thus increasing how well Power BI could compress the data in memory. The net result was the size of our overall memory footprint of the dataset being cut in half. There is more on data size in the <em>Power BI Optimization Best Practices<\/em> section below.<\/p>\n<h2><span style=\"font-size: 45px; color: #000000;\"><span style=\"color: #007cba;\">Power BI Optimization Best Practices<\/span><\/span><\/h2>\n<p>This section highlights key Power BI data modeling best practices. Many of the best practices listed also appear in the <a href=\"https:\/\/info.microsoft.com\/ww-thankyou-build-scalable-BI-solutions-using-power-BI-and-snowflake.html?lcid=en-us\"><span style=\"color: #007cba;\">Build Scalable BI Solutions Using Power BI and Snowflake webinar<\/span><\/a> mentioned in the Introduction. Additionally, we&#8217;ve added a few best practices from our own experience. Lastly, all the best practices listed can be applied to other relational database sources, too! These aren\u2019t exclusive to connecting to Snowflake.<\/p>\n<h5 style=\"font-size: 24px; font-weight: bold;\"><span style=\"color: #000000;\"><span style=\"font-size: 21px;\">Dimensional Models<br \/>\n<\/span><\/span><\/h5>\n<p>As mentioned above, we always recommend taking the time to properly model your data into a star schema design. There are several reasons for this, including:<\/p>\n<ul>\n<li><strong>Cleaner and easier to use than a BFT<\/strong> \u2013 Related attributes and measures can be stored in the same tables for quick navigation<\/li>\n<li>The <strong>granularity (level of detail) <\/strong>of a BFT may not properly reflect the business model, resulting in workarounds, poor calculations, and limits to scenarios which can be performed<\/li>\n<li>Instead, creating separate <strong>fact tables<\/strong> for each subject area or business process, then relating them with <strong>conformed dimensions<\/strong> will allow you to analyze metrics from different business processes by the dimensions they share<\/li>\n<\/ul>\n<p><span style=\"color: black;\"><img decoding=\"async\" style=\"width: 500px; margin-left: auto; margin-right: auto; display: block;\" src=\"https:\/\/3cloudsolutions.com\/wp-content\/uploads\/2022\/10\/image-png-Oct-28-2021-03-42-28-21-PM.png\" width=\"500\" \/><\/span><\/p>\n<p style=\"text-align: center; font-size: 12px;\"><em>1- e.g., Analyzing sales, support incident requests, and delinquent accounts receivables all for a single customer<\/em><\/p>\n<ul>\n<li>You may find <strong>significant memory savings<\/strong> by separating your contextual information out into dimensions, then storing only integer surrogate keys in your fact tables<img decoding=\"async\" style=\"width: 500px; margin-left: auto; margin-right: auto; display: block;\" src=\"https:\/\/3cloudsolutions.com\/wp-content\/uploads\/2022\/10\/image-png-Oct-28-2021-03-43-33-74-PM.png\" width=\"500\" \/>\n<p style=\"text-align: center; font-size: 12px;\"><em>2 &#8211; Simple star schema with a centralized fact table<\/em><\/p>\n<\/li>\n<\/ul>\n<h5 style=\"font-size: 24px; font-weight: bold;\"><span style=\"color: #000000;\"><span style=\"font-size: 21px;\"><br \/>\nLimit Number of Visuals<br \/>\n<\/span><\/span><\/h5>\n<p>Each visual on a report page will generate its own DAX query (yes, even slicers).\u00a0 When using DirectQuery mode, these DAX queries also get translated to SQL and executed against the underlying Snowflake database. For these reasons, it\u2019s imperative to be thoughtful with the design of each report page. You should strive to <span style=\"color: #007cba;\"><a style=\"color: #007cba;\" href=\"https:\/\/3cloudsolutions.com\/resources\/knowing-subtle-data-viz-differences-makes-better-dashboards-and-reporting\/\" rel=\"noopener\">limit the number of visuals on a page<\/a><\/span>, not only for performance, but also in the name of conveying information at a glance.<\/p>\n<h5 style=\"font-size: 24px; font-weight: bold;\"><span style=\"color: #000000;\"><span style=\"font-size: 21px;\">Limit Interactivity between Visuals<br \/>\n<\/span><\/span><\/h5>\n<p>When building report pages, you should also consider the interactivity (e.g., cross-filter, cross-highlight) between your visuals. This is especially true for reports that are developed on top of a DirectQuery dataset or table. The reason is that every time you click a data point in a visual, if interactivity is enabled, it will attempt to filter or highlight the other visuals on your page, thus kicking off another round of queries. You can manually set the interactivity between visuals by selecting Edit Interactions from the Format pane, then toggling how each visual interacts with other visuals on the page.<\/p>\n<p><img decoding=\"async\" style=\"width: 750px; margin-left: auto; margin-right: auto; display: block;\" src=\"https:\/\/3cloudsolutions.com\/wp-content\/uploads\/2022\/10\/image-png-Oct-28-2021-03-46-47-48-PM.png\" width=\"750\" \/><\/p>\n<h5 style=\"font-size: 24px; font-weight: bold;\"><span style=\"color: #000000;\"><span style=\"font-size: 21px;\">Query Reduction<br \/>\n<\/span><\/span><\/h5>\n<p>If you want a quicker way to eliminate all interactivity by default, then check out <a href=\"https:\/\/docs.microsoft.com\/en-us\/power-bi\/connect-data\/desktop-directquery-about#report-design-guidance\"><span style=\"color: #007cba;\">Query Reduction<\/span><\/a>. Query Reduction is a report-level feature that allows report creators to reduce the number of queries that get generated by interactivity and filtering actions (e.g., slicers and filter pane).<\/p>\n<p>Query Reduction can be accessed from:<br \/>\nFile &#8211;&gt; Options and Settings &#8211;&gt; Options &#8211;&gt; Current File &#8211;&gt; Query Reduction<\/p>\n<p>Query Reduction allows you to:<\/p>\n<ul>\n<li>Disable cross highlighting\/filtering by default<\/li>\n<li>Add an Apply button to each slicer, so you can control when filtering occurs<\/li>\n<li>Add an Apply button to each filter, or the entire filter pane, so you can control when filtering occurs<\/li>\n<\/ul>\n<p><span style=\"color: #007cba; font-size: 48px;\"><img decoding=\"async\" style=\"width: 750px; margin-left: auto; margin-right: auto; display: block;\" src=\"https:\/\/3cloudsolutions.com\/wp-content\/uploads\/2022\/10\/image-png-Oct-28-2021-03-48-59-03-PM.png\" width=\"750\" \/><\/span><\/p>\n<p>Patrick at <span style=\"color: #007cba;\"><a style=\"color: #007cba;\" href=\"https:\/\/guyinacube.com\/\" rel=\"noopener\">Guy in a Cube<\/a><\/span> has a <span style=\"color: #007cba;\"><a style=\"color: #007cba;\" href=\"https:\/\/www.youtube.com\/watch?v=4kVw0eaz5Ws\">great video<\/a><\/span> on Query Reduction and how it can help performance in your DirectQuery reports.<\/p>\n<h5 style=\"font-size: 24px; font-weight: bold;\"><span style=\"font-size: 21px; color: #000000;\">Assume Referential Integrity<\/span><\/h5>\n<p>The <a href=\"https:\/\/docs.microsoft.com\/en-us\/power-bi\/connect-data\/desktop-assume-referential-integrity\"><span style=\"color: #007cba;\">Assume Referential Integrity<\/span><\/a> button is an advanced feature found in the relationships dialog. This button only appears when modifying a relationship in a DirectQuery model. This feature will tell Power BI to generate inner joins instead of outer joins when generating the SQL that executes in Snowflake. If you are confident in the data quality of your Snowflake tables, then enabling this feature will allow Power BI to generate more efficient SQL queries to retrieve data.<\/p>\n<h5 style=\"font-size: 24px; font-weight: bold;\"><span style=\"font-size: 21px; color: #000000;\">Only use Bidirectional Filters when Necessary<\/span><\/h5>\n<p>This is a common mistake that we see customers who are new to tabular modeling make all the time. 99% of the time you should stick with a single filter direction in your table relationships. This means that the filter should only flow from the one side (dimension) to the many side (fact) within the model.<\/p>\n<p>However, rather than say \u201cnever use bidirectional relationships\u201d, we think it\u2019s important to call out <a href=\"https:\/\/docs.microsoft.com\/en-us\/power-bi\/guidance\/relationships-bidirectional-filtering\"><span style=\"color: #007cba;\">acceptable use cases<\/span><\/a>. The primary use case involves modeling many-to-many relationships. Take the screenshot below for example. Our customer wanted to group different investment funds:<\/p>\n<ul>\n<li>A fund can belong to more than one fund group<\/li>\n<li>A fund group can contain more than one fund<\/li>\n<\/ul>\n<p>To solve this problem, we created a traditional factless fact\/bridge table to establish the set of keys. The bridge table allows you to create one-to-many relationships between the two dimensions back to the bridge table. The bidirectional relationship comes into play when we want to allow filter, slicer, or grouping capabilities by fund groups. We need the filter to flow from the bridge table to the main fund dimension that is connected to all our downstream facts. Without the bidirectional relationship, our fund groups would filter our bridge table, but nothing else.<\/p>\n<p><img decoding=\"async\" style=\"width: 750px; margin-left: auto; margin-right: auto; display: block;\" src=\"https:\/\/3cloudsolutions.com\/wp-content\/uploads\/2022\/10\/image-png-Oct-28-2021-03-53-51-64-PM.png\" width=\"750\" \/><\/p>\n<h5 style=\"font-size: 24px; font-weight: bold;\"><span style=\"font-size: 21px; color: #000000;\"><br \/>\nOnly Bring in Data that&#8217;s Needed for Analysis<\/span><\/h5>\n<p>This is probably the simplest best practice on this list, yet often one of the most violated. It is very common to see customers pull in things like:<\/p>\n<ul>\n<li>Fact table keys<\/li>\n<li>Business\/natural keys<\/li>\n<li>GUIDs that aren\u2019t used in relationships<\/li>\n<li>ETL\/auditing metadata fields<\/li>\n<\/ul>\n<p>Common examples like the list above should remain in your data sources if they aren\u2019t providing any analytical value. In addition to limiting the fields that are brought into your model, you should also consider if there are any ways to filter the data that you bring into your model. The best models are the simplest ones that provide only the data points you need for analysis.<\/p>\n<h5 style=\"font-size: 24px; font-weight: bold;\"><span style=\"font-size: 21px; color: #000000;\">Aggregations<\/span><\/h5>\n<p>Aggregations are a set of Power BI features that allow data modelers and dataset owners to improve query performance over large DirectQuery datasets. Currently, there are two types of aggregations in Power BI: <a href=\"https:\/\/docs.microsoft.com\/en-us\/power-bi\/transform-model\/aggregations-advanced\"><span style=\"color: #007cba;\">User-Defined Aggregations<\/span><\/a> and <a href=\"https:\/\/docs.microsoft.com\/en-us\/power-bi\/admin\/aggregations-auto\"><span style=\"color: #007cba;\">Automatic Aggregations<\/span><\/a>.<\/p>\n<p>The premise for both types of aggregations is to boost report performance by reducing the number of \u201cround trips\u201d queries must take to return results to a report. The obvious caveat to aggregations is that it may introduce a level of data latency to your queries if data is being imported into memory, as is the case with Automatic aggregations. In this case, if a report query can be satisfied by an aggregation, then it will not be querying the data source directly. Aggregations should be explored if you want to improve query-time performance, without fully sacrificing the need for DirectQuery operations.<\/p>\n<h5 style=\"font-size: 24px; font-weight: bold;\"><span style=\"font-size: 21px; color: #000000;\">Incremental Refresh<\/span><\/h5>\n<p><a href=\"https:\/\/docs.microsoft.com\/en-us\/power-bi\/connect-data\/incremental-refresh-overview\"><span style=\"color: #007cba;\">Incremental refresh<\/span><\/a> is an important feature to consider when working with large tables that you would like to import into memory. With Incremental Refresh, partitions are automatically created on your Power BI table based on the amount of history to retain, as well as the partition size you would like to set. Since data modelers can define the size of each partition, refreshes will be faster, more reliable, and able to build history over time.<\/p>\n<p>When setting up Incremental Refresh, keep in mind the following:<\/p>\n<ul>\n<li>You must create <strong>RangeStart<\/strong> and <strong>RangeEnd<\/strong> <em>date\/time<\/em> parameters.\u00a0 These parameters <strong>must<\/strong> be set to a date\/time data type and <strong>must<\/strong> be named RangeStart and RangeEnd.<\/li>\n<li>The initial refresh in the Power BI service will take the longest due to the creation of partitions and loading of historical data.\u00a0 Subsequent refreshes will only process the latest partition(s) based on how the feature was configured.<\/li>\n<li>Using Incremental Refresh within a dataset in Premium capacity will allow additional extensibility via the XMLA endpoint.\u00a0 For example, SQL Server Management Studio (SSMS) can be used to manage partitions.<\/li>\n<\/ul>\n<h2><span style=\"color: #007cba; font-size: 45px;\">In Summary<\/span><\/h2>\n<p>Developing efficient Power BI solutions on top of Snowflake doesn\u2019t need to be a daunting task. Rather, it needs thoughtful requirements gathering to ensure the <span style=\"color: #007cba;\"><a style=\"color: #007cba;\" href=\"\/blog\/the-ultimate-list-of-ai-features-in-power-bi\" rel=\"noopener\">optimal Power BI features are being leveraged<\/a><\/span> for the desired use case. In the case of our customer, educating them on common Power BI modeling mistakes and more effective alternatives, opened a door of opportunities that they didn\u2019t think were possible a few months ago.<\/p>\n<p>Our team of BI experts can help you learn more about Power BI and how to use it effectively in your organization. <a href=\"\/get-started\/\"><span style=\"color: #007cba;\">Contact 3Cloud<\/span><\/a> today.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Inspired by ideas presented in Microsoft and Snowflake\u2019s joint webinar, and actual issues encountered during a recent client engagement, this post will aim to educate on how to create efficient Power BI solutions on top of a Snowflake warehouse. Part of this engagement included several Power BI QuickStarts, and partnering with different technology teams, to educate the customer to dispel several modeling assumptions and mistakes they were making within Power BI.<\/p>\n","protected":false},"author":21,"featured_media":12375,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"content-type":"","footnotes":""},"categories":[394,260],"tags":[303,305],"class_list":["post-15656","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-business-intelligence","category-data-ai","tag-modern-analytics","tag-modern-bi","topics-blog"],"acf":[],"_links":{"self":[{"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/posts\/15656","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/users\/21"}],"replies":[{"embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/comments?post=15656"}],"version-history":[{"count":0,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/posts\/15656\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/media\/12375"}],"wp:attachment":[{"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/media?parent=15656"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/categories?post=15656"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/tags?post=15656"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}