Cross Database Query Folding For SQL Server Sources In Power Query In Power BI And Excel

A few months ago a new option was added to the Sql.Database and Sql.Databases functions in Power Query in Power BI and Excel which allows Power Query queries that combine data from different SQL Server databases to fold. Here’s a simple example showing how to use it.

On my local PC I have SQL Server installed and the Adventure Works DW 2017 and Contoso Retail DW sample databases:

Both of these databases have date dimension tables called DimDate. Let’s say you want to create a Power Query query that merges these two tables.

Here’s the M code for a Power Query query called DimDate AW to get just the DateKey and CalendarYear columns from the DimDate table of the Adventure Works DW 2017 database:

let
    Source = Sql.Database("localhost", "AdventureWorksDW2017"),
    dbo_DimDate = Source{[Schema="dbo",Item="DimDate"]}[Data],
    #"Removed Other Columns" = Table.SelectColumns(dbo_DimDate,{"DateKey", "CalendarYear"})
in
    #"Removed Other Columns"

Here’s the M code for a Power Query query called DimDate Contoso to get just the Datekey and CalendarYear columns from the DimDate table in the ContosoRetailDW database:

let
    Source = Sql.Database("localhost", "ContosoRetailDW"),
    dbo_DimDate = Source{[Schema="dbo",Item="DimDate"]}[Data],
    #"Removed Other Columns" = Table.SelectColumns(dbo_DimDate,{"Datekey", "CalendarYear"})
in
    #"Removed Other Columns"

Both of these Power Query queries fold. However if you create a third query to merge these two queries (ie do the equivalent of a SQL join between them) on the CalendarYear columns like so:

let
  Source = Table.NestedJoin(
    #"DimDate Contoso",
    {"CalendarYear"},
    #"DimDate AW",
    {"CalendarYear"},
    "DimDate AW",
    JoinKind.LeftOuter
  ),
  #"Expanded DimDate AW" = Table.ExpandTableColumn(
    Source,
    "DimDate AW",
    {"DateKey", "CalendarYear"},
    {"DimDate AW.DateKey", "DimDate AW.CalendarYear"}
  )
in
  #"Expanded DimDate AW"

…this query does not fold, because it combines data from two different SQL Server databases.

However if you edit the Sql.Database function in the Source step of both of the first two queries above to set the new EnableCrossDatabaseFolding option to true, like so:

    Source = 
    Sql.Database(
        "localhost", 
        "ContosoRetailDW", 
        [EnableCrossDatabaseFolding=true]
        ),

…then the query that merges these two queries does fold:

15 thoughts on “Cross Database Query Folding For SQL Server Sources In Power Query In Power BI And Excel

  1. Yep, the main limitation is that it’s only for multiple DBs on the same server, AND where cross-database queries are supported by using multipart names.

    1. It doesn’t work in conjunction with Value.NativeQuery, so there are possible queries which could have been broken if it were enabled by default.

    1. Folding a join pushes the work of joining the two tables back to the SQL Server instead of having to download all of the rows from the two tables into Power BI to join there. Downloading rows is slow so much better to avoid. Joining data can also require a lot of CPU and memory in Power BI – much better to let the database server do it – it is designed for that type of operation and the server can also take advantages of indexes on the tables in the database.
      For large tables (millions of rows) the difference can be very stark – a join done on the database server may take seconds, vs minutes/hours if all the data has to be downloaded to Power BI and joined there.

  2. This was a most awaited feature. Unfortunately it doesn’t seem to work with SSAS, not even 2022. Any chance it will be added? I would like to no longer have to define views just to fake using only one database.

    1. I think that would have to wait until the version of Power Query used by SSAS is updated, and that might not happen until the next major release.

Leave a Reply