New HostProperties Data In The ApplicationContext Column In Power BI Log Analytics

If you’re a fan of using Log Analytics for monitoring Power BI activity, then you may have noticed there’s some new data in the ApplicationContext column of the PowerBIDatasetsWorkspace table. Up until recently the ApplicationContext column only contained IDs that identify the report and the visual that generated a DAX query (something I blogged about here); it now contains additional information on the type of Power BI report that generated the query and an ID for the user session.

Here’s an example of the payload with the new data in bold:

[
  {
    "ReportId": "2beeb311-56c8-471a-83a3-6d7523d40dc7",
    "VisualId": "477b5dc44249fe897411",
    "HostProperties": {
      "ConsumptionMethod": "Power BI Web App",
      "UserSession": "a3d941bd-c374-4e0e-b911-5086310cb345"
    }
  }
]

Here’s an example KQL query that returns the new ConsumptionMethod and UserSession data from HostProperties:

PowerBIDatasetsWorkspace
| where TimeGenerated > ago(1d)
| where OperationName == 'QueryEnd' 
| where ApplicationContext!=""
| extend hp = parse_json(ApplicationContext)
| extend ConsumptionMethod = hp.Sources[0].HostProperties.ConsumptionMethod, 
UserSession = hp.Sources[0].HostProperties.UserSession
|project TimeGenerated, EventText, ApplicationName, ConsumptionMethod, UserSession
| order by TimeGenerated desc

ConsumptionMethod is less useful than it might first appear: at the time of writing it only returns data for Power BI reports (and not other types of report such as Paginated reports), although it will allow you to differentiate between different methods of viewing a Power BI report such as viewing via a browser or viewing via Teams. It should be used in combination with the ApplicationName column to get a fuller picture of the way reports are being consumed.

UserSession is something that I need to explore in more detail. Grouping user activity into sessions is something I blogged about here, but this is the user session ID used internally and therefore a lot more reliable. I don’t know the rules which govern how activity is grouped into sessions though, so I will only blog about this when I find out more.

[Thanks to my colleague Andreea Sandu for this information]

5 thoughts on “New HostProperties Data In The ApplicationContext Column In Power BI Log Analytics

  1. Hi Chris, thanks for the blog.

    This past week I spent a tremendous amount of time trying to understand how much RAM was required to be available in my premium capacity to refresh a calculated column and a user hierarchy. The table is about 150 million rows, and no matter what I did, I wasn’t able to refresh the required features in my 25 GB capacity. Even though the final partition was only going to require 2 GB for a full-year partition, the act of creating the that was taking far more than that (not just 2 GB… I’m guessing it would have been somewhere between 20 GB and 200 GB).

    Eventually I decided it was not worth my time to work on the problem, since there are no tools to investigate memory utilization. I even tried profiler tracing, and there didn’t seem to be any profiler events that displayed the memory utilization of my workspace or capacity.

    Here is a blog from Teo Lachev showing how we would investigate memory back in the early days of AAS:
    https://prologika.com/high-memory-usage-and-calculated-columns/

    Will log analytics give us the metrics we need to investigate the memory consumption that is happening in the “calculate” phase of a Power BI refresh? I haven’t tried it yet, but I’m hoping this is the answer. … There is one part of the docs that says we are limited to getting stuff from the “Workspace-level Log Analytics” configuration scope. Perhaps that precludes the monitoring of RAM usage. If you have a way to monitor RAM during a refresh, please let me know.

    1. No, I don’t think Log Analytics will give you information on the amount of memory consumed on a node over time. You can use the PeakMemory value from a refresh command (see https://blog.crossjoin.co.uk/2023/06/04/monitoring-power-bi-dataset-refresh-memory-and-cpu-usage-with-log-analytics/) to get the maximum amount of memory during a refresh and then just do more granular types of refresh such as refreshing just the table to see how this value changes. In your case, though, you know what the problem is so I don’t see how the memory graph that Teo shows in his blog would help.

      1. NOW I know what the problem is. 😉 After spending a whole week ruling out all the things that seemed more obvious to me, like the number of distinct columns values, the calculated columns, and so on. It ended up being a very trivial three-level user hierarchy that was swallowing all the memory.

        I still feel fairly new to tabular modeling. I remember using SQL server 2K and being able to build very large multidimensional cubes with only 2GB ram for the entire server process (3GB with large address awareness).

        Whereas in tabular modeling things are very different. There isn’t any effort or motive to conserve memory or swap to disk. During the “calculate” phase, it is anyone’s guess how much RAM would need to be added to the service to successfully create this user hierarchy. The error messages are frustrating in that they act like an advertisement to upgrade the premium tier…. Even the advertisements aren’t that actionable… because if 25 GB wasn’t enough RAM for my simple hierarchy, who really knows how much it would have taken? 50 GB? 100 GB?

        …. IMHO a Power BI customer should not be encouraged to upgrade their capacity for the two-minute RAM spike that only happens once a day and uses 10x or 100x more RAM than normal. I think that a visualization of RAM over time is something that is desperately needed, in order for customers to make better-informed decisions about how to handle these RAM spikes, and whether it is worth upgrading the whole organization’s capacity or not. It doesn’t seem like it is ever the right solution if there is a single dataset (out of hundreds or thousands), that briefly spikes RAM from 2 GB to over 25 GB.

        I really appreciate your input and those blogs. I see that you’ve spent a good deal of time raising awareness about memory utilization during refresh operations.

      2. A user hierarchy caused the memory spike? That’s interesting – I didn’t know that user hierarchy creation could be that expensive. What is the cardinality of each of the levels?

Leave a Reply