Skip to content
Ed Freeman By Ed Freeman Software Engineer I
Power BI Dataflow refresh polling

If you're a frequent user of the Power BI REST API and Power BI Dataflows, you may have come across the problem that there's seemingly no programmatic way to get the refresh history of a Dataflow. This is unlike Datasets - there is an endpoint for that.

The ability to know the status of a refresh operation is useful when you're performing automated operations, and you need to know that something has succeeded or failed before deciding what to do next. For example, a desired feature in the Power BI Service is to be able to refresh a dataflow, and automatically refresh a dataset that depends on that dataflow. Without a refresh history endpoint, this is made more complicated than necessary.

You could just schedule a dataset refresh to occur 30 minutes after a dataflow refresh, but if that dataflow refresh fails for any reason, or hasn't finished refreshing, then you'll run into problems.

Alternatively, there's a nice workaround written by Olivier Travers to have a Logic App/Power Automate workflow monitoring an inbox to which Dataflow refresh notifications are configured, but with this method one has to add more moving parts to one's solution, and there'd be a reliance on notification emails being sent successfully and in a timely manner upon refresh completion.

It turns out it is possible to programmatically retrieve a Dataflow's refresh history, enabling you to determine a refresh operation's current refresh status. This can serve as a polling endpoint, so you can determine when to perform subsequent operations based on the current refresh status.

The technique for getting the refresh history for a Dataflow outlined in this blog is not ideal, but it works (at least, as of March 2020). All you need is:

  1. An AAD bearer token for Power BI (you can get one of these numerous ways - probably the easiest is using the Power BI PowerShell Get-PowerBIAccessToken cmdlet)
  2. The base url of the API the UI uses to make requests - this differs based on the region of your Power BI tenant. I've highlighted this in one of the screenshots below.

Currently, when you trigger a Dataflow refresh, you get a 200 code and an empty response. This 200 response code doesn't mean that the underlying refresh operation has succeeded - merely that the request has succeeded. This response code is correct - the problem is that there is no way of tracking how the underlying operation is performing.

As mentioned, unlike Datasets, there's no "official" endpoint in the REST API to get the refresh history, but the fact that it's possible to view the refresh history in the UI made me wonder - can I get my hands on the data that populates that dialogue box?

If we enter the dev tools in the browser (I'm using Chrome), then navigate to the Dataflow refresh history, we can see the request that's made to retrieve the history:

Drilling into that request, we can see this:

The thing I've blurred out in the Request URL is an ID of some sort identifying my Dataflow. This isn't the Dataflow ID that you can see in the url when navigating to a Dataflow (as you can probably tell by its length) - it's 'some other ID'. I've also pointed out the baseUrl for my tenant - yours will be different if your tenant is in a different location - be sure to check. Anyway, here we're making a request to https://<baseUrlForYourRegion>/metadata/dataflows/<someSortOfId>/refreshhistories, and the response looks like this:

As shown in the screenshot, the cdsaModelId is the same as the ID that forms part of the request URL. Also shown in the response is a status attribute. As you've probably guessed, this is the status of our refresh, represented as an int (which is probably an enum behind the scenes). After a bit of investigation, we found the mapping to be something along the lines of:

  • 1: Completed successfully
  • 2: Failed (Incorrect credentials/Dataflow schema issues/internal error)
  • 4: In progress
  • 5: ? (Think it's something to do with a now-deleted-Dataflow that used to be on a V2 workspace which belonged to a shared capacity... Maybe?)
  • 8: Failed (Workspace assigned to capacity but capacity switched off)

Note: in brackets are the errors we've seen that have resulted in that particular status code. They're also the only codes that we've seen so far - it would be fair to assume that, given the gaps in the range of numbers above, there almost certainly are more.

I tried making the same request as above in Postman, using the same bearer token. Everything worked fine.

I bet you're now thinking "That's great, but where can I get that cdsaModelId?" I had the same thought. As it transpires, if you hit refresh on your page whilst in the Power BI Service, the browser will make a request that looks like this:

As you can see, the endpoint https://<baseUrlForYourRegion>//powerbi/metadata/app?preferReadOnlySession=true doesn't have any notion of the workspace you're currently in. From what I can tell, the request seems to return lots of details about all the artifacts in all the (V2?) workspaces that you have access to in your tenant.

As you can imagine, the response payload is therefore quite large. It does contain what we need, though. There's an attribute called cdsaModels which seems to list out the dataflows that live in V2 workspaces that you have access to. I think it may also list out some deleted dataflows in those same workspaces (but that might be wrong). Anyway, the object we're after looks like this:

The id is the cdsaModelId we saw at the start of the blog. The displayName is the name of your Dataflow. More crucially, the objectId is the ID of your Dataflow that you see in the url when you navigate to your Dataflow (i.e. the ID that you use for interacting with a specific dataflow in the Dataflows REST API).

This means that as long as you have the object ID of your Dataflow, you can hit this endpoint, extract the cdsaModels attribute, and easily locate your target Dataflow.

If you have a keen eye, you may have noticed that the above object actually includes a lastRefreshStatus attribute. This means that we don't even need the first endpoint we spoke about earlier. However, this endpoint's response is much larger - if you're wanting to poll frequently (every couple of seconds), you may want to hit this endpoint first, grab the cdsaModelId, and then poll the first endpoint we discussed.

And that's it!

Hacky, I know. But this solution to get the refresh history is better than no solution at all!  Hopefully the Power BI team will add a "Get Refresh History In Group" endpoint to the Dataflows API in the near future, but for now this method should serve your needs.

Discover your Power BI Maturity Score by taking our FREE 5 minute quiz.
TPL Dataflow by Example code samples for LINQPad

TPL Dataflow by Example code samples for LINQPad

Howard van Rooijen

TPL Dataflow is one of the most powerful, yet unknown framworks contained within .NET. If you need to do high performance data processing TPL Dataflow should be your first port of call.
Testing Power BI Dataflows using SpecFlow and the Common Data Model

Testing Power BI Dataflows using SpecFlow and the Common Data Model

James Broome

Whilst testing Power BI Dataflows isn't something that many people think about, it's critical that business rules and associated data preparation steps are validated to ensure the right insights are available to the right people across the organisation. Data insights are useless, even dangerous, if they can't be trusted, so despite the lack of "official support" or recommended approaches from Microsoft, endjin treat Power BI solutions just as any other software project with respect to testing - building automated quality gates into the end to end development process. This post outlines an approach that endjin has used to test Power BI Dataflows to add quality gates and build confidence in large and complex Power BI solutions.
Power BI Data Type Mappings

Power BI Data Type Mappings

Ed Freeman

If you've worked with Power BI at all, you'll have probably realised that there are numerous mediums through which you work with (potentially the "same") data. Data types across these mediums can be called different things, but actually refer to the same thing. They can also (unsurprisingly) be called different things and actually mean different things. It's useful to know what the corresponding data types are across these mediums, as you may need to, for example, convert queries from one format to another. This blog and containing report intend to clarify what the corresponding data types are across each of the separate mediums within Power BI.

Ed Freeman

Software Engineer I

Ed Freeman

Ed is a Software Engineer helping to deliver projects for clients of all shapes and sizes, providing best of breed technology solutions to industry specific challenges. He focusses primarily on cloud technologies, data analytics and business intelligence, though his Mathematical background has also led to a distinct interest in Data Science, Artificial Intelligence, and other related fields.

He also curates a weekly newsletter, Power BI Weekly, where you can receive all the latest Power BI news, for free.

Ed won the Cloud Apprentice of the Year at the Computing Rising Star Awards 2019.