Reliably refreshing a Semantic Model from Microsoft Fabric Pipelines

A common requirement we see in Microsoft Fabric pipelines is the need to refresh a Power BI semantic model as the final step in a data pipeline. This ensures that as soon as new data is processed, it's reflected in the reports and dashboards that drive actions and decisions.
Of course, there's a built-in activity for just this purpose in Fabric, and it works pretty well for simple scenarios. However, there's definitely some gotchas to be aware of when you try to use it in real-world situations. In this post, I'm going to describe a more robust and reusable solution for reliably refreshing your semantic models, building on the out-the-box activity with some of the techniques and patterns I've outlined in a related post when implementing the same functionality in Azure Data Factory and Azure Synapse Pipelines.
Fabric's Semantic Model Refresh activity
Let's start with the Fabric Data Pipeline's Semantic Model Refresh activity. Once it's added to your pipeline, you just point it to an existing semantic model based on the pre-loaded list of workspaces and models that Fabric knows about. For a single, isolated refresh, it works exactly as you'd expect. Behind the scenes, this activity leverages the Power BI REST API to initiate the refresh. The process of triggering a new refresh action is asynchronous, so the activity will automatically poll the API until the refresh is complete. If the refresh succeeds, then so does the activity and your pipeline will continue, or exit successfully. But if the refresh fails, then the activity will throw and error and fail the pipeline run.
Triggering multiple refreshes will fail
However, consider a scenario where you have multiple pipeline runs attempting to refresh the same model concurrently. This isn't uncommon, with the variety of options around automatically triggering pipeline runs. If data is being updated frequently and your semantic model takes a while to process, then it's highly likely that you'll hit this scenario.
The problem here is that the Power BI API will throw an error if you attempt to trigger a new refresh while one is already in progress, and the Semantic Model Refresh activity doesn't attempt to handle this gracefully. So, whilst you can have multiple pipeline runs executing simultaneously, when it comes to the model refresh step your pipeline is going to error and report a failure.
While the built-in activity offers some basic retry settings (as do all pipeline activities), this doesn't provide us with a reliable solution. We don't have a clear understanding of how long the refresh will take, nor do we know how many other processes might be trying to trigger a refresh at the same time.
A more robust approach: polling the Power BI REST API
What we need is a mechanism to reliably trigger a refresh, ensuring that any previous refreshes have completed first. My previous post about doing this in Azure Synapse and Azure Data Factory includes all the pieces we need.
To achieve this, we need to interact with the Power BI REST API directly to query the status of running refreshes. We can do this easily using a simple Web Activity within our Fabric pipeline, parameterising the Workspace ID and the Semantic Model (or Dataset) ID to construct the URL dynamically. We'll issue a GET request to the API, including a query string parameter to filter the results and return only the latest refresh ($top=1
).
{
"name": "Check Refresh Status",
"type": "WebActivity",
"dependsOn": [],
"policy": {
"timeout": "0.12:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"typeProperties": {
"method": "GET",
"relativeUrl": "/v1.0/myorg/groups/@{pipeline().parameters.WorkspaceId}/datasets/@{pipeline().parameters.DatasetId}/refreshes?$top=1"
},
"externalReferences": {
"connection": ""
}
}
The Connections in Microsoft Fabric work slightly differently to Linked Services in Azure Synapse or Azure Data Factory. The code sample above shows a relative Url to the /refreshes
endpoint, and the Web Activity will need a configured Web v2
Connection, with:
- The
Base Url
property pointing to the Power BI REST API Url:https://api.powerbi.com
- The Token Audience Uri pointing to:
https://analysis.windows.net/powerbi/api
And whichever authentication method you use, you will need to ensure the correct Contributor permission is applied to the workspace to allow querying of refresh statuses.
Once the request is made, we can capture the result from the JSON response body and store it in a pipeline variable called RefreshStatus
. The Power BI REST API documentation explains that the status will be "Unknown"
if a refresh is actively running, or "Completed"
or "Failed"
if it has finished. Knowing this, we can wrap the API query in an Until
activity loop, polling the API until we get a response that isn't "Unknown"
, waiting for 10 seconds between each check. We also need to initialise the RefreshStatus
variable to "Unknown"
before the loop begins.
{
"name": "Set Refresh Status to Unknown",
"type": "SetVariable",
"dependsOn": [],
"policy": {
"secureOutput": false,
"secureInput": false
},
"typeProperties": {
"variableName": "Refresh Status",
"value": "Unknown"
}
},
{
"name": "Until No Refresh Running",
"type": "Until",
"dependsOn": [
{
"activity": "Set Refresh Status to Unknown",
"dependencyConditions": [
"Succeeded"
]
}
],
"typeProperties": {
"expression": {
"value": "@not(equals('Unknown', variables('Refresh Status')))",
"type": "Expression"
},
"activities": [
{
"name": "Wait 10 Seconds",
"type": "Wait",
"dependsOn": [
{
"activity": "Set Refresh Status",
"dependencyConditions": [
"Succeeded"
]
}
],
"typeProperties": {
"waitTimeInSeconds": 1
}
},
{
"name": "Set Refresh Status",
"type": "SetVariable",
"dependsOn": [
{
"activity": "Check Refresh Status",
"dependencyConditions": [
"Succeeded"
]
}
],
"policy": {
"secureOutput": false,
"secureInput": false
},
"typeProperties": {
"variableName": "Refresh Status",
"value": {
"value": "@activity('Check Refresh Status').output.value[0].status",
"type": "Expression"
}
}
},
{
"name": "Check Refresh Status",
"type": "WebActivity",
"dependsOn": [],
"policy": {
"timeout": "0.12:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"typeProperties": {
"method": "GET",
"relativeUrl": "/v1.0/myorg/groups/@{pipeline().parameters.WorkspaceId}/datasets/@{pipeline().parameters.DatasetId}/refreshes?$top=1"
},
"externalReferences": {
"connection": ""
}
}
],
"timeout": "0.12:00:00"
}
}
Once the loop exits - either because there's no actively running refresh, or it's waited for one to finish, we can continue with calling the original Semantic Model Refresh activity safely. We can also re-use the two pipeline parameters (Workspace ID and Semantic Model ID) that we're already using in our Web Activity to parameterise the settings of the Semantic Model Refresh activity, using dynamic content.
{
"name": "Semantic model refresh",
"type": "PBISemanticModelRefresh",
"dependsOn": [
{
"activity": "Until No Refresh Running",
"dependencyConditions": [
"Succeeded"
]
}
],
"policy": {
"timeout": "0.12:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"typeProperties": {
"method": "post",
"waitOnCompletion": true,
"commitMode": "Transactional",
"operationType": "SemanticModelRefresh",
"groupId": "@{pipeline().parameters.WorkspaceId}",
"datasetId": "@{pipeline().parameters.DatasetId}"
},
"externalReferences": {
"connection": ""
}
}
Wrapping up
With this in place, we now have a parameterised pipeline that intelligently waits until any previously running refreshes have completed before triggering a new one. We already know that the Semantic Model Refresh activity will automatically poll and wait for the refresh to finish before it completes, and throw an error and fail the pipeline if the refresh itself fails. However, if it fails now, we know it's because something genuinely went wrong in the refresh, rather than a conflict with triggering another refresh.
This entire pipeline pattern is now reusable. You can call it from any other pipeline using the Execute Pipeline activity whenever you need to reliably refresh a semantic model, simply by passing in the relevant workspace and semantic model IDs.
I've made this pipeline pattern available as a template, which you can import directly into your own Fabric environments. Alternatively, you can use the JSON source code to build it up yourself. All of this can be found in the Pipeline Patterns repository, along with other useful reusable samples.
https://github.com/endjin/data-pipeline-patterns
A Quick Note for Azure Data Factory and Synapse Pipelines Users
If you're looking to achieve the same result in Azure Data Factory or Synapse Pipelines, the approach is slightly different, primarily because there isn't an out-of-the-box Semantic Model Refresh activity. However, with a bit of extra logic (which essentially involves making the API call to trigger the refresh, in addition to polling for its status), you can achieve precisely the same result. I've covered this in detail in a related post!