Reliably refreshing a Semantic Model from Azure Data Factory or Synapse Pipelines

A common requirement we see in Azure Synapse or Azure Data Factory pipelines is the need to refresh a Power BI semantic model as the final step in a data pipeline. This ensures that as soon as new data is processed, it's reflected in the reports and dashboards that drive actions and decisions.
However, there's no out the box mechanism for doing this in either Azure Synapse or Data Factory (unlike in Microsoft Fabric pipelines). And orchestrating this process introduces a surprising amount of complexity: How do we trigger a refresh programmatically? What happens if multiple pipelines try to refresh the same model at once? How do we monitor the refresh and handle failures gracefully?
In this post, I'll outline a robust and reusable pipeline pattern for Azure Data Factory or Azure Synapse Pipelines that addresses these points. It provides a reliable mechanism for refreshing any semantic model by interacting directly with the Power BI REST API. And if you're using Microsoft Fabric Pipelines, then check out my related post that explains how to achieve the same result in Fabric.
Triggering a model refresh with Web Activity
Unlike Microsoft Fabric, neither Azure Data Factory nor Azure Synapse Pipelines have a built-in activity for refreshing a Power BI semantic model. However, it's easy enough to initiate a new refresh with a Web Activity, making a POST
request to the Power BI REST API's /refreshes
endpoint.
While this works for triggering a refresh, it's a "fire and forget" approach. Because the API call is asynchronous - it initiates the refresh and immediately returns a 202 Accepted
response - the calling pipeline has no visibility into whether the refresh actually succeeds or fails. This is pretty significant if having updated reports displaying the latest processed data from your pipeline run is important.
On top of that, the Power BI API will throw an error if you attempt to start a refresh while one is already in progress for the same semantic model. In a busy environment with multiple data feeds and trigger schedules, it's highly likely that two pipelines will eventually attempt to refresh the same model concurrently, causing one to fail. While the built-in activity offers some basic retry settings (as do all pipeline activities), this doesn't provide us with a reliable solution. We don't have a clear understanding of how long the refresh will take, nor do we know how many other processes might be trying to trigger a refresh at the same time.
A More Robust Framework: Polling the Power BI REST API
A good way to think about a more reliable solution is building a process that is aware of the semantic model's state. We need our pipeline to intelligently check the status, wait if necessary, and then monitor for completion.
The pattern described below uses a sequence of activities to manage this process:
- Check for an Active Refresh: Before attempting a new refresh, we must first query the API to see if a refresh is already running.
- Trigger the Refresh: If no refresh is active, we can safely initiate a new one.
- Poll for Completion: After triggering, we must wait for the refresh to complete and confirm its final status.
- Handle Success or Failure: Finally, we either complete the pipeline successfully or fail it with a meaningful error if the refresh did not succeed.
To implement this, we'll create a parameterised pipeline that uses the Workspace ID and Dataset ID (Semantic Model ID) as inputs. The core logic is built using Web
, Until
, and If Condition
activities.
First, we need to check if a refresh is already in progress. We can do this by wrapping a Web Activity
inside an Until
loop. The Web Activity
makes a GET
request to the /refreshes?$top=1
endpoint of the Power BI API to fetch the latest refresh status.
{
"name": "Until No Refresh Running",
"type": "Until",
"dependsOn": [
{
"activity": "Set Refresh Status to Unknown",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"expression": {
"value": "@not(equals('Unknown', variables('Refresh Status')))",
"type": "Expression"
},
"activities": [
{
"name": "Check Initial Refresh Status",
"type": "WebActivity",
"dependsOn": [],
"policy": {
"timeout": "0.12:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"method": "GET",
"url": {
"value": "https://api.powerbi.com/v1.0/myorg/groups/@{pipeline().parameters.WorkspaceId}/datasets/@{pipeline().parameters.DatasetId}/refreshes?$top=1",
"type": "Expression"
},
"connectVia": {
"referenceName": "AutoResolveIntegrationRuntime",
"type": "IntegrationRuntimeReference"
},
"body": {
"notifyOption": "NoNotification"
},
"authentication": {
"type": "MSI",
"resource": "https://analysis.windows.net/powerbi/api"
}
}
},
{
"name": "Wait 10 Seconds",
"type": "Wait",
"dependsOn": [
{
"activity": "Set Refresh Status",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"waitTimeInSeconds": 1
}
},
{
"name": "Set Refresh Status",
"type": "SetVariable",
"dependsOn": [
{
"activity": "Check Initial Refresh Status",
"dependencyConditions": [
"Succeeded"
]
}
],
"policy": {
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"variableName": "Refresh Status",
"value": {
"value": "@activity('Check Initial Refresh Status').output.value[0].status",
"type": "Expression"
}
}
}
],
"timeout": "0.12:00:00"
}
}
Once the request is made, we can capture the result from the JSON response body and store it in a pipeline variable called RefreshStatus
. The Power BI REST API documentation explains that the status will be "Unknown"
if a refresh is actively running, or "Completed"
or "Failed"
if it has finished. Knowing this, we can wrap the API query in an Until
activity loop, polling the API until we get a response that isn't "Unknown"
, waiting for 10 seconds between each check. We also need to initialise the RefreshStatus
variable to "Unknown"
before the loop begins.
N.B. If you're using a Service Principal or System Assigned Managed Identity for your Web Activity connection, you will need to ensure the correct Contributor permission is applied to the target Power BI workspace to allow querying of refresh statuses.
Once the loop confirms there is no active refresh, we use another Web Activity
to POST
to the /refreshes
endpoint and trigger a new one. This is followed by a second, identical Until
loop that polls for the completion of the refresh we just started.
{
"name": "Trigger Refresh",
"type": "WebActivity",
"dependsOn": [
{
"activity": "Until No Refresh Running",
"dependencyConditions": [
"Succeeded"
]
}
],
"policy": {
"timeout": "0.12:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"method": "POST",
"url": {
"value": "https://api.powerbi.com/v1.0/myorg/groups/@{pipeline().parameters.WorkspaceId}/datasets/@{pipeline().parameters.DatasetId}/refreshes",
"type": "Expression"
},
"connectVia": {
"referenceName": "AutoResolveIntegrationRuntime",
"type": "IntegrationRuntimeReference"
},
"body": {
"notifyOption": "NoNotification"
},
"authentication": {
"type": "MSI",
"resource": "https://analysis.windows.net/powerbi/api"
}
}
},
{
"name": "Poll For Completion",
"type": "Until",
"dependsOn": [
{
"activity": "Trigger Refresh",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"expression": {
"value": "@not(equals('Unknown', variables('Refresh Status')))",
"type": "Expression"
},
"activities": [
{
"name": "Check Refresh Status",
"type": "WebActivity",
"dependsOn": [],
"policy": {
"timeout": "0.12:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"method": "GET",
"url": {
"value": "https://api.powerbi.com/v1.0/myorg/groups/@{pipeline().parameters.WorkspaceId}/datasets/@{pipeline().parameters.DatasetId}/refreshes?$top=1",
"type": "Expression"
},
"connectVia": {
"referenceName": "AutoResolveIntegrationRuntime",
"type": "IntegrationRuntimeReference"
},
"body": {
"notifyOption": "NoNotification"
},
"authentication": {
"type": "MSI",
"resource": "https://analysis.windows.net/powerbi/api"
}
}
},
{
"name": "Wait 10s",
"type": "Wait",
"dependsOn": [
{
"activity": "Update Refresh Status",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"waitTimeInSeconds": 1
}
},
{
"name": "Update Refresh Status",
"type": "SetVariable",
"dependsOn": [
{
"activity": "Check Refresh Status",
"dependencyConditions": [
"Succeeded"
]
}
],
"policy": {
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"variableName": "Refresh Status",
"value": {
"value": "@activity('Check Refresh Status').output.value[0].status",
"type": "Expression"
}
}
}
],
"timeout": "0.12:00:00"
}
}
Finally, an If Condition
activity checks the final status. If it's "Completed"
, the pipeline succeeds. If it's anything else (e.g., "Failed"
), we use a Fail
activity to stop the pipeline and report the error. Now, a pipeline failure is due to a genuine data processing issue within Power BI, not a conflict with another pipeline run.
{
"name": "Check Status",
"type": "IfCondition",
"dependsOn": [
{
"activity": "Poll For Completion",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"expression": {
"value": "@equals(variables('Refresh Status'), 'Completed')",
"type": "Expression"
},
"ifFalseActivities": [
{
"name": "Dataset Refresh Failed",
"type": "Fail",
"dependsOn": [],
"userProperties": [],
"typeProperties": {
"message": {
"value": "@concat('Power BI Dataset refresh failed with status of: ', variables('Refresh Status'))",
"type": "Expression"
},
"errorCode": "500"
}
}
]
}
}
Wrapping Up
With this pattern in place, we have a parameterised, reusable pipeline that can be called from any other pipeline using an Execute Pipeline
activity. It intelligently waits until any previously running refreshes have completed before triggering a new one, monitors the outcome, and provides a clear success or fail result.
I've made this pipeline pattern available as a template, which you can import directly into your own environments. Alternatively, you can use the JSON source code to build it up yourself. All of this can be found in the Pipeline Patterns repository, along with other useful reusable samples.
https://github.com/endjin/data-pipeline-patterns
A Quick Note for Microsoft Fabric users
If you're looking to achieve the same result in Microsoft Fabric, the approach is slightly simpler as there's an out-of-the-box Semantic Model Refresh activity. However, the issue with concurrent refreshes still applies so parts of this pattern can be adapted for use in Fabric to achieve the same level of resilience. I've covered this in a related post!