Skip to content
James Broome By James Broome Director of Engineering
Reliably refreshing a Semantic Model from Azure Data Factory or Synapse Pipelines

A common requirement we see in Azure Synapse or Azure Data Factory pipelines is the need to refresh a Power BI semantic model as the final step in a data pipeline. This ensures that as soon as new data is processed, it's reflected in the reports and dashboards that drive actions and decisions.

However, there's no out the box mechanism for doing this in either Azure Synapse or Data Factory (unlike in Microsoft Fabric pipelines). And orchestrating this process introduces a surprising amount of complexity: How do we trigger a refresh programmatically? What happens if multiple pipelines try to refresh the same model at once? How do we monitor the refresh and handle failures gracefully?

In this post, I'll outline a robust and reusable pipeline pattern for Azure Data Factory or Azure Synapse Pipelines that addresses these points. It provides a reliable mechanism for refreshing any semantic model by interacting directly with the Power BI REST API. And if you're using Microsoft Fabric Pipelines, then check out my related post that explains how to achieve the same result in Fabric.


Triggering a model refresh with Web Activity

Unlike Microsoft Fabric, neither Azure Data Factory nor Azure Synapse Pipelines have a built-in activity for refreshing a Power BI semantic model. However, it's easy enough to initiate a new refresh with a Web Activity, making a POST request to the Power BI REST API's /refreshes endpoint.

While this works for triggering a refresh, it's a "fire and forget" approach. Because the API call is asynchronous - it initiates the refresh and immediately returns a 202 Accepted response - the calling pipeline has no visibility into whether the refresh actually succeeds or fails. This is pretty significant if having updated reports displaying the latest processed data from your pipeline run is important.

Power BI Weekly is a collation of the week's top news and articles from the Power BI ecosystem, all presented to you in one, handy newsletter!

On top of that, the Power BI API will throw an error if you attempt to start a refresh while one is already in progress for the same semantic model. In a busy environment with multiple data feeds and trigger schedules, it's highly likely that two pipelines will eventually attempt to refresh the same model concurrently, causing one to fail. While the built-in activity offers some basic retry settings (as do all pipeline activities), this doesn't provide us with a reliable solution. We don't have a clear understanding of how long the refresh will take, nor do we know how many other processes might be trying to trigger a refresh at the same time.


A More Robust Framework: Polling the Power BI REST API

A good way to think about a more reliable solution is building a process that is aware of the semantic model's state. We need our pipeline to intelligently check the status, wait if necessary, and then monitor for completion.

The pattern described below uses a sequence of activities to manage this process:

  1. Check for an Active Refresh: Before attempting a new refresh, we must first query the API to see if a refresh is already running.
  2. Trigger the Refresh: If no refresh is active, we can safely initiate a new one.
  3. Poll for Completion: After triggering, we must wait for the refresh to complete and confirm its final status.
  4. Handle Success or Failure: Finally, we either complete the pipeline successfully or fail it with a meaningful error if the refresh did not succeed.
Discover your Power BI Maturity Score by taking our FREE 5 minute quiz.

To implement this, we'll create a parameterised pipeline that uses the Workspace ID and Dataset ID (Semantic Model ID) as inputs. The core logic is built using Web, Until, and If Condition activities.

First, we need to check if a refresh is already in progress. We can do this by wrapping a Web Activity inside an Until loop. The Web Activity makes a GET request to the /refreshes?$top=1 endpoint of the Power BI API to fetch the latest refresh status.

{
    "name": "Until No Refresh Running",
    "type": "Until",
    "dependsOn": [
        {
            "activity": "Set Refresh Status to Unknown",
            "dependencyConditions": [
                "Succeeded"
            ]
        }
    ],
    "userProperties": [],
    "typeProperties": {
        "expression": {
            "value": "@not(equals('Unknown', variables('Refresh Status')))",
            "type": "Expression"
        },
        "activities": [
            {
                "name": "Check Initial Refresh Status",
                "type": "WebActivity",
                "dependsOn": [],
                "policy": {
                    "timeout": "0.12:00:00",
                    "retry": 0,
                    "retryIntervalInSeconds": 30,
                    "secureOutput": false,
                    "secureInput": false
                },
                "userProperties": [],
                "typeProperties": {
                    "method": "GET",
                    "url": {
                        "value": "https://api.powerbi.com/v1.0/myorg/groups/@{pipeline().parameters.WorkspaceId}/datasets/@{pipeline().parameters.DatasetId}/refreshes?$top=1",
                        "type": "Expression"
                    },
                    "connectVia": {
                        "referenceName": "AutoResolveIntegrationRuntime",
                        "type": "IntegrationRuntimeReference"
                    },
                    "body": {
                        "notifyOption": "NoNotification"
                    },
                    "authentication": {
                        "type": "MSI",
                        "resource": "https://analysis.windows.net/powerbi/api"
                    }
                }
            },
            {
                "name": "Wait 10 Seconds",
                "type": "Wait",
                "dependsOn": [
                    {
                        "activity": "Set Refresh Status",
                        "dependencyConditions": [
                            "Succeeded"
                        ]
                    }
                ],
                "userProperties": [],
                "typeProperties": {
                    "waitTimeInSeconds": 1
                }
            },
            {
                "name": "Set Refresh Status",
                "type": "SetVariable",
                "dependsOn": [
                    {
                        "activity": "Check Initial Refresh Status",
                        "dependencyConditions": [
                            "Succeeded"
                        ]
                    }
                ],
                "policy": {
                    "secureOutput": false,
                    "secureInput": false
                },
                "userProperties": [],
                "typeProperties": {
                    "variableName": "Refresh Status",
                    "value": {
                        "value": "@activity('Check Initial Refresh Status').output.value[0].status",
                        "type": "Expression"
                    }
                }
            }
        ],
        "timeout": "0.12:00:00"
    }
}

Once the request is made, we can capture the result from the JSON response body and store it in a pipeline variable called RefreshStatus. The Power BI REST API documentation explains that the status will be "Unknown" if a refresh is actively running, or "Completed" or "Failed" if it has finished. Knowing this, we can wrap the API query in an Until activity loop, polling the API until we get a response that isn't "Unknown", waiting for 10 seconds between each check. We also need to initialise the RefreshStatus variable to "Unknown" before the loop begins.

N.B. If you're using a Service Principal or System Assigned Managed Identity for your Web Activity connection, you will need to ensure the correct Contributor permission is applied to the target Power BI workspace to allow querying of refresh statuses.

Once the loop confirms there is no active refresh, we use another Web Activity to POST to the /refreshes endpoint and trigger a new one. This is followed by a second, identical Until loop that polls for the completion of the refresh we just started.

{
    "name": "Trigger Refresh",
    "type": "WebActivity",
    "dependsOn": [
        {
            "activity": "Until No Refresh Running",
            "dependencyConditions": [
                "Succeeded"
            ]
        }
    ],
    "policy": {
        "timeout": "0.12:00:00",
        "retry": 0,
        "retryIntervalInSeconds": 30,
        "secureOutput": false,
        "secureInput": false
    },
    "userProperties": [],
    "typeProperties": {
        "method": "POST",
        "url": {
            "value": "https://api.powerbi.com/v1.0/myorg/groups/@{pipeline().parameters.WorkspaceId}/datasets/@{pipeline().parameters.DatasetId}/refreshes",
            "type": "Expression"
        },
        "connectVia": {
            "referenceName": "AutoResolveIntegrationRuntime",
            "type": "IntegrationRuntimeReference"
        },
        "body": {
            "notifyOption": "NoNotification"
        },
        "authentication": {
            "type": "MSI",
            "resource": "https://analysis.windows.net/powerbi/api"
        }
    }
},
{
    "name": "Poll For Completion",
    "type": "Until",
    "dependsOn": [
        {
            "activity": "Trigger Refresh",
            "dependencyConditions": [
                "Succeeded"
            ]
        }
    ],
    "userProperties": [],
    "typeProperties": {
        "expression": {
            "value": "@not(equals('Unknown', variables('Refresh Status')))",
            "type": "Expression"
        },
        "activities": [
            {
                "name": "Check Refresh Status",
                "type": "WebActivity",
                "dependsOn": [],
                "policy": {
                    "timeout": "0.12:00:00",
                    "retry": 0,
                    "retryIntervalInSeconds": 30,
                    "secureOutput": false,
                    "secureInput": false
                },
                "userProperties": [],
                "typeProperties": {
                    "method": "GET",
                    "url": {
                        "value": "https://api.powerbi.com/v1.0/myorg/groups/@{pipeline().parameters.WorkspaceId}/datasets/@{pipeline().parameters.DatasetId}/refreshes?$top=1",
                        "type": "Expression"
                    },
                    "connectVia": {
                        "referenceName": "AutoResolveIntegrationRuntime",
                        "type": "IntegrationRuntimeReference"
                    },
                    "body": {
                        "notifyOption": "NoNotification"
                    },
                    "authentication": {
                        "type": "MSI",
                        "resource": "https://analysis.windows.net/powerbi/api"
                    }
                }
            },
            {
                "name": "Wait 10s",
                "type": "Wait",
                "dependsOn": [
                    {
                        "activity": "Update Refresh Status",
                        "dependencyConditions": [
                            "Succeeded"
                        ]
                    }
                ],
                "userProperties": [],
                "typeProperties": {
                    "waitTimeInSeconds": 1
                }
            },
            {
                "name": "Update Refresh Status",
                "type": "SetVariable",
                "dependsOn": [
                    {
                        "activity": "Check Refresh Status",
                        "dependencyConditions": [
                            "Succeeded"
                        ]
                    }
                ],
                "policy": {
                    "secureOutput": false,
                    "secureInput": false
                },
                "userProperties": [],
                "typeProperties": {
                    "variableName": "Refresh Status",
                    "value": {
                        "value": "@activity('Check Refresh Status').output.value[0].status",
                        "type": "Expression"
                    }
                }
            }
        ],
        "timeout": "0.12:00:00"
    }
}

Finally, an If Condition activity checks the final status. If it's "Completed", the pipeline succeeds. If it's anything else (e.g., "Failed"), we use a Fail activity to stop the pipeline and report the error. Now, a pipeline failure is due to a genuine data processing issue within Power BI, not a conflict with another pipeline run.

{
    "name": "Check Status",
    "type": "IfCondition",
    "dependsOn": [
        {
            "activity": "Poll For Completion",
            "dependencyConditions": [
                "Succeeded"
            ]
        }
    ],
    "userProperties": [],
    "typeProperties": {
        "expression": {
            "value": "@equals(variables('Refresh Status'), 'Completed')",
            "type": "Expression"
        },
        "ifFalseActivities": [
            {
                "name": "Dataset Refresh Failed",
                "type": "Fail",
                "dependsOn": [],
                "userProperties": [],
                "typeProperties": {
                    "message": {
                        "value": "@concat('Power BI Dataset refresh failed with status of: ', variables('Refresh Status'))",
                        "type": "Expression"
                    },
                    "errorCode": "500"
                }
            }
        ]
    }
}

Wrapping Up

With this pattern in place, we have a parameterised, reusable pipeline that can be called from any other pipeline using an Execute Pipeline activity. It intelligently waits until any previously running refreshes have completed before triggering a new one, monitors the outcome, and provides a clear success or fail result.

I've made this pipeline pattern available as a template, which you can import directly into your own environments. Alternatively, you can use the JSON source code to build it up yourself. All of this can be found in the Pipeline Patterns repository, along with other useful reusable samples.

https://github.com/endjin/data-pipeline-patterns

A Quick Note for Microsoft Fabric users

If you're looking to achieve the same result in Microsoft Fabric, the approach is slightly simpler as there's an out-of-the-box Semantic Model Refresh activity. However, the issue with concurrent refreshes still applies so parts of this pattern can be adapted for use in Fabric to achieve the same level of resilience. I've covered this in a related post!

FAQs

How can I refresh a Power BI semantic model from Azure Data Factory or Azure Synapse Pipelines when there's no built-in activity? You can use the Power BI REST API with Web Activities to trigger refreshes programmatically. However, for a robust solution that handles concurrent refreshes and monitors completion status, you'll need to implement a polling pattern that checks for active refreshes before triggering new ones and waits for completion.
Why does my Azure Data Factory pipeline fail when trying to refresh a Power BI dataset that's already being refreshed by another process? The Power BI REST API throws an error when you attempt to start a refresh while one is already in progress. You need to implement a pattern that polls the refresh status API to wait for any running refreshes to complete before triggering a new one, preventing conflicts between concurrent pipeline runs.
What's the best way to ensure my Power BI semantic model refresh actually completes successfully from an Azure Synapse pipeline? Instead of using a simple 'fire and forget' POST request, implement a comprehensive pattern that first checks for active refreshes, triggers the refresh when safe to do so, then polls the API until completion to verify success or failure. This ensures your pipeline accurately reflects the true state of your semantic model refresh.

James Broome

Director of Engineering

James Broome

James has spent 20+ years delivering high quality software solutions addressing global business problems, with teams and clients across 3 continents. As Director of Engineering at endjin, he leads the team in providing technology strategy, data insights and engineering support to organisations of all sizes - from disruptive B2C start-ups, to global financial institutions. He's responsible for the success of our customer-facing project delivery, as well as the capability and growth of our delivery team.