Sharing access to synchronized Shared Metadata Model objects in Azure Synapse Analytics

By Ed Freeman Software Engineer III 1st September 2022

Sharing access to synchronized Shared Metadata Model objects in Azure Synapse Analytics

As described in the previous article What is the Shared Metadata Model in Azure Synapse Analytics, and why should I use it?, the shared metadata model synchronizes Spark tables with SQL Serverless such that you can query your Spark table data using normal T-SQL through the SQL Serverless endpoint.

Remember, the underlying data that forms these Spark tables isn't stored in the SQL Serverless DB - rather the underlying data is stored in your ADLS account. So, just like when querying data using OPENROWSET, you must make sure that the caller has the appropriate permissions to read the corresponding files in the lake. This can be done using RBAC (+ABAC) roles at the storage account/filesystem level, or by using ACLs at the directory/file level.

In this article we'll take a look at various scenarios you might encounter when configuring access control on the synchronized objects in the shared metadata model.

How to assign access to the Shared Metadata Model objects

Up until recently, there were two primary ways of assigning access to the synchronized objects:

Assign the sysadmin server role to the target LOGIN. This can be directly assigned, or can be inherited by the Synapse Administrator or Synapse SQL Administrator RBAC roles.
Grant a LOGIN the CONNECT ANY DATABASE and SELECT ALL USER SECURABLES permissions.

(as described in the official docs How to set up access control on synchronized objects in serverless SQL pool).

On their own, these were both quite high-privileged permissions sets, as they applied to all replicated Spark databases. So this wasn't ideal if you wanted to reduce the scope of permissions. (Although in conjunction with storage-level permissions, it was possible to further lock-down what end-users could see, even if they were assigned to these roles.)

But recently there's been an update to Synapse to allow the creation of customized objects in the Spark replicated database. This update provides us with another option:

Create a USER on specific replicated database, and assign db_datareader (or some other role that SQL Serverless supports - see Database-level roles). This must be an AAD security principal (User/Group/App), as currently this is the only authentication mechanism supported by the synchronized tables.

This is much better - you can now assign USERs access to individual databases and add them to a role (most likely db_datareader). Unfortunately GRANT statements aren't supported, so you can't assign more granular access to objects within the database. Maybe that will come in the future - who knows.

The best hour you can spend to refine your own data strategy and leverage the latest capabilities on Azure to accelerate your road map.

Here's the code snippet (creating a contained AAD user):

USE [MyReplicatedSparkDB];

CREATE USER [foo.bar@contoso.com] FROM EXTERNAL PROVIDER;

ALTER ROLE db_datareader ADD MEMBER [foo.bar@contoso.com];

So long as the underlying storage access has been assigned for the backing data of the synchronized tables (using the Storage Blob Data * RBAC permissions, or using ACLs), there shouldn't be anything more to do.

Common errors

`Login failed for user '<token-identified principal>'.`

This is common if you're logging in via SSMS or Azure Data Studio, or some other SQL IDE. This error is nothing to do with the Shared Metadata Model - rather the AAD user trying to access doesn't have access to the Serverless pool at all. If you're using Synapse RBAC, then the user needs to be granted either the Synapse Administrator or Synapse SQL Administrator roles. Otherwise they need to be given explicit SQL permissions on the server and/or DB.

`The SELECT permission was denied on the object...`

USER can log-in to the database but has no permissions to access any of the objects (or at least no "SELECT" permissions). They'll need to be assigned to a role like db_datareader to be granted access to the objects. Or, if this is a non-replicated database, then more granular GRANTs can be assigned.

`You do not have permission to use the bulk load statement.`

In order to use the OPENROWSET command (directly or indirectly (e.g. via a VIEW)), the USER needs to be granted the ADMINISTER DATABASE BULK OPERATIONS permission.

`File 'https://...' cannot be opened because it does not exist or it is used by another process.`

We've finally reached an error caused by storage level permissions! This implies that the AAD user (or the identity under which the call is being made) doesn't have the appropriate storage level permissions. If it's an AAD identity (which it will be for queries against synchronized objects in the replicated DB), these permissions can be assigned either through the Storage Blob Data * RBAC permissions, or by ACLs against the filesystem/directories/files.

`External table '...' is not accessible because content of directory cannot be listed.`

Another storage level permissions issue: this is when a user has access to a database and its objects (synchronized external table in this instance), but doesn't have appropriate permissions to the underlying files in the lake. Again, an RBAC role or ACLs are required.

FAQs

How do I share access to synchronized objects in the Shared Metadata Model in Azure Synapse?

The simplest and least privileged method is to create contained USERs on the corresponding replicated database (utilizing the custom object support in Lake databases), and assign the USERs to the db_datareader role. Alternatively, you can assign the user to the sysadmin server role, either directly on the database or through assigning the user the "Synapse Administrator" or "Synapse SQL Administrator" role. Or, you can create a LOGIN, and run the GRANT CONNECT ANY DATABASE and GRANT SELECT ALL USER SECURABLES server-level statements which will provide read access to all tables and views on any database on the Serverless SQL server. In any case, USERs will also need the appropriate permissions on the underlying files in storage, either using RBAC permissions or ACLs.

Who We Are

What We Do

Who We Help

What We Think

Contact Us

How to assign access to the Shared Metadata Model objects

Common errors

`The SELECT permission was denied on the object...`

`You do not have permission to use the bulk load statement.`

`File 'https://...' cannot be opened because it does not exist or it is used by another process.`

`External table '...' is not accessible because content of directory cannot be listed.`

FAQs

Also worth reading:

Customizing Lake Databases in Azure Synapse Analytics

Ed Freeman24/10/2022

What is the Shared Metadata Model in Azure Synapse Analytics, and why should I use it?

Ed Freeman12/07/2022

What is a Lake Database in Azure Synapse Analytics?

Ed Freeman18/10/2022

Ed Freeman

Software Engineer III