How to connect to Azure SQL with AAD authentication and Azure managed identities

Introduction

We’re trying to improve the security posture of our internal applications.

One aspect of this is how we deal with sensitive information, like database connection strings, API keys, or AAD client secrets. The approach we’re using is to store these in Key Vault instances, which can be accessed by the applications that require them, thanks to Azure managed identities.

In the case of Azure SQL, however, we’re using a slighty different technique, by leveraging Azure Active Directory authentication, and more specifically token-based authentication. Instead of using a connection string that contains a username and a password, we’re using the following strategy:

  1. If not done already, assign a managed identity to the application in Azure;
  2. Grant the necessary permissions to this identity on the target Azure SQL database;
  3. Acquire a token from Azure Active Directory, and use it to establish the connection to the database.

The main benefit comes from the fact that we don’t need to manage and protect the credentials required to connect to the database. We think this is more secure, because the less sensitive information to protect, the less chance of them being accessed by unauthorised parties. After all, isn’t the best password one that doesn’t exist in the first place?

In this post, we’ll talk about how one can connect to Azure SQL using token-based Azure Active Directory authentication, and how to do so using Entity Framework Core.

Connecting to Azure SQL using Azure Active Directory authentication

As mentioned before, this approach doesn’t use the traditional way of having a connection string that contains a username and a password. Instead, the credentials are replaced with an access token, much like you would use when you call an API. Here’s a simple example:

public static async Task Main(string[] args)
{
    var connectionStringBuilder = new SqlConnectionStringBuilder
    {
        DataSource = "tcp:<azure-sql-instance-name>.database.windows.net,1433",
        InitialCatalog = "<azure-sql-database-name>",
        TrustServerCertificate = false,
        Encrypt = true
    };

    await using var sqlConnection = new SqlConnection(connectionStringBuilder.ConnectionString)
    {
        AccessToken = await GetAzureSqlAccessToken()
    };

    await sqlConnection.OpenAsync();
    var currentTime = await sqlConnection.ExecuteScalarAsync<DateTime>("SELECT GETDATE()");

    Console.WriteLine($"The time is now {currentTime.ToShortTimeString()}");
}

private static async Task<string> GetAzureSqlAccessToken()
{
    // See https://docs.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/services-support-managed-identities#azure-sql
    var tokenRequestContext = new TokenRequestContext(new[] { "https://database.windows.net//.default" });
    var tokenRequestResult = await new DefaultAzureCredential().GetTokenAsync(tokenRequestContext);

    return tokenRequestResult.Token;
}

As previously mentioned, the connection string doesn’t contain a username or a password, only the Azure SQL instance and database we want to connect to. The authentication is performed via an access token that we associate with the SQL connection.

Acquiring the token is done with the help of the Azure.Identity NuGet package through the DefaultAzureCredential class. The killer feature of that class is, that it tries to acquire an access token from different sources, including:

  • Using credentials exposed through environment variables;
  • Using credentials of an Azure managed identity;
  • Using the account that is logged in to Visual Studio;
  • Using the account that is logged in to the Visual Studio Code Azure Account extension.

For more information, check out the Azure SDK for .NET GitHub repository.

Integrating AAD authentication with Entity Framework Core

Many of our internal applications use Entity Framework Core to access data. One impact is that the example shown above isn’t viable anymore, because EF Core manages the lifetime of SQL connections, meaning it creates and disposes of connections internally. While this is a big advantage, it means we need to find a way to “inject” an access token in the SQL connection before EF Core tries to use it.

The good news is that EF Core 3.0 introduced the concept of interceptors, which had been present in EF 6 for a long time. Interestingly, I could only find a mention of this capability in the release notes of EF Core 3.0, but not in the EF Core docs.

The AddInterceptors method used in the example expects instances of IInterceptor, which is a marker interface, making it hard to discover types that implement it. Using the decompiler of your choice — ILSpy in my case — we can easily find them:

Implementations of the IInterceptor interface

The DbConnectionInterceptor type seems like a fit. Luckily, it exposes a ConnectionOpeningAsync method which sounds just like what we need!

Update

If you use synchronous methods over your DbContext instance, like ToList(), Count(), or Any(), you need to override the synchronous ConnectionOpening method of the interceptor.

See more details in this new post!

Let’s get to it, shall we?

public class Startup
{
    public Startup(IConfiguration configuration)
    {
        Configuration = configuration;
    }

    public IConfiguration Configuration { get; }

    public void ConfigureServices(IServiceCollection services)
    {
        services.AddDbContext<AppDbContext>(options =>
        {
            options.UseSqlServer(Configuration.GetConnectionString("<connection-string-name>"));
            options.AddInterceptors(new AadAuthenticationDbConnectionInterceptor());
        });
    }
}

public class AadAuthenticationDbConnectionInterceptor : DbConnectionInterceptor
{
    public override async Task<InterceptionResult> ConnectionOpeningAsync(
        DbConnection connection,
        ConnectionEventData eventData,
        InterceptionResult result,
        CancellationToken cancellationToken)
    {
        var sqlConnection = (SqlConnection)connection;

        //
        // Only try to get a token from AAD if
        //  - We connect to an Azure SQL instance; and
        //  - The connection doesn't specify a username.
        //
        var connectionStringBuilder = new SqlConnectionStringBuilder(sqlConnection.ConnectionString);
        if (connectionStringBuilder.DataSource.Contains("database.windows.net", StringComparison.OrdinalIgnoreCase) && string.IsNullOrEmpty(connectionStringBuilder.UserID))
        {
            sqlConnection.AccessToken = await GetAzureSqlAccessToken(cancellationToken);
        }

        return await base.ConnectionOpeningAsync(connection, eventData, result, cancellationToken);
    }

    private static async Task<string> GetAzureSqlAccessToken(CancellationToken cancellationToken)
    {
        // See https://docs.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/services-support-managed-identities#azure-sql
        var tokenRequestContext = new TokenRequestContext(new[] { "https://database.windows.net//.default" });
        var tokenRequestResult = await new DefaultAzureCredential().GetTokenAsync(tokenRequestContext, cancellationToken);

        return tokenRequestResult.Token;
    }
}

The configuration of the EF Core DbContext is ordinary, with the exception of the registration of our interceptor.

The interceptor itself is straightforward as well; we can see that the way we acquire a token is similar to the previous example. One interesting aspect is that we try to detect whether we even need to get an access token, based on the SQL Server instance we connect to, and whether the connection string specifies a username.

During local development, there’s a high chance developers will connect to a local SQL database, so we don’t need a token in this case. Imagine also that for some reason, we revert back to using a connection string that contains a username and password; in that case as well, getting a token is not needed.

Going further: resolving interceptors with Dependency Injection

Update

Good news! There’s a much simpler and terser solution to resolve interceptors from the dependency injection container — please check out this new post.

I strongly recommend that you not use the solution described below, as it involves much more code and hasn’t been fully tested.

Interceptors are a great feature, but at the time of writing, the public API only allows you to add already constructed instances, which can be limiting. What if our interceptor needs to take dependencies on other services?

Registering the interceptors in the application service provider doesn’t work, because EF Core maintains an internal service provider, which is used to resolve interceptors.

I found a way by reverse engineering how EF Core itself is built. However, as you’ll see, the solution is quite involved, and I haven’t fully tested it. As a result, please carefully test it before using this method.

When configuring the DbContext, we can register an extension which has access to the internal service provider; hence, we can use it to register additional services, in this case our interceptor. However, this internal provider doesn’t have as many registered services as a provider used in an ASP.NET Core application. For example, this provider doesn’t have the commonly used ILogger<T> service registered.

Our goal is then to register our interceptor in the internal provider, but somehow have it be resolved from the application provider, so we can take advantage of all the services registered in the latter. To achieve this, we can leverage an extension provided by EF Core, called CoreOptionsExtension, which has a reference to the application service provider — the one we use to register services in the ConfigureServices method of the Startup class of ASP.NET Core applications.

The following implementation is based on the internal CoreOptionsExtension used in EF Core.

public class Startup
{
    public Startup(IConfiguration configuration)
    {
        Configuration = configuration;
    }

    public IConfiguration Configuration { get; }

    public void ConfigureServices(IServiceCollection services)
    {
        // 1. Register our interceptor as "itself"
        services.AddScoped<AadAuthenticationDbConnectionInterceptor>();

        // 2. Add our extension to EF Core
        services.AddDbContext<AppDbContext>(options =>
        {
            options.UseSqlServer(Configuration.GetConnectionString("<connection-string-name"));
            ((IDbContextOptionsBuilderInfrastructure)options).AddOrUpdateExtension(new AppOptionsExtension());
        });
    }
}

public class AppOptionsExtension : IDbContextOptionsExtension
{
    private DbContextOptionsExtensionInfo _info;

    public DbContextOptionsExtensionInfo Info => _info ??= new ExtensionInfo(this);

    public void ApplyServices(IServiceCollection services)
    {
        // 3. Get application service provider from CoreOptionsExtension, and
        // resolve interceptor registered in step 1, so we can register it in the
        // internal service provider as IInterceptor
        services.AddScoped<IInterceptor>(provider =>
        {
            var applicationServiceProvider = provider
                .GetRequiredService<IDbContextOptions>()
                .FindExtension<CoreOptionsExtension>()
                .ApplicationServiceProvider;

            return applicationServiceProvider.GetRequiredService<AadAuthenticationDbConnectionInterceptor>();
        });
    }

    public void Validate(IDbContextOptions options)
    {
    }

    private class ExtensionInfo : DbContextOptionsExtensionInfo
    {
        public ExtensionInfo(IDbContextOptionsExtension extension) : base(extension)
        {
        }

        public override bool IsDatabaseProvider => false;
        public override string LogFragment => null;
        public override long GetServiceProviderHashCode() => 0L;
        public override void PopulateDebugInfo(IDictionary<string, string> debugInfo)
        {
        }
    }
}

public class AadAuthenticationDbConnectionInterceptor : DbConnectionInterceptor
{
    private readonly ILogger _logger;

    // In this case we inject an instance of ILogger<T>, but you could inject any service
    // that is registered in your application provider
    public AadAuthenticationDbConnectionInterceptor(ILogger<AadAuthenticationDbConnectionInterceptor> logger)
    {
        _logger = logger;
    }

    public override async Task<InterceptionResult> ConnectionOpeningAsync(
        DbConnection connection,
        ConnectionEventData eventData,
        InterceptionResult result,
        CancellationToken cancellationToken)
    {
        var sqlConnection = (SqlConnection)connection;

        //
        // Only try to get a token from AAD if
        //  - We connect to an Azure SQL instance; and
        //  - The connection doesn't specify a username.
        //
        var connectionStringBuilder = new SqlConnectionStringBuilder(sqlConnection.ConnectionString);
        if (connectionStringBuilder.DataSource.Contains("database.windows.net", StringComparison.OrdinalIgnoreCase) && string.IsNullOrEmpty(connectionStringBuilder.UserID))
        {
            try
            {
                sqlConnection.AccessToken = await GetAzureSqlAccessToken(cancellationToken);
                _logger.LogInformation("Successfully acquired a token to connect to Azure SQL");
            }
            catch (Exception e)
            {
                _logger.LogError(e, "Unable to acquire a token to connect to Azure SQL");
            }
        }
        else
        {
            _logger.LogInformation("No need to get a token");
        }

        return await base.ConnectionOpeningAsync(connection, eventData, result, cancellationToken);
    }

    private static async Task<string> GetAzureSqlAccessToken(CancellationToken cancellationToken)
    {
        if (RandomNumberGenerator.GetInt32(10) >= 5)
        {
            throw new Exception("Faking an exception while tying to get a token");
        }

        // See https://docs.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/services-support-managed-identities#azure-sql
        var tokenRequestContext = new TokenRequestContext(new[] { "https://database.windows.net//.default" });
        var tokenRequestResult = await new DefaultAzureCredential().GetTokenAsync(tokenRequestContext, cancellationToken);

        return tokenRequestResult.Token;
    }
}

Conclusion

In this post, we covered how we can use Azure Active Directory authentication to connect to Azure SQL, focusing on the token-based aspect of it, since we’re trying to reduce the amount of sensitive information an application needs to deal with.

We also went over a nice way to integrate AAD authentication with Entity Framework Core, by leveraging interceptors. The first benefit of using this approach is that we let EF Core manage SQL connections internally. The second advantage of using interceptors is that they are asynchronous, which allows us not to have to resort to block on asynchronous operations.

Finally, we investigated how we can inject services in our interceptors. The solution we explored involves quite a bit of ceremony, which makes it pretty heavy. I opened an issue on the EF Core repository, we’ll see if the team finds a way to make this more friendly. Please let me know on Twitter if you know of an easier way to achieve this.

I hope you liked this post!

How we sped up an ASP.NET Core endpoint from 20+ seconds down to 4 seconds

This post has initially been published on the Telstra Purple blog at https://purple.telstra.com/blog/how-we-sped-up-an-aspnet-core-endpoint-from-20-seconds-down-to-4-seconds.

Introduction

We have an internal application at work that sends large payloads to the browser, approximately 25MB.

We know it is a problem and it is on our radar to do something about it. In this article, we’ll go through the investigation we performed, and how we ultimately brought the response time of this specific endpoint from 20+ seconds down to 4 seconds.

The problem we faced

We run this application on an Azure App Service, and the offending endpoint had always been slow, and I personally assumed that it was due to the amount of data it was returning, until one day for testing purposes, I ran the app locally and noticed that it was much faster, between 6 and 7 seconds.

To make sure we were not comparing apples to oranges, we made sure that the conditions were as similar as they can be:

  • We were running the same version of the app — that is, the same Git commit, we didn’t go as far as running the exact same binaries;
  • The apps were connecting to the same Azure SQL databases; and
  • They were also using the same instance of Azure Cache for Redis.

The one big difference that we could see was that our dev laptops are much more powerful in regards to the CPU, the amount of RAM or the speed of the storage.

The investigation

What could explain that this endpoint took roughly 3 times less to execute when it was connecting to the same resources?

To be perfectly honest, I can’t remember exactly what pointed me in this direction, but at some point I realised two things:

  1. Starting with ASP.NET Core 3.0, synchronous I/O is disabled by default, meaning an exception will be thrown, if you try to read the request body or write to the response body in a synchronous, blocking way; see the official docs for more details on that; and
  2. Newtonsoft.Json, also known as JSON.NET, is synchronous. The app used it as the new System.Text.Json stack didn’t exist when it was migrated from ASP.NET Classic to ASP.NET Core.

How then did the framework manage to use a synchronous formatter while the default behaviour is to disable synchronous I/O, all without throwing exceptions?

I love reading code, it was then a great excuse for me to go and have a look at the implementation. Following the function calls from AddNewtonsoftJson, we end up in the NewtonsoftJsonMvcOptionsSetup where we can see how we replace the System.Text.Json-based formatter for the one based on Newtonsoft.Json.

That specific formatter reveals it’s performing some Stream gymnastics — see the code on GitHub. Instead of writing directly to the response body, the JSON.NET serializer writes (synchronously) to an intermediate FileBufferingWriteStream one, which is then used to write (asynchronously this time) to the response body.

The XML docs of FileBufferingWriteStream explain it perfectly:

A Stream that buffers content to be written to disk. Use DrainBufferAsync(Stream, CancellationToken) to write buffered content to a target Stream.

That Stream implementation will hold the data in memory while it’s smaller than 32kB; any larger than that and it stores it in a temporary file.

If my investigation was correct, the response body would’ve been written in blocks of 16kB. A quick math operation would show: 25MB written in 16kB blocks = 1,600 operations, 1,598 of which involve the file system. Eek!

This could explain why the endpoint was executing so much quicker on my dev laptop than on the live App Service; while my laptop has an SSD with near-immediate access times and super quick read/write operations, our current App Service Plan still runs with spinning disks!

How can we verify whether our hypothesis is correct?

Solution #1, quick and dirty

The easiest way I could think of to get the file system out of the equation was to enable synchronous I/O.

// Startup.cs
public void ConfigureServices(IServiceCollection services)
{
    services
        .AddControllers(options =>
        {
            // Suppress buffering through the file system
            options.SuppressOutputFormatterBuffering = true;
        })
        .AddNewtonsoftJson();
}

// Controller.cs
public Task<IActionResult> Action()
{
    // From https://docs.microsoft.com/en-us/dotnet/core/compatibility/aspnetcore#http-synchronous-io-disabled-in-all-servers
    // Allow synchronous I/O on a per-endpoint basis
    var syncIOFeature = HttpContext.Features.Get<IHttpBodyControlFeature>();
    if (syncIOFeature != null)
    {
        syncIOFeature.AllowSynchronousIO = true;
    }

    // Rest of the implementation, ommited for brevity
}

Making both of those changes is required because:

  1. Only suppressing output buffering would throw an exception, since we’d be synchronously writing to the response body, while it’s disabled by default;
  2. Only allow synchronous I/O wouldn’t change anything, as output buffering is enabled by default, so that updating projects to ASP.NET Core 3.0 doesn’t break when using Newtonsoft.Json and sending responses larger than 32kB.

Locally, I observed a response time of ~4 seconds, which was a substantial improvement of ~30%.

While it was a good sign that our hypothesis was correct, we didn’t want to ship this version. Our application doesn’t get that much traffic, but synchronous I/O should be avoided if possible, as it is a blocking operation that can lead to thread starvation.

Solution #2, more involved, and more sustainable

The second option was to remove the dependency on Newtonsoft.Json, and use the new System.Text.Json serialiser. The latter is async friendly, meaning it can write directly to the response stream, without an intermediary buffer.

It wasn’t as easy as swapping serialisers, as at the time of writing System.Text.Json it was not at feature parity with Newtonsoft.Json. My opinion is that it’s totally understandable as JSON.NET has been available far longer.

Microsoft provides a good and honest comparison between the two frameworks: https://docs.microsoft.com/en-us/dotnet/standard/serialization/system-text-json-migrate-from-newtonsoft-how-to

The main thing for us was that System.Text.Json doesn’t support ignoring properties with default values. For example: like 0 for integers. We couldn’t just ignore this, because the payload is already so large, adding unnecessary properties to it would have made it even larger. Luckily, the workaround was relatively straightforward and well documented: we needed to write a custom converter, which gave us total control over which properties were serialised.

It was boring, plumbing code to write, but it was easy.

In the end, we removed the changes mentioned above, and plugged in System.Text.Json.

// Startup.cs
public void ConfigureServices(IServiceCollection services)
{
    services
        .AddControllers()
        .AddJsonOptions(options =>
        {
            // Converters for types we own
            options.JsonSerializerOptions.Converters.Add(new FirstCustomConverter());
            options.JsonSerializerOptions.Converters.Add(new SecondCustomConverter());

            // Make sure that enums are serialised as strings by default
            options.JsonSerializerOptions.Converters.Add(new JsonStringEnumConverter());

            // Ignore `null` values by default to reduce the payload size
            options.JsonSerializerOptions.IgnoreNullValues = true;
        });
}

We gave it another go, and again achieved a consistent 4-second response time on that endpoint 🎉. After deploying this new version to our App Service, we were stoked to see a similar response time there as well.

App Insights timings

Conclusion

This was a super fun investigation, and I was once again super happy to dig into the ASP.NET Core internals and learn a bit more about how some of it works.

While I realise our case was extreme given the size of the response payload, in the future I’ll think twice when I encounter a code base using Newtonsoft.Json, and see how hard it’d be to move to System.Text.Json. There’s definitely a big gap between both, but the team is hard at work to fill some of it for .NET 5.

See the .NET 5 preview 4 announcement and search for the “Improving migration from Newtonsoft.Json to System.Text.Json” header to learn more about where the effort goes. You can also check issues with the “area-System.Text.Json” on the dotnet/runtime repository, or take a look at the specific project board for System.Text.Json if you want to be even closer from what’s happening.

Authentication, antiforgery, and order of execution of filters

Introduction

I recently had to build a super simple application that had two main parts:

  1. the home page was accessible to anonymous users and presented them with a basic form;
  2. an “admin” section that required users to be authenticated, and where they could approve or reject submissions done via the home page.

Super simple, but I faced an issue related to antiforgery that I couldn’t understand at the time. I went with a workaround but thought I’d dig a bit deeper when I have time. Let’s have a look at it together!

How it was set up

Authentication

We’ll first go through how authentication was wired. Because only a subset of pages needed the user to be authenticated, I thought I’d configure the app so that authentication runs only for requests that need it. What this means in practice is not setting the DefaultAuthenticateScheme on AuthenticationOptions, and be explicit about the authentication scheme in the authorisation policy. Doing it this way has the advantage of doing authentication “just in time”, only for requests that need it.

// Startup.cs
public void ConfigureServices(IServiceCollection services)
{
    services
        .AddAuthentication()
        .AddCookie("Cookie");

    services
        .AddAuthorization(options =>
        {
            options.AddPolicy("LoggedInWithCookie", builder => builder
                .AddAuthenticationSchemes("Cookie")
                .RequireAuthenticatedUser());
        });
}

// HomeController.cs
// No authentication required here
public class HomeController : Controller
{
    [HttpGet("")]
    public IActionResult Index() => View();
}

// AdminController.cs
// All requests must be authenticated
[Route("admin")]
[Authorize(Policy = "LoggedInWithCookie")]
public class AdminController : Controller
{
    [HttpGet("")]
    public IActionResult Index() => View();
}

Antiforgery

The other part that we’re interested in is antiforgery.

If you don’t know what that is, it’s the mechanism that protects ASP.NET Core from cross-site request forgery (XSRF). You can read more about it on the great official docs.

My recommendation, instead of opting in antiforgery on a per-endpoint basis, is to take advantage of the AutoValidateAntiforgeryTokenAttribute filter, which “lights up” the check for all requests except GET, HEAD, TRACE and OPTIONS ones. Should you want to not enable antiforgery on a specific endpoint, you can apply the [IgnoreAntiforgeryToken] attribute as an opt-out mechanism — it’s the authentication equivalent of [AllowAnonymous].

I chose to apply antiforgery globally like so:

// Startup.cs
public void ConfigureServices(IServiceCollection services)
{
    [...]

    services.AddMvc(options =>
    {
        options.Filters.Add<AutoValidateAntiforgeryTokenAttribute>();
    });
}

The issue

The antiforgery mechanism worked well for the home page, in that trying to send POST requests from Postman didn’t work and returned an expected 400 HTTP response.

However, the approval/rejection requests in the admin section didn’t work and fetched the following error message:

Microsoft.AspNetCore.Antiforgery.AntiforgeryValidationException: The provided antiforgery token was meant for a different claims-based user than the current user.

The diagnosis

After doing some tests, I came to the conclusion that it was failing because when the antiforgery check was made, authentication had not run yet, so the request was treated as if it was anonymous, and that didn’t match the hidden field POSTed in the HTML form, nor the antiforgery cookie value. This was surprising to me as the documentation for the AutoValidateAntiforgeryTokenAttribute class mentions that the Order property is explicitly set to 1000 so that it runs after authentication.

To validate my suspicions, I changed the minimum logging level on the app to Debug, ran the request again, and this came up (slightly updated to avoid super long lines):

Execution plan of authorization filters (in the following order):
 - Microsoft.AspNetCore.Mvc.ViewFeatures.Internal.AutoValidateAntiforgeryTokenAuthorizationFilter
 - Microsoft.AspNetCore.Mvc.Authorization.AuthorizeFilter

This confirmed what my hunch was. Now we need to figure out why this is the case.

The solution

It was 100% random that I tried a different way of adding the antiforgery filter to the MVC global filters collection. But it worked 🤔.

// Startup.cs
public void ConfigureServices(IServiceCollection services)
{
    [...]

    services.AddMvc(options =>
    {
        // What it was before
        // options.Filters.Add<AutoValidateAntiforgeryTokenAttribute>();

        // What I tried for no logical reason
        options.Filter.Add(new AutoValidateAntiforgeryTokenAttribute());
    });
}

Why did it work? The fact that ASP.NET Core is open-source makes these types of researchs really easy. So I compared both overloads of the Add method of the FilterCollection class.

It turns out that the generic overload Add<T>() calls another generic overload with an extra parameter order with a value of 0 which, ultimately, creates an instance of TypeFilterAttribute which “wraps” the original filter type, ignoring its order.

Running the app again after making those changes confirmed that using this overload was respecting the Order set on AutoValidateAntiforgeryTokenAttribute, as I could see the following in the logs (again slightly modified):

Execution plan of authorization filters (in the following order):
 - Microsoft.AspNetCore.Mvc.Authorization.AuthorizeFilter
 - Microsoft.AspNetCore.Mvc.ViewFeatures.Internal.AutoValidateAntiforgeryTokenAuthorizationFilter

Conclusion

Working with an open-source framework makes it easier to figure out why it’s sometimes not behaving as you expect it to. In the future, I’ll refrain from using the generic overload, instead I’ll instantiate the filter myself to avoid surprises like that one.

I hope you liked that post 👍

ASP.NET Core and JSON Web Tokens - where are my claims?

Introduction

When extracting an identity from a JSON Web Token (JWT), ASP.NET Core — and .NET in general — maps some claims. In other words, the claims in the instance of ClaimsIdentity do not match perfectly the ones found in the JWT payload.

In this post we’ll go through an example of that behaviour, discover where that comes from, and how to opt out.

An example

If you use OpenID Connect- or OAuth2-based authentication in your application, there’s a high chance this is happening, potentially without you knowing, because it’s the default behaviour.

Let’s imagine we have an ASP.NET Core application using OpenID Connect to authenticate its users against an OIDC identity provider.

public void ConfigureServices(IServiceCollection services)
{
    services
        .AddAuthentication(options =>
        {
            options.DefaultScheme = "Cookies";
            options.DefaultChallengeScheme = "OpenIDConnect";
        })
        .AddOpenIdConnect("OpenIDConnect", options =>
        {
            options.Authority = "<the-url-to-the-identity-provider>";
            options.ClientId = "<the-client-id>";
        })
        .AddCookie("Cookies");
}

Here’s what the payload part of the issues JWT could look like:

{
  "aud": "<audience>",
  "iss": "<issuer-of-the-jwt>",
  "iat": 1561237872,
  "nbf": 1561237872,
  "exp": 1561241772,
  "email": "<email-address>",
  "name": "Someone Cool",
  "nonce": "636968349704644732.MjU2MzhiNzMtNDYwNi00NjZjLTkxZDItYjY3YTJkZDMzMzk0ODMyYzQxYzItNmRmNi00NmFiLThiMzItN2QxYjZkNzg5YjE4",
  "oid": "84a52e7b-d379-410d-bc6a-636c3d11d7b2",
  "preferred_username": "Someone Cool",
  "sub": "<some-opaque-identifier>",
  "tid": "<tenant-id>",
  "uti": "bGsQjxNN_UWE-Z2h-wEAAA",
  "ver": "2.0"
}

Now, here’s what the JSON representation of the claims in the extracted ClaimsIdentity would be:

{
  "aud": "<audience>",
  "iss": "<issuer-of-the-jwt>",
  "iat": 1561238329,
  "nbf": 1561238329,
  "exp": 1561242229,
  "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/emailaddress": "<email-address>",
  "name": "Someone Cool",
  "nonce": "636968354285381824.ZmE2M2Y2NWItZjc5NS00NTc3LWE5ZWItMGQxMjI2MjYwNjgyODI3Yjg1NTItYWMzYS00MDE3LThkMjctZjBkZDRkZmExOWI1",
  "http://schemas.microsoft.com/identity/claims/objectidentifier": "84a52e7b-d379-410d-bc6a-636c3d11d7b2",
  "preferred_username": "Someone Cool",
  "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/nameidentifier": "<some-opaque-identifier>",
  "http://schemas.microsoft.com/identity/claims/tenantid": "<tenant-id>",
  "uti": "rzybpqYLHEi4Wyk-yv0AAA",
  "ver": "2.0"
}

While some claims are identical, some of them got their name changed — let’s list them:

Claim name in JWT Claim name in ClaimsIdentity
email http://schemas.xmlsoap.org/ws/2005/05/identity/claims/emailaddress
oid http://schemas.microsoft.com/identity/claims/objectidentifier
sub http://schemas.xmlsoap.org/ws/2005/05/identity/claims/nameidentifier
tid http://schemas.microsoft.com/identity/claims/tenantid

Let’s have a look at where that behaviour comes from.

How does that happen?

The answer to that question lies in the library that is used to handle JSON Web Tokens — the validation and the extraction of an identity. This is the System.IdentityModel.Tokens.Jwt NuGet package, which source code is also on GitHub at the AzureAD/azure-activedirectory-identitymodel-extensions-for-dotnet repository.

The main class is the JwtSecurityTokenHandler, but the ones were after is ClaimTypeMapping. Because it’s quite a big portion of code, here’s the link to the relevant part: https://github.com/AzureAD/azure-activedirectory-identitymodel-extensions-for-dotnet/blob/a301921ff5904b2fe084c38e41c969f4b2166bcb/src/System.IdentityModel.Tokens.Jwt/ClaimTypeMapping.cs#L45-L125.

There we have it, a whopping 72 claims being renamed as they’re processed!

How do I opt out of this?

This behaviour could be confusing; imagine you consult the documentation of your identity provider to understand which claims you can expect back in a JWT that it issues, only to find that some of them are missing when you develop your .NET application!

Luckily, there are multiple ways to disable that behaviour.

1. The global, application-level way

The JwtSecurityTokenHandler class takes a static copy of the mapping dcutionary declared by ClaimTypeMapping, as you can see here on GitHub. This static copy is used by default by all instances of JwtSecurityTokenHandler. The trick is to clear this dictionary when the application starts.

In an ASP.NET Core app, that could be done in Program.Main, for example. My preference would be to put it closer to related code, maybe in Startup.ConfigureServices.

public void ConfigureServices(IServiceCollection services)
{
    // This is the line we just added
    JwtSecurityTokenHandler.DefaultInboundClaimTypeMap.Clear();

    services
        .AddAuthentication(options =>
        {
            options.DefaultScheme = "Cookies";
            options.DefaultChallengeScheme = "OpenIDConnect";
        })
        .AddOpenIdConnect("OpenIDConnect", options =>
        {
            options.Authority = "<the-url-to-the-identity-provider>";
            options.ClientId = "<the-client-id>";
        })
        .AddCookie("Cookies");
}
2. The per-handler way

While opting out at the application level is unlikely to be an issue if you develop a new application, t could have unintended consequences if we were to use it in an existing codebase. The good news is that JwtSecurityTokenHandler exposes instance-level properties which allow us to achieve the same result.

The first option is to clear the instance-level claims mappings dictionary of the handler:

public void ConfigureServices(IServiceCollection services)
{
    services
        .AddAuthentication(options =>
        {
            options.DefaultScheme = "Cookies";
            options.DefaultChallengeScheme = "OpenIDConnect";
        })
        .AddOpenIdConnect("OpenIDConnect", options =>
        {
            options.Authority = "<the-url-to-the-identity-provider>";
            options.ClientId = "<the-client-id>";

            // First option
            // Clear the instance-level dictionary containing the claims mappings
            var jwtHandler = new JwtSecurityTokenHandler();
            jwtHandler.InboundClaimTypeMap.Clear();

            options.SecurityTokenValidator = jwtHandler;
        })
        .AddCookie("Cookies");
}

The second one is to instruct the handler not to perform claims mappings, regardless of whether the dictionary contains mapping or not:

public void ConfigureServices(IServiceCollection services)
{
    services
        .AddAuthentication(options =>
        {
            options.DefaultScheme = "Cookies";
            options.DefaultChallengeScheme = "OpenIDConnect";
        })
        .AddOpenIdConnect("OpenIDConnect", options =>
        {
            options.Authority = "<the-url-to-the-identity-provider>";
            options.ClientId = "<the-client-id>";

            // Second options
            // Instruct the handler not to perform claims mapping
            var jwtHandler = new JwtSecurityTokenHandler
            {
                MapInboundClaims = false
            };

            options.SecurityTokenValidator = jwtHandler;
        })
        .AddCookie("Cookies");
}

Conclusion

In this post we went through the default behaviour in which JWT claims are being mapped to different names in .NET applications. By going through the source code of the library that handles JSON Web Tokens, we also pinned down how the library implements the mapping, as well as several ways to disable it.

What library did you wish you knew the internals of better? There’s a high chance it’s open source on GitHub these days, and if it isn’t, you can always use a .NET decompiler to read the code.

The consequences of enabling the 'user assignment required' option in AAD apps

Introduction

Applications in Azure Active Directory have an option labelled “user assignment required”. In this blog post, we’ll talk about how this affects an application.

💡 Quick heads-up — all the examples in this blog post are based on a web application using AAD as its identity provider through the OpenID Connect protocol.

Context

By default, applications created in Azure Active Directory have the “user assignment required” option turned off, which means that all the users in the directory can access the application, both members and guests.

While this might sound like a sensible default, we find ourselves at Readify with a growing number of guests in the directory as we collaborate with people from other companies. Some of our applications contain data that should be available to Readify employees only, so we decided to make use of the “user assignment required” option.

To access this option, in the Azure portal, go to “Azure Active Directory > Enterprise applications > your application > Properties” and the option will be displayed there.

Some of the behaviour changes were expected, but others were not! Let’s go through them.

1. People not assigned to the application can’t use it

Well, duh, isn’t that what the option is supposed to do?!

You’re absolutely right! If someone that hasn’t been explicitly assigned to the application tries to access it, then AAD will reject the authorisation request with a message similar to the following:

AADSTS50105: The signed in user ‘Microsoft.AzureAD.Telemetry.Diagnostics.PII’ is not assigned to a role for the application ‘<application-id>’ (<application-name>)

The message is straightforward and the behaviour expected.

There are several ways to assign someone to the application. I typically use the Azure portal, navigate to “Azure Active Directory > Enterprise applications > my application > Users and groups” and add them there.

2. Nested groups are not supported

This is the first surpise we had. It’s our bad, because it’s well documented on that documentation page in the “Important” note: https://docs.microsoft.com/en-us/azure/active-directory/users-groups-roles/groups-saasapps

In other words, if you assign a group to an application, only the direct members of that group will gain access to the application. So instead of using our top-level “all employees” type of group, we had to assign several lower-level groups which only had people inside of them.

3. All permissions need to be consented to by an AAD administrator

Applications in Azure Active Directory can request two types of permissions:

  1. the permissions which are scoped to the end user, like “Access your calendar”, “Read your user profile”, “Modify your contacts” — these permissions are shown to the user the first time they access an application, and they can consent to the application performing those actions on behalf of them;
  2. another type of permissions usually have a broader impact, outside of the user’s scope, like “Read all users’ profiles” or “Read and write all groups” — those permissions need to be consented to by an AAD administrator on behalf of all the users of the application.

When the access to the application is restricted via the “user assignment required”, an Azure Active Directory administrator needs to consent to all the permissions requested by the application, no matter whether users can normally provide consent for them.

As an example, I created an application with only one permission called “Sign in and read user profile”. After enabling the “user assignment required” option, I tried to log in through my web application and got prompted with a page similar to the screenshot below:

AAD application requires admin approval after enabling the "user assignment required" option

While I don’t fully understand that behaviour, it is alluded to in the tooltip associated with the “user assignment required” option, shortened for brevity and emphasis mine.

This option only functions with the following application types: […] or applications built directly on the Azure AD application platform that use OAuth 2.0 / OpenID Connect Authentication after a user or admin has consented to that application.

The solution is to have an AAD admin grant consent to the permissions for the whole directory. In the Azure portal, go to “Azure Active Directory > Enterprise application > your application > Permissions” and click the “Grant admin consent” button.

4. Other applications not assigned to the application can’t get an access token

It’s not uncommon to see integration between applications. As an example, an application “A” could run a background job every night and call the API of application “B” to get some data.

Before we enabled the “user assignment required” option in application “B”, it was possible for application “A” to request an access token to AAD, allowing it to call the API of application “B”. This is done using the client_credentials OAuth2 flow, where application “A” authenticates itself against AAD with either a client secret (it’s like a password, but an app can have different secrets) or a certificate.

However, after requiring users to be assigned to the application “A”, the token request returns the following error:

AADSTS501051: Application ‘<application-b-id>’ (<application-b-name>) is not assigned to a role for the application ‘<application-a-id>’ (<application-a-name>).

While it’s similar to the first error we talked about in this post, the resolution is different, as the Azure portal doesn’t let us assign applications to another application in the “User and groups” page.

I found the solution in this Stack Overflow answer which advises to take the following steps:

  1. create a role in application “A” that can be assigned to applications;
  2. have application “B” request this permission; and
  3. get an AAD admin to grant consent for the permissions requested by application “B”.

Let’s go through these steps one by one.

4.1 Create a role that can be assigned to applications

If you want to get some background information on AAD app roles, I highly suggest reading the following pages on docs.microsoft.com: Application roles and Add app roles in your application and receive them in the token.

To create a role aimed at applications, we’ll use the “Manifest” page and replace the appRoles property with the following:

"appRoles": [{
  "allowedMemberTypes": ["Application"],
  "description": "Consumer apps have access to application A data",
  "displayName": "Access application A",
  "id": "1b4f816e-5eaf-48b9-8613-7923830595ad",
  "isEnabled": true,
  "value": "Access"
}]
4.2 Request that permission in application “B”

Wait, we were talking about creating a role and now we request a permission?

I agree, sorry about the confusion, but the following will hopefully make sense. There’s a change in the terminology we use because assigning that role to application “B” is actually done the other way around, by requesting that role from the settings of application “B”.

To do so, we navigate in the Azure portal to “Azure Active Directory > App registrations > application “B” > Required permissions” and then click on the “Add” button. In the new “Add API Access”, we look for application “A”, select it, then pick the “Access application A” application permissions we created in the previous step:

Request the permission to access the target application

💡 Another heads-up — at the time of writing, the Azure portal has a new App registrations experience in preview. The steps mentioned above are for the GA App registrations blade, but the experience is pretty similar in the preview one. If you want to try it out, follow “App registrations (preview) > application “B” > API permissions > Add a permission > APIs my organization uses > application “A” > Application permissions”, then finally pick the “Access application A” one.

Because there’s no user involved, application permissions automatically require admin consent. Follow the steps taken previously, but this time for application “B”. After doing so, the token request from application “B” to access application “A” will work as expected.

Conclusion

When we first used that “user assignment required” option, I was only expecting unassigned users to be bounced by AAD when trying to log in. Little did I know we would encounter all those “bumps” along the way 🤣.

This was a great learning opportunity, and hopefully it’ll be useful to others.