What happens when you start a VSTS agent Docker container?

Here’s what I learnt last week because of a copy/paste error 🤣

Did you know Microsoft provide Docker images for the VSTS agent? The microsoft/vsts-agent image allows you to run the VSTS agent in a Docker container.

A colleague of mine who ran out of free build minutes on VSTS was trying to start one up. Unfortunately, he always ran into the same issue, and was presented this error message:

error: could not determine a matching VSTS agent - check that account '<tenant-name>' is correct and the token is valid for that account

Even though the error message is very explicit, we thought the token was valid since it had just been generated and started to think that maybe the environment variables we were passing in to the container were wrong.

Knowing that the repository containing the Dockerfiles of the images is open-source, we headed to https://github.com/Microsoft/vsts-agent-docker and searched for that error message.

We landed on a start.sh file where we found our error message, and tried to figure out what was the execution flow. Here’s the portion of the script we focused on:

echo Determining matching VSTS agent...
VSTS_AGENT_RESPONSE=$(curl -LsS \
  -u user:$(cat "$VSTS_TOKEN_FILE") \
  -H 'Accept:application/json;api-version=3.0-preview' \
  "https://$VSTS_ACCOUNT.visualstudio.com/_apis/distributedtask/packages/agent?platform=linux-x64")

if echo "$VSTS_AGENT_RESPONSE" | jq . >/dev/null 2>&1; then
  VSTS_AGENT_URL=$(echo "$VSTS_AGENT_RESPONSE" \
    | jq -r '.value | map([.version.major,.version.minor,.version.patch,.downloadUrl]) | sort | .[length-1] | .[3]')
fi

if [ -z "$VSTS_AGENT_URL" -o "$VSTS_AGENT_URL" == "null" ]; then
  echo 1>&2 error: could not determine a matching VSTS agent - check that account \'$VSTS_ACCOUNT\' is correct and the token is valid for that account
  exit 1
fi

The first block seems to be making an HTTP request with the curl tool. I tried making that request against my VSTS tenant with a personal access token I just generated, and here’s the response I got back:

{
  "count": 9,
  "value": [
    {
      "type": "agent",
      "platform": "linux-x64",
      "createdOn": "2018-07-11T18:30:02.527Z",
      "version": {
        "major": 2,
        "minor": 136,
        "patch": 1
      },
      "downloadUrl": "https://vstsagentpackage.azureedge.net/agent/2.136.1/vsts-agent-linux-x64-2.136.1.tar.gz",
      "infoUrl": "https://go.microsoft.com/fwlink/?LinkId=798199",
      "filename": "vsts-agent-linux-x64-2.136.1.tar.gz"
    },
    {
      "type": "agent",
      "platform": "linux-x64",
      "createdOn": "2018-05-31T18:02:29.463Z",
      "version": {
        "major": 2,
        "minor": 134,
        "patch": 2
      },
      "downloadUrl": "https://vstsagentpackage.azureedge.net/agent/2.134.2/vsts-agent-linux-x64-2.134.2.tar.gz",
      "infoUrl": "https://go.microsoft.com/fwlink/?LinkId=798199",
      "filename": "vsts-agent-linux-x64-2.134.2.tar.gz"
    },
    {
      "type": "agent",
      "platform": "linux-x64",
      "createdOn": "2018-06-12T17:26:59.84Z",
      "version": {
        "major": 2,
        "minor": 134,
        "patch": 0
      },
      "downloadUrl": "https://vstsagentpackage.azureedge.net/agent/2.134.0/vsts-agent-linux-x64-2.134.0.tar.gz",
      "infoUrl": "https://go.microsoft.com/fwlink/?LinkId=798199",
      "filename": "vsts-agent-linux-x64-2.134.0.tar.gz"
    },
    {
      "type": "agent",
      "platform": "linux-x64",
      "createdOn": "2018-05-04T15:44:30.593Z",
      "version": {
        "major": 2,
        "minor": 133,
        "patch": 3
      },
      "downloadUrl": "https://vstsagentpackage.azureedge.net/agent/2.133.3/vsts-agent-linux-x64-2.133.3.tar.gz",
      "infoUrl": "https://go.microsoft.com/fwlink/?LinkId=798199",
      "filename": "vsts-agent-linux-x64-2.133.3.tar.gz"
    },
    {
      "type": "agent",
      "platform": "linux-x64",
      "createdOn": "2018-05-21T18:03:22.033Z",
      "version": {
        "major": 2,
        "minor": 133,
        "patch": 2
      },
      "downloadUrl": "https://vstsagentpackage.azureedge.net/agent/2.133.2/vsts-agent-linux-x64-2.133.2.tar.gz",
      "infoUrl": "https://go.microsoft.com/fwlink/?LinkId=798199",
      "filename": "vsts-agent-linux-x64-2.133.2.tar.gz"
    },
    {
      "type": "agent",
      "platform": "linux-x64",
      "createdOn": "2018-03-19T16:01:44.94Z",
      "version": {
        "major": 2,
        "minor": 131,
        "patch": 0
      },
      "downloadUrl": "https://vstsagentpackage.azureedge.net/agent/2.131.0/vsts-agent-linux-x64-2.131.0.tar.gz",
      "infoUrl": "https://go.microsoft.com/fwlink/?LinkId=798199",
      "filename": null
    },
    {
      "type": "agent",
      "platform": "linux-x64",
      "createdOn": "2018-02-26T16:29:08.783Z",
      "version": {
        "major": 2,
        "minor": 129,
        "patch": 1
      },
      "downloadUrl": "https://vstsagentpackage.azureedge.net/agent/2.129.1/vsts-agent-linux-x64-2.129.1.tar.gz",
      "infoUrl": "https://go.microsoft.com/fwlink/?LinkId=798199",
      "filename": null
    },
    {
      "type": "agent",
      "platform": "linux-x64",
      "createdOn": "2018-01-26T22:11:32.117Z",
      "version": {
        "major": 2,
        "minor": 127,
        "patch": 0
      },
      "downloadUrl": "https://vstsagentpackage.azureedge.net/agent/2.127.0/vsts-agent-linux-x64-2.127.0.tar.gz",
      "infoUrl": "https://go.microsoft.com/fwlink/?LinkId=798199",
      "filename": null
    },
    {
      "type": "agent",
      "platform": "linux-x64",
      "createdOn": "2017-12-05T19:38:34.7Z",
      "version": {
        "major": 2,
        "minor": 126,
        "patch": 0
      },
      "downloadUrl": "https://vstsagentpackage.azureedge.net/agent/2.126.0/vsts-agent-linux-x64-2.126.0.tar.gz",
      "infoUrl": "https://go.microsoft.com/fwlink/?LinkId=798199",
      "filename": null
    }
  ]
}

Interesting! The container asks VSTS which agents are available for the linux-x64 platform. And then it struck us: the Docker image doesn’t have the VSTS agent binaries in it, which, when we think about it, makes a lot of sense. Doing so would mean the release cycle of the agent would need to be in line with the release cycle of the Docker image, which is less than ideal.

To work around this, the Docker container, upon start, wil install the agent and run it. But we’re not there yet. Let’s have a look at the second block:

if echo "$VSTS_AGENT_RESPONSE" | jq . >/dev/null 2>&1; then
  VSTS_AGENT_URL=$(echo "$VSTS_AGENT_RESPONSE" \
    | jq -r '.value | map([.version.major,.version.minor,.version.patch,.downloadUrl]) | sort | .[length-1] | .[3]')
fi

This is kind of Chinese to me, but knowing that the $VSTS_AGENT_RESPONSE variable should contain the JSON response displayed above, it seems to be running the jq program on it with some parameters. A quick search away and we found from the official website that jq is a lightweight and flexible command-line JSON processor.

And they have an online playground, too, great, let’s try it. We filled the JSON and the filter, checked the Raw output option — which we guessed is the equivalent of the -r parameter — and the result was https://vstsagentpackage.azureedge.net/agent/2.136.1/vsts-agent-linux-x64-2.136.1.tar.gz.

We analysed the query more closely and figured that it was a way to get the latest version of the agent. Neat! Let’s decompose the query:

  • .value expands the value property of the JSON object; the result of that is then an array of objects;
  • it’s then piped to map([.version.major,.version.minor,.version.patch,.downloadUrl]) which executes a projection over each object, selecting 4 properties on each of them, 3 being the version portions, the last one being the download URL; at this point, the result is an array of objects, each containing these 4 properties;
  • these objects are then being sorted; our assumption here is that they’re sorted based on the order of the properties, so first by the major version, then the minor and finally the patch; the result is the same array, but it’s sorted so that the first object is the smallest version and the last one is the greatest;
  • .[length-1] selects the last item of the array, so effectively the one with the latest version; now the current result is an object with 4 properties;
  • finally we assumed that the last part, .[3], selects the fourth property of the object, being the download URL

All this done in a single line! The result of this query is stored in the VSTS_AGENT_URL variable.

On to the last block:

if [ -z "$VSTS_AGENT_URL" -o "$VSTS_AGENT_URL" == "null" ]; then
  echo 1>&2 error: could not determine a matching VSTS agent - check that account \'$VSTS_ACCOUNT\' is correct and the token is valid for that account
  exit 1
fi

If the VSTS_AGENT_URL variable doesn’t exist of if it’s null, then the error message gets displayed. At this stage, we were scratching our heads 🤔 We followed the execution flow and it all seemed right.

We decided to double-check whether the token was correct, and guess what, it wasn’t! After generating it, it was pasted into OneNote which capitalised the first letter, which made it invalid. It was then copied from OneNote into the docker run command, which explained why we saw the error.

Two things I’m taking out of this situation:

  • Check my basics — absolute basics — when you’re encountering an issue. Is the cable disconnected? Is the token valid? Is the laptop connected to the Internet? I know I tend to assume the basics are working as expected and go head first into what I think is a non trivial problem;
  • I’m still really happy we went on this investigation because I got a better understanding of how that specific container works. And it took us maybe 30 minutes to figure out it was the token which was invalid. So another thing I’ll remind myself is to timebox these deep-dives so I don’t spend too much time when the fix is simple.

Azure App Service connection strings and ASP.NET Core - How?!

Here’s a quick one. You know how in ASP.NET Core there’s this new configuration model where you can get values from different providers? If not, I suggest you read the official documentation on it which is absolutely great!

A primer

For the purpose of this post, let’s imagine an ASP.NET Core MVC application that reads configuration from these sources:

  • the appsettings.json file; and
  • the environment variables

The order matters here, because if several providers export the same value, the last one wins. In our case, imagine that the JSON file is the following:

 {
   "ConnectionStrings": {
     "SqlConnection": "Data Source=server; Initial Catalog=database; Integrated Security=SSPI"
   }
 }

Let’s also imagine that we have an environment variable called CONNECTIONSTRINGS:SQLCONNECTION with the value Data Source=different-server; Initial Catalog=different-database; Integrated Security=SSPI.

In that case, the value coming from the environment variable wins and will be the one returned from the configuration.

On to our interesting case

Azure App Service allows you to specify both application settings and connection strings so that you don’t need to deploy your application again if you want to change some configuration settings.

The documentation states that connection strings will be exposed as environment variables which will be prefixed based on which type of connection string you create

Type of connection string Prefix
SQL Server SQLCONNSTR_
MySQL MYSQLCONNSTR_
Azure SQL AZURESQLCONNSTR_
Custom CUSTOMCONNSTR_

My colleague Dom had an ASP.NET Core web application deployed to an Azure App Service. This application was sourcing a connection string from the ConnectionStrings:SqlConnection configuration key.

I was very surprised when he created an Azure SQL connection string named SqlConnection in his App Service and his app used it to connect to his Azure SQL database!

If we follow the docs, the environment variable corresponding to this connection string would be named AZURESQLCONNSTR_SQLCONNECTION. It was the case as we double-checked that in the Kudu console where you can see all the environment variables of your App Service.

So how did it work?!

I know. Much confusion. My understanding was that only an environment variable named CONNECTIONSTRINGS:SQLCONNECTION would override the one that was present in the appsettings.json configuration file.

What next? Lucky for us, all that configuration code is open-source and available on the aspnet/Configuration repository on GitHub. This contains both the abstractions and several providers: JSON, XML and INI files, environment variables, command line arguments, Azure Key Vault, etc…

Next step is digging in the environment variables provider to see if there’s anything of interest. And there is! Having a look at the EnvironmentVariablesConfigurationProvider class, it all falls into place.

The provider checks for all the prefixes present in the table above and replaces them with ConnectionStrings: when feeding the data into the configuration model. This means that an environment variable named AZURESQLCONNSTR_SQLCONNECTION is fed into the configuration system with the ConnectionStrings:SqlConnection value. This explains why creating a connection string in the Azure App Service made the application change its connection string.

I’m happy because I learnt something new.

Bonus

I actually learnt something else. Double underscores in environment variables will be replaced by the configuration delimiter, :, when fed into the configuration model. That’s shown by the NormalizeKey method. This means that if we were not using Azure App Service, we could override the connection string with two environment variables: ConnectionStrings:SqlConnection and ConnectionStrings__SqlConnection.

How to install VSTS deployment group agents on Azure VMs

I recently got to work on an Azure migration project where we took the lift & shift approach as a first step. This means that the solution, while running in Azure, was still making use of virtual machines.

We decided to create two separate release pipelines:

  • the one that would provision the infrastructure in Azure — this one would be run only once for each environment as we don’t plan on tearing down/bringing up the resources for each application deployment; and
  • the application deployment one, which would update the applications bits on the virtual machines created in the first step — this one would be run much more frequently

The second one, that deploys the applications to the virtual machines, runs from a cloud-hosted agent provided by VSTS and uses WinRM to connect to the VMs to perform all the necessary steps, like copy scripts and packages over, configure IIS, deploy the packages, etc…

When I presented that solution to a few colleagues, one of them asked:

Why didn’t you install VSTS agents on the VMs? It’s more secure since it uses a pull model (instead of a push one), meaning you wouldn’t need to punch holes in the firewall for the cloud agent to connect to the virtual machines.

They have a very good point! I might add that another benefit of running the release directly from the VMs would likely speed up the process, as the artifacts would be downloaded automatically on the VM at the start of the release, and each and every step in the release wouldn’t need to set up a WinRM connection to the VM.

So I started looking for a way to do exactly this. We are using the built-in Azure Resource Group Deployment task, and one of the arguments called Enable Prerequisites allows to install the VSTS deployment group agent on all the VMs declared in your ARM template.

What’s this deployment group agent?

VSTS introduced some time ago the concept of deployment group, which is a bunch of target machines that all have an agent installed and can be assigned tags. I find it’s similar to the way Octopus Deploy works. When using deployment groups, the release pipeline is made of deployment group phases, where each phase runs on servers with specific tags. This means you could execute different tasks on your database servers and on your web servers, or you could decide to split them based on which application they run. If you’re more interested in this, I suggest you read the official documentation.

Going back to the VSTS task, here’s the property that allows you to install the agent on the virtual machines:

Install the VSTS deployment group agent on VMs
The setting that drives the installation of the deployment group agent on VMs

After selecting that option, we’re prompted to fill in a few additional properties:

  • a VSTS service endpoint;
  • a team project within the previously selected VSTS instance;
  • a deployment group that belongs to the selected team project;
  • whether we want to copy the tags from each VM to the associated agent; and finally
  • whether we want to run the VSTS agent service as a different user than the default one
Settings required to install the deployment group agent on VMs
The settings required to install the deployment group agent

This all worked out as expected, and going back to my deployment group after the privisionning of the VMs, I could see as many agents as VMs that were created. The next task was to modify the application deployment pipeline to adapt it to the fact that the process would now run directly on the virtual machines, and remove the rules that allowed inbound traffic for WinRM. It’s also worth noting that the process now needs to contain deployment group phases as opposed to agent phases.

Using this approach has several benefits:

  • increased security, as no inbound traffic is required to the VMs;
  • a quicker release process as there’s no need for WinRM connections for each step;
  • it also handles potential changes in the infrastructure: if we decide to increase the number of VMs for an application for increased reliability, the fact that the application deployment pipeline is based on VM tags means this will be transparent

Going deeper

While the main goal was achieved, I had a few questions in my mind:

  • how does the VSTS task install the VSTS agent on all the VMs?
  • why does the task require a VSTS service endpoint if the agent is to be connected to the same VSTS instance as the one where the release runs?

As all the VSTS tasks are open-source — if you didn’t know, you can find the source code in the Microsoft/vsts-tasks repository on GitHub — I decided to take a look under the hood.

The code for the Azure Resource Group Deployment task is in the Tasks/AzureResourceGroupDeploymentV2 folder.

The task.json file contains metadata about the task, like its name, the different input properties — and the rules around conditional visibility, like show setting B only when setting A has this value — and the execution entry point to invoke when the task need to run.

After finding the Enable prerequisites property, I traced the execution flow of the task until I landed on the DeploymentGroupExtensionHelper.ts which handles all things related to the installation of the deployment group agent on VMs.

And surprise! The VSTS task delegates the installation to the TeamServicesAgent Azure VM extension, as these two functions show. This answers the second question I had: the VSTS task needs a VSTS service endpoint to generate a PAT to register the agent as the underlying Azure VM extension rquires one.

The good thing about the fact that the agent installation is handled with an Azure VM extension is that we can easily reduce the coupling to this task by deploying the extension ourselves in the ARM template. This means that if we decide to move away from the VSTS task and do the deployment with either PowerShell scripts or the Azure CLI, we won’t be losing anything.

How to integrate Autofac in ASP.NET Core generic hosts

ASP.NET Core 2.1 brought a new feature that is generic hosts. They allow to write apps that rely on ASP.NET Core concepts like logging, configuration and built-in DI but that are not web applications.

I was playing with them yesterday and wanted to see if I could easily integrate the Autofac IoC container with it. After looking at the ASP.NET Core integration page in the Autofac docs, I came up with code that looks like the following:

using System.Threading.Tasks;
using Autofac;
using Autofac.Extensions.DependencyInjection;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.Hosting;

internal class Program
{
    public static async Task Main(string[] args)
    {
        await new HostBuilder()
            .ConfigureServices(services => services.AddAutofac())
            .ConfigureContainer<ContainerBuilder>(builder =>
            {
                // registering services in the Autofac ContainerBuilder
            })
            .UseConsoleLifetime()
            .Build()
            .RunAsync();
    }
}

This all looks pretty straightforward and follows the docs, but at runtime the application threw an exception with the following error message:

System.InvalidCastException: 'Unable to cast object of type 'Microsoft.Extensions.DependencyInjection.ServiceCollection' to type 'Autofac.ContainerBuilder'.'

That’s interesting, given:

  • services.AddAutofac() registers an AutofacServiceProviderFactory instance as IServiceProviderFactory as we can see here; and
  • the code tells us that the CreateBuilder method of AutofacServiceProviderFactory returns an instance of ContainerBuilder

So we’re all good, right?! What’s wrong?! Interestingly, I also read Andrew Lock’s post about the differences between web host and generic host yesterday, and thought maybe something was fooling us into thinking we were doing the right thing.

So I cloned the aspnet/Hosting repo, checked out the 2.1.1 tag, opened the solution in Visual Studio, and started readong through the HostBuilder.cs file.

And there it was: the HostBuilder class uses a ServiceProviderAdapter that wraps the IServiceProviderFactory. This means that registering an IServiceProviderFactory like services.AddAutofac() does conveys no meaning to a HostBuilder.

Luckily, while going through the code, I also found the UseServiceProviderFactory method on the HostBuilder class. The difference is that this one wraps the provided factory within the adapter.

The code then became:

using System.Threading.Tasks;
using Autofac;
using Autofac.Extensions.DependencyInjection;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.Hosting;

internal class Program
{
    public static async Task Main(string[] args)
    {
        await new HostBuilder()
            .UseServiceProviderFactory(new AutofacServiceProviderFactory())
            .ConfigureContainer<ContainerBuilder>(builder =>
            {
                // registering services in the Autofac ContainerBuilder
            })
            .UseConsoleLifetime()
            .Build()
            .RunAsync();
    }
}

And it worked!

I don’t know why the generic host uses an adapter around the service provider factory — I asked the question on Twitter, time will tell if we get the answer.

The morale here is very close to the one in Andrew’s post: don’t assume everything you know about web host is true or will work with generic host.

Why, when and how to reinstall NuGet packages after upgrading a project

I was working a codebase this week and noticed a few build warnings that looked like this:

Some NuGet packages were installed using a target framework different from the current target framework and may need to be reinstalled.
Visit http://docs.nuget.org/docs/workflows/reinstalling-packages for more information.
Packages affected: <name-of-nuget-package>

The docs page is really helpful in understanding in which situations this can happen, but we’ll focus on the one situation mentioned in the warning, that is upgrading a project to target a different framework.

How it looks like before upgrading the project

Let’s imagine we have an exiting project targeting .NET 4.5.2 and the Serilog NuGet package is installed. If we’re using packages.config and the old .NET project system, our .csproj file will contain something that looks like the following:

<Reference Include="Serilog, Version=2.0.0.0, Culture=neutral, PublicKeyToken=24c2f752a8e58a10, processorArchitecture=MSIL">
  <HintPath>..\packages\Serilog.2.7.1\lib\net45\Serilog.dll</HintPath>
</Reference>

The above snippet shows that the assembly that is being used by the project is the one living in the net45 folder of the NuGet package, which makes sense since we’re targeting .NET 4.5.2.

Upgrading the project

We then decide to upgrade the project to target .NET 4.7.1 through Visual Studio.

Immediately after doing this, we get a build error with the message shown at the beginning of this post. On subsequent builds, though, the error goes away and we get a warning, which is consistent with what’s documented in item #4 of the docs page.

But why?!

Why do we get those warnings?

NuGet analysed all the installed packages and found out that there are more appropriate assemblies for the new target framework than the ones we’re referencing. This is because a NuGet package can contain different assemblies for different target frameworks.

Let’s inspect the content of the lib directory of the Serilog package:

└─lib
  ├─net45
  │   Serilog.dll
  │   Serilog.pdb
  │   Serilog.xml
  │
  ├─net46
  │   Serilog.dll
  │   Serilog.pdb
  │   Serilog.xml
  │
  ├─netstandard1.0
  │   Serilog.dll
  │   Serilog.pdb
  │   Serilog.xml
  │
  └─netstandard1.3
      Serilog.dll
      Serilog.pdb
      Serilog.xml

We can see different assemblies for 4 different target frameworks.

My guess is that those warnings are driven by the requireReinstallation attribute that is added for those packages in the packages.config file:

<packages>
  <package id="Serilog" version="2.7.1" targetFramework="net452" requireReinstallation="true" />
</packages>

How to fix this?

The way I find easiest to do this is by using the Package Manager Console in Visual Studio by running this command:

Update-Package <name-of-nuget-package> -Reinstall -ProjectName <name-of-project>

The most important parameter here is -Reinstall as it instructs NuGet to remove the specified NuGet package and reinstall the same version. This gives NuGet a chance to determine which assembly is most appropriate for the current framework targeted by the project.

Running this command in our sample project would change the .csproj:

<Reference Include="Serilog, Version=2.0.0.0, Culture=neutral, PublicKeyToken=24c2f752a8e58a10, processorArchitecture=MSIL">
  <HintPath>..\packages\Serilog.2.7.1\lib\net46\Serilog.dll</HintPath>
</Reference>

And also the packages.config file:

<packages>
  <package id="Serilog" version="2.7.1" targetFramework="net471" />
</packages>

The project now references the .NET 4.6 assembly of the package, and the build warning is gone. I don’t know how NuGet internally determines which set of assemblies is best suited for a target framework, though. There might be a matrix somewhere that shows this.

We can run the command for every package that is flagged by NuGet to make sure we reference the correct assemblies. Alternatively, if too many packages are

Conclusion

We saw that it’s easy to get rid of the warnings that can occur when a project is upgraded to target a different framework.

Do you see those warnings when you build a solution? Does a solution-wide search for requireReinstallation fetch some results? You’re only a few commands away to being in a cleaner state! Fire away!