Peter Ritchie's Blog

A tool to help contributing to many Git repos

2023-11-30T00:00:00Z

I've contributed to many Git repos over the years. Doing this means I work in a code base for a little while, switch to another, and often eventually switch back.

Collaborating with Others

In the repos that I work in, many have multiple contributors. The contributions to those repos can be prolific, and if the repo is using a workflow that uses feature or topic branches, branches come and go quite often. git fetch by default (or with no other options) gets all branches so you'll have other team members' branches after a fetch--which can be used to do a deep dive on a PR.

You could choose not to use the `git fetch` defaults and have it only get a particular branch. This can typically be done with `git fetch origin main` (depending on how you've named your remotes and your branches.)

I work with many organizations and rarely is there one repo (yes, I know, there's this thing called a "monorepo"; but I find that organizations that can make this work need to be very technically savy, with products/technologies geared towards developers, and only a few of the organizations I work with are at that level.) With remote work being what it is (I'm often working at a different time than other contributors), when I return to work with an organization's code, I usually need to update several repos.

|Why not do a git pull instead of git fetch?| |:-:| What I'm contributing to, what I may be reviewing, and whether I'm connected, are variable enough that I've built a habit only to pull when I'm ready to merge and deal with potential conflicts. If I have conflicts, I must resolve them (or abort: git merge --abort or git fetch origin and git reset --hard origin) before doing anything else. This means I must commit to resolving those conflicts before switching to another branch to review or work with it. (Yes, I could re-clone in a different place, but frequent-fetch>abort>clone in terms of effort and risk.)}

A tool to help

When I re-start work (or maybe I'm coming off a vacation), going to each repo dir to perform git fetch is tedious. I've developed a Powershell script to do that. I'll walk through the script after the code (commented code available here.)

using namespace System.IO;
param (
    [switch]$WhatIf,
    [switch]$Verbose,
    [switch]$Quiet
    )
$currentDir = (get-location).Path;

if($Verbose.IsPresent) {
    $VerbosePreference = "Continue";
}

function Build-Command {
    $expression = 'git fetch';
    if($Quiet.IsPresent) {
        $expression += ' -q';
    }
    if($Verbose.IsPresent) {
        $expression += ' -v --progress';
    }
    if($WhatIf.IsPresent) {
        $expression += ' --dry-run';
    }
    $expression += " origin";
    return $expression;
}

foreach($item in $([Directory]::GetDirectories($currentDir, '.git', [SearchOption]::AllDirectories);)) {
    $dir = get-item -Force $item;
    Push-Location $dir.Parent;
    try {
        Write-Verbose "fetching in $((Get-Location).Path)...";
        $expression = Build-Command;

        Invoke-expression $expression;

    } finally {
        Pop-Location;
    }
}

First, I'm translating the PowerShell idioms WhatIf, Verbose, and Quiet to common Git options --dry-run, --verbose (-v), and --quiet (-q). The Build-Command builds up the expression we want to use to invoke git. I've included the --progress option with git fetch to display progress when-Verbose is specified. Next, I'm looping through all directories, looking for a .git directory. I'm using System.IO.GetDirectories instead of Get-ChildItem because it's much faster. For each directory that contains a .git subdirectory, Git fetch is invoked. This allows me to fetch several Git repos within the hierarchy of the current directory.

Organizaing Code Locally

Organizaing Code Locally
I work with my code (spikes, libraries, experiments, etc.), open-source projects, and multiple clients. All these diverge from one another at one level in my directory structure. e.g. I may have a `src` subdiretory in my home directory; and `oss`, `experiments`, and `client` subdirectories within `src`, so I can choose to fetch from all the repos recursively in each of those subdirectories--if I'm returning to work on an OSS project after being away from OSS for a while, I just `fetch-all.ps` within the `oss` subdirectory.

I work with my code (spikes, libraries, experiments, etc.), open-source projects, and multiple clients. All these diverge from one another at one level in my directory structure. e.g. I may have a src subdiretory in my home directory; and oss, experiments, and client subdirectories within src, so I can choose to fetch from all the repos recursively in each of those subdirectories--if I'm returning to work on an OSS project after being away from OSS for a while, I just fetch-all.ps within the oss subdirectory.

By default (or with no other options), git fetch does not delete corresponding local branches that have been removed from a remote. So, new branches will be downloaded, but those that were removed will remain.

To also remove local branches removed from the remote, you can include a purge option with `git fetch`: `git fetch --prune` or `git fetch -p`.

If I'm reviewing a PR, I don't necessarily want removed remote branches to be removed locally all the time. So, I like pruning separately from fetching. The following is the script for that (other than Build-Command, it has the same structure and flow as fetch-all.ps1 (so I won't walk through this snippet.)

using namespace System.IO;
param (
    [switch]$WhatIf,
    [switch]$Verbose
    )
$currentDir = (get-location).Path;

if($Verbose.IsPresent) {
    $VerbosePreference = "Continue";
}

function Build-Command {
    $expression = 'git remote';
    if($Verbose.IsPresent) {
        $expression += ' -v';
    }
    $expression += ' prune';
    if($WhatIf.IsPresent) {
        $expression += ' --dry-run';
    }
    $expression += ' origin';
    return $expression;
}

foreach($item in $([Directory]::GetDirectories($currentDir, '.git', [SearchOption]::AllDirectories);)) {
    $dir = get-item -Force $item;
    Push-Location $dir.Parent;
    try {
        Write-Verbose "pruning in $((Get-Location).Path)...";
        $expression = Build-Command;

        Invoke-expression $expression;
    } finally {
        Pop-Location;
    }
}

Separating pruning from fetching also allows me to prune at a wider scope than fetching. e.g. c:\Users\peter\src\client .\fetch-all.ps1 and c:\Users\peter\src .prune-all.ps1.

I look forward to your feedback and comments.

Entity Framework in .NET Aspire

2023-11-29T00:00:00Z

.NET Aspire is an opinionated, cloud ready stack for building observable, production ready, distributed applications.

.NET Aspire is currently in preview and focuses on simplifying the developer experience with orchestration and automatic service discovery features. There's a huge potential for .NET Aspire beyond this initial valuable feature set.

Being in preview, .NET Aspire may not yet support all the scenarios or workloads you may be comfortable with. It's an opinionated framework, which means differences of opinion are natural and expected. Currently, one of those opinions seems to be a focus on containers. The sample solutions that the new dotnet templates provide are a great example of the benefits of containerization. The .NET Aspire starter solution that dotnet new --use-redis-cache --output AspireStarter generates, out of the box, is something that, when debugged, will download, run, and utilize a Docker Redis image. (I've worked with teams where getting each member productive in a development environment has ended up being days of work.) The AppHost component of a .NET Aspire solution codifies abstract aspects of the architectural decisions that automates the generation and deployment of a development environment<(!--and configuration provides the details from future decisions about other environments-->.)

A container focus is empowered by .NET Aspire's orchestration features. An independent orchestration responsibility enables better separation of release and deploy concerns from build and test concerns; shifting right those decisions that release and deploy depend on. (i.e., the ability to develop, execute, and evaluate solutions are discernibly left of release and operation.) Containers are an established method of componentizing a distributed system with independent servers (sometimes called "tiers.") This provides flexibility to deploy and execute in a development environment even before architectural decisions about a production topology have been considered. For example, debugging the .NET Aspire starter app automatically spins up a Redis container in Docker, but it's extremely unlikely that's how it will be deployed in production. In production, will there be one only Redis instance? If you have many instances, what sort of gateway or reverse proxy to that pool of instances will be utilized? Will it be on-prem or cloud? Will it be Azure, AWS, or Google Cloud? The beauty of Aspire's orchestration feature is that it doesn't matter yet; you can configure orchestration to figure it out at run-time, one environment at a time!

But, with every decision comes compromise. Technologies that depend on the physical resources that come from those decisions (that we're now effectively deferring) introduce some challenges with some existing software development idioms. A chicken-and-egg situation: if how to connect to physical resources may only be known at run-time, what happens to design-time technologies that depend on that connection information?

One popular technology in .NET, Entity Framework, suffers one of those challenges in .NET Aspire (Possibly only in code-first scenarios. Many Entity Framework examples detail adding Entity Framework support to an existing component (resource, like console app, ASP.NET Core web API, Razor app, etc.), creating a circular dependency between its project and the existence of an executing database (i.e., a valid database connection string.) In database-first, you have an existing application with existing physical databases and practices to utilize them in a development environment. With .NET Aspire, developers are shifted left from the decisions that provide the resources that things like migrations add <migration-name> and dotnet ef database update require to function properly.

To be clear, the way .NET Aspire works is that the orchestration (AppHost) executes, figures out the various connection strings, and overrides the appsettings by setting environment variables before running the other components. The premise behind this means that at run-time, whatever is in appsettings is ignored. dotnet ef command doesn't execute at run-time; it effectively runs at design-time and gets its configuration from appsettings, so it's out of sync with reality.

The basic guidance is to abstract those types of dependencies as .NET Aspire resources. Nothing new conceptually, but this might be an application of the principles of abstraction at a level less commonly applied. Refining that guidance to using Entity Framework: the database should be an independent resource. Independent resources are modeled in .NET as either separate projects or separate solutions. Luckily, an .NET Aspire sample addresses this. Let's look into the details.

The structure of the eShopLite sample overlaps with the .NET Aspire starter dotnet new template. It has a Blazor web frontend, a web API, an Aspire AppHost, and an Aspire service defaults project. Additionally, there is a shopping cart service (BasketService), and the catalog database (CatalogDb) project is an abstraction of the database resource.

The CatalogDb looks very similar to what you'd end up with following Tutorial: Create a web API with ASP.NET Core: an ASP.NET Core web API that leverages Entity Framework, and is effectively a gateway to a backend database. Although, that tutorial uses Entity Framework in-memory rather than via PostgreSQL. The way eShopLite supports Entity Framework is through the CatalogDb project. CatalogDb is like a stub project to the rest of the solution: Aspire doesn't execute it, but CatalogService depends upon it for the database model classes and DbContext (utilized more like a class library.) Nothing connects to the CatalogDb web API. The CatalogDb project contains all the Entity Framework design-time details and references, allowing you to utilize Entity Framework's features like migrations add <migration-name> and dotnet ef database update. The target of Entity Framework operations like migration add and database update would depend on the configuration in appsettings.json. Initialization/seeding of the data is handled in CatalogDbInitializer within CatalogDb, as well as migrations at run-time (startup). CatalogDb appsettings connection strings must be in sync with the run-time values for ef commands to work.

In summary, if you want to utilize Entity Framework in a basic .NET Aspire application, adding a project to contain the entity models, context, and Entity Framework references and supporting a database engine container is a recommended place to get started. I suspect this guidance may be refined as .NET Aspire evolves.

I'm still wrapping my head around how .NET Aspire can support other non-containerized workloads like Azure SQL. Still, a containerized design melds nicely with the idea of independent resources (or nodes) in .NET Aspire. .NET Aspire also helps to more clearly delineate concerns like design, build, test, release, and deploy. As with .NET Aspire, containerization is an easier starting point for someone interested in distributed applications.

I look forward to how .NET Aspire evolves.

ETags in ASP.NET Core

2023-06-28T00:00:00Z

A good software architect doesn't just provide expectations of structure; they also work with developers to give feedback and guidance for implementation. It's easy to say, "Use ETags for entity concurrency control in a Web API," it's another to empower teams to accomplish the objectives of entity versioning.

To review: entity-tags (Etags) are a method of implementing Optimistic Concurrency Control. Optimistic Concurrency Control is a means to avoid distributed locking in situations when two or more potentially concurrent operations rarely interfere with each other. You can see cases like this on the Web when multiple processes or people are not normally working on the same data simultaneously. With the Web, there are rare situations where a single process or single person can (usually inadvertently) modify data from two places at the same time. It's rare case like this where the low overhead of optimistic concurrency can avoid accidental overwrites.

Entity-tags are a moniker of a particular incarnation of an entity. The tag is opaque, so it shouldn't need to be interpretable by a requestor to your service. With opaque data, you want to make the value itself as unobvious as possible.

The value, of course, could be an incrementing integer if you could reliably and efficiently increment an integer in a distributed environment (remember, we're addressing the possibility of two distributed transactions interfering with one another, the same transactions mechanisms that would be used to increment an integer.) But, before choosing to increment an integer (an ordinal number), consider RFC 9110 ETags and why ordinal version numbers are not specified.

| If you think an ordinal number will work, do you need entity-tags at all?

A time-stamp is something to consider, in which case, prefer the Last-Modified header field validator. Or in conjunction with entity-tags. If a time-stamp is reliable, Last-Modified offers better interoperability options than re-inventing the wheel. Also, be thoughtful when considering time-stamps, especially their granularity; per-second time-stamp granularity can only partially solve the problem of concurrent writes.

So, how do you reliably generate an entity-tag value? The first thing to consider is what you want to accomplish. Do you want to prevent accidental overwrites, or do you want entity versioning? If you said, "I want entity versioning," to what end? If a client gets version 1, and another client updates it to version 2, what action do you want to perform when the first client requests to update the entity? You don't need versioning to prevent that first client from updating the entity. If you want to merge with version 2, you probably want versioning; in this case, you can stop reading now; I won't get into detail like that in this post.

If we're interested in preventing accidental overwrites, on the server side, we only really care about the current entity and the basis for the current request to update it. It doesn't matter if the basis is the previous version or ten versions behind; we only care that it's not based on the current version.

Another thing to consider is that entity-tags are also used in HTTP caching, which requires that an entity-tag be unique per encoding (e.g., a gzipped response should have a different entity-tag than a non-gzipped response.) The encoding value is often postfixed to the entity-tag to make it unique per encoding. But be careful to parse that out when checking for semantically identical entities. That's out of the scope of this post.

With an understanding of those constraints, a common method of generating an entity-tag is to use a hash of the entity representation.

Let's look at an example controller that tries to isolate the implementation detail of how the entity-tag is calculated. For my examples, I'm choosing to use controllers over minimal APIs; the controller class attributes make some of what is required easier. For clarity, my examples are stripped of error responses unrelated to conditional requests and exception middleware. For complete source, see this repo.

[ApiController] [Route("[controller]")]
public class AppointmentController : ControllerBase
{
	[HttpGet(Name = "GetAppointmentRequests")]
	[ProducesResponseType(typeof(WebCollectionElement<AppointmentRequestDto>[]),
		StatusCodes.Status200OK, MediaTypeNames.Application.Json)]
	public async Task<IActionResult> GetMany(CancellationToken cancellationToken = default)
	{
		var resource = appointmentRequestService.GetRequests(cancellationToken);

		List<WebCollectionElement<AppointmentRequestDto>> items = new();
		foreach (var (dto, guid, concurrencyToken) in await resource.ToListAsync(cancellationToken: cancellationToken))
		{
			items.Add(
				new WebCollectionElement<AppointmentRequestDto>(dto, Url.Action(nameof(GetById),
					new { id = guid })!, etag: concurrencyToken));
		}

		return base.Ok(items);
	}

	[HttpGet("{id}", Name = "GetAppointmentRequest")]
	[ProducesResponseType(typeof(AppointmentRequestDto), StatusCodes.Status200OK, MediaTypeNames.Application.Json)]
	[ProducesResponseType(typeof(AppointmentRequestDto), StatusCodes.Status304NotModified)]
	public async Task<IActionResult> GetById(Guid id, [FromHeader(Name = "If-None-Match")] string? ifNoneMatch,
		CancellationToken cancellationToken = default)
	{
		var (resource, concurrencyToken) = string.IsNullOrWhiteSpace(ifNoneMatch) 
			? await appointmentRequestService.GetRequest(id, cancellationToken) 
			: await appointmentRequestService.GetRequest(id, ifNoneMatch, cancellationToken);

		HttpContext.Response.Headers.Add(HeaderNames.ETag, concurrencyToken);

		return Ok(resource);
	}

	[HttpPost(Name = "CreateAppointmentRequest")]
	[Consumes(MediaTypeNames.Application.Json)]
	[ProducesResponseType(StatusCodes.Status201Created)]
	public async Task<IActionResult> Create([FromBody] AppointmentRequestDto appointmentRequest,
		CancellationToken cancellationToken = default)
	{
		var (id, concurrencyToken) = await appointmentRequestService.CreateRequest(appointmentRequest, cancellationToken);

		HttpContext.Response.Headers.Add(HeaderNames.ETag, concurrencyToken);

		return CreatedAtAction(nameof(GetById), routeValues: new { id }, value: null);
	}

	[HttpPut("{id}", Name = "ReplaceAppointmentRequest")]
	[Consumes(MediaTypeNames.Application.Json)]
	[ProducesResponseType(StatusCodes.Status204NoContent)]
	public async Task<IActionResult> Replace(Guid id, [FromBody] AppointmentRequestDto appointmentRequest,
		[FromHeader(Name = "If-Match")] string? ifMatch, CancellationToken cancellationToken = default)
	{
		var concurrencyToken = string.IsNullOrWhiteSpace(ifMatch)
			? await appointmentRequestService.ReplaceRequest(
				id, appointmentRequest, cancellationToken)
			: await appointmentRequestService.ReplaceRequest(
				id, appointmentRequest, ifMatch, cancellationToken);

		HttpContext.Response.Headers.Add(HeaderNames.ETag, concurrencyToken);

		return NoContent();
	}

	[HttpPatch("{id:guid}", Name = "UpdateAppointmentRequest")]
	[ProducesResponseType(typeof(AppointmentRequestDto), StatusCodes.Status200OK, MediaTypeNames.Application.Json)]
	[Consumes("application/json-patch+json")]
	public async Task<IActionResult> Update(Guid id, JsonPatchDocument<AppointmentRequestDto> patchDocument,
		[FromHeader(Name = "If-Match")] string? ifMatch, CancellationToken cancellationToken = default)
	{
		var (result, concurrencyToken) = string.IsNullOrWhiteSpace(ifMatch)
			? await appointmentRequestService.UpdateRequest(id, patchDocument, cancellationToken)
			: await appointmentRequestService.UpdateRequest(id, patchDocument, ifMatch, cancellationToken);

		HttpContext.Response.Headers.Add(HeaderNames.ETag, concurrencyToken);

		return Ok(result);
	}

	[HttpDelete("{id}", Name = "RemoveAppointmentRequest")]
	[ProducesResponseType(StatusCodes.Status204NoContent)]
	public async Task<IActionResult> Remove(Guid id, [FromHeader(Name = "If-Match")] string? ifMatch,
		CancellationToken cancellationToken = default)
	{
		if(string.IsNullOrWhiteSpace(ifMatch))
			await appointmentRequestService.RemoveRequest(id, cancellationToken);
		else
			await appointmentRequestService.RemoveRequest(id, ifMatch, cancellationToken);

		return NoContent();
	}
}

There are inherent complexities in a Web API. It needs to present an interface usable on the Web and utilizes open standards as much as possible. You'll notice that the AppointmentController PATCH implementation uses JsonPatchDocument, an implementation of the JSON Patch (IETF RFC 6902) standard. This standard is specific to the Web, specific to JSON, and deals with operations intended to be specifically applied to JSON representations equivalent to the model defined in the interface (i.e., the model, not what is represented in the database or an in-memory representation of a domain object.)

This controller is isolated from the collaboration with the database and delegates that interaction to an Application Service via the appointmentRequestService field (declaration removed for readability). In state-modifying HTTP methods (PUT, DELETE, PATCH), the actions have an ifMatch parameter passed in through the If-Match HTTP request header. When present, it is passed along to the application service for optimistic concurrency. This example shows an optional use of If-Match; it's plausible that another implementation might require the If-Match header and respond with status code 428 Precondition Required.

Of note is that this controller abstracts etag header values as concurrency token text so that nothing else has to deal with HTTP headers.

Let's look at the MVC model (I prefer to refer to it as a Data Transfer Object).

public class AppointmentRequestDto
{
	[Required]
	public DateTime? CreationDate { get; set; }
	public IEnumerable<string>? Categories { get; set; }
	[Required]
	public string? Description { get; set; }
	public string? Notes { get; set; }
	[Required]
	public AppointmentRequestStatus? Status { get; set; }
	[Required]
	public MeetingDuration? Duration { get; set; }
	[Required]
	public IEnumerable<string>? Participants { get; set; }
	[Required]
	public IEnumerable<DateTime>? ProposedStartDateTimes { get; set; }
}

Since we're delegating serialization to ASP.NET (which requires writable properties), the properties are nullable but annotated with RequiredAttribute to signal to the framework what properties are required. There is no identifier in the AppointmentRequestDto class because we don't want to duplicate it there and in the resource's URI.

Azure Cosmos has implemented optimistic concurrency control and stores an ETag per document. I'll use Azure Cosmos for the database implementation to show how this can be re-used in your WebAPI.

Azure Cosmos Example

In Azure Cosmos, each document has several mandatory properties: id, _rid, _self, _etag, _attachements, and _ts. These are implementation details of the database that we don't want to leak into our API as body content. When we use the Azure Cosmos SDK, we need serialization classes to serialize the data to and from a container. Let's see an example with a fictitious appointment request resource:

public class AppointmentRequestEntity : CosmosEntityBase
{
    [JsonProperty(PropertyName = "id")]
    public Guid Id { get; set; }
    [JsonProperty(PropertyName = "_rid")]
    public string? ResourceId { get; set; }
    [JsonProperty(PropertyName = "_self")]
    public Uri? SelfUri { get; set; }
    [JsonProperty(PropertyName = "_etag")]
    public string? ETag{ get; set; }
    [JsonProperty(PropertyName = "_ts")]
    public int? TimestampText{ get; set; }
	public DateTime? CreationDate { get;  set; }
	public IEnumerable<string>? Categories { get; set; }
	public string? Description { get;  set; }
	public string? Notes { get;  set;  }
	public AppointmentRequestStatus? Status { get;  set; }
	public MeetingDuration? Duration { get;  set; }
	public IEnumerable<string>? Participants { get; set; }
	public IEnumerable<DateTime>? ProposedStartDateTimes { get;  set; }
}

Notice the first five properties that are necessary to access the Azure Cosmos implementation details. (in this repo this is split out into a CosmosEntityBase class.)

For my example, I'm going to draw on Domain-Driven design patterns and use a Repository implementation in the database collaboration. I want to delegate all the logic related to database-specific details to the repository implementation. This includes encapsulating the use of the database entity serialization class (translation to/from the database entity class), associating an identifier and etag with the resource, etc. To separate the existence of the database entity class from clients of the repository, we'll define a generic interface that I'll name IOptimisticallyConcurrentRepository that works with different types of domain entity classes:

public interface IOptimisticallyConcurrentRepository<TDomainEntity>
{
    Task<TDomainEntity> Get(Guid id, CancellationToken cancellationToken = default);
    IAsyncEnumerable<TDomainEntity> Get(CancellationToken cancellationToken = default);
	Guid GetId(TDomainEntity entity);

	bool TryGetIfModified(Guid id, string concurrencyToken, out TDomainEntity? entity);
	string GetConcurrencyToken(TDomainEntity entity);

    Task<Guid> Add(TDomainEntity entity, CancellationToken cancellationToken = default);
    Task Remove(Guid id, CancellationToken cancellationToken = default);
    Task Replace(Guid id, TDomainEntity entity, CancellationToken cancellationToken = default);

    Task RemoveIfMatch(Guid id, string token, CancellationToken cancellationToken = default);
    Task ReplaceIfMatch(Guid id, TDomainEntity entity, string token, CancellationToken cancellationToken = default);
}

Next is a generic repository class to support Azure Cosmos that deals with arbitrary domain (TDomainEntity) and database serialization classes (TDbEntity):

public class CosmosOptimisticallyConcurrentRepository<TDomainEntity, TDbEntity> 
	: IOptimisticallyConcurrentRepository<TDomainEntity>
	where TDomainEntity : class
	where TDbEntity : CosmosEntityBase
{
	private class EntityContext
	{
		public EntityContext(Guid id, string concurrencyToken)
		{
			Id = id;
			ConcurrencyToken = concurrencyToken;
		}

		public Guid Id { get; }
		public string ConcurrencyToken { get; }
	}

	private readonly Container container;
	private readonly ITranslator<TDomainEntity, TDbEntity> dbEntityTranslator;
	private readonly Action<TDbEntity, Guid> setDbEntityId;

	protected CosmosOptimisticallyConcurrentRepository(Container container, ITranslator<TDomainEntity, TDbEntity> dbEntityTranslator,
		Action<TDbEntity, Guid> setDbEntityId)
	{
		this.container = container;
		this.dbEntityTranslator = dbEntityTranslator;
		this.setDbEntityId = setDbEntityId;
	}

	public async Task<Guid> Add(TDomainEntity entity, CancellationToken cancellationToken = default)
	{
		var id = Guid.NewGuid();
		var dbEntity = dbEntityTranslator.ToData(entity);
		setDbEntityId(dbEntity, id);

		try
		{
			var result = await container.CreateItemAsync(dbEntity, new PartitionKey(id.ToString("D")), cancellationToken: cancellationToken);
			conditionalWeakTable.Add(entity, new EntityContext(id, result.ETag));
			return id;
		}
		catch (CosmosException ex) when(ex.StatusCode == HttpStatusCode.PreconditionFailed)
		{
			throw new ConcurrencyException();
		}
	}

	public bool TryGetIfModified(Guid id, string concurrencyToken, out TDomainEntity? entity)
	{
		var idText = id.ToString("D");
		try
		{
			var result = container.ReadItemAsync<TDbEntity>(
					idText,
					new PartitionKey(idText),
					requestOptions: new ItemRequestOptions() { IfNoneMatchEtag = concurrencyToken })
				.Result;

			entity = dbEntityTranslator.ToDomain(result.Resource);
			conditionalWeakTable.Add(entity, new EntityContext(id, result.ETag));
			return true;
		}
		catch (AggregateException aggregateException) when (aggregateException.InnerExceptions.Count == 1 &&
		                                                    aggregateException.InnerExceptions.Single() is
			                                                    CosmosException
			                                                    {
				                                                    StatusCode: HttpStatusCode.NotModified
			                                                    })
		{
			entity = default;
			return false;
		}
		catch (AggregateException aggregateException) when (aggregateException.InnerExceptions.Count == 1 &&
		                                                    aggregateException.InnerExceptions.Single() is
			                                                    CosmosException
			                                                    {
				                                                    StatusCode: HttpStatusCode.NotFound
			                                                    })
		{
			throw new EntityNotFoundException(id);
		}
		catch (CosmosException ex) when(ex.StatusCode == HttpStatusCode.NotFound)
		{
			throw new EntityNotFoundException(id);
		}
	}

	public async IAsyncEnumerable<TDomainEntity> Get([EnumeratorCancellation] CancellationToken cancellationToken = default)
	{
		var iterator = container.GetItemQueryIterator<TDbEntity>();
		while (iterator.HasMoreResults)
		{
			var set = await iterator.ReadNextAsync(cancellationToken);
			foreach (var e in set)
			{
				var entity = dbEntityTranslator.ToDomain(e);
				conditionalWeakTable.Add(entity, new EntityContext(e.Id, e.ETag!));
				yield return entity;
			}
		}
	}

	public async Task<TDomainEntity> Get(Guid id, CancellationToken cancellationToken = default)
	{
		var idText = id.ToString("D");
		try
		{
			var result = await container.ReadItemAsync<TDbEntity>(idText, new PartitionKey(idText), cancellationToken: cancellationToken);
			var entity = dbEntityTranslator.ToDomain(result.Resource);
			conditionalWeakTable.Add(entity, new EntityContext(id, result.ETag));
			return entity;
		}
		catch (CosmosException ex) when(ex.StatusCode == HttpStatusCode.NotFound)
		{
			throw new EntityNotFoundException(id);
		}
	}

	public async Task Replace(Guid id, TDomainEntity entity, CancellationToken cancellationToken = default)
	{
		var dbEntity = dbEntityTranslator.ToData(entity);
		setDbEntityId(dbEntity, id);

		try
		{
			_ = await container.UpsertItemAsync(dbEntity, cancellationToken: cancellationToken);
		}
		catch (CosmosException ex) when(ex.StatusCode == HttpStatusCode.NotFound)
		{
			throw new EntityNotFoundException(id);
		}
	}

	public async Task ReplaceIfMatch(Guid id, TDomainEntity entity, string token, CancellationToken cancellationToken = default)
	{
		var idText = id.ToString("D");
		var dbEntity = dbEntityTranslator.ToData(entity);
		setDbEntityId(dbEntity, id);

		var requestOptions = new ItemRequestOptions { IfMatchEtag = token };
		try
		{
			_ = await container.ReplaceItemAsync(dbEntity, idText, new PartitionKey(idText), requestOptions: requestOptions, cancellationToken: cancellationToken);
		}
		catch (CosmosException ex) when (ex.StatusCode == HttpStatusCode.PreconditionFailed)
		{
			throw new ConcurrencyException();
		}
		catch (CosmosException ex) when(ex.StatusCode == HttpStatusCode.NotFound)
		{
			throw new EntityNotFoundException(id);
		}
	}

	public async Task Remove(Guid id, CancellationToken cancellationToken = default)
	{
		var idText = id.ToString("D");

		try
		{
			_ = await container.DeleteItemAsync<TDbEntity>(idText, new PartitionKey(idText), cancellationToken: cancellationToken);
		}
		catch (CosmosException ex) when(ex.StatusCode == HttpStatusCode.NotFound)
		{
			throw new EntityNotFoundException(id);
		}
	}

	public async Task RemoveIfMatch(Guid id, string token, CancellationToken cancellationToken = default)
	{
		var idText = id.ToString("D");

		var requestOptions = new ItemRequestOptions { IfMatchEtag = token };
		try
		{
			_ = await container.DeleteItemAsync<TDbEntity>(idText, new PartitionKey(idText), requestOptions: requestOptions, cancellationToken: cancellationToken);
		}
		catch (CosmosException ex) when (ex.StatusCode == HttpStatusCode.PreconditionFailed)
		{
			throw new ConcurrencyException();
		}
		catch (CosmosException ex) when(ex.StatusCode == HttpStatusCode.NotFound)
		{
			throw new EntityNotFoundException(id);
		}
	}
}

As a reminder, a "concurrency token" is synonymous with an "etag" in the context of the repository.

The persistence needs of an application are independent of a domain entity, so the domain entity is isolated from web/database identifiers, concurrency tokens, HTTP, etags, etc. So, the repository needs to translate from a domain object to the serialization object, which is performed mostly by an ITranslator<TDomain, TData> implementation but also with the assignment of the identifier to the serialization object. To keep the non-domain details isolated from the domain object, I've used the ConditionalWeakTable<TKey, TValue> type to associate database persistence details (ID and etag/concurrency token, as abstracted by EntityContext) to the object without too much management logic.

ConditionalWeakTable is like a dictionary that associates a value with another object. It differs from a traditional dictionary in that when the key is no longer referenced, the associated value is freed/destroyed. This allows us to get associated data with minimal memory impact easily.

An implementation of the repository now just requires the type to use for the database serialization class, the domain entity type, and how to assign an identifier to the Azure Cosmos id property:

public sealed class CosmosAppointmentRequestRepository : CosmosOptimisticallyConcurrentRepository<AppointmentRequest, AppointmentRequestEntity>
{
    public CosmosAppointmentRequestRepository(Container container, ITranslator<AppointmentRequest, AppointmentRequestEntity> appointmentRequestEntityTranslator)
        : base(container, appointmentRequestEntityTranslator, (entity, guid) => entity.Id = guid)
    {
    }
}

The only remaining part is the implementation of the application/database collaboration, the application service:

public class AppointmentRequestService
{
	private readonly AppointmentRequestDtoTranslator appointmentRequestDtoTranslator;
	private readonly IOptimisticallyConcurrentRepository<AppointmentRequest> repository;

	public AppointmentRequestService(AppointmentRequestDtoTranslator appointmentRequestDtoTranslator, IOptimisticallyConcurrentRepository<AppointmentRequest> repository)
	{
		this.appointmentRequestDtoTranslator = appointmentRequestDtoTranslator;
		this.repository = repository;
	}

	public async Task<(Guid, string)> CreateRequest(AppointmentRequestDto appointmentRequest, CancellationToken cancellationToken = default)
	{
		var entity = appointmentRequestDtoTranslator.AppointmentRequestDtoToAppointmentRequest(appointmentRequest);
		var guid = await repository.Add(entity, cancellationToken);

		return (guid, repository.GetConcurrencyToken(entity));
	}

	public async Task<(AppointmentRequestDto, string)> GetRequest(Guid id, CancellationToken cancellationToken = default)
	{
		var appointmentRequest = await repository.Get(id, cancellationToken);
		return (appointmentRequestDtoTranslator.AppointmentRequestToAppointmentRequestDto(appointmentRequest), repository.GetConcurrencyToken(appointmentRequest));
	}

	public Task<(AppointmentRequestDto, string)> GetRequest(Guid id, string etag, CancellationToken _ = default)
	{
		if(repository.TryGetIfModified(id, etag, out var appointmentRequest))
		{ 
			return Task.FromResult((appointmentRequestDtoTranslator.AppointmentRequestToAppointmentRequestDto(appointmentRequest!), repository.GetConcurrencyToken(appointmentRequest!)));
		}

		throw new ConcurrencyException();
	}

	public async IAsyncEnumerable<(AppointmentRequestDto, Guid, string)> GetRequests([EnumeratorCancellation] CancellationToken cancellationToken = default)
	{
		var result = repository.Get(cancellationToken);
		await foreach (var item in result.WithCancellation(cancellationToken))
		{
			yield return (appointmentRequestDtoTranslator.AppointmentRequestToAppointmentRequestDto(item), repository.GetId(item),
				repository.GetConcurrencyToken(item));
		}
	}

	public async Task RemoveRequest(Guid id, CancellationToken cancellationToken = default)
	{
		await repository.Remove(id, cancellationToken);
	}

	public async Task RemoveRequest(Guid id, string etag, CancellationToken cancellationToken = default)
	{
		await repository.RemoveIfMatch(id, etag, cancellationToken);
	}

	internal async Task<string> ReplaceRequest(Guid id, AppointmentRequestDto appointmentRequest,
		CancellationToken cancellationToken = default)
	{
		var entity = appointmentRequestDtoTranslator.AppointmentRequestDtoToAppointmentRequest(appointmentRequest);
		await repository.Replace(id, entity, cancellationToken);

		return repository.GetConcurrencyToken(entity);
	}

	internal async Task<string> ReplaceRequest(Guid id, AppointmentRequestDto appointmentRequest, string etag,
		CancellationToken cancellationToken = default)
	{
		var entity = appointmentRequestDtoTranslator.AppointmentRequestDtoToAppointmentRequest(appointmentRequest);
		await repository.ReplaceIfMatch(id, entity, etag, cancellationToken);

		return repository.GetConcurrencyToken(entity);
	}

	public async Task<(AppointmentRequestDto, string)> UpdateRequest(Guid id, JsonPatchDocument<AppointmentRequestDto> patchDocument,
		CancellationToken cancellationToken = default)
	{
		var current = await repository.Get(id, cancellationToken);
		var currentDto = appointmentRequestDtoTranslator.AppointmentRequestToAppointmentRequestDto(current);
		patchDocument.ApplyTo(currentDto);
		await repository.Replace(id, appointmentRequestDtoTranslator.AppointmentRequestDtoToAppointmentRequest(currentDto), cancellationToken);
		return (currentDto, repository.GetConcurrencyToken(current));
	}

	public async Task<(AppointmentRequestDto, string)> UpdateRequest(Guid id, JsonPatchDocument<AppointmentRequestDto> patchDocument,
		string etag, CancellationToken cancellationToken = default)
	{
		var current = await repository.Get(id, cancellationToken);
		var currentDto = appointmentRequestDtoTranslator.AppointmentRequestToAppointmentRequestDto(current);
		patchDocument.ApplyTo(currentDto);
		await repository.ReplaceIfMatch(id, appointmentRequestDtoTranslator.AppointmentRequestDtoToAppointmentRequest(currentDto), etag, cancellationToken);
		return (currentDto, repository.GetConcurrencyToken(current));
	}
}

AppointmentRequestService contains the interaction logic specific to the application and the repository. Since there

Dealing with translation to and from DTO, domain, and serialization classes is made less of a chore with tools like Mapperly. Mapperly will generate translation code based on property names. To create a translator to/from two types is easy as creating a partial class with a MapperAttribute attribute with partial methods that take one type as parameter and the other as a return:

[Mapper]
public partial class AppointmentRequestDtoTranslator
{
	public partial AppointmentRequest AppointmentRequestDtoToAppointmentRequest(AppointmentRequestDto dto);
	public partial AppointmentRequestDto AppointmentRequestToAppointmentRequestDto(AppointmentRequest entity);
}

AppointmentRequestDtoTranslator translates AppointmentRequestDto instances to/from AppointmentRequest domain entity instances. And to translate to/from AppointmentRequestEntity:

[Mapper]
public partial class AppointmentRequestEntityTranslator : ITranslator<AppointmentRequest, AppointmentRequestEntity>
{
	[MapperIgnoreSource(nameof(AppointmentRequestEntity.Id))]
	[MapperIgnoreSource(nameof(AppointmentRequestEntity.ResourceId))]
	[MapperIgnoreSource(nameof(AppointmentRequestEntity.ETag))]
	[MapperIgnoreSource(nameof(AppointmentRequestEntity.SelfUri))]
	[MapperIgnoreSource(nameof(AppointmentRequestEntity.TimestampText))]
	public partial AppointmentRequest ToDomain(AppointmentRequestEntity data);

	[MapperIgnoreTarget(nameof(AppointmentRequestEntity.Id))]
	[MapperIgnoreTarget(nameof(AppointmentRequestEntity.ResourceId))]
	[MapperIgnoreTarget(nameof(AppointmentRequestEntity.ETag))]
	[MapperIgnoreTarget(nameof(AppointmentRequestEntity.SelfUri))]
	[MapperIgnoreTarget(nameof(AppointmentRequestEntity.TimestampText))]
	public partial AppointmentRequestEntity ToData(AppointmentRequest domain);
}

Since AppointmentRequestEntity has some Azure Cosmos implementation details, we use Mapprerly's MapperIgnoreTargetAttribute and MapperIgnoreSourceAttribute to tell Mapperly that not all properties need translation.

Dealing with concurrency issues and implementing concurrency control can be intimidating. In this post, I make it less intimidating by clarifying some specifics by showing an example implementation with ASP.NET Core and Azure Cosmos DB. Additionally, the Domain-Driven Design patterns Repository and Application Service are used to isolate etag implementation details from the Web API to delegate that to Azure Cosmos.

There are multiple ways of implementing optimistic concurrency; HTTP ETags are but one way. If you can't abide by the expectations set out by the HTTP standards, don't use Etags. There's nothing that forces you to use HTTP precondition header fields. But, remember, the means exist in HTTP, and embracing it will promote interoperability and reliability (to implement something different than something introduced at least 26 years ago fails to recognize the huge amount of validation and verification that's gone into making it correct.)

In a future post, I will show an example of a repository implementation that uses Entity Framework and its expectations for concurrency tokens.

HTTP and ETag Header Fields

2023-06-15T00:00:00Z

Update: corrected mention of 412 in the context of GET and If-Modified-Since to 304.

Over the last four-plus years, I have been almost exclusively working on some sort of *-as-a-Service (*aaS)—for example, Mortgage Origination as a Service, Insurance Claims as a Service. I always see a couple of things when implementing Web (HTTP) services: the reinvention of the wheel and recognizing the problem ETags solves after publishing a specification (sometimes both).

With *aaS as a Web API, the intention is to have multiple API clients providing access to representations of shared resources. Early in projects like this involves an initial (single) client, so the chances of a client having a stale resource representation are slim. When another client starts to use the API and an update gets accidentally overwritten, things get needlessly complicated.

I've seen teams address this problem in a number of ways, often involving a date-time stamp. With multiple clients on an API, scalability is an issue, and a date-time stamp can mean different things to different servers (as we'll see below). You need a single authority for a resource's last modified date-time to avoid exchanging one problem for another. See Last-Modified/Comparison for more details.

The creators of HTTP encountered this issue and added features to HTTP to deal with this (I assume that's why they added these features). I don't know when these features were devised, but they proposed them in 1997. So, they've been in the wild for at least 25 years with the entire web as a test bed. So, many brilliant people either created or scrutinized the solution. i.e., it's a wheel.

The HTTP features are ETags and conditional requests and enable optimistic concurrency.

ETags

An ETag (AKA entity-tag) addresses the "lost update" problem where there are two clients of an API that have received the representation of a version of an entity. Still, another client updates the entity before the other: the second update causes the first the be "lost." See the following diagram for a visualization:

An ETag addresses accidental overwrite by versioning the resource with an entity-tag (a hash of the representation, a version, etc.). When a client requests a resource, the server may include an ETag validator header field with an entity-tag value in the response. The URI of the resource, along with that entity-tag, constitutes an identifier for a particular version of an entity.

When a client requests a change to the entity, it includes the entity-tag as a basis version with a conditional header field (like If-Match.) The server responds with 412 (Precondition Failed), and the client can retrieve the latest version, re-apply their change, and re-send. See the following diagram for a visualization:

Falling back to date and time

Even if you use date and time, HTTP also covers you with other precondition header fields involving modification date. The If-Unmodified-Since and If-Modified-Since precondition header fields allow you to pass modification date preconditions to make a request conditional. When the precondition isn't met, a 412 (Precondition Failed) status code will be in the response, or for GET or HEAD, a 304 (Not Modified) status code will be in the response.

The initial GET of a resource that supports modification dates in conditional requests will include a Last-Modified header field validator. The Last-Modified validator is in the form of an HTTP-date.

Being Successful

RFC 7232, the HTTP 1.1 specification, section 2.3 describes the entity-tag:

An entity-tag is an opaque validator for differentiating between multiple representations of the same resource, regardless of whether those multiple representations are due to resource state changes over time, content negotiation resulting in multiple representations being valid at the same time, or both.

This means that the ETag value depends on the content-type, so two different representations of the same resource should have different ETag values (e.g., one gzip encoded, one not.)

This also means that the ETag value is opaque to requestors but does point out that one of the intents of ETags to be an alternative to using a date-time stamp due to lack of accuracy.

PATCH

Using PATCH with something like JSONPatch may seem to help alleviate conflicts by providing more granularity in what is changing. Technically true, to implement this would be non-trivial. The ETag specifics a tag of that edition of the entire resource, not any one field. While comparing a change against a delta between two editions of a resource (keeping in mind those editions may not be adjacent) might be one technique for dealing with that, creating deltas between arbitrary versions of the same resource is non-trivial. You could introduce that sort of thing. Something like event-sourcing might enable that. But remember that there may be interdependencies between properties of a resource, and just because the current request changes a property that hasn't changed since the resource was retrieved doesn't mean there isn't still a conflict.

Last-Modified

Remember that Last-Modified uses HTTP-date format, so Last-Modified only supports second granularity. With multiple origin servers, more than second granularity may be needed to be accurate 100% of the time.

If-Unmodified-Since

If-Unmodified-Since is used with state-changing methods like PUT, POST, DELETE, and PATCH to avoid accidental overrides (lost updates). If-Unmodified-Since imposes the precondition update this entity only if it hasn't changed since the provided date-time. Use If-Unmodified-Since to avoid lost update problems when second granularity is not a problem.

If-Modified-Since

When used with GET or HEAD, the If-Modified-Since header field imposes the precondition respond with 304 (Not Modified) and not with an entity representation if the modification date of the identified resource is not more recent than the date provided. Use If-Modified-Since to avoid re-transferring the same data.

`409 (Conflict)`

409 (Conflict) may sound like an appropriate response to a conditional PUT/POST/PATCH request, except that 412 (Precondition Failed) is expected. Response status code 409 should be used when something about the current state of the resource means that the server cannot change it. Also, if you have chosen not to use HTTP precondition features and have included something in the representation of the entity for versioning (like last-modified-date, see above), then 409 (Conflict) is appropriate to signify a potential accidental overwrite or lost update.

Leveraging Existing Implementations

Azure Cosmos DB implements ETags and Last-Modified to be leveraged to support the versioning of resources in your Web API. Technically the ETag is a version of the representation that Cosmos DB provides, so consider generating a new ETag based on what Cosmos DB provides, especially if you support more than one content-type (like XML). Suppose you have the concept of a database DTO or database models different from your MVC models. In that case, you should consider custom entity-tag generation based on the Cosmos-supplied entity-tag.

To leverage the Cosmos-supplied entity-tag, retain it and re-send it in any state-changing requests to Cosmos in the If-Match header field. If the entity-tags do not match, Cosmos DB will respond with 412, and the Cosmos DB library will throw a CosmosException with StatusCode == HttpStatusCode.PreconditionFailed.

References

Being Successful with Domain-Driven Design: Minimal Complexity, Part 3

2023-06-12T00:00:00Z

With a name like "Domain-Driven Design," it should be no surprise there is a major focus on the domain and has a huge influence on implementation. We've focused mostly on strategic design patterns and practices like Ubiquitous Language, Bounded Context, etc. But I've also covered a bit of tactical design and implementation. I've transitioned from strategical patterns--that deal with being explicit with domain concepts (Ubiquitous Language, Bounded Contexts)--to tactical patterns that have focused on directly translating domain concepts into code structure or coding patterns (like Services and Aggregates.)

The concepts and their consistency boundaries are only a couple of things that contribute to the complexity of non-trivial domains. For example, the work required to implement a domain is independent of its concepts and consistency boundaries. Additionally, the system's quality attributes and technical constraints are major influencers on the internal structure of that system. The next set of Domain-Driven Design patterns I'll get into (tactical patterns) aid in this respect. As we get closer to implementation, the focus turns more towards isolating domain complexity from implementation complexities.

As with many things in Domain-Driven Design, x for the sake of x is not the intention. Many things in most methodologies can be regurgitated and used by rote, providing little to no value. The principles and practices in Domain-Driven Design are best utilized with purpose and intent. Architectural layering is a good example. Each layer needs a reason for being (a purpose) with unidirectional independence of its concepts from another layer's concepts. Just any two groupings won't do; without the purposeful intent of having two layers with a unidirectional dependency, you'll never gain the benefits of layering. You end up with the added burden of managing a structure that does not give you any layered benefits.

The minimum complexity for layers is that two groups of concepts (contexts) are uni-directionally interdependent. In Domain-Driven Design, those layers focus on isolating the concerns of a User Interface, Application, Domain, and Infrastructure.

The Application layer may seem unique to Domain-Driven Design. There are few patterns/methodologies that isolates the concern that the application layer deals with. Ports and Adapters (Hexagonal) and, by extension, Clean Architecture recognize and isolate high-level use cases from both the domain and the implementation details of a UI. This is the role of the Application layer in Domain-Driven Design to further isolate the domain from how any use case uses the domain (a use case applies or realizes the domain). In Ports and Adapters, uses-cases (or, as Cockburn describes, uses-cases) are sequences of interactions between the system and users/actors. With the recognition of these interactions, they can now be isolated as collaborations within the Application layer.

There are other patterns that isolate interaction behavior structurally within collaborations like the Adapter pattern, but I'll save that for another day.

A UI may have to deal with different form factors, communication protocols, execution contexts, etc. A loosely coupled UI involves designing an interface that takes all of those things into consideration to be successful. A web-based UI requires a backend that supports open protocols and standards. Protocols and standards relating to implementation or delivery are merely constraints on how a system is implemented. At some level, the domain needs to operate correctly regardless of those constraints.

Layers are like different team roles, all working together simultaneously to accomplish specific types of goals. Bounded contexts are also like multiple teams, sometimes like a night shift and a day shift or an on-shore and an off-shore team. These types of teams work with some level of independence: shifts may never work together simultaneously, and on- and off-shore teams only work together for a brief time with much more structured communication.

Recognizing and planning for how teams contribute to the same goals is key for these teams to be effective. It's the recognition that different parts of a larger system need different levels of independence. With teams, this is to utilize resources effectively: like how shifts can use limited resources (human skills) across more of the day (e.g., 24 hours instead of 8.) Conway's law is just an observation (i.e., a reality). A team (or teams) structure imposes a means and cadence to communication. How often and the way inter-team communications occurs has implicit limits on that communication.

Recognizing and working with that communications structure can make teams much more successful. Domain-Driven Design makes domains and sub-domains first-class citizens within the practices. Many aspects of architectural and social boundaries can affect the release of a product. A Bounded Context is more than a consistency boundary or scope of a domain model. A Bounded Context also involves work products (deployments, deliverables) and team organization.

For example, the consistency boundary of a mortgage loan application becoming complete and submitted is a fairly obvious boundary and context. Still, the amount of work involved to support that might be fairly large. The number of people implementing and supporting that context might amount to several teams. The complexity of dealing with several teams of people to deliver parts of the same system can be enormous. Domain-Driven Design also gives us some patterns and practices to address those complexities. You may need to split a domain into more bounded contexts because one context is too complex for a single team to manage. When we start to talk about separating work across teams, we're still talking about bounded contexts. For similar reasons, you may need to split a domain into more bounded contexts (and thus "sub-domains") simply because of an existing team or reporting structure.

For delivery to be more successful, it's important to recognize the different teams, reporting structures, team motivations, and missions within the strategic design of the Bounded Contexts. How two teams and how the work product of those two teams interact is unique. Fortunately, there are some patterns to address the dependencies between two teams and their work products that help us address their inherent complexities. I'm assuming there's always some degree of interdependency and independence, and I'm ignoring mutually independent (Separate Ways) and Big Ball of Mud relationships/structures.

It's worth noting that as soon as two contexts are recognized the need to translate between the two becomes a reality. As context become complex to the point of being bounded context so too does the need to recognize and isolate translation. Much of what we do in Domain-Driven Design is the isolation of concepts, concerns, responsibilities, etc. The need for a translation layer is no different.

There is a spectrum to the degree of independence of two teams and or the independence of their work products. The teams are very dependent on one end, while the other is extremely independent. With Domain-Driven Design, very dependent teams exhibit a lot of domain overlap. With a lot of domain overlap, you can have an interdependence where teams work as equals or partners. This partnership can manifest in an early re-org of people working on existing or legacy systems. That partnership may start with different teams working on separate parts of the codebase. This partnership may only be one step in the evolution of the teams; the next step is often to organize teams toward the Shared Kernel model.

A spectrum of options is a synonym for infinite combinations. It's nice to have flexibility, but an endless set of possibilities is hard to map to a finite set of patterns, and it's hard to use established practices if every situation is novel. There are some ideas and structures that Domain-Driven Design details to add some granularity to the domain we're modeling so that we can more easily map complexity to the patterns and practices that address them.

In the Shared Kernel model, the team carves off a separate shared codebase or a shared component to contain all the things that two or more teams will always or almost always mutually require. A Shared Kernel model involves organizational behavior, like specific responsibilities, code areas, accountability, etc. But Shared Kernel is a fairly casual relationship. With more formality between two teams or components, you usually see an Upstream/Downstream relationship form. It is easy to view the users of the Shared Kernel downstream dependents and evolve to a more formal Upstream/Downstream relationship. A Customer/Supplier model may emerge in cases like this.

A shared codebase is more casual than a shared component, but a shared component promotes more independence. At the component level, it's important to ensure that autonomy hasn't allowed the teams to deviate from a shared plot, which Continuous Integration is intended to address. The component should be integrated with client code at every opportunity. The intent of a Published Language is for all contexts to be on the same page in understanding that domain. It's not that all contexts will adopt the published language as their domain, but they know how to translate in and out of their domain.

In the Customer/Supplier model, one team owns a component the other uses as the consumer of the component's capabilities. The team that owns the component is the Supplier, and the team that uses it is the Customer. With this model comes organizational behavior with more specific responsibilities, more planning, and scheduling. With this increased independence, the supplier team has very specific goals in which the customer team has a stake and influence, represented in a release cadence and a roadmap. The Customer is usually the driver of what capabilities the component provides next. The integration model of Customer/Supplier is usually a web service.

In a Upstream/Downstream model, there will almost always be some form of Published Language--usually more formal than just a description, often a specification. Translation becomes more formal in a Upstream/Downstream model, often resulting in a translation layer. If the Upstream/Downstream relationship is between two Bounded Contexts with a high degree of independence an Anticorruption Layer is used on one side to manage the differences communicating between the two domains.

Shared kernel and two-team customer/supplier relationships often exist due to the reporting structure or that reporting structure was created to split work across two teams. In a more product-focused organization, you may have a Customer/Supplier model with more than one customer. Multi-customer relationships can be witnessed in larger organizations with things like shared libraries. The customers are still driving the capabilities that the component and team provide them, but it can become more formal to manage the unique requirements of different customers. The supplier team is often more organized or formal and may have more of a product strategy with a product vision and mission that helps guide their work.

With more formal relationships come more formal expectations. Those expectations may come in the form of specifications and processes. Continuous Integration is an example of a process that continuously validates integrability. Potentially less formal than a specification may be a Published Language--which in its simplest form is a description of the concepts of a domain. (more complex forms would be varying degrees of specifications.)

Communications with a customer/supplier model within the same organization can be informal. What the team is working on and how they interact with customers might be more like partnerships; teams may work closely together to implement and integrate components. The number of customers or distance from customers can impact this informalness. The further away a customer is (different division, different organization, different company) may impose more formality to the relationship. The work product of the supplier team may be viewed more like a product. And while customers may drive that product, it may be much more formal to the point where the component is independent of any single customer. This type of relationship may be structured more like a service with a very specific or well-specified interface. Moving towards a well-specified interface is the intent of an Open Host Service where the component is remotely accessed (a service) with a specified protocol and interface.

As a customer has less influence on a service, they may become completely dependent on the supplier team to provide the capabilities they require. They accept the risk that the supplier team may not provide the necessary capabilities in the future. This is extreme, and either no organization would accept this risk, or it is a temporary relationship. In reality, the different models aren't mutually exclusive, but there is a tendency towards one of them. e.g., a relationship tends to be less like a customer/supplier relationship and more like one completely conforming to another. The recognition of this relationship is called the Conformist model.

highlight: Recognize change will happen but don't try to create a design that accommodates all change.

Being Successful with Domain-Driven Design: Minimal Complexity, Part 2

2023-05-29T00:00:00Z

In part one, I talked about the complexity in the language used to communicate the domain. Domain-Driven Design (DDD) deals with that complexity by isolating the concepts in a clear language that domain experts understand. Ubiquitous Language helps form the basis of all the other patterns and practices in Domain-Driven Design through the clear isolation of domain concepts. The DDD pattern language context map provides a good example of isolating concepts (in this case, Domain-Driven Design concepts):

Isolating individual concepts, naming them, and detailing how they relate allows each to be thought about independently. We can focus on parts of "Domain-Driven Design" because a Context Map details that isolation.

I'll dig deeper into the Aggregate and Service patterns in this part two. Aggregate and Service enable and embody major domain concepts.

An aggregate is the realization of a logical consistency boundary and the operations contributing to that consistency. An aggregate is a composition of several domain objects: At least one entity object and usually several value objects. Each domain object maintains its own consistency (it has invariants.) A date is a composition of a day, month, and year, but February 31, 1981 is not a valid date. We differentiate an aggregate from any grouping of domain objects because of the invariants and rules beyond that of simply a collection of consistent domain objects. An aggregate models that cross-object consistency requirement.

Aggregates map to major domain concepts and significant domain behavior associated with a particular domain object (the root). The root is the object of behavior and acts as a gateway to the other objects the aggregate comprises. The root takes on the responsibility of maintaining the consistency of the entire aggregate. The complexity of that composition and the consistency are separated from complexities outside the natural boundary of the aggregate.

An aggregate is not a design choice; it is a natural role that a major domain entity plays in the domain that requires recognition in a domain model. Some domain entities naturally take on certain behavior that affects other domain objects. All the behavior must happen in certain ways and have expectedly consistent results. Those consistent results (or state) abide by rules and invariants--it's consistent because... With a loan application, for example, there's no such thing as having negative assets (particularly: it's not a consistent loan application when it contains assets with negative value.) If you owe money, that's a liability.

When understanding and modeling a domain, I like to accurately map behavior to domain concepts that logically have that (or any) behavior. Sometimes it's easy to mis-associate behavior with static concepts when those are the major concepts in the domain. A loan, for example, is a major concept in many financial domains, but it does not exhibit behavior; it's static. A loan is the subject of many behaviors in a financial domain but is just a contract (or a specification.) We know we have complexity to deal with. Mis-associating behavior reduces clarity, making things needlessly more complex.

Starting out understanding a domain, I find focusing primarily on behavior and activities useful. Everything else in a domain is ultimately the subject of a behavior or activity, so I don't model them explicitly. For example, the role of underwriter approves a loan application. A loan application can be approved when the following rules are satisfied: a) the Debt-To-Income Ratio is below 43%, and b) etc... Debt-To-Income Ratio is modeled as an attribute of a loan application and covered in the activity of Approving a Loan. Information that isn't the object of a behavior also adds needless complexity.

When certain logic doesn't require a single object or doesn't require a particular state and requires particular objects, it might not be accurate to model the logic as part of an aggregate.

A Service is the realization of a collaboration between several objects. The concept of that service exists because it is not the natural behavior of a domain entity. A Domain Service is the realization of a collaboration between one or more domain entities. It involves business logic and or business rules that may affect the state of those domain entities. An application service is the realization of a collaboration between one or more domain services, domain objects, and an infrastructure service. And an infrastructure service is a collaboration with a framework or the "outside." The Service pattern recognizes complexity by isolating logic that is otherwise unrelated.

Look again at the Domain-Driven Design pattern language. If we always had to deal with all the complexities detailed in that diagram, it would be much harder to get things done promptly. The fact that the concepts are delineated and we can deal with them in isolation simplifies working with them. Separating interaction-only logic into a separate service from the object behavior affecting their state means we can think about and work with those concepts in isolation. Those concepts are now more loosely-coupled, and we obtain the benefits of loose coupling.

These definitions are easy enough to understand but can be confusing when it comes time to put them into practice. Sometimes the confusion stems from a backward approach to implementing software systems. Teams work backward from patterns when looking for the opportunity to use patterns. It's more successful in understanding the domain and then matching domain concepts to patterns and practices. For example, I've seen people approach a domain with questions like "What are all the entities?" or "What are all the services?" While we might be able to answer those questions after we've understood the domain and started to design solutions, approaching it from that perspective at that stage of understanding can pervert the interpretation of the domain. Services can be hard to recognize and implement when the domain concepts are not yet clearly isolated.

Sometimes it can be easy to delineate interaction logic in a collaboration from the business logic; often, it is not. In a financial domain, transferring funds as a capability can be easily viewed as the behavior of an account, for example. It involves accounts, obviously, so why would it not be an account behavior? But what happens when transferring funds? The situation's complexity comes from the fact that more than one account is involved, and the consistency of each account needs to be managed independently of the other(s). A funds transfer succeeds or fails, but has no state of its own--any change in state is encapsulated in the objects participating in the collaboration.

A clear understanding of the domain concepts is vital to project the isolation of those concepts into design elements more accurately. Modeling domain knowledge requires the delineation and understanding of the concepts as well as correctly associating all the behavior and attributes of those concepts. Maintaining this model is a process of managing complexity, but it only happens in stages. Managing these complexities is an ongoing and iterative process. The more complex a domain is, knowledge fragmentation amongst domain experts is more likely. It's highly unlikely that a single person completely understands the domain. This knowledge fragmentation is one reason we model the domain iteratively, recognizing that understanding evolves over time.

The Aggregate and Service patterns model parts of the domain similarly but provide a means to recognize separate parts of the domain: independent of the level at which they apply as well as how they affect state. Service operates at a higher level to model an activity, a collaboration of objects. An aggregate is the composition of several domain objects that must abide by the same invariants and be consistent in the presence of each other.

Being Successful With Domain-Driven Design: Minimal Complexity, Part 1

2023-05-19T00:00:00Z

The Domain-Driven Design book (the "Blue Book") includes "Tackling complexity at the heart of software" in the title. While "complexity" can be subjective, the takeaway is that Domain-Driven Design intends to address complex software systems. The principles and practices in Domain-Driven Design have their complexities, so for Domain-Driven Design to add value, it needs to address existing/expected complexity and attempt to be net-positive for simplicity.

Interestingly, the title of the Blue Book alludes to questions about what complexity is at the heart of software. Subjectively subjective, but we can look to the intent of some of the patterns and practices to deduce some types of complexity that Domain-Driven Design adequately addresses in the design of software systems.

For this series, I'll tranche away some of the patterns as essential complexity (essential complexity of both Domain-Driven design and the design of almost all software design): Entities, Value Objects, Modules, Layered Architecture, Factories, and Repositories. Any modular software system must deal with identity, value, creation, and storage. There's nothing new about layered architecture, but Domain-Driven Design does detail the isolation of specific responsibilities (like Domain and Infrastructure) that I'll cover. Practices like Side-Effect Free Functions, Standalone Classes, Intention-Revealing Interfaces, Continuous Integration, Assertions, and Declarative Style are aspects of long-championed techniques like cohesion, loose-coupling, naming standards, or functional programming (in my opinion).

None of what I've tranched off are unimportant, but they have been tried and true before Domain-Driven Design, and I want to focus on added value in Domain-Driven Design. To that end, I'll focus on Ubiquitous Language, Bounded Context, Context Map, Aggregates, Services, Domain Layer, Generic Subdomains, Segregated Core, Anti-Corruption Layer, and Core Domain; and touch on Evolving Order.

Clean Concepts, Contexts, and Boundaries

If I had to distill the intent of Domain-Driven Design to a single statement, it might be "be explicit." Or, more explicitly: "Be explicit with boundaries." Software systems are not the only source (or victim) of complexity. There's a whole science devoted to it: Complex Adaptive Systems. Not to oversimplify Complex Adaptive Systems, but systems with sufficient complexity are inherently unpredictable and exhibit emergent behavior, among other things. Meaning that complex systems will do what they're going to do, and we can only sometimes predict what they will do. Sometimes that emergent behavior is beneficial; sometimes, it isn't. To make systems more predictable (and get the benefits that provides), we have to reduce complexity. The complexity of complex systems arises from the number of dependencies, relationships, and interactions. Each unbounded interconnection increases complexity exponentially.

Complex adaptive systems theory is why explicit boundaries are a major aspect of how Domain-Driven Design combats complexity in software to produce more reliable and robust systems. Explicitness is important here; we're not looking for any-old boundaries. We're looking to constrain and isolate areas of the system based on purpose, meaning, and intent. We could chalk this up as simply an exercise in cohesion, but Domain-Driven Design focuses on getting to and clarifying that purpose, meaning, and intent.

Explicitness starts with unambiguous concepts, descriptions, and terms. If people aren't communicating the same concept things aren't going to get simpler. I speak about Naming Things, and part of what makes that difficult, I've decided, is language (or English). Stemming from human nature, we try to classify an ever-increasing set of concepts with a finite set of words, syntax, and semantics. The first step to explicit boundaries is the agreement on what they are: agreement on the concepts and to which explicit context they apply—the Ubiquitous Language.

The value of Ubiquitous Language isn't just that there is an agreed-upon vocabulary. The added value to a Ubiquitous Language is what it accounts for. The Ubiquitous Language recognizes classifications of concepts, classifications common to most software systems. Classifications that the Ubiquitous Language fosters and isolates: individuals, invariants and consistency rules, operations, processes, collaborations, commands, events, views, and values/properties. By "individuals," I don't just mean people, but anything that exhibits individuality (aka "entity.")

Imagine an amorphous "loan" concept in the financial industry. People apply for loans, obtain loans, and pay back loans. Getting a loan involves evaluating personal information (credit rating, assets, liabilities, etc.). Paying back a loan consists of a term, an interest rate, a payment schedule, etc. Credit rating, assets, liabilities, term, interest rate, and payment schedule are six interconnected concepts. With six concepts (with each having five interconnections to the others), there are 30 interconnections. Or 30 complexity points. But, if we think of these six concepts as two different semi-independent contexts: "loan application" and "loan servicing," we end up with two contexts (and one interconnection) with three concepts (or three interconnections) totaling six interconnections. We've gone from 30 complexity points to 6 simply by defining the actual contexts better. In other words, we're being more explicit.

Explicitness like this--delineating elements of a set into two groups connected by a specific relationship--is creating a boundary between two contexts. This is a very simple example of Bounded Context. As the name implies, Bounded Context is an explicit contextual boundary: where one context ends and another begins. This is a domain's macro level, recognizing that Loan Servicing can only happen after Loan Application is successful. This particular boundary is based on a temporal or procedural boundary. Phases or steps are a good way of organizing domains into bounded contexts.

In these two contexts, the word "loan" exists in both. The word "loan" is used in the application context as well as in the servicing context. But, in the application context, the meaning is really "loan application," and in the servicing context, it really means "serviced loan". Understanding that servicing depends on an approval event in a loan application phase (or activity) allows us to realize an explicit boundary. Sometimes it's as easy as this, but often it's not. There are other ways of teasing out boundaries (or contexts), almost always involving vocabulary elements.

Sometimes you've got overloaded terms like "loan"; sometimes, you have different terms like different rules. Different rules are typically applied in different scenarios or involve different parameters. Different rules offer a window into recognizing different contexts with a boundary in-between. You may recognize concepts like this (events, operations, rules/invariants) from the larger list I mentioned above. A Ubiquitous Language can also account for individuals and entities, commands (often related to an activity), views (reports, screens, results), collaborations, and attributes or properties attributed to individuals and entities. Additionally, attributes or properties can be involved in criteria, and categories or subtypes may group individuals and entities.

Working towards a Ubiquitous Language is working towards concepts more independent from each other. Independent concepts are themselves individual contexts. Any defined concept has a defined context with understandable boundaries. Keeping the complexity of one context bound from others keeps the essential complexity within that context and reduces the accidental complexity that arises from blended contexts.

Domain-Driven Design adds value when you have a minimal complexity, when a subject matter has multiple terms per classification. Terms can be classified as entities, processes, phases, events, rules, views, etc. The focus of this post was a level of complexity where boundaries are recognizable in the nuances of the vocabulary. In future posts, I'll dig deeper into different the subject matter (or domain) classifications, how you can isolate the complexities of each, and the parts of Domain-Driven Design that apply.

Installing .NET Framework 4.5 Targeting Pack

2023-02-05T00:00:00Z

Something came up with a client around Live Dependency Validation in Visual Studio recently. Digging into it I ran into several issues, one of which was the error:

Severity	Code	Description	Project	File	Line	Suppression State
Error		The reference assemblies for .NETFramework,Version=v4.5 were not found. To resolve this, install the Developer Pack (SDK/Targeting Pack) for this framework version or retarget your application. You can download .NET Framework Developer Packs at https://aka.ms/msbuild/developerpacks	DependencyValidation	C:\Program Files\Microsoft Visual Studio\2022\Enterprise\MSBuild\Current\Bin\amd64\Microsoft.Common.CurrentVersion.targets	1229

.NET Framework 4.5 has been out of support since 2016, so its targetting pack isn't available for download. I found a couple blogs posts about editing the modelproj file to add things like ResolveAssemblyReferenceIgnoreTargetFrameworkAttributeVersionMismatch or a PackageReference to microsoft.netframework.referenceassemblies.net45 but neither worked.

One of the features of Visual Studio Installers is that they can be a one-stop-shop for all the things you're going to need to develop software (with or without Visual Studio). One of these features is to install .NET targeting packs! Although the latest version of Visual Studio doesn't include out-of-support components, prior versions of Visual Studio are available. Visual Studio 2019 came out before .NET Framework 4.5 was completely unsupported (i.e. still had the option of paid support) so it offers the ability to install some targetting packs that are currently out of support.

You can download older versions of Visual Studio via https://visualstudio.microsoft.com/vs/older-downloads/, which seems to redirect you eventually to Visual Studio Subscriptions downloads. For our purposes, Visual Studio Community Editions works fine.

To install, run the Visual Studio installer that you've downloaded (if you already have 2019 installed, run the already installed Visual Studio Installer and click Modify) then click Continue to go past the set up a view things dialog. (if you have VS 2022 installed, this seems to do nothing.)

Note
Make sure you don't change any of the workloads (if you have Visual Studio 2019 install already, some may be checked--don't uncheck them, that will uninstall them).

Click on the Individual Components tab at the top (to the right of Workloads and to the left of Language packs.) In the .NET section, find an check .NET Framework 4.5 targetting pack.

Click Install or Install while downloading.

Once completed, you now have the .NET Framework 4.5 targetting pack. If you're doing this in response to a .NET Framework 4.5 targetting pack error message in Visual Studio, exit and re-start Visual Studio--the error should go away (it does with the modelproj error.)

Incidentally, the other issues I encountered are:

Full solution analysis for C# is currently disabled. You may not be seeing all possible dependency validation issues in C# projects. Options... Don't show again

... with no way to enable full solution analysis in a way that this notice recognizes and goes away.

I'd appreciate any advice to resolve that that doesn't involve clicking Don't show again.

Things I Learned Attempting Azure Administrator Associate - Part 2 - Storage

2022-12-22T00:00:00Z

Azure Administrator Associate certification is about the skills required to be an Azure account, subscription, tenant, etc., administrator. If your end goal is to develop applications on Azure, that involves a lot of administration of Azure resources. Regardless of your plan, storage administration is nuanced. This post focuses on some of those nuances, nuances that may not be apparent in the documentation.

Overview

Azure provides storage services for Files, Blobs, Queues, and Tables. Files are blobs that support access via SMB protocol, AKA File Shares. Blobs are web resources that support access via a URI (HTTP). Blob Storage supports two types of blobs: block blobs and page blobs. Table Storage supports key-based relational and document access. Queues support access to ephemeral messages.

Azure Files has a File Sync feature that supports file-level replication across Windows Servers. The Azure File endpoint is also called the Cloud Endpoint and is part of a Sync Group that includes one or more Windows Server file shares.

There are two performance options for Storage Accounts: Standard (general purpose v2) and Premium (for low latency.)

Storage has a couple of storage tiers: Standard and Premium. Storage tiers provide different functionality at different costs. Blob Storage has several access tiers: Hot, Cool, and Archive. Access tiers offer a way to communicate the frequency and type of data access to reduce storage costs. Access tiers can be used to implement a lifecycle for data, moving to lower-cost tiers over time to reduce cost.

Storage supports data redundancy that makes copies of data to avoid loss due to infrastructure failure. There are several options: Locally-Redundant Storage (LRS), Zone-Redundant storage (ZRS), Geo-Redundant Storage (GRS), and Geo-Zone-Redundant Storage (GZRS). LRS stores three copies of the data asynchronously within a single data center. ZRS duplicates those 3 LRS copies across three availability zones (clusters) in a region. GRS duplicates those 3 LRS asynchronously to a single zone in a secondary region. GZRS duplicates the ZRS data across zones within the secondary region.

Recovery Services is the service responsible for storing backups and recovery points. Recovery Services stores data within Recovery Services Vaults.

Encryption scopes logically group blobs or containers and assign an encryption key specific to that scope.

There are a couple options for controlling access to data: Azure AD accounts/groups or Shared Access Signatures (SAS). Azure AD Groups provide a more manageable way to control Azure AD account access to data (than simply Azure AD accounts). SAS provides a granular means to provide delegate access to external entities.

Notable information

LRS protects against rack-level hardware failure (so, if you want data-center-wide failure protection, LRS is insufficient).
LRS is supported for Standard File and Standard Block Blob account types (otherwise, GRS is the default.)
ZRS protects against data loss due to data center failure (so, if you want region-wide failure protection, ZRS is insufficient)
GRS protects against region failure.
GRS and GZRS secondary regions are pre-defined, forcing your data into a specific region.
GZRS protects against region failure and simultaneous data center failure in the secondary region.
When defining a data lifecycle, in the case of a tie, the option that results in the least cost will be chosen.
Migrating from LRS to GRS is supported with a feature called "Live Migration." Migration from LRS in other scenarios (e.g. to ZRS) must be done manually. Since Premium Storage accounts do not support LRS, Live Migration does not support Premium Storage accounts.
Live migration supports the use case of a storage account failure of GRS-replicated data. GRS is a second LRS in a secondary region. If a region fails, GRS reduces to LRS, so recovering means using Live Migration.
Durability is not backup; it provides access to data when infrastructure recoverability isn't an option. Apart from Live Migration, restoration is limited to manually copying live data when needed.
Durability does not protect against application-level failure; use backups or custom (application-level) durability in those scenarios.
Encryption scopes are useful for providing logical data tenancy.
Files added/modified in a File Share are only detected and replicated to the Windows Server file shares once every 24 hours (i.e., only visible after 24 hours).
Adding a file share to a Sync Group acts like all the files and folders within the file share were just added, replicating to the cloud endpoint and any other file shares.
When applying least privilege to storage accounts, the Reader role is also required on the Azure AD account if the Azure AD account needs to navigate storage resources in the Azure Portal.
Asynchronous data redundancy options introduce the possibility of data loss. If the asynchronous duplication to the secondary region did not complete, it is out of sync with the last state of the primary region. Application-level logic is required to prevent loss of data in this scenario.
The Archive tier does not have immediate access; it must be rehydrated to a cool/hot tier first (usually with a Copy Blob operation of up to 15 hours completion time).
File Share storage may be backed up to Recover Service vaults, but Blob Storage may not.

This table summarizes the types of storage accounts and the features/redundancy that each support.

Account	Redundancy	block blob	page blob	append blob	file share	queue	table
Standard account	LRS	☑	☑	☑	☑	☑	☑
Standard account	GRS	☑	☑	☑	☑	☑	☑
Standard account	GAZRS	☑	☑	☑	☑	☑	☑
Standard account	RA-GZRS	☑	☑	☑	☑	☑	☑
Premium Block blobs account	LRS	☑	☐	☑	☐	☐	☐
Premium Block blobs account	ZRS	☑	☐	☑	☐	☐	☐
Premium File shares account	LRS	☐	☐	☐	☑	☐	☐
Premium File shares account	ZRS	☐	☐	☐	☑	☐	☐
Premium Page blobs account	LRS	☐	☑	☐	☐	☐	☐
Premium Page blobs account	ZRS	☐	☑	☐	☐	☐	☐

Things I Learned Attempting Azure Administrator Associate - Part 1

2022-12-20T00:00:00Z

I recently earned certification for Azure Administrator Associate. My goal is to make my experience and skills more verifiable in areas like application solution architecture. Azure Administrator Associate is a prerequisite for Azure Solutions Architect Expert and DevOps Engineer Expert (I imagine it's a prerequisite for all Azure * {Expert|Associate} certs.)

Certifications aren't perfect, "certification" has different meanings to the observer and the certification itself. Most certifications bring with them an expected minimum understanding of the subject. Does it mean the earner will do everything perfectly with the subject? Of course not, but it gives the person a certain vocabulary to communicate more efficiently on the subject.

The road to Azure Administrator Associate was interesting, and sharing some notable information would be helpful for others.

Making The Implicit Explicit

The key to good communication is clearly understanding a subject and eliminating assumptions and misunderstandings. While understanding what is expected of a certified Azure Administrator Associate, I noticed some knowledge that I realized is typically implicit. Another way of looking at the following is that each starts with "It may seem obvious, but...".

Implicit knowledge is knowledge obtained through incidental activities; knowledge gained without awareness of learning is occurring.

Line of Business (LoB) Applications

Line of Business (LoB) applications is ubiquitous in the computing industry. Everyone knows what it means, but if you ask two people to define it, you'll get more than one answer. While agreement/standardization on what a LoB application is isn't going to happen any time soon, there are certain truths about LoB applications:

An in-house, custom web application
Not accessible via the Internet, either behind a firewall or strict access control (authentication and authorization)
Access may occur via an application gateway or load balancer
Specific to the company, business area, or industry

AzCopy

AzCopy works with Azure storage but only Azure Blob Storage and Azure Files.

Conclusion

There are many areas of clarification with Azure Administrator Associate. Future posts on the subject will address clarifications involving important explicit limits, restrictions, constraints, rules, etc.

Are there other implicit aspects of Azure administration that can be made explicit?

Fundamental ASP.Net Minimal API Integration Tests

2022-11-03T00:00:00Z

I've been involved with some fairly large projects that involved RESTful APIs. When dealing with multiple team members, multiple teams, and OpenAPI specs, there can be many risks. Even when an OpenAPI specification is generated from source code, what the code does can easily be unaligned with the spec. Luckily the spec is a machine-readable contract of the intent and purpose of the API.

Automated testing to the rescue! With ASP.NET, you can inject into and observe the middleware pipeline. ASP.NET integration tests are a common way of verifying the pipeline and how it is used. We can create integration tests that process the OpenAPI spec and verify operations are working as expected in various ways. This article dives into a couple of these ways.

Fundamental API Integration Tests

With a functioning Web API and an OpenAPI specification that describes it there are some fundamental things we can verify:

The generated OpenAPI document is valid
The paths have endpoints implemented
The operations respond with the correct type of response

First, let's set up our solution, projects, and integration testing scaffolding.

Setting Up the Solution and Projects

We're dealing with a Web API and integration tests, so let's create a Web API project and make the Program class public. You can do that manually in Visual Studio; but for consistency, the CLI is powerful (I'm being intentional with framework versions and some configuration options--appending public partial class Program { } to Program.cs to make the class public):

dotnet new solution
dotnet new webapi -o WebApi --use-minimal-apis true --framework net6.0 --use-program-main false
echo public partial class Program { } >> WebApi\Program.cs
dotnet sln add WebApi\WebApi.csproj

Next, we want to add a test project. xUnit is my go-to, so we'll use that and add a reference to the Web API project. Again, in the CLI:

dotnet new xunit -o IntegrationTests --framework net6.0
del IntegrationTests\UnitTest1.cs
dotnet add IntegrationTests\IntegrationTests.csproj reference WebApi\WebApi.csproj
dotnet sln add IntegrationTests\IntegrationTests.csproj

For ASP.Net integration tests, we will use WebApplicationFactory<T>, which requires a reference to Microsoft.AspNetCore.Mvc.Testing. In addition, to process OpenAPI documents, we'll need the Microsoft.OpenApi.Readers package. Again, via the CLI:

dotnet add IntegrationTests\IntegrationTests.csproj package Microsoft.OpenApi.Readers
dotnet add IntegrationTests\IntegrationTests.csproj package Microsoft.AspNetCore.Mvc.Testing

Integration Test Scaffolding

I got into some of the scaffolding of ASP.NET 6 integration tests in Setting Up the Solution and Projects concerning the required package references. the Microsoft.AspNetCore.Mvc.Testing package is required so that we may use the WebApplicationFactory<TEntryPoint> class--which allows us to bootstrap a web application in memory, specifically for testing.

We'll use WebApplicationFactory to create an instance of an HttpClient test fake that works with our in-memory host. In addition, we'll override WebApplicationFactory to get at some of the Swashbuckle details from the pipeline. We're interested in the generated OpenAPI document for processing and the name of that document to generate the OpenAPI specification URI for verification. Here's an example of a WebApplicationFactory implementation that does what we need:

public class MyWebApplicationFactory : WebApplicationFactory<Program>
{
	public OpenApiDocument? OpenApiDocument { get; private set; }
	public string OpenApiDocumentName { get; private set; } = string.Empty;

	protected override IHost CreateHost(IHostBuilder builder)
	{
		var host = base.CreateHost(builder);
		using var scope = host.Services.CreateScope();
		var sp = scope.ServiceProvider;
		var swaggerGeneratorOptions = sp.GetRequiredService<IOptions<SwaggerGeneratorOptions>>().Value;
		OpenApiDocumentName = swaggerGeneratorOptions.SwaggerDocs.First().Key ?? string.Empty;
		var swaggerProvider = sp.GetRequiredService<ISwaggerProvider>();
		OpenApiDocument = swaggerProvider.GetSwagger(OpenApiDocumentName);

		return host;
	}
}

The important parts are the OpenApiDocument and OpenApiDocumentName properties.

Now that we've got integration testing scaffolded let's create a test base class to make creating multiple integration tests clean and tidy.

Some Test Conventions

Automated testing classes and methods offer an opportunity to isolate and categorize tests to reduce work and clarify what is being tested (more importantly, what isn't passing). I tend towards a given/when/then structure when designing tests. The test class encapsulates the given/when (as well as the arrange from arrange/act/assert) whose name is suffixed with "Should." Each test method in the class is then given a name that describes the then condition. I try to ensure that there is one condition and thus one assert per method. YMMV.

For the tests I want to describe in this article, I've created a base class to encapsulate related given/when scenarios (or shoulds) that require the details we're accessing with the WebApplicationFactory<Program> implementation. Naming is hard, so I'm starting simple with a WebApiShouldBase class that encapsulates the parts we're getting with MyWebApplicationFactory and an ability to get a stream to the "live" OpenAPI spec document (JSON). It also deals with the responsibility of owning those things (e.g., disposal):

public class WebApiShouldBase : IDisposable
{
	private readonly string openApiSpecUriText;

	protected readonly HttpClient WebApiClient;
	protected OpenApiDocument? OpenApiDocument { get; }
	protected Task<Stream> GetOpenApiDocumentStreamAsync() => WebApiClient.GetStreamAsync(openApiSpecUriText);

	protected WebApiShouldBase()
	{
		var factory = new MyWebApplicationFactory();
		WebApiClient = factory.CreateClient();
		OpenApiDocument = factory.OpenApiDocument;
		this.openApiSpecUriText = $"/swagger/{factory.OpenApiDocumentName}/swagger.json";
	}

	protected virtual void Dispose(bool isDisposing)
	{
		if (isDisposing)
		{
			Dispose();
		}
	}

	public void Dispose() => WebApiClient.Dispose();

}

The important parts are the OpenApiDocument property which re-surfaces the MyWebApplicationFactory.OpenApiDocument to implementors, the WebApiClient property to access the API, and the GetOpenApiDocumentStreamAsync method that holds the OpenAPI spec document that the API provides. This class hides things like the URI to the swagger.json, the use of MyWebApplicationFactory, disposal, etc.

With that, let's start doing some tests!

Verifying The Generated OpenAPI Is Valid

"Valid" is subjective with OpenAPI. An OpenAPI spec is very forgiving in allowing for many opinions on what a good API looks like. I'm not going to go deep on what good might mean; just dive into facilitating validation of that generated document. The fact that there is an OpenApiDocument instance, and a raw OpenAPI specification, is an implementation detail. We'll use that OpenApiDocument instance shortly, but I want to ensure that the raw document meets some minimum requirements. For this example, the OpenAPI document is processed, not errors we detected, and there are paths. Very simple:

	[Fact]
	public async Task ProduceValidOpenApi()
	{
		var readerResult = await new OpenApiStreamReader()
			.ReadAsync(await GetOpenApiDocumentStreamAsync().ConfigureAwait(false)).ConfigureAwait(false);
		Assert.NotNull(OpenApiDocument);
		Assert.NotEmpty(readerResult.OpenApiDocument.Paths);
		Assert.Empty(readerResult.OpenApiDiagnostic.Errors);
	}

Client requirements can be less strict than development requirements (development objectives), and there may be different subsets of requirements in the case of multiple clients. This example doesn't implement that specifically but does provide the means to do it (by adding distinct test methods.)

OpenAPI.Net has can do very complex verification and validation, but I expect that sort of testing to be performed at a different level--I want to make sure client-oriented tests are handled here.

Verifying The Paths Have Endpoints Implemented

Publishing an API with paths and operations, and hosting an API that hasn't implemented those operations is silly. So the next test verifies they are implemented (at least the GET operations) as specified:

	[Fact]
	public async Task EndpointsRespondOkToGet()
	{
		Assert.NotNull(OpenApiDocument);
		var pathsWithGetOperations = OpenApiDocument.Paths.Where(w => w.Value.Operations.ContainsKey(OperationType.Get));

		foreach (var (requestUriText, _) in pathsWithGetOperations)
		{
			var response = await WebApiClient.GetAsync(requestUriText).ConfigureAwait(false);
			Assert.True(response.IsSuccessStatusCode);
		}
	}

GET operations are easy; they shouldn't have a request body and almost always have a success response specified. In the future, I can dive into other types of operations like POST, how to extract samples from the OpenAPI specification, and how to verify operations with request data and or error responses.

Verifying The Operations Respond With The Correct Type Of Response

HTTP, and thus OpenAPI, don't enforce that any operation responds with anything in particular. But, if you're reading this blog, you are probably of the opinion that given the opportunity to specify behavior, you should be at least as detailed in specifying the type and schema of the responses. I'll leave out validating response schema in this article, but I will show verifying that each request responds with the correct media type. For example:

	[Fact]
	public async Task EndpointsRespondWithCorrectMediaTypeToGet()
	{
		Assert.NotNull(OpenApiDocument);
		var pathsWithGetOperations = OpenApiDocument.Paths.Where(w => w.Value.Operations.ContainsKey(OperationType.Get));

		foreach (var (requestUriText, pathItem) in pathsWithGetOperations)
		{
			var responseContentType = pathItem.Operations[OperationType.Get]
				.Responses[OkResponseCodeText]
				.Content
				.Single().Key;

			var request = new HttpRequestMessage
			{
				Method = HttpMethod.Get,
				RequestUri = new Uri(requestUriText, UriKind.Relative),
				Headers =
				{
					{
						HttpRequestHeader.Accept.ToString(),
						responseContentType
					}
				}
			};
			var response = await WebApiClient.SendAsync(request).ConfigureAwait(false);
			Assert.True(response.Content.Headers.ContentType?.MediaType ==
			            responseContentType);
		}
	}

Caveats

Of course, you can have or create an OpenAPI that does little more than document an endpoint and ignore that there are operations and those operations do specific things.

This article is an overview. I recognize that Swashbuckle and ~~Swagger~~OpenAPI support in ASP.NET is powerful, but this article doesn't take into account many things you can do with it (like multiple OpenAPI documents.)

I also recognize that operations that take no parameters are rare, but I trust that my readers are good with taking on that as an exercise. Or, at least let me know if that's detail I should post in the future.

Summary

This article provides a very high-level overview of integration testing ASP.NET minimal APIs. We then got into some details of general Web API integration tests that focus on OpenAPI specification aspects of the Web API middleware.

What sort of automated testing of an API specification do you see as beneficial to your projects?

References

The source for the examples, including the creation scripts can be found at https://github.com/peteraritchie/fundamental-webapi-integration-testing

Visual Studio Performance with Microsoft Defender

2022-10-27T00:00:00Z

Steve Smith posted about speeding up built times in Visual Studio by configuring Windows Defender. That was in 2016 and to say things have changed a bit is probably an understatement. Configuring a new laptop, I thought I'd revisit this briefly.

Before changing anything in Windows Virus & Threat Protection, go ahead and run a scan to make sure we're starting with a clean slate. ~~Go to Windows Security and click Virus & threat protection then click the Quick scan button.~~ I've been advocating scripting all-the-things, to run a quick scan in an administrator Powershell terminal run Start-MpScan -ScanType QuickScan. You can also run a full-scan, if that makes you more comfortable: Start-MpScan -ScanType FullScan.

Once that's complete we can configure Windows Virus and Threat Protection to "trust" (exclude) Visual Studio. To do that in PowerShell you can use the App-MpPreference cmdlet (as well as see what's already configured with the Get-MpPreference cmdlet). Some examples:

With Visual Studio 2022 Enterprise:

Add-MpPreference -ExclusionProcess "$Env:ProgramFiles\Microsoft Visual Studio\2022\Enterprise\Common7\IDE\devenv.exe"

With Visual Studio 2022 Professional:

Add-MpPreference -ExclusionProcess "$Env:ProgramFiles\Microsoft Visual Studio\2022\Professional\Common7\IDE\devenv.exe"

With Visual Studio 2022 Community:

Add-MpPreference -ExclusionProcess "$Env:ProgramFiles\Microsoft Visual Studio\2022\Community\Common7\IDE\devenv.exe"

And, with Visual Studio 2022 Preview:

Add-MpPreference -ExclusionProcess "$Env:ProgramFiles\Microsoft Visual Studio\2022\Preview\Common7\IDE\devenv.exe"

You can also do that for Visual Studio Code:

Add-MpPreference -ExclusionProcess "$Env:LocalAppData\Programs\Microsoft VS Code\code.exe"

You can also exclude the location of where you store your source code. The default location is C:\Users\<user-name>\source\repos for Visual Studio. So, in PowerShell, you can add a path exclusion:

Add-MpPreference -ExclusionPath "$Env:USERPROFILE\source\repos"

Note:
If you're working with Git repositories that you're unsure of what they contain, you may want to separate where you clone those repos from where you exclude.

Or, if you want a PowerShell script to just do all thee things, see optimize-defender.ps1

Other processes to consider:

Add-MpPreference -ExclusionProcess "$Env:ProgramFiles\dotnet\dotnet.exe"

Any other processes or paths that you'd consider for exclusion?

By Reference in C#

2022-09-28T00:00:00Z

I became aware recently that there were many C# compiler errors that do not have a corresponding documentation page. That documentation is open-source and I chose to spend some time contributing some pages for the community. Looking at a language feature from the perspective of its compile-time errors is rather enlightening, so I'd though I'd write a bit about these features in hopes of offering a better understanding for my readers.

C# compiler errors can be categorized (arbitrarily) by different areas of C# syntax, and I started to focus on one category at a time. One of those areas involves referenced variables. C# has always has ref arguments, but ref return, ref locals, ref structs, and ref fields have been additions to the syntax.

The declaration of a variable in C# influences its syntax in a couple ways: binding and accessibility. Accessibility is whether an identifier is visible at compile-time in a given context. Binding is how an identifier or name is bound at run-time to resources like data and code. Binding uniquely affect the compile-time correctness of any particular usage of a ref variable.

Binding affects the compile-time usage of an identifier because of the run-time lifetime of the resources it is bound to. You're probably familiar with a static method accessing instance data and the errors caused in this context. ref variables have a similar context when they are stack allocated. Heap-allocated objects (objects bound to the heap) can have their lifetime extended to be long-lived because the heap shares the same lifetime as the application. Variables bound to stack-allocated resources cannot have their lifetime extended beyond a specific scope. The stack is a sequential collection of elements with elements implicitly partitioned by a shared scope. The most recognized scope is probably a method call or method/lambda body. Local variables bound to stack elements do not have a lifetime beyond the method call. A reference to a stack object cannot be assigned to a variable or expression with a broader scope.

How far the value of an expression can leave the confines of its declaration scope is called "escape scope". Sometimes the escape scope is the same as the declaration scope. The compiler verifies compatible escape scopes during assignment. For example:

    void M(ref int ra)
    {
        int number = 0;
        ref int rl = ref number;
        if (ra == 0)
        {
            int x = number;
            rl = ref x;
        }
    }

x is local to the if body, it is bound to the stack, and its escape scope is narrower than ref rl because ref rl is declared in the outer scope. Since ref rl is an alias to another variable, it cannot reference a variable bound to a resource that will go out of scope before it does. rl = ref x results in a compiler error. If rl were not a reference to a value type, the assignment would be okay because x would be bound to heap and have a broader escape scope.

The compiler also verifies compatible escape scopes when returning values. For example:

    ref int M(ref int ra)
    {
        int number = 0;
        ref int rl = ref number;
        if (ra == 0)
        {
            ref int x = ref number;
            return ref x;
        }
        return ref ra;
    }

return ref x results in a compiler error because the escape scope of x is local to the method. The error message here may not be as clear as the first because it doesn't mention the narrower escape scope.

There are the basic escape scopes. A calling method scope, a current method scope, and a return-only scope.

The calling method scope is a scope outside of the containing method/lambda. References can reach this scope via either a ref parameter or a return.

The current method scope is a scope within a containing method/lambda.

The return-only scope is a special case for ref struct types that can only leave the method scope via a return and not through a ref or out parameter.

Fundamental Quality Attributes of Technology Systems

2022-09-16T00:00:00Z

What are quality attributes? The term "non-functional requirements" has been more prevalent, but that is a technologist's term. The first time you bring up "non-functional requirements" with a customer, there's always confusion, then concern. I've heard more than once, "we want something functional."

The most important attribute of a system is its functionality. Functionality is whether a system is fit for purpose. A system's functionality is what makes it unique, so I'll defer detail on functional attributes for another time. In this post I'll focus this post on cross-cutting quality attributes that permeate all aspects of a solution.

Quality attributes are the characteristics a system needs to exhibit--qualifications of the system's desired functionality. Quality attributes address customer concerns regarding the degree of success of a system. A customer's concerns of a system are unique and thus precludes having a universal prioritized list of quality attributes.

It is simple to describe the characteristics a customer expects of a system that provides the features they need: Customers expect features that operate without fault or error, operate consistently within expectations, operate within resource constraints, and protect from unauthorized access. Translating that into a collection of system-specific measures is an enormous undertaking that cannot be taken lightly.

Many philosophies about quality attributes (usually termed "quality models") exist, like FURPS, ISO/IEC 9126/25010, McCall, etc. These models detail several categories that organize the many adjectives that can apply to software systems. Some common categories may include Reliable, Efficient, Maintainable, Secure, etc. I view quality attributes as a palette of possible adjectives; no one list is perfect for every situation. There are top-/high-level categorizations that can apply more broadly. When discussing quality attributes, we use the noun form (Reliability vs. Reliable.) I've landed on the following top-level categories (in no particular order): Performance, Operability, Security, and Dependability.

Dependability (of features)

to function, without fault or error

Dependability involves concerns such as:

availability — readiness for usage
reliability — continuity of service
safety — non-occurrence of catastrophic consequences on the environment
confidentiality — non-occurrence of unauthorized disclosure of information
integrity — non-occurrence of improper alterations of information
maintainability — aptitude to undergo repairs and evolution

Performance (of execution)

To function within resource constraints (time, compute, storage, memory, network) constraints

Performance involves concerns such as:

latency - the degree of responsiveness.
throughput - the rate at which work can be performed
capacity - the amount of work that can be performed

Operability (of function)

to become and remain operable.

Operability involves concerns such as:

deployability - the ability of a system to be put into production
monitorability - the ability of a system's health and operation to be monitored
configurability - the ability of a system's behavior to be customized

Security

To ensure authorized usage.

Security involves concerns such as:

Confidentiality - the quality of a system restrict access to information
Integrity - the quality of a system adhere to accuracy and consistency of information and behavior
Availability - the quality of a system to provide access to information to those authorized
accountability - the quality of a system to account for its actions when fulfilling its responsibilities

Because quality attributes address customer concerns, there are overlaps between categories. For example, there's a dependability concern that data integrity does not impact functionality; there's also a security concern that data integrity doesn't result in data loss. Don't let the overlap distract you from what's best for your solution. It will change even if you could pre-define a complex structure of quality attributes most suitable for your solution. Needs change, priorities change, and quality attributes are philosophic influencers to a solution that requires nurturing.

Naming Things - Common Actions and Events

2022-09-01T00:00:00Z

In this multi-part series on Naming Things, I dig into the benefits of having a clear understanding of common terms and concepts--in this case, common actions and events.

What does Deleted mean? Is it the same as Removed or Destroyed? What if you want to support soft delete as well as hard delete?

I want to be clear; these aren't developer decisions. They're developer problems based on a lack of clarity in the customer's domain. A customer likely won't use terms like "soft delete" and "hard delete ."The customer will probably refer to the most common form of delete as "delete." An architect role is responsible for teasing out the nuances of meaning into terms that the subject matter experts agree upon and getting consensus on usage with the development team.

Every system involves mutating data and information, yet it can be a common source of confusion regarding naming things. There are multiple types of data changes. Systems can create new data and may add that new data to a collection--physical or logical. Systems may change data, or designers may change the structure of data. Data is often removed--from a particular view or existence.

English allows us to reuse terms to mean many things. "Delete" and "changed," for example. There are well-known terms that enable us to communicate intent and consequences easily. But, when we reuse these terms across different contexts with different intents and consequences, we introduce the possibility of confusion, making naming things seem difficult.

It's important to understand the different intents and resulting consequences to data and attempt to get consensus on names and terms that adequately and uniquely represent these situations.

"Delete" is a common point of confusion. There may be a need for soft deletes and hard deletes; both of WHICH make data inaccessible in a context. But, data may also be moved from one context to another, changing its accessibility but not making it inaccessible. To use a single term like "delete" for all of these situations leads to confusion and issues in naming things.

Each domain can be different, but the situations I've just described are very common. For those, I start with unique terms for each and work with the subject matter experts to refine them (if needed):

Deleted means something is no longer accessible in some context.
Destroyed means removed from existence; no possible way to ever get it back.
Removed means a thing has been moved out of or removed from a container/collection.

Inverse terms:

Created signifies something new has come into existence (rather than Added).
Added* signifies something has been added to a container or collection.

Mutating information seems like such a simple concept. But, we often need to know if data changes within the context of other data. Unique data mutation terms I start out with when working with subject matter experts:

Updated signifies the value properties or attributes of an existing thing (entity) have been changed.
Changed signifies an entire thing (entity) has been replaced with another.

Word Form

Actions are verbs, and events are past participles constructed from verbs. Most event names are constructed from regular verbs by adding the prefix -ed. Delete + ed: deleted; create + ed: created. Sometimes events are constructed from irregular verbs and don't end in ed. Bind: bound; drive: driven; sleep: slept.

Events are not simply verbs in past tense form. An event's context is that it is related to a subject. For example, the event "deleted" involves a subject and is used to describe the current state as a result of a past action. In grammar, this is present perfect tense and implies an auxiliary verb of "has been ."e.g., The customer record has been deleted or The customer record is deleted. Since this details the subject somehow, it is also in a present indicative form. This detail is important but normally for edge cases. Normal domain narratives should align with this because that's normal language in these scenarios.

Match events to actions; don't define events when no action would result in that event.
Don't assume events always end in "ed."
Events are present indicative, past participles, and in the present perfect tense.

Do you have other actions and events that you commonly encounter?

Message-oriented Minimal APIs in ASP.NET Core

2022-08-28T00:00:00Z

TL;DR - go to the implementation details

First of all, what is message-oriented? Like many things in technology and life, terms like "oriented" are frequently used whose meaning may not be immediately connected to its usage. Object-oriented, message-oriented, aspect-oriented, etc. have a vague meaning when used, which can sometimes introduce a lack of clarity.

*-Oriented means that a particular concept always taken into consideration and utilized in the specified circumstances. Message-oriented means that interactions between concepts at a certain layer involve messages. This is sometimes referred to as Message Passing; or that communication between layer-specific concepts it done by passing messages to each other.

Things can be oriented in many ways at the same time. A system may be simultaneously message-oriented and object-oriented, for example--which usually means that what produces and consumes messages are implemented as objects.

Being Message-Oriented

Message-orientation demands a level of loose coupling. In object-oriented message-orientation objects donn't communicate with each other directly, a third-party transports message from the sender (or producer) to the receiver (or the consumer). There an many ways that can happen: queues (or simply collections), mediators, buses, etc. The type of the third-party component depends on the degree of loose coupling and how much work the third party takes on to transport those messages. For the purposes of this article I'll focus on Bus Architecture. Bus Architecture is a combination of a common data model, a common command set, and an infrastructure that provides a shared set of interfaces to transport messages.

Being Successfully Message-Oriented

As with any layered architecture, implementation details at different layers allow us to compartmentalize different concerns to ease understanding and simplify implementation. Message-oriented systems often classify messages to better implement and support a subdomain. Messages are often classified as commands, events, or documents to better support common subdomain and communication scenarios. Commands are messages that communicate a request or imperative intent. Events are messages that communicate or encapsulate a change in state. And documents are messages that contain data independent of a command or an event.

Why Minimal APIs?

You may have read articles like "MVC Controllers are Dinosaurs - Embrace API Endpoints" that suggest that modern MVC implementation have controllers that don't actually "control" anything. MVC details that Models, Views, and Controllers are decoupled from one another and cohesive in and of themselves. Controllers should interpret input and convert it into invocations upon the model and the view. Models are a dynamic data structure that directly manages data, logic, and rules for a given context. When MVC was devised, the controller was far closer to the user and took on more responsibility to imperatively translate and route data. With modern systems and technologies like JSON and programming language syntax, much of that translation and routing can be declarative--wiring up a request, its route, and the receiver of the command directly.

As an aside, many have argued that model and view have been muddied and that view doesn't exist with RESTful APIs, questioning "MVC" implementations altogether.

When you don't really have a controller and data translation occurs under the hood, going through the motions of controllers and models with core subdomain objects is viewed as needless ceremony.

Message-Oriented Frameworks

I've been working with a simple-but-no-simpler messaging library for several years. It's a set of libraries that I maintain called [PRI.Messaging](https://github.com/peteraritchie/Messaging). It consists of primitive types (abstractions) (PRI.Messaging.Primitives) and pattern implementations (PRI.Messaging.Patterns). It makes ideas like consumers, producers, and buses first-class concepts. PRI.Messaging.Patterns includes a bus implementation that assumes the role of dependency injection and message routing, allowing you to simply create message producers, message consumers, and have them automatically wired-up and messages routed appropriately.

I'll be using these libraries to implement message-oriented minimal APIs in ASP.NET Core 6+.

Implementation Details

For simplicity, I'll show making the default project created for minimal APIs message-oriented; get ready for some weather forecasting.

Starting with creating the default project:

dotnet new webapi -minimal -o WebApi
dotnet new sln -n example
dotnet sln example.sln add WebApi

This gives us OpenAPI (Swagger) and HTTPs support along with a single weatherforecast endpoint and a WeatherForecast response model (message).

The important code from Program.cs:

var summaries = new[]
{
    "Freezing", "Bracing", "Chilly", "Cool", "Mild", "Warm", "Balmy", "Hot", "Sweltering", "Scorching"
};

app.MapGet("/weatherforecast", () =>
{
    var forecast =  Enumerable.Range(1, 5).Select(index =>
        new WeatherForecast
        (
            DateOnly.FromDateTime(DateTime.Now.AddDays(index)),
            Random.Shared.Next(-20, 55),
            summaries[Random.Shared.Next(summaries.Length)]
        ))
        .ToArray();
    return forecast;
})
.WithName("GetWeatherForecast")
.WithOpenApi();

app.Run();

record WeatherForecast(DateOnly Date, int TemperatureC, string? Summary)
{
    public int TemperatureF => 32 + (int)(TemperatureC / 0.5556);
}

To become message-oriented we need to first create explicit messages to represent the interactions. For this I've created a GetWeatherForecastCommand command and a WeatherForecastedEvent event:

public class GetWeatherForecastCommand : ICommand
{
    public string CorrelationId { get; set; } = Guid.NewGuid().ToString();
}

public class WeatherForecastedEvent : IEvent
{
    public WeatherForecast[] Forecasts { get; }

    public WeatherForecastedEvent(WeatherForecast[] forecasts)
    {
        Forecasts = forecasts;
    }

    public DateTime OccurredDateTime { get; set; } = DateTime.UtcNow;
    public string CorrelationId { get; set; } = Guid.NewGuid().ToString();
}

Now we need something that explicitly handles (consumes) the GetWeatherForecastCommand command produces the WeatherForecastedEvent event. I prefer to call these types of things "Command Handlers". So,:

public class GetWeatherForecastCommandHandler : IConsumer<GetWeatherForecastCommand>, IProducer<WeatherForecastedEvent>
{
    private IConsumer<WeatherForecastedEvent> consumer = new ActionConsumer<WeatherForecastedEvent>((_) => { });

    public void AttachConsumer(IConsumer<WeatherForecastedEvent> consumer)
    {
        this.consumer = consumer;
    }

    private readonly string[] summaries = new[]
    {
        "Freezing", "Bracing", "Chilly", "Cool", "Mild", "Warm", "Balmy", "Hot", "Sweltering", "Scorching"
    };

    public void Handle(GetWeatherForecastCommand message)
    {
        var forecasts = Enumerable.Range(1, 5).Select(index =>
            new WeatherForecast
            (
                DateTime.Now.AddDays(index),
                Random.Shared.Next(-20, 55),
                summaries[Random.Shared.Next(summaries.Length)]
            ))
            .ToArray();

        consumer.Handle(new WeatherForecastedEvent(forecasts));
    }
}

As you'll see in this command handler, I've moved the implementation details from Program.cs and encapsulated them into this class (i.e. summaries and the creation of the WeatherForcast array.)

Returning to Program.cs, we now need to inject a Bus service and update the route endpoint to accept a bus instance, translate to our command, and send it to the bus.

app.MapGet("/weatherforecast", async (IBus bus) =>
{
    WeatherForecastedEvent result =
        await bus.RequestAsync<GetWeatherForecastCommand, WeatherForecastedEvent>(
            new GetWeatherForecastCommand());
    return Results.Ok(result.Forecasts);
})
.WithName("GetWeatherForecast")
.WithOpenApi();

The IBus RequestAsync extension method implements the asynchronous request-reply pattern.

With our messages, consumers, and producers we can now create a message bus singleton and have it wire-up the producers and the consumers. This is simply done by invoking the IBus.AddHandlersAndTranslators method in addition to registering a singleton bus:

IBus bus = new Bus();
bus.AddHandlersAndTranslators(
    Path.GetDirectoryName(typeof(Program).Assembly.Location)!,
    Path.GetFileName(typeof(Program).Assembly.Location), "");
builder.Services.AddSingleton(bus);

Summary

As you can see, the route endpoint has an IBus injected into it (i.e. Dependency Injection) and is only concerned with sending a GetWeatherForecastCommand message and receiving a WeatherForecastedEvent message. Where that command goes and where the event comes from (and how it gets created) are irrelevant (i.e. neither knows nor cares about GetWeatherForecastCommandHandler). With the implementation details of weather forecasting moved out into GetWeatherForecastCommandHandler those details are now longer directly coupled to a web API. GetWeatherForecastCommandHandler exists in its own library and can be used by several types of applications. GetWeatherForecastCommandHandler could be used, as is, within a console application, a PowerShell CmdLet, etc. As with any well designed loosely coupled system, it's just a matter of correctly setting up a service container for the specific circumstances.

How will you use event-orientation?

References

Data URLs in Markdown

2022-06-27T00:00:00Z

Data URLs embed data within the URI instead of being a link
Data URLs can be used to embed images into a web page
Data URLs can be used for images in markdown

TL;DR⮷

Background

Links in Markdown

URLs can be used in markdown as a hyper-link to another location (another page, or an anchor in the current page, or both.) These links in markdown come in two varieties: "conventional" and reference-style. Conventional links have the format [text](url).

Reference-style links have a two-part format. The first is the reference declaration which consists of a name and the URL. For example:

[twitter]: https://twitter.com/peterritchie

Typically the reference declarations appear at the end of the markdown file.

That second-part of the format is slightly different than a conventional link: [Visible Text][reference-name] (notice that both parts are enclosed in square brackets)

For example: [Me@Twitter][twitter]. Which would result in Me@Twitter.

That reference can be used any number of times within markdown by referencing the reference name.

Images in Markdown

An image in markdown is a special kind of link; it follows the same format as other links except that it starts with an exclamation mark (!) and has an optional title-text enclosed in quotes. For example ![alt-text](url "title-text"). Since the image is what is visible, the title text portion of the link is text shown when hovering over the image and the alternative text is used by accessibility features.

Data URLs

There exists the ability to encode content within a URL (so the URL is actually the "response" in the conventional URI request scenario). Data URLs (Formally known as Data URIs) use the data: URI schema followed by an optional media type, an optional base64 extension (;base64) followed by data (if the base64 extension is used the data is binary and is base-64 encoded).

For example, with binary data for a red dot png a data URL may look like this:

data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg==

Images, Data Urls, and Reference-Style Markdown Links

Data URLs may be used in markdown image links. With image markdown format and a data URL:

![a red dot](data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg==)

... and result in:

...or with title text:

![a red dot](data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg== "The Image")

... and result in:

Or, with a reference-style image link:

![a red dot][red-dot]

[red-dot]: data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg==

... and result in:

Data URLs are a handy way to reduce the number of files involved in a page. Like any feature, that can get ridiculous so the value comes when working with small chunks of data (like a red dot image.)

References

Data URLs

Data URI Scheme (wikipedia)

As a Delivery Team Member, I Want To Know if My Organization's Agile Initiative Is off the Rails

2022-06-19T00:00:00Z

Image by Pexels from Pixabay

Or, as a delivery team member, I want to know if my organization's agile initiative is off the rails, so that I may compensate for it.

I have been an agile team member (delivery, engineering) in many organizations. There is a spirit to any defined agile process, a spirit that addresses known time-to-market and quality fallacies. Agile processes are like any good guidance; they are based on experience and techniques proven to address known problems.

First, some context:

Excellence is not about being perfect but about recognizing and exploiting opportunity.

A Project is a temporary, planned effort to achieve a particular aim.

A Sprint is a time-boxed period where a team works to deliver usable functionality to stakeholders.

Goals and objectives are often Means Goals and Means Objectives, signifying they are a catalytic end to realize another end.

Objectives that are meant to accomplish other objectives exist in a continuum of objectives called Cascading Objectives.

A Key Result is not qualitatively measured ("done," "improved," etc.); they are measured quantitatively ("improved by 25%," "decreased by a factor of 2," etc.)

Agile methodologies embody a continuum of purpose, motivation, and improvement. Purpose, motivation, and improvement are not team-specific concepts but are organization-wide constructs. This continuum is the result of the act of leadership.

Over the years, I have witnessed many patterns of behavior that have resulted in failed agile delivery. Following are some common of those practices,

Agile Is the Only Process

Agile's raison d'être is to deliver value to the stakeholders. Agile is a project management technique; no enterprise devotes 100% of its resources to projects. An enterprise has a purpose for existing (their why, the vision) and has a current means to achieve that purpose (their mission). Any effort not gauged by whether it satisfies the overall mission (and thus aligns with the purpose) can only succeed accidentally. Planning to succeed accidentally is not planning, and that sort of "planning" is a waste of time. Agile organizations put effort into addressing assumptions before planning value delivery.

You know an organization is working against itself and pretending to be agile when:

100% of engineering time is devoted to "sprints."
Sprints are planned correctly 100% of the time and are never canceled due to change.
A project plan does not focus on an operational outcome.
Spikes are exceedingly rare.
No stakeholder has communicated what they value.
Stakeholders do not declare a spike's usable functionality; the delivery team declares it.

No One Has OKR Training or Expertise

Goals and objectives are easily understood conceptually but are hard to implement in reality. One of the impetus' of OKRs is to recognize and address that. Objectives and deliverables are consequents of goals; they are the means to a larger end while still being an end in and of themselves. In isolation, objectives and deliverables are meaningless and, like any other misguided activities, detract from the purpose of accomplishing them. OKRs attempt to associate goal-oriented key results with individual objectives, objectives that cascade from higher-level objectives.

You know OKRs are in name only when:

Objectives do not cascade from higher-level objectives
Key results are an action, not an event
Key results are not measurable

To be honest, I've only ever seen failed implementations of OKRs--OKRs more often address perceived delivery failures rather than leadership failures. i.e., they are a management technique rather than a result of leadership.

Leadership in Name Only

A manager creates and judges the attainment of goals (doing things right). A leader communicates and cultivates purpose and vision (doing the right things).

There are only "leaders" and no "managers."
"Leaders" that are late to every meeting.
Activities are judged, not outcomes.
Goals and objectives are only ever qualitative, not quantitative.
Personal improvement is not a measured performance metric.

Of course, I could go on. There are many more examples and many bad practices. I'd love to hear about what you've witnessed and your thoughts on these and other bad practices.

Environment Variables with C# Conditional Compilation Symbols

2019-12-12T00:00:00Z

Have you ever thought, it would be nice to have a symbol like PETERRIT that is unique to your domain account that you could use for code that YOU maybe working on but don't want to break the build?

I occaisionally think I would like to do this:

#if PETERRIT
   public class ViolatileExperiment()
   {
     //...
   }
#endif

When I think of this I go look at the docs or on Stackoverflow, but I never find anything that allows me to do that.

I had that thought recently and poked around in the Project Settings for a few minutes to see what's going on. Interestingly "%USERNAME%" causes and error, but doesn't break the build.

Damn, I thought. But % is so... DOS, maybe they use a different delimiter. So, I stuck in ${USERNAME}. Nope. Then I thought, wait, macros in build events have a specific format! I entered $(USERNAME) and low-and-behold it worked!

It's a little wonky though, in the Project Settings it shows the expanded variable (PETERRIT), but in the project file it shows the macro reference. ($(USERNAME)). I can see the macro reference getting overwritten from time to time.

Enjoy!

RESTful Versioning

2019-11-12T00:00:00Z

Versioning is not new. Versioning seems to be one of those things that people find hard to do or difficult to fully understand, especially with services and APIs. RESTful versioning seems to be in the realm of Tabs v Spaces, but I want to detail my related observations (mostly of other's writings, but with some added color).

What is a Version?

Before going further, I find defining terms so their meaning is explicit and understood. Version is no exception.

A version recognizes a change to something already established and assigns it an unique identity. That identity serves as a moniker for what changed so that when the something that changed is processed, it can be differentiated from other somethings of different versions.

Why Do we Need a Version?

Based on what a version is, it may seem easy to at least deduce why. That deduction usually to differentiate different versions of things. This is what not why. This is the part that many people seem to dismiss or let slip by. In the context the knee-jerk response is "different version of the API" (API Versioning). But, this simply restating what versioning is. It's similar to defining "version" as

A version is the version of something in relation to other versions of the same thing.

Yeah, using the word you're defining in the definition is helpful. "Version" must provide:

Support For Past Representations

The major reason versioning comes into play is because any one representation of something evolves over time. Needs change, understanding improves, technology evolves, imperfections are found, etc. and how something is stored or communicated needs to change to accommodate that evolution.

"Requirements" are an obvious agent of change, and it would be easy to provide a trivial requirements example but a fixing imperfection example would be more persuasive. Humans like to be open-minded but inherently we live in our own worlds (our own mental model of the world). Some of us are empathic and recognize parts of other worlds of the people around us. Or we know about a set of archetypes for which we can optimized interaction. But in reality there are really 8+ billion other worlds out there and it's simply not humanly possible to know the intricacies of each. Which means we make assumptions and trade-offs of what is acceptable to all of those other worlds. Usually our audience isn't all 8+ billion people, so we're generally more correct than incorrect in our assumptions. But, being incorrect is inevitable and expected. Much like we need to support many personalities, preferences, and needs; we also need to:

Support Multiple Representations of Concepts

An example of this type of imperfection are date/time representations. We live in our own locus (which is like a personal locale) and take for granted things we use or do in our locus from day-to-day, like Date/Time representations. Local time has been working for each of us for all our lives, we take that for granted and use it in a representation without thinking.

There are many things that make this problematic and error-prone. I won't get into detail what all of those may be (a blog isn't the place for a tome like that). The fix is, of course, to use a different representation. Implementing that fix and supporting existing representations of complex data means we need to be able to tell different representations of the same data from one another.

This allows us to know how to translate each representation to the same in-memory structure when we (i.e. our code) encounter these representations; reinforcing that representations differ from the conceptual resource and from the implementation translation of the representation

We need the ability to translate multiple representations because there will be instances of differing representations in the wild at a time.

Support Multiple Active Representation Versions

Increasingly we work in asynchronous environments where communication of data is off-loaded to asynchronous communication technology (like queues, topics, and even threads). And with the drive to 100% availability this means that we perform software updates in situ or without taking our systems completely off-line (or unavailable). e.g. indirect communication through a queue makes communication of the message independent of the processes. Until it reaches and is consumed by the destination, the message can be in a queue with neither process executing. This requires components to support at least two versions of a representation at the same time.

Attributes of REST

Since we're in the context of Representational State Transfer where resource representation is front-and-center as well as state of that resource, the following is a review of main features of RESTful services:

REST is not a distributed object style 3
A resource identifier (URI/URN) is a reference to a particular conceptual resource, not to a particular representation of it. 1
A representation of a resource is transfered between REST components. 1
A resource maps to a set of entities that varies over time, not just the representation at the moment 2
The set of entities that are mapped to a resource are considered equal (by resource identifier and/or representation). 2
The semantics of mapping a resource to an entity distinguishes one resource from another and is constant. 2
Representations are late-bound and based on characteristics of the request. 2
An identifier may exist without, or before, any realized representations. 2

RESTful Versioning Options

A quick review of objectives, for any given resource representation:

we need to differentiate change independently from unrelated representations
we need to differentiate different changes to related representations at the same time

There are really only two fundamental options for API Versioning (I didn't use RESTful Versioning for reasons I hope will become clear):

Version moniker in the URI/URN, or
Version moniker in the headers (or media-type)

URI/URN

In REST, a URL/URI only identifies a resource, it is not a content-type identifier. One reason for this is Content Negotiation. Content negotiation details that in the request for any particular resource, the representation of the resource (the response) can be negotiated through headers and responses. That negotiation occurs through the single URL/URI.

I.e. the response format does not need to be consistent per URL/URI. 4 If it's not clear, supporting multiple representations means there can be many response formats for any single URI/URN. Since we've already shown that each representation is independent of other, multiple versions need to be represented per URL. Making something like example.com/picture/v2 for the SVG format meaningless. Therefore URI/URN versioning doesn't support some fundamental REST features.

Media Types (Headers)

Media types are monikers for a particular re presentation. XML and Json or JPEG and PNG are examples of particular representations of the same resource. But media types are more complex than that. Media types can consist of a registered type (application, audio, example, font, image, message, model, multipart, text, and video), a subtype (the registered format in the standards tree, or a dot-delimited subtype tree), a suffix (prefixed with +), and optional parameters (key/optional-value pairs prefixed with ;). Suffixes can be used to specify the underlying structure of a type/subtype, e.g. JSON and XML. Formats like SVG can be either textual or binary, so although being a image and SVG, simply specifying image/svg is not enough to cover both of those structures. The media type for the XML format of SVG ends up being image/svg+xml. Application-specific types use the application type and a subtype in the vendor tree (vnd). If a custom application format for a person resource that uses XML format would have a media type of application/vnd.person+xml. If the service also supports JSON it would have another media type application/vnd.person+json. charset is a reserved parameter, other parameters have unique meaning defined within the type/subtype. for example text/html; charset=UTF-8 and application/vnd.person+json; version=2.0.0.

The take away is that media-types are independent from the endpoint.

Wrapping Up

It may make sense to think that the resource is changing, but in reality it is the representation that changes. The resource is abstract, like client. Changing Birth Date from local to UTC doesn't change the fact that the resource is still a client. If the resource fundamentally changes, that's when you change the URL/URI. But not with a version identifier, but a new resource. If something previously considered a "client" changes so is conceptually no longer a "client", then a new URI/URN should be used (like "client" to "lead"). We wouldn't have different versions of clients, merely different representations of client information.

TL;DR

A URI/URN is a reference to the single conceptual resource and not to a particular representation. Since media types are the data format of the representation and the same conceptual resource has representations that can change, media type must be used to specify different representations that a single URI/URNs supports.