Blog/The Strangler Fig Pattern: Safely Migrating Legacy Systems
strangler-figmigrationlegacy-systemsarchitectureanti-corruption-layer

The Strangler Fig Pattern: Safely Migrating Legacy Systems

February 22, 2024·15 min read·by Bishwambhar Sen
A legacy monolith with a routing facade in front and a growing modern service cluster around its edges

The strangler fig is a tropical vine that germinates in the canopy of a host tree, grows its roots downward, and over years gradually encloses and replaces the host. The host may remain structurally present inside the fig even after the fig has taken over all the load-bearing function. Martin Fowler borrowed this metaphor in 2004 to describe a migration strategy for legacy systems: rather than attempting a big-bang rewrite, you wrap the legacy system in a facade, route new features to the replacement system, and progressively migrate existing features, allowing the old system to be strangled while the new one grows.

The metaphor is useful. What the metaphor glosses over is that strangling a real production system involves routing decisions, dual-write synchronization, schema translation, and anti-corruption layers that can themselves become the next legacy problem if not engineered carefully. This post works through the mechanics of each.

Concept

The Routing Facade

The facade sits in front of both the legacy system and the new system and makes routing decisions per request. In its simplest form, the facade is a reverse proxy — NGINX, an API Gateway, or a custom ASP.NET middleware — that routes requests by feature flag, path prefix, customer segment, or explicit routing rules.

The facade is the only component that is aware of both systems simultaneously. The legacy system and the new system do not reference each other. They coexist behind a common interface that the facade presents to all callers, maintaining a stable external API surface while the internal implementation changes.

There are two routing strategies:

Feature-first routing: Each capability is assigned to either the legacy or new system. Requests for migrated features go to the new system; requests for unmigrated features go to legacy. The migration progresses feature by feature.

User-segment routing: A percentage or cohort of users is routed to the new system. All of their requests — migrated and unmigrated — hit the new system. The new system must be able to handle unmigrated features (typically by proxying them through to legacy or by having completed the migration of all features required by the segment).

Feature-first routing is operationally simpler but requires that the two systems can be used independently per feature. User-segment routing is more powerful for UI-heavy migrations where the user experience must be coherent, but requires the new system to be more complete before it can receive any traffic.

The Anti-Corruption Layer (ACL)

When the new system needs data or behavior from the legacy system, it must not model that data using the legacy system's vocabulary. The legacy system has its own domain model — often one shaped by years of organic growth, denormalized tables, and business rules encoded in stored procedures — and importing that model into the new system is how the strangler fig becomes the new legacy system before the migration is complete.

The ACL translates between the legacy domain model and the new bounded context's domain model. It is a set of services, mappers, and translators that live in the new system's integration layer and speak to the legacy system through whatever API the legacy exposes — a REST endpoint, a SOAP service, direct database queries, or message queue events.

The key discipline is that the ACL is the only place where legacy types are referenced. The new domain's aggregate, entities, value objects, and use cases reference only the new model's types. The ACL is responsible for any translation, and that translation is tested independently.

Migration Sequencing

Not all features can be migrated in any order. Dependencies between features create a directed acyclic graph of migration sequencing: feature B may need to be migrated before feature A because A's new implementation reads data that is only maintained by B in the legacy system.

The right migration order is determined by data ownership: migrate the feature that owns a piece of data before migrating features that depend on it. If the Customer Preferences feature in the new system owns the user's notification settings, it must be migrated before the Notification Dispatch feature can be pointed at the new system.

Data Synchronization

During the migration window — which can last months for large systems — the same data may be written by both the legacy and new systems. A customer who is in the legacy-routed cohort updates their shipping address through the legacy system. A customer who is in the new-system cohort updates their address through the new system. Both updates must eventually be reflected correctly in both systems during the transition.

This dual-write problem is the hardest part of the strangler fig migration. The approaches are:

Change Data Capture (CDC): Attach a CDC tool (Debezium, AWS DMS) to the legacy database's transaction log. Every write to the legacy database produces an event that is consumed by the new system, which applies the update to its own data store. This provides near-real-time synchronization without modifying the legacy application.

Event bridge: If the legacy system already emits events or can be modified to do so, a translation layer consumes legacy events and converts them into new system commands.

Shared database (temporary): During the migration, both systems read and write to the same physical tables, with the new system's ORM mapped to the existing schema. This is the lowest-risk approach for data consistency but the highest-risk approach for architectural independence, because the new system becomes coupled to the legacy schema.

Constraints

Rollback fidelity: If the new system writes data that the legacy system cannot read (different schema, different format), rolling back to the legacy system will produce data loss or corruption. Every migration decision must include a rollback analysis: can the legacy system read everything the new system has written? If not, the migration is one-directional and rollback risk must be explicitly accepted.

Routing consistency: A user who starts a transaction in the legacy system must not be mid-routed to the new system before the transaction is complete. Shopping carts, multi-step wizards, and any stateful session must be pinned to one system for the duration of the session.

Legacy system availability: During the migration, the legacy system continues to serve production traffic. Any performance degradation in the legacy system affects the entire user base. CDC tools that read the transaction log can increase legacy DB I/O. Direct DB queries from the ACL can add connection pool pressure.

ACL erosion: Over time, the anti-corruption layer accumulates workarounds, special cases, and direct legacy database queries. Without active maintenance, the ACL becomes as complex as the legacy system it was supposed to insulate. Code review policies and bounded complexity metrics are necessary to keep it under control.

Trade-offs

The strangler fig pattern is almost always the right choice over a big-bang rewrite for systems with more than 6 months of accumulated business logic. Big-bang rewrites routinely underestimate the tacit knowledge embedded in legacy behavior — the edge cases, the data quality handling, the business rules that exist in a form statement in a stored procedure but not in any requirements document. The strangler fig forces you to confront this tacit knowledge incrementally, in each migrated feature, rather than all at once at go-live.

The cost is extended duration. A big-bang rewrite has a defined end date. A strangler fig migration is open-ended by nature, and systems that lack executive commitment to actually completing it can end up with a facade that routes 40% of traffic to the new system indefinitely, maintaining both systems in perpetuity — the worst of both worlds.

Code

Routing Facade as ASP.NET Middleware

// RoutingFacadeMiddleware.cs
// Examines each incoming request and routes it to the legacy or new system
// based on a feature flag and the request path
public sealed class RoutingFacadeMiddleware
{
    private readonly RequestDelegate _next;
    private readonly IFeatureRouter _featureRouter;
    private readonly IHttpClientFactory _httpClientFactory;
    private readonly ILogger<RoutingFacadeMiddleware> _logger;

    public RoutingFacadeMiddleware(
        RequestDelegate next,
        IFeatureRouter featureRouter,
        IHttpClientFactory httpClientFactory,
        ILogger<RoutingFacadeMiddleware> logger)
    {
        _next = next;
        _featureRouter = featureRouter;
        _httpClientFactory = httpClientFactory;
        _logger = logger;
    }

    public async Task InvokeAsync(HttpContext context)
    {
        var featureKey = _featureRouter.ResolveFeature(context.Request.Path);
        var userId = context.User.FindFirst("sub")?.Value ?? "anonymous";

        var routingDecision = await _featureRouter.RouteAsync(featureKey, userId);

        if (routingDecision == RoutingTarget.NewSystem)
        {
            // Let the request fall through to the new system's own controllers
            await _next(context);
            return;
        }

        // Proxy to the legacy system
        _logger.LogDebug(
            "Routing {Path} for user {UserId} to legacy system (feature: {Feature})",
            context.Request.Path, userId, featureKey);

        var legacyClient = _httpClientFactory.CreateClient("legacy-monolith");
        var proxyRequest = BuildLegacyRequest(context.Request, legacyClient.BaseAddress!);
        var legacyResponse = await legacyClient.SendAsync(
            proxyRequest, HttpCompletionOption.ResponseHeadersRead, context.RequestAborted);

        await WriteProxyResponse(context, legacyResponse);
    }

    private static HttpRequestMessage BuildLegacyRequest(
        HttpRequest incomingRequest, Uri legacyBase)
    {
        var proxyUri = new Uri(legacyBase, incomingRequest.Path + incomingRequest.QueryString);
        var proxyRequest = new HttpRequestMessage(
            new HttpMethod(incomingRequest.Method), proxyUri);

        // Copy headers, excluding hop-by-hop headers
        foreach (var header in incomingRequest.Headers
            .Where(h => !HopByHopHeaders.Contains(h.Key)))
        {
            proxyRequest.Headers.TryAddWithoutValidation(header.Key, header.Value.ToArray());
        }

        if (incomingRequest.ContentLength > 0)
            proxyRequest.Content = new StreamContent(incomingRequest.Body);

        return proxyRequest;
    }

    private static readonly HashSet<string> HopByHopHeaders =
        new(StringComparer.OrdinalIgnoreCase)
        { "Connection", "Transfer-Encoding", "Keep-Alive", "Upgrade" };

    private static async Task WriteProxyResponse(HttpContext context, HttpResponseMessage response)
    {
        context.Response.StatusCode = (int)response.StatusCode;
        foreach (var header in response.Headers)
            context.Response.Headers[header.Key] = header.Value.ToArray();
        await response.Content.CopyToAsync(context.Response.Body);
    }
}

Anti-Corruption Layer: Legacy-to-New Model Translation

// Integration/LegacyCustomerAcl.cs
// The ACL translates the legacy customer model into the new bounded context's model
// The new domain's types are used exclusively beyond this boundary
public sealed class LegacyCustomerAcl : ICustomerProfilePort
{
    private readonly ILegacyCustomerApiClient _legacyApi;
    private readonly ILogger<LegacyCustomerAcl> _logger;

    public LegacyCustomerAcl(
        ILegacyCustomerApiClient legacyApi,
        ILogger<LegacyCustomerAcl> logger)
    {
        _legacyApi = legacyApi;
        _logger = logger;
    }

    public async Task<CustomerProfile?> GetProfileAsync(
        CustomerId customerId, CancellationToken ct)
    {
        // Call the legacy system through its own client type
        LegacyCustomerDto? legacyCustomer = await _legacyApi.GetCustomerAsync(
            customerId.Value.ToString(), ct);

        if (legacyCustomer is null)
            return null;

        // Translate legacy field names and enum representations to new domain types
        // Legacy uses integer codes for tier; new system uses a value object
        var tier = legacyCustomer.SubscriptionCode switch
        {
            0 => CustomerTier.Free,
            1 or 2 => CustomerTier.Standard,
            >= 3 => CustomerTier.Premium,
            _ => throw new AclTranslationException(
                $"Unknown legacy subscription code: {legacyCustomer.SubscriptionCode}")
        };

        // Legacy stores address as a flat string; new domain uses a structured value object
        var shippingAddress = ParseLegacyAddress(legacyCustomer.ShippingAddressRaw);

        return new CustomerProfile(
            Id: customerId,
            FullName: PersonName.From(legacyCustomer.FirstName, legacyCustomer.LastName),
            Email: Email.From(legacyCustomer.EmailAddress),
            Tier: tier,
            ShippingAddress: shippingAddress,
            ProfileSource: DataSource.Legacy); // Metadata so new system knows the origin
    }

    private static ShippingAddress ParseLegacyAddress(string raw)
    {
        // Legacy format: "123 Main St, Springfield, IL 62701, USA"
        var parts = raw.Split(',').Select(p => p.Trim()).ToArray();
        if (parts.Length < 4)
            return ShippingAddress.Unknown;

        return new ShippingAddress(
            Street: parts[0],
            City: parts[1],
            StateOrRegion: parts[2].Split(' ')[0],
            PostalCode: parts[2].Split(' ').ElementAtOrDefault(1) ?? string.Empty,
            Country: parts[3]);
    }
}
// CDC Consumer: sync legacy writes to the new system's data store
// Debezium publishes change events to a Kafka topic; this consumer applies them
public sealed class LegacyCustomerChangeConsumer : BackgroundService
{
    private readonly IConsumer<string, string> _kafkaConsumer;
    private readonly ICustomerSyncService _customerSync;
    private readonly ILogger<LegacyCustomerChangeConsumer> _logger;

    protected override async Task ExecuteAsync(CancellationToken stoppingToken)
    {
        _kafkaConsumer.Subscribe("legacy.customers.cdc"); // Debezium CDC topic

        while (!stoppingToken.IsCancellationRequested)
        {
            var result = _kafkaConsumer.Consume(TimeSpan.FromMilliseconds(200));
            if (result is null) continue;

            var cdcEvent = JsonSerializer.Deserialize<DebeziumChangeEvent<LegacyCustomerRow>>(
                result.Message.Value)!;

            switch (cdcEvent.Op)
            {
                case "c" or "u": // create or update
                    await _customerSync.UpsertFromLegacyAsync(cdcEvent.After!, stoppingToken);
                    break;
                case "d": // delete
                    await _customerSync.MarkDeletedAsync(cdcEvent.Before!.CustomerId, stoppingToken);
                    break;
            }

            _kafkaConsumer.StoreOffset(result);
        }
    }
}

Further Reading