September 4, 2025

Fortune 500 Logistics Provider — Tender Orchestration Gateway

Tender orchestration gateway on Azure — reference architecture

Summary

Built an API-led Tender Orchestration Gateway on Azure APIM + AKS that normalizes partner tender requests, returns immediate acknowledgements, and delivers final results asynchronously via secure callbacks. Per-customer mappings are externalized (no code changes), and onboarding is governed through subscription keys and OAuth2 — enabling faster partner activation and steadier fulfillment.


Problem

  • Partner/aggregator payloads varied widely; small changes broke integrations and created tender fallout.
  • Some partners required async response patterns; teams lacked a clean way to acknowledge immediately and deliver results later without timeouts.
  • Onboarding new partners was slow: each needed subscription keys, customer-specific mappings, and environment-specific endpoints.

Solution Mechanics

Primary pattern: API-led orchestration (Java + Spring Boot on AKS).

  • Entry & security

    • Azure API Management front door with OAuth2 client credentials and subscription key headers, enforced per environment (external managed gateway).
    • Immediate ACK on tender create/cancel; long-running work shifts to async flow.
  • API Orchestration Layer (AKS / Spring Boot)

    • Tender Gateway: validates request against published OpenAPI, routes create/cancel, and correlates responses.
    • Mapping Engine: loads per-customer Jolt specs (request/response/error) keyed by subscription; mapping files live in Blob Storage so changes don’t require redeploys.
    • Callback Handler: exposes secure notify endpoints for final tender results; correlates and persists outcomes.
  • Downstream & data

    • Calls internal Tender/TMS APIs; where required, JSON→XML translation is applied before hitting the transportation system.
    • Azure SQL for audit, correlation, and idempotency records.
    • Azure Service Bus for reliable fan-out to CRM/analytics and for callback retries (DLQ + replay).
    • Azure Monitor / App Insights for logs/metrics across namespaces and clusters.
  • Onboarding & governance

    • Partners/aggregators are onboarded with Liable Party IDs and APIM subscription keys; requests must carry the key in headers.
    • Contract-first API; per-partner differences live in mapping specs (request/response/error) rather than code.

Diagram 1 - Context Diagram — Tender orchestration gateway on Azure

Context Diagram — Tender orchestration gateway on Azure

Diagram 2 - Sequence — Tender create/cancel with async notify

Sequence — Tender create/cancel with async notify

Diagram 3 - Config & Onboarding — Keys and mapping specs

Config & Onboarding — Keys and mapping specs


Process Flow

  1. Partner/aggregator sends tender create/cancel to Azure APIM with OAuth2 token and subscription key; gateway validates and immediately acknowledges the request.
  2. APIM forwards to Tender Gateway (AKS). The gateway validates schema, assigns a correlation ID, and persists intake metadata.
  3. Mapping Engine fetches the customer’s Jolt mapping JSONs (request/response/error) based on the subscription key; request is transformed to the internal tender format.
  4. For create, gateway calls internal Tender/TMS API; for cancel, it invokes the cancel endpoint with identifiers (e.g., SCAC + tenderId).
  5. The call flow then becomes asynchronous: final tender result is produced later by the back-end process.
  6. The back-end posts to a notify endpoint (Callback Handler). Handler verifies headers, correlates to the intake record, and saves the result.
  7. Handler publishes events to Azure Service Bus for CRM/analytics; failures land in DLQ for replay.
  8. Observability: teams review logs/metrics in Azure Monitor/App Insights and AKS logs; environment URLs follow the APIM/AKS conventions.

Outcomes

  • Reduced tender fallout via contract-first intake, enforced headers, and per-customer mapping outside code. (Proxy; mapping defects detected early vs. runtime.)
  • Predictable partner experience: immediate ACKs and async callbacks prevent gateway timeouts and replays. (Verified in environments using notify endpoints.)
  • Faster onboarding: subscription-key onboarding + Blob-stored mappings shorten partner activation cycles. (Proxy; onboarding steps codified in APIM docs and mapping procedure.)

Strategic Business Impact

  • Steadier fulfillment pipeline (Proxy): fewer dropped/corrupted tenders improve downstream planning.
  • Partner onboarding speed-up (Modeled): contract discipline + externalized mappings reduce lead time for new partners.
  • Lower support load (Proxy): immediate ACK + notify pattern cuts “where is my tender?” tickets.

Method tags: Verified (observed in env tests), Modeled (capacity/flow sims), Proxy (leading indicators: mapping errors found pre-deploy, callback success rate).


Role & Scope

Owned architecture for APIM policies, AKS services (Gateway, Mapping Engine, Callback Handler), mapping governance (Blob), SQL schema for audit, Service Bus topics/queues, and observability; aligned onboarding steps and headers with platform guidance.


Key Decisions & Trade-offs

  • API-led front door vs direct partner→TMS integration: contract stability & security at the cost of an extra hop.
  • Externalized mappings (Jolt in Blob) vs code transforms: faster changes, but requires strong versioning and tests
  • Async notify vs synchronous blocking: resilient under partner/TMS latency, but demands correlation and idempotency.
  • Azure Service Bus for fan-out/retries vs direct writes: operational safety with DLQs, traded for extra components.
  • Environment URL discipline for APIM/AKS to reduce drift across UNT/UAT/PRD.

Risks & Mitigations

  • Missing/incorrect mapping specs → schema validation in CI, sample payload tests, and blue/green mapping rollout.
  • Partner headers misconfigured (OAuth2/keys) → APIM policy checks + clear 4xx with remediation hints.
  • Callback lost or duplicated → signed callbacks, idempotent upserts, Service Bus DLQ + replay tooling; alert on gaps.
  • TMS latency/format mismatch → timeouts and retries set per adapter; JSON↔XML validation before dispatch.

Suggested Metrics (run-time SLOs)

  • Intake p95 (APIM→Gateway ACK) and error rate by partner.
  • Callback delivery success and end-to-end tender turnaround p95.
  • Mapping failure rate (request/response/error), Blob version drift incidence.
  • Service Bus: queue depth, DLQ size, retry success.
  • Environment conformance: % traffic using correct APIM/AKS base URLs.

Closing principle

Contract-first gateway, async by default. Lock payloads and headers at the edge, keep mappings outside code, and deliver results via reliable callbacks.

Ready to take your idea to the next level? Let's work together.