The most common conflict resolution strategy I see is also the worst: compare updated_at timestamps and let the newer row overwrite the older one.
Using wall-clock timestamps as your conflict resolver. It works in the demo because your laptop clock is the only clock in the room. It fails in production because APIs, bulk jobs, and database replicas each keep their own time.
It is the default because it requires zero design work, and it fails silently in production because clocks lie. APIs return timestamps in different timezones, sometimes without offsets. Bulk backfill scripts refresh every timestamp they touch. A scheduled import that re-imports unchanged data can stamp every row with a "newer" time, obliterating human edits made minutes earlier in the other system.
If your conflict strategy is "whichever timestamp is larger," you do not have a strategy. You have a lottery.
The right way to handle conflicts is to decide who wins before the conflict happens. I use three approaches, ranked from best to worst:
The edge case everyone asks about is: what if you need true bidirectional equality with no master? You do not. Someone in the business always cares more about the canonical value. Find that system, write it down, and enforce it in code.
State mirroring is the silent killer of two-way sync architectures. The beginner pattern is to fetch every record from System A, iterate through them, and upsert each one into System B. Then fetch every record from System B and upsert back into System A. This treats every row as a hot potato. It destroys custom fields that exist only in System B because they get overwritten by A's nulls. It burns API rate limits shuttling identical data back and forth. It turns every sync into a full-table rewrite.
Sync only the fields that changed, not the full record. Maintain a ledger of hashes so you know what you already wrote. This turns sync from an O(n) full-table scan into an O(changes) operation.
Before I push anything anywhere, I calculate what actually changed. If a record in A is byte-for-byte identical to the last time I looked, I skip it. If only the phone number changed, I send only the phone number. If nothing changed, I make zero API calls.
To do this you need a ledger. For toy workflows you can use workflow static data, but in production I keep a small table — sync_ledger — with columns like record_key, source_system, field_hash, and synced_at. When a record arrives from a source, I hash the fields I care about (using a deterministic JSON serialization with sorted keys), compare that hash to the ledger, and only proceed if they differ.
// Mode: Run Once for All Items
const incoming = $input.all();
const output = [];
for (const item of incoming) {
const record = item.json;
const key = record.external_id;
const fingerprint = hashRecord(record); // deterministic, sorted keys
const last = await lookupLedger(key, 'crm'); // your DB node or helper
if (!last || last.field_hash !== fingerprint) {
output.push({
json: {
record_key: key,
change_set: buildDiff(record, last?.snapshot),
new_hash: fingerprint,
full_record: record
}
});
}
}
return output;
This also stops null fields in one system from erasing rich data in the other. I have seen this single pattern reduce API call volume by more than 80 percent on the first day.
The change-set pattern forces you to answer a useful question: what do you actually care about? If System A has 30 fields and System B only stores 12, your diff engine should map and filter before the hash. The ledger stores the canonical 12-field snapshot, not the raw payload.
In my experience, half of all sync loops start with a bad write that the receiving system "fixes" during ingestion, creating a detectable diff on the next pass back. This is where validation layers earn their keep.
The scariest failure mode in two-way sync is the echo chamber. System A fires a webhook. Your workflow updates System B. System B's webhook fires. Your workflow updates System A. System A's webhook fires again. Within minutes you have consumed your API quota, filled your execution log with garbage, and possibly triggered anti-abuse rate limits that disable your integration entirely.
Never propagate a change unless you can prove it did not originate from your own sync. If you cannot name the original author, you are looking at an echo.
There are three ways to prove it, from strongest to weakest:
sync_source: 'automation' and a timestamp. Your webhook listener on System B checks for that stamp. If it sees its own footprints, it drops the event.Two-way sync is not a mirror. It is a pair of one-way gates, and each gate needs a bouncer. If you cannot name the original author of a change, you do not have two-way sync. You have a feedback loop wearing integration clothing.
In a one-way flow, idempotency means "running twice does not create two records." In a bidirectional flow, idempotency means "running twice does not create a closed loop." The standard patterns — upserts, check-then-act, idempotency keys — are necessary but not sufficient.
You need directional idempotency. An update from A→B must carry enough identity that if B echoes it back, you recognize it as an echo, not a new edit.
I build idempotency keys from business identity, not execution metadata. An execution ID changes every time the workflow runs, so it prevents duplicate charges but does nothing to stop a loop. A key like contact_4829_phone_from_crm tells me exactly what I pushed. If the support system fires back with the same phone value before any other field changes, the key lets me recognize the round-trip and drop it.
My ledger table helps here too. The composite lookup is (record_key, field_hash, direction). If that tuple already exists and the hash matches, the write is a no-op even if the target system lacks rich metadata fields.
-- Ledger table structure
CREATE TABLE sync_ledger (
record_key TEXT NOT NULL,
source_system TEXT NOT NULL,
target_system TEXT NOT NULL,
field_hash TEXT NOT NULL,
direction TEXT NOT NULL,
synced_at TIMESTAMPTZ DEFAULT NOW(),
PRIMARY KEY (record_key, direction, field_hash)
);
Before every write, I check the ledger. After every successful write, I insert the tuple. This is stricter than a simple updated_at watermark because it captures the exact shape of the data, not just the wall clock.
For APIs that support native idempotency keys — Stripe's Idempotency-Key, SendGrid's X-Message-Id — use them. Combine the business key with a directional suffix: crm-to-billing-invoice-4921. If the API call retries due to a network blip, the key prevents a duplicate charge. If the same value bounces back through a webhook, the key in your ledger prevents the loop.
Do not build this as a monolithic 40-node workflow. Split it into sub-workflows with narrow, explicit interfaces. The parent workflow should be a thin router; the heavy lifting lives in reusable blocks.
I structure production syncs as four stages:
create, update, or delete, with only the fields that changed.Parent Workflow: "Sync - CRM to Support"
[Webhook: CRM Record Updated]
|
[Execute Sub-Workflow: "Ingest - Normalize CRM Record"]
|
[Execute Sub-Workflow: "Diff Engine - Compute Changes"]
|
[IF: conflict detected?]
+-> Yes: [Execute Sub-Workflow: "Router - Human Review Queue"]
+-> No: [Execute Sub-Workflow: "Writer - Update Support Ticket"]
|
[Postgres: Update sync_ledger]
When the reverse direction runs — Support to CRM — it reuses the same Diff Engine and Writer sub-workflows. The only thing that changes is the directional parameter. This keeps behavior consistent and lets me test the core logic with pinned data without touching live APIs.
Workflow static data is convenient for a proof of concept, but it lives in memory and evaporates on restart. Sync state should survive container restarts, version upgrades, and manual executions. Use Postgres for the ledger and Redis only for short-lived suppression locks. Your sync is only as durable as the book you keep.
You do not need to rebuild everything this week. You need to stop the bleeding and introduce structure.
Open your sync workflow. Are you fetching full records and overwriting entire rows? If yes, you are running a state snapshot, not a sync. Introduce a hash check before every write. If the target already matches, skip the API call.
Pick your conflict resolution strategy per record type and document it on the canvas in a sticky note. Example: "CRM wins phone and email. Support wins priority and tags. Human review on collisions." If you cannot explain the rule in one sentence, your workflow is guessing.
Place a Code node or validation sub-workflow between every ingest and every writer. Reject malformed payloads before they touch the target system. Bad data causes corrective webhooks, and corrective webhooks cause loops.
If your target system supports custom fields, start writing a sentinel value on every update. If it does not, implement a write-hash guard or a 60-second suppression window for alternating edits on the same record.
Extract your diff engine and your writers into standalone workflows. Test them with pinned data. When the API changes, you now have one place to update.
Before you refactor, save the current JSON to Git or trigger an n8n history snapshot. Two-way sync refactors are hard to roll back without a known-good state.
Two-way sync will never be trivial. But it can be boring, and boring is the goal. Build the diff, guard the loop, and pick a winner before the conflict happens.