Lesson 30 of 83 advanced

Offline-First Sync, Caching, Conflict Resolution

The senior Android differentiator: build apps that work perfectly without internet and sync reliably when connected

Open interactive version (quiz + challenge)

Real-world analogy

Offline-first is like a doctor's paper chart system in a hospital with spotty WiFi. The doctor writes everything on the paper chart immediately (Room) — no waiting for the network. When WiFi comes back, a nurse batch-uploads all charts to the central system (server sync). If two doctors updated the same patient chart offline (conflict), there's a triage protocol: newest timestamp wins for vital signs, but a human must resolve conflicting diagnoses (manual merge). The hospital never stops working because of internet problems.

What is it?

Offline-first architecture makes Room the single source of truth — all reads come from Room, all writes go to Room first, and background sync reconciles local and remote state. Push sync queues local changes for upload; pull sync fetches server changes via delta/timestamp. Conflict resolution strategies (last-write-wins, merge, manual) handle diverged state. WorkManager provides guaranteed background sync. This architecture is the key differentiator between junior and senior Android engineers in enterprise contexts.

Real-world relevance

BRAC's field operations app serves 10,000+ field workers across Bangladesh — many work in areas with no connectivity for hours. The offline-first architecture means a field worker can submit 50 work orders completely offline. Each order is written to Room instantly (UI shows it as 'pending sync'). A Room sync_queue table holds the pending POST requests. When connectivity returns, WorkManager flushes the queue with retry logic. If the server rejects an order (validation error), the order is flagged in Room and the worker is notified. Tixio's real-time workspace sync uses optimistic updates — dragging a card updates Room immediately, then POSTs to the server; if the WebSocket confirms the change, the optimistic update stands; if there's a conflict (another user moved the same card), the server version wins and the card snaps to the server position.

Key points

Offline-first principle — The app reads from and writes to local storage (Room) first — the network is an optimisation, not a requirement. UI always reflects local state. Network sync happens in the background, asynchronously. Users can create, update, and delete data with zero connectivity — changes are queued and synced when connection is available. This is the defining feature of enterprise field apps.
Single source of truth (SSOT) — Room is the SSOT — the UI ONLY reads from Room, never directly from the network response. The sync engine writes API responses to Room; Room's Flow automatically pushes updates to the UI. This architecture means the UI code doesn't need to know anything about network state — it just observes Room.
Pull sync strategy — App periodically requests all changes from the server since the last sync timestamp (delta sync). GET /orders?updated_since=1710000000. Server returns only changed records. Client upserts them into Room. Efficient for read-heavy data where the server is the authority. Used in Hazira Khata for school schedule sync — teachers pull the latest timetable on app open.
Push sync strategy — App immediately POSTs local changes to the server when connectivity is available. Changes are queued in a local sync_queue table when offline. WorkManager processes the queue when connectivity is restored. Used in BRAC field ops — field workers submit work orders offline; the queue ensures zero data loss even with hours of offline time.
Delta sync with sync_version or timestamp — Server maintains a monotonically increasing sync_version or last_updated_at per record. Client stores the last successful sync_version. On next sync, sends ?since_version=. Server returns only records changed after that version. Avoids downloading the entire dataset on every sync. Critical for 10K+ record datasets.
Optimistic updates — Immediately reflect the user's action in Room (and thus the UI) before the network request completes. If the server confirms: keep the local state. If the server fails: rollback the local state and show an error. Creates a snappy UX — the user sees instant feedback. Used in Tixio for workspace card moves — the card moves instantly; if sync fails, it snaps back with an error toast.
Conflict resolution — last-write-wins — Each record has a server_updated_at timestamp. When a conflict is detected (local version and server version both changed), the record with the newer timestamp wins. Simple, deterministic, but can silently discard valid offline changes. Appropriate for most non-critical preferences and non-financial data.
Conflict resolution — merge strategy — Merge non-conflicting fields from both versions — the server version wins for fields it changed, the local version wins for fields only changed locally. Requires field-level change tracking (dirty flags per field or operation log). Complex to implement but preserves maximum user intent. Used in collaborative document editing.
Conflict resolution — manual resolution — When automatic resolution is impossible (e.g., both server and client deleted the same record, or both changed the same financial amount), surface the conflict to the user. Show 'Your version' vs 'Server version' with a 'Choose one' UI. Appropriate for financial transactions, healthcare records, legal documents.
WorkManager for background sync — WorkManager is the only correct solution for guaranteed background work on Android. Use NetworkConstraint to wait for connectivity. Periodic sync: PeriodicWorkRequest with 15-minute minimum interval. One-time queue flush: OneTimeWorkRequest triggered by ConnectivityManager NetworkCallback. Chain work: sync → notify → update UI via LiveData/Flow from Room.
Network connectivity monitoring — ConnectivityManager.registerNetworkCallback() with NetworkCapabilities.NET_CAPABILITY_INTERNET — monitors real internet connectivity, not just WiFi association. Emit connectivity state as a Flow using callbackFlow. Combine with sync logic: when connectivity is restored, trigger WorkManager one-time sync job. Handle the initial state (app may start offline).
Sync queue table pattern — Create a pending_sync table in Room: (id, entityType, entityId, operation [INSERT/UPDATE/DELETE], payload JSON, attempts, created_at). When offline, write changes to both the entity table (for UI) and the pending_sync table (for later upload). WorkManager processes pending_sync on connectivity, retries with exponential backoff, marks entries as synced on success.

Code example

// 1. Room entity with sync metadata
@Entity(tableName = "work_orders")
data class WorkOrderEntity(
    @PrimaryKey val id: String = UUID.randomUUID().toString(),
    val title: String,
    val status: String,
    val assignedToId: String?,
    @ColumnInfo(name = "server_updated_at") val serverUpdatedAt: Long = 0L,
    @ColumnInfo(name = "local_updated_at") val localUpdatedAt: Long = System.currentTimeMillis(),
    @ColumnInfo(name = "sync_status") val syncStatus: String = SyncStatus.PENDING_UPLOAD.name,
    @ColumnInfo(name = "is_deleted") val isDeleted: Boolean = false
)

enum class SyncStatus { SYNCED, PENDING_UPLOAD, PENDING_DELETE, CONFLICT }

// 2. Pending sync queue table
@Entity(tableName = "pending_sync")
data class PendingSyncEntity(
    @PrimaryKey val id: String = UUID.randomUUID().toString(),
    @ColumnInfo(name = "entity_type") val entityType: String,
    @ColumnInfo(name = "entity_id") val entityId: String,
    val operation: String, // "INSERT", "UPDATE", "DELETE"
    val payload: String,   // JSON of the entity
    val attempts: Int = 0,
    @ColumnInfo(name = "created_at") val createdAt: Long = System.currentTimeMillis()
)

// 3. Repository — offline-first pattern
class WorkOrderRepository @Inject constructor(
    private val dao: WorkOrderDao,
    private val syncDao: PendingSyncDao,
    private val api: WorkOrderApiService,
    private val connectivityMonitor: ConnectivityMonitor
) {
    // UI always reads from Room — SSOT
    fun observeOrders(): Flow<List<WorkOrder>> =
        dao.observeAll()
            .map { entities -> entities.filter { !it.isDeleted }.map(::toDomain) }

    // Optimistic create — write to Room first, queue for sync
    suspend fun createOrder(order: WorkOrder) {
        val entity = toEntity(order).copy(syncStatus = SyncStatus.PENDING_UPLOAD.name)
        dao.upsert(entity)
        syncDao.insert(
            PendingSyncEntity(
                entityType = "work_order",
                entityId = entity.id,
                operation = "INSERT",
                payload = Json.encodeToString(entity)
            )
        )
        // If online, trigger immediate sync
        if (connectivityMonitor.isConnected()) {
            triggerSync()
        }
    }

    // Soft delete — mark for deletion, filter in observeOrders()
    suspend fun deleteOrder(orderId: String) {
        dao.markDeleted(orderId, SyncStatus.PENDING_DELETE.name)
        syncDao.insert(
            PendingSyncEntity(
                entityType = "work_order",
                entityId = orderId,
                operation = "DELETE",
                payload = orderId
            )
        )
    }

    // Pull sync — delta fetch from server
    suspend fun syncFromServer(lastSyncTimestamp: Long): SyncResult {
        return try {
            val response = api.getOrders(updatedSince = lastSyncTimestamp)
            dao.upsertAll(response.orders.map(::toEntity))
            SyncResult.Success(response.serverTimestamp)
        } catch (e: NoConnectivityException) {
            SyncResult.Skipped
        } catch (e: Exception) {
            SyncResult.Error(e)
        }
    }

    // Push sync — flush pending queue to server
    suspend fun flushPendingSync(): Int {
        val pending = syncDao.getPending(limit = 50)
        var successCount = 0
        pending.forEach { syncItem ->
            try {
                when (syncItem.operation) {
                    "INSERT", "UPDATE" -> {
                        val entity = Json.decodeFromString<WorkOrderEntity>(syncItem.payload)
                        val serverOrder = api.upsertOrder(entity.toApiModel())
                        dao.upsert(entity.copy(
                            syncStatus = SyncStatus.SYNCED.name,
                            serverUpdatedAt = serverOrder.updatedAt
                        ))
                        syncDao.delete(syncItem.id)
                        successCount++
                    }
                    "DELETE" -> {
                        api.deleteOrder(syncItem.entityId)
                        dao.hardDelete(syncItem.entityId)
                        syncDao.delete(syncItem.id)
                        successCount++
                    }
                }
            } catch (e: ConflictException) {
                // Server has a newer version — resolve conflict
                resolveConflict(syncItem, e.serverEntity)
            } catch (e: Exception) {
                // Increment attempts, will retry on next sync
                syncDao.incrementAttempts(syncItem.id)
            }
        }
        return successCount
    }

    // Last-write-wins conflict resolution
    private suspend fun resolveConflict(local: PendingSyncEntity, serverEntity: WorkOrderEntity) {
        val localEntity = Json.decodeFromString<WorkOrderEntity>(local.payload)
        if (serverEntity.serverUpdatedAt > localEntity.localUpdatedAt) {
            // Server wins — overwrite local
            dao.upsert(serverEntity.copy(syncStatus = SyncStatus.SYNCED.name))
            syncDao.delete(local.id)
        } else {
            // Local is newer — flag as conflict for user review
            dao.updateSyncStatus(localEntity.id, SyncStatus.CONFLICT.name)
        }
    }
}

// 4. WorkManager sync worker
@HiltWorker
class SyncWorker @AssistedInject constructor(
    @Assisted context: Context,
    @Assisted params: WorkerParameters,
    private val repository: WorkOrderRepository,
    private val prefs: UserPreferencesRepository
) : CoroutineWorker(context, params) {

    override suspend fun doWork(): Result {
        return try {
            val lastSync = prefs.getLastSyncTimestamp()
            val pullResult = repository.syncFromServer(lastSync)
            val pushCount = repository.flushPendingSync()

            if (pullResult is SyncResult.Success) {
                prefs.setLastSyncTimestamp(pullResult.serverTimestamp)
            }

            Result.success(workDataOf("pushed" to pushCount))
        } catch (e: Exception) {
            if (runAttemptCount < 3) Result.retry() else Result.failure()
        }
    }

    companion object {
        fun buildPeriodicRequest(): PeriodicWorkRequest =
            PeriodicWorkRequestBuilder<SyncWorker>(15, TimeUnit.MINUTES)
                .setConstraints(
                    Constraints.Builder()
                        .setRequiredNetworkType(NetworkType.CONNECTED)
                        .build()
                )
                .setBackoffCriteria(BackoffPolicy.EXPONENTIAL, 1, TimeUnit.MINUTES)
                .build()

        fun buildImmediateRequest(): OneTimeWorkRequest =
            OneTimeWorkRequestBuilder<SyncWorker>()
                .setConstraints(
                    Constraints.Builder()
                        .setRequiredNetworkType(NetworkType.CONNECTED)
                        .build()
                )
                .build()
    }
}

// 5. Connectivity monitoring as Flow
class ConnectivityMonitor @Inject constructor(
    @ApplicationContext private val context: Context
) {
    private val connectivityManager =
        context.getSystemService(ConnectivityManager::class.java)

    fun observeConnectivity(): Flow<Boolean> = callbackFlow {
        val callback = object : ConnectivityManager.NetworkCallback() {
            override fun onCapabilitiesChanged(
                network: Network,
                caps: NetworkCapabilities
            ) {
                trySend(caps.hasCapability(NET_CAPABILITY_INTERNET))
            }
            override fun onLost(network: Network) { trySend(false) }
        }
        val request = NetworkRequest.Builder()
            .addCapability(NET_CAPABILITY_INTERNET)
            .build()
        connectivityManager.registerNetworkCallback(request, callback)
        // Emit initial state
        trySend(isConnected())
        awaitClose { connectivityManager.unregisterNetworkCallback(callback) }
    }.distinctUntilChanged()

    fun isConnected(): Boolean {
        val caps = connectivityManager
            .getNetworkCapabilities(connectivityManager.activeNetwork)
        return caps?.hasCapability(NET_CAPABILITY_INTERNET) == true
    }
}

Line-by-line walkthrough

1. WorkOrderEntity includes syncStatus and isDeleted columns — these metadata fields drive the sync state machine without affecting business logic columns.
2. PendingSyncEntity stores the full JSON payload — this allows the sync worker to reconstruct the exact state of the entity at the time the change was made, even if the entity was further modified before sync ran.
3. observeOrders() filters isDeleted=true records before mapping to domain — soft-deleted items are hidden from the UI but preserved in Room until the server confirms deletion.
4. createOrder() writes to Room first (dao.upsert), then enqueues in pending_sync, then optionally triggers immediate sync — the UI sees the new order instantly regardless of connectivity.
5. flushPendingSync() processes a batch of 50 at a time — bounded batch size prevents the worker from running too long and being killed by the OS on resource-constrained devices.
6. ConflictException handling in flushPendingSync calls resolveConflict — the conflict resolution strategy is isolated in one function, making it easy to swap from last-write-wins to merge.
7. resolveConflict compares serverUpdatedAt (from server) vs localUpdatedAt (set at creation time) — if the server is newer, it wins silently; if local is newer, it's flagged for user review.
8. @HiltWorker with @AssistedInject enables Hilt injection into WorkManager workers — without this, you cannot inject repository dependencies into the worker.
9. Result.retry() with runAttemptCount < 3 gives WorkManager permission to reschedule the worker with exponential backoff — after 3 failures it returns Result.failure() to stop retrying.
10. callbackFlow with registerNetworkCallback/unregisterNetworkCallback in awaitClose — the callback is automatically unregistered when the Flow collector is cancelled, preventing memory leaks.

Spot the bug

// Find 5 offline-first architecture bugs
class BrokenOrderRepository(
    private val dao: OrderDao,
    private val api: OrderApiService
) {
    fun observeOrders(): Flow<List<Order>> {
        return flow {
            val orders = api.getOrders()  // Bug 1
            emit(orders)
        }
    }

    suspend fun createOrder(order: Order) {
        val response = api.createOrder(order)  // Bug 2
        dao.insert(response.toEntity())
    }

    suspend fun deleteOrder(orderId: String) {
        dao.hardDelete(orderId)              // Bug 3
        api.deleteOrder(orderId)
    }

    suspend fun syncFromServer() {
        val orders = api.getOrders()        // Bug 4
        dao.deleteAll()
        dao.insertAll(orders.map { it.toEntity() })
    }

    suspend fun flushPending() {
        val pending = dao.getPending()
        pending.forEach { item ->
            api.upsertOrder(item.toApiModel())  // Bug 5
            dao.markSynced(item.id)
        }
    }
}

Need a hint?

Look at the data source for observeOrders, the order of operations in createOrder and deleteOrder, the sync strategy's approach to deletion, and error handling in flushPending.

Show answer

Bug 1: observeOrders() reads from the API directly — this is not offline-first. If offline, the Flow immediately fails. The UI never shows cached data. Fix: return dao.observeAll().map { entities -> entities.map(::toDomain) } — always read from Room. The sync engine (a separate function) handles fetching from API and writing to Room. Bug 2: createOrder() calls the API first and only writes to Room if the API succeeds — this is online-first. If offline, the order is never saved anywhere. Fix: write to Room first with PENDING_UPLOAD syncStatus, add to pending_sync queue, then optionally trigger sync. The order must survive app kill even if sync hasn't run. Bug 3: deleteOrder() hard-deletes from Room before the API call — if the device is offline or the API call fails, the order is gone locally but never deleted from the server, creating a ghost record on the server. Fix: soft delete (mark isDeleted=true in Room), add DELETE to pending_sync queue, hard delete from Room only after the server confirms deletion. Bug 4: syncFromServer() calls dao.deleteAll() before inserting new data — this is a destructive full resync with no delta. During the window between deleteAll and insertAll, any observer (UI) sees an empty list (flicker). More critically, any PENDING_UPLOAD records that haven't synced yet are permanently deleted. Fix: use upsertAll() (REPLACE strategy) instead of deleteAll + insertAll, use delta sync with a timestamp, never delete records that have unsynced local changes. Bug 5: flushPending() calls api.upsertOrder() and immediately marks as synced with no error handling — if the API call throws an exception, the forEach continues to the next item but the exception is silently swallowed (or propagates and marks nothing as synced). If the API call fails transiently, the item is lost from the queue or never retried properly. Fix: wrap each item's sync in try-catch, on success mark synced, on failure increment attempts and let WorkManager retry on next execution.

Explain like I'm 5

Imagine you're a delivery driver with a notebook and no phone signal. You write down every delivery in your notebook (Room) — you never wait for signal to do your job. When you get back to the depot with signal (WiFi), you send all your notes to the office (sync). If the office already updated a delivery record while you were offline (conflict), your manager checks who wrote it down more recently and uses that version. The app works the same way — it writes everything locally first, then figures out the internet stuff later.

Fun fact

WhatsApp's offline-first architecture for message queueing was one of the technical reasons Facebook acquired it for $19 billion in 2014. Messages sent while offline are queued locally and delivered with exactly-once semantics when connectivity returns — a deceptively hard problem. The engineers who built this are now the reference point for 'how offline-first should work'.

Hands-on challenge

Design and implement a complete offline-first sync system for a task management app: (1) Room schema with sync_status and pending_sync table, (2) repository with optimistic create/update/delete, (3) WorkManager worker that pushes pending changes and pulls server updates with delta sync, (4) ConnectivityMonitor Flow that triggers immediate sync on reconnection, (5) last-write-wins conflict resolution. Identify which parts would need to change if you upgraded to merge-based conflict resolution.

More resources

Offline-first apps with WorkManager — Android Developers (Android Developers)
Network connectivity monitoring — Android Developers (Android Developers)
Offline-first Android: The complete guide (ProAndroidDev)
Now in Android — offline-first reference implementation (GitHub)
Conflict resolution strategies for mobile sync (Medium)

Open interactive version (quiz + challenge) ← Back to course: Android Interview Mastery