Saturday, September 28, 2019

REST idempotency for long running jobs

scenarios:
- POST 202
  GET  404, 200

- POST 409

- POST 202
  GET  500
  retry POST

- POST 500
  retry POST
Due to two operations (cache, queue) there is a possibility of inconsistent state where the cache operation succeeds but the queue operation fails. One way to deal is to do a compensation operation (cache undo).
// post
response accept(request):
    item = cache.get(request.id)
    hash = hasher(request.id, request.body,
                  request.header1, request.header2, ...)
    if item == null:
        cache.insert({ id: request.id,
                       hash: hash })
        queue.send(request)
        return { statusCode: 202,
                 location: "{request.id}/status" }
    if hash != item.hash:
        return { statusCode: 409 }
    if item.statusCode == 5xx:
        updateItem(item)
        queue.send(request)
    return { statusCode: 202,
             location: "{request.id}/status" }

updateItem(item):
    item.statusCode = null
    cache.update(item)

// get
response status(id):
    item = store.get(id)
    return { statusCode: 200,
             code: item.statusCode,
             body: item.body }

// cache ttl exceeds possible retry window
cache:
    item get(id):
    insert(item):
    update(item):

store:
    item get(id):

hasher(request):
    // sha1 hash
This deals with the inconsistent state mentioned above but is unable to handle 409 conflicts in the backend. The chance of id collision is quite less. It is mainly an attack vector (attacker uses the same id but different bodies for requests in quick succession).
// post
response accept(request):
    item = cache.get(request.id)
    if item == null || item.statusCode == 5xx:
        queue.send(request)
        return { statusCode: 202,
                 location: "{request.id}/status" }
    hash = hasher(request.id, request.body,
                  request.header1, request.header2, ...)
    if hash != item.hash:
        return { statusCode: 409 }
    return { statusCode: 202,
             location: "{request.id}/status" }

// backend
process(request):
    item = cache.get(request.id)
    if item != null && item.processing:
        // drop message
        return
    if item == null:
        hash = hasher(request.id, request.body,
                      request.header1, request.header2, ...)
        cache.insert({ id: request.id,
                       hash: hash,
                       processing: true })
    // can't deal with 409 here
    if item.statusCode == 5xx:
        item.statusCode = null
        item.processing = true
        cache.updateItem(item)
    processInner(request)

// get
response status(id):
    item = store.get(id)
    return { statusCode: 200,
             code: item.statusCode,
             body: item.body }

// cache ttl exceeds possible retry window
cache:
    item get(id):
    insert(item):
    update(item):

store:
    item get(id):

hasher(request):
    // sha1 hash

No comments:

Post a Comment