Friday, March 27, 2015

rank for any x in a stream of numbers

Use a binary tree and store size of left subtree.

Calculate rank with dfs as:
x < node.value: // do nothing
x > node.value: + 1 + node.leftsize
x = node.value: + node.leftsize
//example
     43
  21    61
10  30 50 70
When x is not present, rank is still calculated properly.
i.e.
get-rank(5.5) = 5
get-rank(4.5) = 4

For duplicates, we break out on first occurrence.
node:
    value
    left
    right
    leftsize // size of left subtree (excluding this node)

root = null
process-stream(stream):
    foreach x in stream:
        insert(root, new node(x)):

insert(node, x):
    if x == null:
        throw
    if node == null:
        root = x
        return
    if x <= node.value:
        node.leftsize++
        if node.left != null:
            insert(node.left, x)
        else:
            node.left = x
    else:
        if node.right != null:
            insert(node.right, x)
        else:
            node.right = x

get-rank(x):
    rank = 0
    node = root
    while node != null:
        if x < node.value:
            // no change in rank
            node = node.left
        else if node.value < x:
            rank += 1 + node.leftsize
            node = node.right
        else: // x == node.value
            rank += node.leftsize
            break
    return rank

[Hat tip to ctci]

No comments:

Post a Comment