Monday, June 27, 2016

find missing integer(s) with limited memory

Find the missing integers in 32k integers with no duplicates (range 0..32k) and with 4KB available memory?

With only 1 integer missing:
// using sum(0..n) = n*(n+1)/2, and overflow optimization
(a):
    n = a.length()
    for x in a:
        sum += x
    missing = 0
    // to avoid overflow
    if n &1 == 0:
        // n even
        missing = (n/2)*(n+1) - sum
    else:
        // n+1 even
        missing = (n)*((n+1)/2) - sum
    return missing

32k integers = 2^(10+5) integers
bool[] size = 2^(10+5) x 1B = 32KB
bitset size = 2^(10+5) KB / 2^3 = 2^12 KB = 4KB
bitset:
    bool has(x):
    set(x):
    clearall():

// 1 pass
find(file, max = 2^(10+5)):
    list missing = new
    bitset = new bitset(max)
    using stream = create(file):
        while stream.hasnext():
            x = stream.next()
            bitset.set(x)
    for x = 0, x <= max, x++:
        if !bitset.has(x):
            missing.add(x)
    return missing

With 2KB available memory?
// 1+ passes
find(file, max = 2^(10+5), available = 2^(11)):
    list missing = new
    bitset = new bitset(chunksize)
    chunksize = available << 3      // 2KB = 2K x 8 bits
    passes = (max / chunksize) +1   // +1 is needed
    for pass = 0, pass < passes, pass++:
        start = pass *chunksize
        end = start +chunksize
        using stream = create(file):
            while stream.hasnext():
                x = stream.next()
                if !(start <= x < end):
                    continue
                bitset.set(x -start)
        for x = 0, x < chunksize, x++:
            if !bitset.has(x):
                missing.add(x +start)
        bitset.clearall()           // cannot new as limited space
    return missing

XOR method: For only 1 integer missing in n-1 integers in range 1..n and no duplicates.
// xor method
find(file, max = 2^(10+5)):
    xor1 = 0
    using stream = create(file):
        while stream.hasnext():
            x = stream.next()
            xor1 ^= x
    xor2 = 0
    for i = 0, i <= max, i++:
        xor2 ^= i
    missing = xor1 ^ xor2
    return missing
[Hat tip to PC, GFG]

No comments:

Post a Comment