Um... I don't want to rain on anyone's parade, but this isn't that insightful. It even has a name: this is called a "high pass filter". Over time, it seeks to a constant input level (the "DC bias" in the parlance) while allowing the instantaneous signal excursions (strictly: their derivative) to pass through. It's a useful gadget in both analog electronics and in software for tricks like this one.
It is not, however, an "averager". The output depends on the order in which the inputs are delivered, it seeks to an average only given constant input. Think of the output for a continuously alternating set of 1's and 0's, vs giving all the zeros first then all the ones.
Isn't this similar to a low pass filter? It differs in that the filter constant grows as time goes on, rather than having a fixed time constant.
What leads you to believe that the order of the inputs is significant? The derivation makes sense, the final form makes sense, and I tried a few inputs.
(also, a high pass filter does not necessarily settle to its input over time)
This is not the obvious solution to the problem? I used to use this method to compute test averages in my head when I was in elementary/middle school. Teachers were often willing to read off the (anonymized) scores of a test, but weren't willing to do the math and figure out the average. Also used it in college to figure out what grades I needed to stay above a 3.0 GPA, and on the SATs and AHSME to figure out my margin of safety.
IMHO, a more interesting interview question is:
"How would you efficiently compute a sliding-window average, where each number has an associated time and each time a number comes in, you want to quickly compute the average of all numbers whose timestamps fall within the last 5 minutes?"
This type of problem comes up all the time in the financial industry - I had to implement it for my last employer. The best algorithm I can think of takes amortized constant time and linear (proportional to window size) space. Anyone able to think of a better one? Wall Street wants you. ;-)
The best algorithm I can think of takes amortized constant time and linear (proportional to window size) space. Anyone able to think of a better one? Wall Street wants you. ;-)
It's easy to prove that linear space is optimal -- you can't store N bits of information using less than N bits of space.
Well, here's a naive one written in Lua. it keeps a fifo list of {timestamp, number} pairs. The fifo is implemented as a double-ended queue in a closure. The averaging function takes in the current time and a new number. Pop off the pairs that are outside of our sliding window and subtract their value from a tally. Push on our new value and add its value to the tally. Return the tally divided by the size of the queue. This algorithm takes linear time with respect to the rate of transactions (i.e. how many do we expire each call) and linear space with respect to window size (how many entries fit in our sliding window).
function create_sliding_averager (window_span, debug)
local window_span = window_span -- seconds
local tally = 0
local fifo = { tail = 1, head = 0 }
local debug = debug
return function (amount, current_time)
local head = fifo.head + 1
local tail = fifo.tail
fifo[head] = { timestamp = current_time, amount = amount }
tally = tally + amount
local cutoff, old_time = current_time - window_span, nil
if fifo[tail] then
old_time = fifo[tail].timestamp
end
if debug then
print ('cutoff: ', cutoff, 'old time:',
old_time, 'current:', current_time)
end
while old_time ~= nil and old_time < cutoff do
tally = tally - fifo[tail].amount
if debug then
print ('expiring: ', fifo[tail].timestamp, fifo[tail].amount)
end
fifo[tail] = nil
tail = tail + 1
old_time = fifo[tail].timestamp
end
fifo.tail = tail
fifo.head = head
if debug then print(tail, head) end
return 'avg: '..(tally / (head - tail + 1)), 'size: '..(head - tail + 1)
end
end
averager = create_sliding_averager(10) -- 10 second average
while true do
local rand = math.random()
local time = os.time()
print (averager(rand, time))
end
I would usually use a small library to implement (push(), pop(), peek()) on the fifo but this example will run in a stock interpreter. On my system the average for rand() hovers around .5 so it passes my initial smell test. It's hovering around 80k entries in a busyloop with a 10 second window.
This is a pretty brute-force solution. I'll see if I can think of something more elegant.
That was the algorithm I used, though of course it was in Java. It's actually amortized constant time (assuming a sensible dequeue implementation), because each number you push on can be popped off only once. If you end up popping lots of entries at once, it means that you went a long time without popping anything as the dequeue was populated.
What an amateur, the algorithm described here suffers from the same problem as keeping the running sum only in the other direction. It will stop working when the values you send it divided by the count are smaller than the precision of floats on your system...
Actually depending on the values you give it it might preform a lot worse than the running sum version.
Surely a better method would be to detect the point at which sum will overflow, and at that point halve the sum and the count (shift right)? Sure, it's still not exact, but I think it's a far better long term solution.
raganwald, you'd ask that in an interview? I think it'd be a bit tough. Though I've never had a chance to really grill a candidate -- they usually get nervous with open-ended questions.
I'm kinda notorious for stating that I don't ask questions like this in interviews. My basic feeling is that asking for any working function that computes the average of 100 numbers will eliminate 1990 out of 2000 blind submissions, and you need something (be that an interview question or a conversation or whatever) a lot deeper to differentiate the remaining ten competent people from each other.
But that doesn't mean that you won't run into someone out there asking a question like this.
Another advantage of this method is that for data where the long-running average is meaningful (i.e., not a completely bizarre distribution) it is more numerically stable.
I might be making mistake (should go to sleep), but shouldn't the alternate formula be ((average * count) + x) / ++count, that is ++count instead of count++? Then I arrive at average += (x-average)/(count+1)
It is not, however, an "averager". The output depends on the order in which the inputs are delivered, it seeks to an average only given constant input. Think of the output for a continuously alternating set of 1's and 0's, vs giving all the zeros first then all the ones.