Another one of those "Aha" interview questions: Long-running averages

ajross · on July 30, 2008

Um... I don't want to rain on anyone's parade, but this isn't that insightful. It even has a name: this is called a "high pass filter". Over time, it seeks to a constant input level (the "DC bias" in the parlance) while allowing the instantaneous signal excursions (strictly: their derivative) to pass through. It's a useful gadget in both analog electronics and in software for tricks like this one.

It is not, however, an "averager". The output depends on the order in which the inputs are delivered, it seeks to an average only given constant input. Think of the output for a continuously alternating set of 1's and 0's, vs giving all the zeros first then all the ones.

mindslight · on July 31, 2008

Isn't this similar to a low pass filter? It differs in that the filter constant grows as time goes on, rather than having a fixed time constant.

What leads you to believe that the order of the inputs is significant? The derivation makes sense, the final form makes sense, and I tried a few inputs.

(also, a high pass filter does not necessarily settle to its input over time)

raganwald · on July 31, 2008

Bravo!

Let this be a lesson to all those who believe a CS degree should amount to nothing more than "Learn Blub in 21 Days Vocational Training" :-)

nostrademons · on July 30, 2008

This is not the obvious solution to the problem? I used to use this method to compute test averages in my head when I was in elementary/middle school. Teachers were often willing to read off the (anonymized) scores of a test, but weren't willing to do the math and figure out the average. Also used it in college to figure out what grades I needed to stay above a 3.0 GPA, and on the SATs and AHSME to figure out my margin of safety.

IMHO, a more interesting interview question is:

"How would you efficiently compute a sliding-window average, where each number has an associated time and each time a number comes in, you want to quickly compute the average of all numbers whose timestamps fall within the last 5 minutes?"

This type of problem comes up all the time in the financial industry - I had to implement it for my last employer. The best algorithm I can think of takes amortized constant time and linear (proportional to window size) space. Anyone able to think of a better one? Wall Street wants you. ;-)

cperciva · on July 30, 2008

The best algorithm I can think of takes amortized constant time and linear (proportional to window size) space. Anyone able to think of a better one? Wall Street wants you. ;-)

It's easy to prove that linear space is optimal -- you can't store N bits of information using less than N bits of space.

LogicHoleFlaw · on July 30, 2008

Well, here's a naive one written in Lua. it keeps a fifo list of {timestamp, number} pairs. The fifo is implemented as a double-ended queue in a closure. The averaging function takes in the current time and a new number. Pop off the pairs that are outside of our sliding window and subtract their value from a tally. Push on our new value and add its value to the tally. Return the tally divided by the size of the queue. This algorithm takes linear time with respect to the rate of transactions (i.e. how many do we expire each call) and linear space with respect to window size (how many entries fit in our sliding window).

  function create_sliding_averager (window_span, debug) 
      local window_span = window_span -- seconds
      local tally = 0
      local fifo = { tail = 1, head = 0 }
      local debug = debug
  
      return function (amount, current_time)
          local head = fifo.head + 1
          local tail = fifo.tail
  
          fifo[head] = { timestamp = current_time, amount = amount }
          tally = tally + amount
  
          local cutoff, old_time = current_time - window_span, nil

          if fifo[tail] then
            old_time = fifo[tail].timestamp
          end
  
          if debug then
              print ('cutoff: ', cutoff, 'old time:',
                    old_time, 'current:', current_time)
          end

          while old_time ~= nil and old_time < cutoff do
              tally = tally - fifo[tail].amount
              if debug then
                print ('expiring: ', fifo[tail].timestamp, fifo[tail].amount)
              end
              fifo[tail] = nil
              tail = tail + 1
              old_time = fifo[tail].timestamp
          end
  
          fifo.tail = tail
          fifo.head = head

          if debug then print(tail, head) end
          return 'avg: '..(tally / (head - tail + 1)), 'size: '..(head - tail + 1)
      end
  end
  
  averager = create_sliding_averager(10) -- 10 second average
  
  while true do
      local rand = math.random()
      local time = os.time()
      print (averager(rand, time))
  end

I would usually use a small library to implement (push(), pop(), peek()) on the fifo but this example will run in a stock interpreter. On my system the average for rand() hovers around .5 so it passes my initial smell test. It's hovering around 80k entries in a busyloop with a 10 second window.

This is a pretty brute-force solution. I'll see if I can think of something more elegant.

nostrademons · on July 31, 2008

That was the algorithm I used, though of course it was in Java. It's actually amortized constant time (assuming a sensible dequeue implementation), because each number you push on can be popped off only once. If you end up popping lots of entries at once, it means that you went a long time without popping anything as the dequeue was populated.

chollida1 · on July 30, 2008

> The best algorithm I can think of takes amortized constant time and linear (proportional to window size) space.

Sounds about right for what we do to compute Geometric Brownian Motion.

marcus · on July 31, 2008

What an amateur, the algorithm described here suffers from the same problem as keeping the running sum only in the other direction. It will stop working when the values you send it divided by the count are smaller than the precision of floats on your system...

Actually depending on the values you give it it might preform a lot worse than the running sum version.

noonespecial · on July 31, 2008

Isn't this dumb? Don't you just consume bits in precision instead of bits in maximum variable space?

average += (x-average)/count

As soon as count becomes large enough, (x-average)/count will equal zero in your system and you "averager" will be broken.

Seems to me you've just moved your representation bits from one side of a decimal to the other.

axod · on July 31, 2008

Surely a better method would be to detect the point at which sum will overflow, and at that point halve the sum and the count (shift right)? Sure, it's still not exact, but I think it's a far better long term solution.

andreyf · on July 31, 2008

More interesting question: how do you keep a running standard deviation?

fgimenez · on July 31, 2008

Here's Knuth's way to keep a running variance: http://en.wikipedia.org/wiki/Algorithms_for_calculating_vari...

andreyf · on July 31, 2008

Cheater.

fgimenez · on Aug 5, 2008

Technically, I didn't give away how to do the standard deviation ;)

danbernier · on July 30, 2008

raganwald, you'd ask that in an interview? I think it'd be a bit tough. Though I've never had a chance to really grill a candidate -- they usually get nervous with open-ended questions.

raganwald · on July 30, 2008

I'm kinda notorious for stating that I don't ask questions like this in interviews. My basic feeling is that asking for any working function that computes the average of 100 numbers will eliminate 1990 out of 2000 blind submissions, and you need something (be that an interview question or a conversation or whatever) a lot deeper to differentiate the remaining ten competent people from each other.

But that doesn't mean that you won't run into someone out there asking a question like this.

cperciva · on July 30, 2008

Another advantage of this method is that for data where the long-running average is meaningful (i.e., not a completely bizarre distribution) it is more numerically stable.

scott_s · on July 30, 2008

I had to solve this recently for a data processing script. I don't keep a running sum, I do keep a running count. In python:

  nums = [5, 7, 11, 9, 6, 6, 9]

  ravg = 1.0
  for i, num in enumerate(nums):
    ravg = (ravg * i + num) / (i + 1)
    print n, ravg

Tichy · on July 30, 2008

I might be making mistake (should go to sleep), but shouldn't the alternate formula be ((average * count) + x) / ++count, that is ++count instead of count++? Then I arrive at average += (x-average)/(count+1)