First, let me say that this is a very neat project for a high schooler before what I say next because I'm not trying to downplay what he did. He figured out how to overlay an image in realtime over video, which is neat.
However, this is only a small part of what is actually done to present the final image. The more complicated part is the computer vision which allows for things like composing the image so that the swimmers are always above the labels (i.e. the superimposed labels appear to be at the bottom of the pool). They also are using computer vision to recognize where the labels should go, rather than some kind of positional mount on the camera as he suggests. The image is processed through a computer which is capable of recognizing the lane lines, the boundaries of the pool, and the bottom of the pool. This way, when the camera moves the image is updated automatically.
Again, I'm not trying to downplay what was done here, it is a very nice hack.
The NFL strictly uses positional cameras. The first down markers, ads, and other on-field graphics are only available from specific views. It's actually a very low-tech approach.
College and NFL football typically has 3 to 4 stationed cameras up at ~30 degree angle. These cameras are bolted down and do not move from their positions. They are able to transmit their zoom, angle, and pitch very precisely which allows the production team to know exactly where they are pointing at all time. During the transmission they can choose the exact location for the first down markers, etc., and all the high-angle cameras will add it in. To do the overlay they choose a swatch on the field to use as a green-screen. They only trouble that arises is when a teams uniforms is very close to the field color.
They don't use green screen. Before the game with the field empty they do a pan and scan of the entire field from every camera. That gives the system a baseline for change detection that works regardless of team colors.
By 'green-screen' I meant that they take a small snapshot of the field and use those colors as a replacement. A scan before the game would do them very little good, especially in open-air stadiums where game conditions can change immediately.
Well apparently it does them a lot of good, because that is what they do. Check out the Wikipedia article on First and Ten I posted further up in the thread.
However, this is only a small part of what is actually done to present the final image. The more complicated part is the computer vision which allows for things like composing the image so that the swimmers are always above the labels (i.e. the superimposed labels appear to be at the bottom of the pool). They also are using computer vision to recognize where the labels should go, rather than some kind of positional mount on the camera as he suggests. The image is processed through a computer which is capable of recognizing the lane lines, the boundaries of the pool, and the bottom of the pool. This way, when the camera moves the image is updated automatically.
Again, I'm not trying to downplay what was done here, it is a very nice hack.