3Gear Systems - http://threegear.com - San Francisco, CA
Contact: jobs@threegear.com
Bringing the 'Minority Report' user interface to reality, without the gorilla-arm. We're a team of three research engineers developing fundamental, finger-precise hand-tracking and gesture recognition technology. We're looking for two more engineers to join us with experience in some of the following:
Computer graphics engineer:
- Solid understanding of the practical aspects of the computer graphics pipeline, shaders
- Comfortable with 3D math: vectors, matrices, rotations, projection, etc.
- Solid understanding of computer systems: caches, low-level optimization
- Game development background ideal, user interaction design a plus
- Comfortable with C/C++
Computer graphics / computer vision research engineer:
- Strong optimization or machine learning background
- Experience implementing algorithms on 3D geometry or 2D images
Very true. I work at a company building an alternative gestural input device (http://threegear.com). Here's how we have tried to address your points.
1. Gorilla arm -- keep your hands low. We support tracking and interactions literally 1cm above the keyboard / desk. We're mounting the camera above the monitor to achieve this.
2. We use gestures with built-in physical feedback. For instance, our click mechanism is a "pinch" which brings the thumb and index finger tips together. You can "feel" the physical touch event between your fingers when you trigger a command.
Actually, it's quite possible to build a "clicker" on the Kinect. It involves mounting the sensor from above, and building completely different software that tracks the hands and fingers well.
video of hands (fingers) doesn't seem like the hard part here though (relatively speaking, I'm sure it's plenty hard). The hard part is recognizing the input the user wants. Tracking the finger is easy. Figuring out when that finger motion is a click is the hard part.
The video on your website is significantly more impressive than the linked youtube video. That being said, I still didn't see any "clicking". Perhaps I missed it.
A big problem with the LEAP is that there isn't an effective way to click / select something. Pushing forward with your index finger isn't very accurate when the finger tip is also controlling the position. Hence, you always seem to miss where you intend to click. Good selection is a pretty important piece of almost any useful application.
Disclaimer: I work at 3Gear Systems (http://threegear.com), developing technology that possibly competes with the LEAP. We solve clicking by tracking the entire hand -- not just the straight finger.
This is very interesting, doubly so because at my last gig (retailnext.net) we were looking into shopper tracking via ToF technology.
Fine gestures like clicking do indeed work terribly on the Leap. Their own app (Touchless) uses a difficult "poke" gesture for clicking vs. some other coarse and easily-detectable gesture.
The one thing that Leap has going for it is though is that it is small and positioned under the wrist, which means that installation is super-easy, and at some point it will be built into some laptops and keyboards. This means it will likely do better in the consumer space. (assuming they can fix all the bugs, that is)
However, if your technology can detect more gestures robustly, it will do WAY better in professional environments where ease of installation is not such a big deal (e.g the surgery room, animation studios, etc). I'm sure you already know this, but just typing out my thoughts :)
A couple medical device companies and a hospital are evaluating our system right now. :-)
We're actively working on supporting smaller / shorter-range sensors as well. You probably already know that in addition to the Kinect, a lot more depth cameras are on the market now: PrimeSense's Capri, SoftKinetic, PMD, Inuitive Tec, etc. All of these companies have introduced gum-stick sized sensors that can be embedded in a laptop or monitor.
I actually had no problem with clicking accurately. Instead of pushing "forward", I twitched my finger downwards, like you would normally do when clicking a mouse. My problem is fingers would randomly appear and disappear, despite the fact my hands hadn't moved (or at least not enough for me to notice they were moving). I attempted to play cut the rope and was completely unable to get it to reliably track a single finger.
When I first arrived at MIT, I was handed a book on "How to Get Around MIT." I was impressed with the section on hacking, which included the following story about the Harvard-Yale hack:
"DKE has tried to hack the game before, most memorably in the late 1940s when they buried explosive cord in a pattern that would spell out "MIT''. Unfortunately, Harvard discovered the hack and set up a trap. They arrested several students wearing coats lined with batteries. A dean, who had been informed about the hack after the arrest, went down to bail the students out. He pointed out to the detective that the battery-lined coats were only circumstantial evidence. At this point the dean opened his own battery-lined coat and declared "all Tech men carry batteries.''"
My point is that MIT presents itself as a place that defends hacking, and it has at least been lenient in the past.
It's hard to specify our exact workspace, because it's the intersection of the two camera frustums. Here's an approximate estimate: 2 feet wide x 1.75 feet deep x 1.25 feet high.
What's great about the Kinect is that it lets developers go after "3D computer vision" problems rather than "2D vision."
There's a wealth of techniques from computer graphics on dealing with 3D point clouds, whereas even basic things like background subtraction are still hard (and not completely robust) in the 2D vision world.
I've just replied to the parent post with some discussion about how we differ in terms of features and technical approach.
We're also certainly cognizant of the price of our SDK. We hope that not too far down the road, we'll be able to spec out some cheaper hardware for our users.
It's indeed similar, but we think there are three important differences.
Because our cameras are top mounted, our system is more comfortable / ergonomic. You don't have to lift your hands very high to interact, and your arms are always supported comfortably by the desk. We can enable convenient interactions right above the keyboard or even turn the desk itself into a touchscreen. Basically, our system is designed to be comfortable enough to use all day.
The second difference is more speculative, because it's hard to tell exactly what the Leap can or cannot do based off their video. Our system captures the entire hand rather than just the finger tips. This lets you use more natural gestures. For example, you can spin a virtual object by rotating your hand as if you were holding it (like in the sword-waving example in our video).
Finally, we're starting with commodity hardware you can get at immediately. It may look a little clunky right now, but we're experimenting with different cameras and different frames right now. Further down the road, it'll be a monitor / laptop clip-on or even built-in. Today, we're just releasing an SDK to let developers get started.
Thanks for the feedback! All of us at 3Gear spend more time than we should reading HN.
I know from certain sources that the next Kinect version is going to be at least as accurate as the Leap and probably moreso. Curious if your project will adapt to providing a Mac/Linux compatible API layer for the new hardware.
We're looking forward to working with all the new 3D camera technology coming down the pipe, and as soon as we have the resources to do so, we'll start supporting Mac and Linux too.
Bringing the 'Minority Report' user interface to reality, without the gorilla-arm. We're a team of three research engineers developing fundamental, finger-precise hand-tracking and gesture recognition technology. We're looking for two more engineers to join us with experience in some of the following:
Computer graphics engineer:
- Solid understanding of the practical aspects of the computer graphics pipeline, shaders
- Comfortable with 3D math: vectors, matrices, rotations, projection, etc.
- Solid understanding of computer systems: caches, low-level optimization
- Game development background ideal, user interaction design a plus
- Comfortable with C/C++
Computer graphics / computer vision research engineer:
- Strong optimization or machine learning background
- Experience implementing algorithms on 3D geometry or 2D images
- Solid understanding of computer systems
- Research experience ideal
- Comfortable with C/C++
Contact: jobs@threegear.com