Short summary: They used machine-learning algorithms and lots of training data to come up with an algorithm that converts moving images into moving joint skeletons.
Given the huge variation in human body shapes and clothing, it will be interesting to see how well this performs in practice.
I bet people will have a lot of fun aiming the Natal camera at random moving object to see what happens. I can already imagine the split-screen YouTube videos we'll see of Natal recognizing pets, prerecorded videos of people, puppets and marionettes.
Oh, and of course, people will point Natal back at the TV screen and see if the video game character can control itself.