Well, about Wednesday night I was looking at the output of one of the sift annotated images from my vision system and I finally snapped. It just looked wrong. It had been bothering me for a while that it didn’t seem to be selecting very interesting features and taking an unreasonable amount of interest in the bedroom wall, so I broke out the example binary provided by Professor Lowe himself (the inventor of the SIFT algorithm) for comparison.
As I suspected running the two SIFT implementations against identical images yielded completely different results. Lowe’s implementation finding many more and far more interesting features than the version I had been using for months.
This discovery was simultaneously bad and good news. Bad in that it meant I had a bunch of work to do, but good in that I had finally discovered the source of all my frustration.
My new plan of attack has been to write a bunch of C# code to invoke Lowe’s binary out of process and parse the resulting key file back into my program. It’s all a bit messy really but this whole project is still just a proof of concept so I’m not too stressed about performance at the moment. There is no doubt in my mind at all that C# is magnitudes slower than C for this sort of work, but development time is significantly less and for a prototype that’s what matters.
I still have a bunch of work to do before I fully understand stereo vision and the associated camera calibration requirements but I am getting there.