A new server, a new look and a return to regular service after a bit of a hiatus from blogging.
I’ve returned to my computer vision project with a vengeance and am close to having the first part working. This includes feature location and tracking between frames, 3D reconstruction and ego motion estimation. The final piece of the puzzle is my old nemesis – Non-linear Least Squares Minimisation. For this I am using the levmar library implementation of the Levenberg-Marquardt algorithm. So far I have a partial solution in that it works well for translations, but not so well for rotations so I need to do a bit more work on that yet. Once that’s sorted the last piece of the puzzle will be implementing a kalman filter to merge the odometry data from the wheel encoders with the ego motion estimation from the vision system.
I’m simultaneously resurrecting AMI (my robotics research platform) and am working on a few upgrades for that. Getting the environment up and running was easy this time only a few hours to get the cross tool chain built and various other tools sorted and I was loading s19 records into the 68332 and seeing “Hello world!”.
I’m off to TechEd next week so that should be interesting!
On the train on the way back from a weekend in Stratford-upon-Avon and decided I have been remiss in neglecting my blog for so long. I often get overwhelmed by the pace of life in London and things like blogging seem to quickly drop of the list of priorities, however the other day while reading back over some of my past entries I realised that it is an important record for me as well as a way of keeping in contact with distant friends and family.
Anthea and I have certainly been very busy this year, and when I reflect back on what we have done this year I realise it has probably been the most social year of our lives, combined with the most travelling we have ever done – and for me at least – one of the most challenging working years of my career. I am very proud of what we have achieved this year and although I am very tired and desperately looking forward to our Christmas break I wouldn’t have had it any other way. My goal now is to try and figure out how to fit more into next year!
I have been neglecting my vision project over the last few months. Making small progress by picking at it whenever I’ve had some spare time. What hasn’t helped is that a couple of months ago I decided to build a Mythtv box to record the hockey (since most of it is screened in the middle of the night). This has turned out to be a far far more challenging task than I anticipated and has consequently stolen time from my other projects. The good news is that the end is finally in sight and I will hopefully have a stable system running in the next week or two.
I have spent some of this weekend reading the chapter on Structure Computation in my computer vision text. Essentially working out how to reconstruct a 3D scene from stereoscopic information. I think I understood most of it on first reading – which testifies to the fact that a lot of the revision work I have been doing on linear algebra has paid off. My plan is to implement a naive algorthim first, and once that is working refine it with an optimal (and much more challenging) solution. It is still my hope that once I have the reconstruction phase working the rest of the project will progress along very quickly. I am still amazed at how difficult the elementary concepts of computer vision are. I really had no idea of what I was getting myself into when I started this project.
That’s me! I just can’t seem to get into the habit of blogging on a regular basis. Quite a bit has happened since my last post. I’ve just delivered a pretty major project at work which went well I think. It was a massive amount of work though, and really draining long days which has left me desperately in need of a holiday. Luckily Anthea and I have the next two weeks off to trip around northern Italy! We are both really looking forward to it and have been hearing amazing stories from our friends about places to visit.
I have also returned to my computer vision project. I got talking about it to a mate a few weeks back and found myself getting really enthused again. Of course the reality of the situation is that it is still bloody hard work, but I am taking a much more systematic approach this time and taking more time to try and fully understand the math before coding. My first goal is to reliably find the fundamental matrix. Once I have that nailed, 3d reconstruction shouldn’t be too hard.
I got another LG 19″ monitor this week so I now have a cool dual head setup. I’ll take a pic and post it sometime soon.
So it turns out this 3d reconstruction stuff is a lot more complicated than I initially thought. I am trying to get the essential matrix for my camera system, however I am having real trouble getting something that makes sense. I have code that produces an output, it’s just wrong.
It would help if I had a really good text that explained this stuff in baby language so a math retard like me could understand it but unfortunately I don’t at the moment so I am doing a lot of googling. I have ordered Multiple View Geometry in Computer Vision from Amazon as this seems to be the authoritative text, however its got a 4-6 week delivery time.
Looks like this could take a while.
Well, I was really hoping that my next post regarding the my vision project would be to announce that I have phase one working correctly. Alas that is not the case. After some hard debugging over the last couple of weeks I found and fixed a number of bugs and problems.
I am now much more confident in my matching between stereo images, and also in my interframe matching. I rewrote this stuff to be much more in line with the work of Se et al. During the testing process I found a major bug in my kd-tree range search implementation which must have been having a major impact on the number of good quality matches I was getting.
I find it quite difficult to test this stuff as I can’t think of a good way of automating the testing of the quality of matching between images. At the moment I simply annotate the images with the SIFT feature locations and then check for matches by eye.
I got really excited about a week ago when I thought I have finally cracked it and was getting reliable interframe matching and ego motion estimation. In fact I still think that for the most part it is working correctly. However, when I checked the 3d world coordinates the system was calculating for the observed features I realised they were completely wrong. Something that was in reality 1 meter away from the center of the camera system was registering as being nearly 3 meters away, and something that was in fact over 2m away came back less than a meter. I think the problem is in the way I am calculating the disparity and I am not taking all the camera intrinsic and extrinsic parameters into account properly.
To that end I am now working on calculating the essential matrix from the fundamental matrix, and then I should be able to much more accurately calculate the relative position of the observed features.
Time will tell.
Tonight I managed to get a clunky little UI working. It took a bit of work as it calls out to a C library I wrote via interop. The UI was designed using Glade and uses the Gtk widget set. It’s so easy to build decent GUI applications using the Glade/Gtk/mono combination in no time at all. If I had a choice I would love to develop in Gtk instead of winforms in professional life too!
So, here’s the screen shot:
The text you can see next to the buttons is the Fundamental matrix calculated for the chess board points found in the stereo image pair. The UI needs a bunch more work, but it’s really only a test harness and a debugging tool for me while I get the application working properly. Once the system is put on a real mobile platform it won’t have any GUI interface.
The next step is to test if the epipolar constraints I was talking about in my last post actually improved the accuracy of the matching process.
Anthea and I are off to Germany tomorrow to visit some friends and see some Christmas markets. Should be a great time and hopefully I might find some good photo opportunities too.
The epipolar geometry turned out to be much harder than I anticipated. I now have enough of it done to be able to check the epipolar constraint successfully. I am now in the process of reworking my stereo matching algorithm to include the epipolar constraint along with disparity, orientation, scale and uniqueness constraints. I’m hoping this will improve my matching algorithm and reduce any false positives. That in turn should improve my localisation accuracy as there will be fewer outliers and less error to minimise.
It’s been pretty hard to find time to work on this stuff recently as we have been so busy socially. I guess it’s that time of year with Christmas parties, catching up with friends etc. We are going to Germany this weekend and are off to India for two weeks from the 25th of Dec so I won’t really get time to do any decent hacking on this for the next month or so. I am really hoping that I have enough of the pieces in place that I’ll be able to make some good progress in the new year and actually be able to report some reasonable results.
I have been doing a little bit of reading about FPGA design lately and am getting quite excited about that stage of the project. I expect that will be immensely difficult but should be quite interesting too.
I’m finally making progress on my Vision project again. After spending a couple of months trying to figure out how to do nonlinear least squares minimisation I think it is now working. I still have quite a bit of testing to do before I am sure it is giving sensible results but so far it looks good. I couldn’t have gotten this working without the tireless help of Robin Hewitt who (via the Seattle Robotics Society mailing list) coached me through the process.
I have now returned to the beginning of the project again. Before I go any further I have decided to rework what I have done so far. My understand of computer vision concepts has come a long way since I started this project and I am sure I can improve my original code.
To that end I am writing a camera calibration application that will more accurately calibrate my camera system. This program will be able to calculate the intrinsic parameters of each camera and save and load these parameters from file. It will also be able to calculate the essential matrix of the stereo system and hence will be enable me to far more accurately match SIFT features as I will be able to include the epipolar constraint which I am currently not using.
I am hoping this calibration work will have a drastic improvement on the accuracy of the system, but only time will tell I guess.
Finally managed to get an hour or so of hacking in tonight. I’m trying to figure out how to use the nonlinear least squares minimisation in the GNU Scientific Library. It’s pretty tricky and I think it may take a while before I get it sussed out.
I’m hoping to get some time to finish hacking on my changes to the SourceEditorDisplayBinding class in MonoDevelop. This will add the drop down select boxes for jumping to a method definition in code.
Maybe I might get some time for this on Sunday…
I really need to post more often!
I now think I understand where the errors come from that I discussed in my last post. From what I have read, I am trying to use SVD as if I am doing a rigid body transformation – which I’m not. The fact that I am using a linear method to try and find errors in a nonlinear system means that I am getting a less exact answer than I would like.
As a result I’ve spent the last week trying to understand how nonlinear least squares minimisation works. With the help of some great people on the Seattle Robotics Society mailing list I think I am now getting a handle on it. The next step is trying to figure out the practical application of the mathematical theory to my work. This mostly entails trying to decipher how to call the appropriate functions in the GNU Scientific Library.
I also figured out that there should be no error accumulation. That was due to my misunderstanding of how the ego motion is calculated. I was transforming my SIFT feature coordinates to world coordinates one step too early and was accidentally cumulating the error over each subsequent frame.
Hopefully with the fix to the latter problem and the incorporation of a nonlinear method to calculate the ego motion I should be getting much more accurate results.
In other news, I had a great weekend watching NZ win the Tri-nations with an exciting win over Australia. Had a great BBQ at our place on Saturday night. And to top it all off, watched my Hockey team gain a convincing win in their first game of the season (GO RACERS!) 😉