Tuesday, May 31, 2005

Through the "eyes" of my program

I finally have results of my program. It can now detect "major" foregrounds, by learning the background. In the image below, what you see are three images: the backgroud (Top left), the foreground (the colored balls) on the background (top right), and what the program has detected (botton left).

Right now, i am using a single Gaussian distribution for the learning function. The results are not that great on real humans walking in the scene. I might need to use a more complicated Gaussian distribution. Thinking, that i might implement the background subtraction technique by Yamada et al. They use intensity (which is a linear function of R,G and B) of each pixel to plot the single Gaussian distribution, and henceforth, compares the Gaussian model to the intensity of each pixel in the image where it searches for the foreground, to differentiate between the foreground and the background.

Monday, May 30, 2005

Selecting multiple files using an open file dialog

Its a pain to program using the MFC (Microsoft Foundation classes). Even people at microsoft use the Win32 API. MFC is a wrapper for Win32 API, and it is meant to be easy to use. I haven't really had the chance to look at the Win32 API.

My task was to implement a multiple select open file dialog. I found this forum post on the net, this is the simplest way i have found so far. Other implementations involve extending the CDocManager class which can get quite complicated, especially if you are not familiar with MFC.

Earlier, I was unable to properly construct a multiple file open dialog. My file open dialog would only open a limited number of files (usually 5-6). I soon realized that it was limited by the size of CString. Including a TCHAR, which is simply a pointer to a string, much like char* solves the problem. However, the only difference is that, if you compile a TCHAR using a uni-code compiler (compiler options set to uni-code), you end up with a wchar_t (used for i18n). MSDN's documentation on TCHAR is here.

GNU MP - arithmetic without limitations

GNU MP (a.k.a GMP) is a library written in C for arbitrary precision arithmetic. I was able to compile version 4.1 of GMP successfully in the visual c++ 6.0 environment, using its cl.exe compiler.

I followed instructions that i found in this site. You also need to download the core library from that site, where you will find patch files (such as .dsw) that lets you open a project in visual c++, and from there on you compile. To link your application program to the library, you need to include the header, and some library files that are available through a download of the static GMP library (see site).

Unfortunately, the c++ wrapper wouldn't work. I was disappointed, it wouldn't compile. However, it is safe to call the C functions from C++ in GMP. I attempted to overload the arithmetic operators, by defining my own wrapper class, that wraps around a GMP float. However, i abandoned my effort, and am currently using the mpf_xxxx functions. Its quite cumbersome to use GMP's functions. Operator overloading might be required as the number of calculation grows.

links:
GNU MP's official site
GNU MP's documentation - priceless if you are thinking of integrating GMP into your application.

Saturday, May 28, 2005

Stuck with computations that dont fit

Its been good progress lately. I have implemented most of the code that was required to open multiple images. Spent some time trying to implement the multiple file open dialog in MFC. It was a pain. There are different implementations on the web, such as the ones where they extend the CDocManager class, and write their own methods.

However, I came across a much simpler version at: click here and implemented that. However, it uses a CStringList to store the filenames. This seems to be limited in terms of the file paths it can store. A reader posted that it can store at an average 40 files, depending on the file name. An alternative solution to this is using TCHAR. I haven't implemented that yet. So my multiple file select dialog is still limited in functionality.

I am able to read pixels off multiple images, which was a goal i had set for the weekend. I think i have achieved that goal. I also created a GaussianModeler class, that creates a Single Gaussian distribution out of a list of Red/Blue/Green colors. So a single gaussian model for each of the color channels. The constructor reads off from a vector, which i also spent some time implementing. It was a tough time figuring, how to iterate through a vector given that you only have a pointer to the vector - you use an iterator.

Right now, i am stuck trying to accomodate the numbers that arise in the computations that my program performs. I require data types that fit in some 20-30 digits. the maximum that an unsigned long int can hold is 0 to 4,294,967,295. After asking haseeb, he pointed out to me a few unlimited precision arithmetic math libraries that allow unlimited digits.