Translation, rotation, and scaling (TRS) are the simplest transformations of spatial coordinates. TRS, sometimes called similarity transform can be described as
| u = a0 + a1x - b1y
||or||u = a0 + s x cos α - s y sin α|
|v = b0 + b1x + a1y||v = b0 + s y sin α + s x cos α,|
where (a0,b0)T is a translation vector, s is a positive uniform scaling factor and α is the angle of rotation. Complex moments are suitable for creating TRS invariants. The normalized central complex moment of order (p+q) of the image f is defined
x=x-xt and y=y-yt are coordinates translated so the centroid (xt,yt) of the image is in the origin of the coordinates, w=(p+q)/2+1 provides correct normalization to scaling, i is imaginary unit, and xt=m10/m00 and yt=m01/m00. m00, m10, m01 are geometric moments
The TRS invariants are then constructed
We should use only complete and independent sets of invariants, creating so called basis. Special attention must be paid to recognition of symmetric object, where many complex moments are zero.
To illustrate it, we carried out a following experiment. We used a popular baby toy that is also commonly used in testing computer vision algorithms and robotic systems. The toy consists of a hollow sphere with twelve holes and of twelve objects of various shapes. Each object matches with one particular hole. The baby (or the algorithm) is supposed to assign the objects to the corresponding holes and insert them into the sphere. The baby can employ both the color and shape information, however, in our experiment we completely disregarded the colors to make the task more difficult. First, we binarized the pictures of the holes (one picture per each hole) by simple thresholding. Binarization was the only pre-processing, no sphere-to-plane corrections were applied.
Since all objects have some type of symmetry, we did not use the invariants of low orders that would be zero or almost zero, we used the following three invariants: c60c06, c50c05, and c40c04 instead. They were used as features of minimum-distance classifier. We took weighted Euclidean distance, where the weights were set up to normalize the dynamic range of the invariants. The invariants of the holes were used as representatives of the classes. Then we took five pictures of each object with random rotations, binarized them, and run the classification. This task is not so easy as it might appear because the holes are a bit larger than the objects but this relation is morphological rather than linear and does not preserve the shapes exactly. Fortunately, all 60 unknown objects were recognized correctly and assigned to proper holes.
The toy set and the feature space can be seen on the following figures.
Relevant publications by other authors: