I have been working on a fun project for a while: I am building a robot that makes a jigsaw puzzle. Earlier, I wrote a blog post about the overall idea and one about the first step (detecting a jigsaw piece from a picture).  Starting from a photo of a jigsaw piece (see below), I now have a boundary (in green) around the piece.


Initially, I expected that I would be able to start matching parts of boundaries to see what sides of what pieces fit together. However, working with the boundary was very tedious. The boundary is an array of pixels, and I don’t know exactly which stretch of pixels correspond to which side of the piece. As a result, I kept having to shift the boundary when comparing it to other boundaries, and that became a bit annoying. So, plan B: let’s find out which pixels belong to which part of the boundary

The algorithm

To illustrate what I am looking for, I guessed the exact pixels of each boundary and colored the four sides of the piece I showed earlier. The top (T) boundary is blue, the right (R) boundary green, the bottom (B) boundary red, and the left boundary (L) cyan.


Prior to starting I had several rough ideas that I could work with:

  1. For each pixel on the boundary find the closest edge of the image, and assign the appropriate label (T,R,B,L). This won’t work near the corners of the image.
  2. Look for sharp turns in the boundary in a relatively small section of it. This indicates either a corner or one of the turns near the ‘Male’ protrusions and ‘Female’ inlets. Problem: my boundary is not very accurate, especially near the corners (they get rounded a bit).
  3. Perhaps I can use the concept of ‘normal vectors’ to find which direction a section of the boundary is facing. This is problematic for the inlets and protrusions.
  4. Combine 1 and 3?
  5. Calculate the distance to the center of the piece for each pixel, perhaps plot these in a graph, and find the peaks. The areas between peaks should indicate where TRBL are.

Out of these, the last one seemed the most likely to succeed. The first thing I did is think about what the ‘center’ would be. Is it the center of the bounding box? Is it the center of gravity of the boundary? The center of the largest inscribed rectangle? To keep things simple I used the center of the bounding box, i.e., the average x- and y-coordinate of the points on the boundary. Next I transformed the boundary to polar coordinates (more on this later) and plotted the ‘rho’-coordinate (the distance to the center) and marked out where the boundaries start:


The boundaries indeed start at certain peaks and it is also possible to distinguish the inlets (deep dips) and the protrusions (‘W’-shapes). Perhaps you can imagine that a straight edge of a piece is also a dip, but much shallower than an inlet. Looking at the figure above, the distance-based approach to finding TRBL seems promising, and all I need to do is separate the male protrusions (large peaks in the figure) from the corners of the piece (smaller peaks in the figure). I’m a bit reluctant to separate them based on distance from the center, because I may have another jigsaw piece where the distance to the tip of the male protrusion is similar (or even smaller) than the distance to one of the corners. However, if I assume that the picture of the jigsaw piece is not rotated too much, then the corners of the piece will be closer to the corners of the picture than the male protrusions are.

Before trying the approach outlined above, let’s summarize it:

  1. Take the boundary, find the center of its bounding box, and substract the center from each point on the boundary.
  2. Transform the points on the (centered) boundary to polar coordinates.
  3. Find all the peaks by inspecting the derivative.
  4. Each peak corresponds to a point on the boundary. Then, for each corner of the picture, find the peak/point that is closest to it.
  5. This yields four points on the boundary that mark the separation of TRBL and these points are thus the corners of the jigsaw piece.
  6. Bonus: I can also find the type of boundary from the polar coordinates: (M)ale protrusion, (F)emale, (S)traight (and yes, I am assuming that those are the only possibilities).
  7. Rotate the boundary so that the first element on the boundary is the left-corner of the top boundary (remember how I started out aiming for this).

Test on six pieces

I’m going to try this procedure on several jigsaw pieces. The first two pieces are the ones we saw in the previous blog post. Pieces 3-5 are new piece from the same puzzle, and piece 6 is constructed by me to form a shape that I want to deal with but that is not available in my puzzle.


Detecting the boundary as in the previous post yields:


And applying the algorithm to detect the four sides gives:


The figure above demonstrates that algorithm does indeed find the sides of the jigsaw piece, and also identifies the correct class (FMS).

Thoughts about the center

I used the mean x- and y-coordinate as the center for my distance-based approach and it seems to work sufficiently well. However, it may turn out that the resulting corners of the piece are not accurate enough. As an example, look at piece 6 in the previous figure. The two corners of the bottom (red) boundary seem to include a bit too much of the rounded corner of the piece. This is caused by my choice of center – in piece 6 it is bit low because there is a straight edge on top and a male protrusion on the bottom. This in turn makes the red boundary just a bit longer than I would like. I didn’t pay too much further attention to this, because (as I mentioned in the previous post) the boundary detection algorithm is inaccurate near the corners of the piece, and the rounded corners should in fact be sharp. So I’ll wait with fixing stuff until I run into problems.

Thoughts about polar coordinates

Initially, before writing this post, I intended to start matching boundaries together. I wanted to used polar coordinates for that, since minor rotations of the piece on the picture could be resolved by adding or substracting from the theta-coordinate. Similarly, I could handle small differences in scale by scaling the rho-coordinate. When I got stuck and decided to look for the TRBL first, I kept the polar coordinate out of lazyness, but I could have just used Euclidian distance directly.

Also, I didn’t show the rho polar coordinate of each piece above, because I was too lazy to manually guess the corners (for coloring it). Now that I have run the algorithm, I can show it with the colors:


Note that the boundaries have been shifted so that the first index is the start of the top (blue) boundary. Furthermore, pieces 3-6 show shallow dips that correspond to the straight edges.

Thoughts about the process so far

I realize that I am making assumptions during the process, that there is not much testing yet, and that I do not have a complete view of the final solution. Many things will still go wrong, and there likely are examples where my ideas so far will fail. Also, I wouldn’t be surprised if I find simplifications to previous steps in the future. I like approaching algorithmic challenges in a quick, dirty, and approximate manner. This teaches me a lot about the problem, and avoids that I get overwhelmed with the scale and complexity of the problem. Also, I would be perfectly happy if I end up with a robot that can make one particular non-trivial puzzle, even if it fails on other puzzles.

Closing thoughts

So that’s another complication fixed, without making any too restrictive assumptions. The next step is to start matching parts of boundaries together so that I can say how well two edges of two pieces fit together. Supercool! It will take me some time though, because I expect that I will also have to revisit some parts of the process so far.