Thursday, August 25, 2011

XI

Let's be sonically active. This activity challenges our skills in culminating all the image processing tricks to convert an image of a sheet music to actual sound.

Figure 1. First 2 measures of "London Bridge Is Falling Down"

Fig. 1 shows the sheet excerpt that was used for this activity. These two bars contains 4 elements: the G-clef, the time signature symbol, the quarter notes, and quarter rests. We need not concern ourselves with the first two, since I will only be extracting a simple monotonic sound and not a full blown sound complete with timbre and accent. To have a distinction between the quarter notes and rests, I used template matching by correlation that we tackled in Activity 6.

Figure 2. Thresholded image after correlation with a quarter note image

Figure 3. Thresholded image after correlation with a quarter rest image


Figure 4. Combined image of the quarter notes and rests positions

After the correlation, I thresholded the resulting images so that only the brightest spots remain. This means that I am selecting the region that is most correlated with my pattern image. Fig. 2 and 3 shows the result for the quarter notes and rests. Fig. 4 shows the combined image for notes and rests. The sequence of Fig. 4 is in reverse, which means I have to rotate it by 180 degrees to acquaint it with Fig. 1. This is due to the fftshift(). That fact is taken into account for all the sorting that I have done in this work.

From this, I have properly identified the coordinates of the spots that correspond to quarter notes and quarter rests. This showcases the ability of the code to distinguish entities found in the music sheet. In assigning the specific notes, I do a for loop for the various x ranges (height in the image) because this identifies the specific note value of our quarter notes. For the rests, the height range is not important since they don't have a frequency value, only a time value. This translates to a dependence for its L-R placement, but not the height placement, unlike the notes that needs both. To simulate a pause, I use 44 kHz since this is well beyond the range of human hearing. Sequencing the results finally generates the melody:


It is now easy to improve this code since I have already generalized its identification capabilities. Further work is needed in automatizing the x range selection for the note values.

Self-Assessment: 10/10   

Saturday, August 13, 2011

X

Binary operations can be used for size estimation, due to it's independency on details. Regions of interest (ROI) are separated by edge detection methods and morphological operations were used to enhance the binarized image. This improves the information obtained from the data.

This activity culminates various techniques that we have learned before for area estimation of "cells". This enables us to sort out possible "cancerous cells" via image manipulation.

Figure 1. Image of sample "cells"

Fig. 1 shows our test image. This is a snapshot of paper cut to circles. The main idea is to binarize this using its histograms, and then perform opening and closing transforms.

(1)

(2)
Closing (eq. 1) is done by erosion of matrix A with pattern B, then the resulting image is dilated with pattern B. Reversing the process constitutes to opening (eq. 2).

Operating these two methods on Fig. 1 yielded Fig. 2:

Figure 2. Result of performing opening and closing of the image in Fig. 1.
Note that the image was divided to 7 segments to have different ROIs.


Fig. 2 was generated by using a circle (r=10px) as erode()/dilate() pattern. Fig. 2 shows that opening is more viable for our image since we want to improve gap areas and to have a better view on the stacked cells. It also removed some deformed images, due to erosion being the first operator for openClosing though could possibly improve cut segments. To find the average area of the cells, I have selected the Opened images row 1,3 and 4. These were the ones selected since the separation between the cells are, at most well, defined. Using bwlabel() to separate closed clusters of  data, the averaged value is 520px/cell.


Finally, we tackle the 2nd image for this activity. This time, our image has 5 cells that are bigger than the rest. To separate these, we implement this process:

  1. Convert image to black and white
  2. Perform opening transform
  3. Using bwlabel(), obtain the average cell size
  4. Those clusters that exceed the average value are zeroed out.
This method returns an image that has only the cancerous cells retained, thus easily identifying them based from the original image.

Figure 3. Image a)after and b) before operating open. The pattern was a circle with radius of 14px.


Fig. 3 shows the transformed image. The transformation actually cleaned the black and white converted image, because there were some remnant white blots due to the thresholding.

Figure 4. a) Before and b) after filtering the opened image.

Fig. 4 shows the result of the filtering. The colored blobs in Fig. 4a indicate those that are removed, as can be observed in Fig. 4b. The red ones are the supposed "cancerous" cells while the green ones are those that are removed due to overlapping. The filtering was a partial success because it was able to sort out all 5 of the target "big, cancerous" cells, however, the overlapping blobs were also removed. I suggest that in the future, edge detection can be used to discriminate indivual cells among a collection of overlapped ones.

  Self-Assessment: 9/10