Home / Problem Set 11 / Enhance!

Enhance!

The questions below are due on Monday May 03, 2021; 10:00:00 PM.
 
You are not logged in.

Please Log In for full access to the web site.
Note that this link will take you to an external site (https://shimmer.csail.mit.edu) to authenticate, and then you will be redirected back to this page.

The check-in for this lab is due on Friday, 30 Apr at 2pm Eastern. The other questions are due with the rest of the p-set on Monday, 3 May at 10pm.

Code

Download the zip archive for this problem here and unzip it so you have a local copy of the files contained therein.

Table of Contents

1) Half-Toning

The picture cat_halftone.png in the code distribution is a type of image known as a "half-tone" image: it contains only pure black and pure white tones, but they conveys the illusion of containing more colors. This technique is commonly used in, for example, newspaper printing, to produce greyscale images.

To begin, look closely at the cat_halftone.png image: it consists of a pattern that repeats roughly every 12 pixels in either dimension.

Images of this form are made by adding a doubly-periodic waveform to the input image. In the olden days, this waveform was a transparency mask/screen (e.g., film negative). In the digital world, it is a "screen function" that is applied to the pixels of an input image. After applying the screen function, we threshold these values; values above the threshold become white in the output image, and values below the threshold become black.

In this exercise, we'll consider the problem of inverse half-toning: trying to reconstruct the original image from a half-tone pattern. We have included two images in this week's distribution related to this task:

  • cat_screen.png, which is an original input image with a "screen function" applied (but without thresholding), and

  • cat_halftone.png, which was created by thresholding the contents of cat_screen.png to produce a black-and-white image.

    In particular, this image was created with the following code:

    C = png_read('cat_screen.png')
    png_write((C > numpy.mean(C)).astype(float), 'cat_halftone.png')
    

We'll start by considering only cat_screen.png. If you view this image, you can clearly see both the underlying image and the screen function. We'll start by trying to invert the screening operation alone, before turning our attention to inverting the overall half-toning process.

1.1) Blur

One strategy for removing the screen function involves applying a blur to the screen.

In filters.py, we have provided two functions to create filters: circle_filter and gauss_filter. Each of these functions returns a frequency-domain representation (i.e., DFT coefficients) of a filter.

Both of these filters are imported into cat.py, where you should write your code for this lab. To start, write a small piece of code that visualizes the frequency-domain representations of each of these filters. How do the parameters to the functions affect these frequency-domain representations?

1.1.1) Applying Filters

We can apply these filters to the cat image by:

  • computing the DFT of the cat image,
  • multiplying element-wise by the result of calling circle_filter or gauss_filter, and
  • computing the inverse DFT of the result.

Try applying each of these filters to the cat_screen.png, and save the results. Try to find values of cutoff and sigma that produce the best possible results. You should be prepared to discuss these results during your checkoff.

1.1.2) Spatial-domain Filtering

We could also have applied these filters by convolving with something in the spatial domain, rather than multiplying in the frequency domain. Save images of the spatial-domain kernels we would need to convolve with in order to produce the same results from above. How do the parameters to the filter functions affect the spatial-domain representations? Does this relationship make sense?

During the checkoff, be prepared to discuss these shapes (why do they look the way they do?), as well as how they explain the differences between the results of applying the filters as above.

2) A Better Filter

While the strategy of blurring the image did remove a number of the artifacts associated with the screen function, it also blurred sharp edges in the original image (in particular, the cat's whiskers no longer show up as sharp lines!). Now let's try to do better.

Note that the screen is a very regular pattern; we see it as something that repeats roughly every 12 pixels in either dimension. How will this pattern manifest in the frequency domain?

Save an image of the DFT magnitudes. Which parts correspond to the screen pattern (as opposed to the cat)? It may be helpful to plot the DFT magnitudes on a log scale (which can be done by providing color_scale='log' as an argument to the show_dft function).

Use this idea to formulate a plan for removing the screen by oparting purely in the frequency domain. You do not need to have implemented your plan before the check-in, but you should be prepared to discuss your plan at the check-in.

3) Check-in

Before you ask for a check-in, be sure you are prepared to:

  • Show the frequency-domain and spatial-domain equivalents of the two filters from the first section (that result form calling circle_filter and gauss_filter), and explain the results of filtering the cats, including differences you noticed between the cats filtered with these two different filters.

  • Show the DFT magnitudes of the cat image, and explain which portions are related to the cat, versus the screen.

  • Describe (in detail) a strategy for removing the screen by working only in the frequency domain.

When you are ready, use the Help Queue to sign up for a check-in.

4) Removing the Screen

Now, by operating in the frequency domain directly, develop and implement a strategy for removing the screen function from the image. It should be possible to produce an image that closely resembles the original image (including the whiskers showing up as sharp lines).

In the box below, upload a single zip file including:

  • the results of applying your new filter to cat_screen.png,
  • the code your wrote to perform this filtering, and
  • a PDF file containing answers to the following questions:
    • Describe your approach for removing the screen.
    • How well did your approach work?
    • What are some benefits/drawbacks, compared to the blurring strategy?
    • Explain any artifacts that remain.

ZIP file with images, code, and description:
 No file selected

5) Inverse Half-toning

Now try both strategies (blurring, and your strategy for removing the screen specifically) on cat_halftone.png and save the results.

You'll probably notice that your method does not work as well on the half-toned image. Can you make an argument (in terms of the DFT coefficients) as to why your strategy does not work as well on cat_halftone.png, versus cat_screen.png?

Explain any artifacts that remain (as it turns out, inverse half-toning is still a decently active area of research, so unfortunately we won't be able to solve it in the course of one p-set!).

Upload a second ZIP file, including the results of applying your method to the half-toned cat, as well as answers to the questions above (why does it not work as well as it did with the plain screen function? how can you explain the artifacts that remain?).

ZIP file with images and description:
 No file selected