CARNEGIE MELLON UNIVERSITY

15-826/10-603 - Multimedia databases and data mining

Spring 2002

Homework 2 - Due: March 14, in class.

Q1: Fractals - Koch curve [40 pts]

The Koch curve is generated recursively, as shown in the Figure below. Write a program that, at each iteration, will generate the points shown as black circles. (The white circles correspond to points of the previous iteration.
  1. [20pts] Write a program that generates points on the Koch curve, all up to 6 iterations (roughly 4,000 points). HAND IN your code (any major language is acceptable: C/C++, Perl, Python, Java, etc.).
  2. [10pts] Give their boxcounting plot, using the code at http://www.andrew.cmu.edu/~lw2j/FracDim-20020220.tar.gz or a simplified (and less capable) version of it at http://www-2.cs.cmu.edu/~christos/courses/826.S02/HW2/fd.tar.gz . If the code is hard to use, (a) try some other code from the web or (b) try to write your own or (c) try a smaller sample, and compute the O(N**2) correlation integral.
  3. [5pts] What is the slope (fractal dimension)?
  4. [5pts] Using the same code, estimate the 'Hausdorff' fractal dimension, and give the related plot
Koch curve - iteration 0 Iteration 1 Iteration 2

Q2: 'Fat' Fractals [40pts]

We want to generate 'fat fractals', as shown in Figures below: We start with a square of side 1; each side is divided into n equal pieces (n=7 in the Figure), and each side gives 'birth' to an 'island' of side 1/n, above the m-th piece (m=3 there), at distance d. Notice that, at the second iteration, every 1/n piece also gives birth to a little island.
  1. [20pts]Write a program that will read the parameters n, m, d and k (= number of iterations) and it will generate points on the periphery of the k-th iteration of such a fat fractal. Specifically, just print the end-points of each segment of side (1/n)**k. That is, for k=0, just print the four corners of the unit square; for k=1, also print the corners of each new little island, as well as the end-points of each 1/n segment on the original unit square; and so on. HAND IN your code (again, in any major language)
  2. [5pts] Run your program with n=7, m=3, d=1/n and k=4 iterations, and HAND IN the scatterplot of these points.
  3. [10pts] HAND IN the plot of the correlation integral of these points (using the box-counting method of FracDim, or equivalent)
  4. [5pts] GIVE the correlation fractal dimension of these points


Fat fractal – iter.#0 Iteration 1 Recursive rule Iteration 2

Q3: Text - string editing distance [20pts]

The string editing distance code at http://www-2.cs.cmu.edu/~christos/courses/826.S02/HW2/stredit.pl penalizes insertions, deletions and substitutions by 1 unit.
  1. [10pts] Modify it, so that the penalty for vowel-vowel substitution is 0.5, and the consonant-consonant substitution is 0.7. (The vowel-consonant substitution should still have a penalty of '1'). Let the vowels be 'a', 'e', 'i', 'o', 'u'. (If Perl is cumbersome for you, feel free to use your favorite language). HAND IN your code.
  2. Using your modified string editing distance function, plot the correlation integral for the a subset of the UNIX dictionary words at http://www-2.cs.cmu.edu/~christos/courses/826.S02/HW2/wsample
    1. [5pts] HAND IN the plot, and
    2. [5pts] report its slope (using, eg., the slope fitting code from fracDim above).

Christos Faloutsos, 2/3/2002