SECURITY and CRYPTOGRAPHY 15-827                  15 NOV 01
             Lecture #16                                 M.B.
                                                      4615 Wean


Ask Bartosz for the Directory of Musical Tunes. Show how it works. Can one
construct a similar dictionary for pictures?


Nick Hopper has a chosen challenge attack for breaking #82.
It uses the fact that from a correct guess for the permutation g and
1-digit responses to each of <a1...an A> and <a1...an B>, one can get a
linear relation between A and B (like B-A = 4). Then from the k-digit
response to <b1...bn A>, one can construct the k-digit response to <b1...bn B>.

Will cracking be harder to do if the last symbol (or two) of the challenge
is chosen at random by the user?
CHALLENGE = JA could be turned by the user into one of
JAB, JAM, JAR, JAW, ...
I'm not saying this fixes #82. I'm asking:
  Q: Does this make the protocol more secure? 
  A: No! 
Suppose the challenge from the PNC Bank is always PNC, and you give them
responses to PNCA, PNCB, etc. The quick-witted teller can use your
responses to build up information on your character to digit mapping!

Some PhonOID problems:

How can one restrict the permissible challenges so that:
1. a human can easily recognize whether a challenge is permissible or not, and
2. chosen challenge attack cannot replace one character in a challenge by
another.

In every language, digits must be pronounced differently to be
distinguishable. Are there classes of words that make for good challenges?
English digits look good. Here are digits in English, Spanish, German, and
Cantonese:
zero one two three four five six seven eight nine ten
cero uno dos tres cuatro cinco seis siete ocho nueve diez
null eins zwei drei vier funf sechs sieben acht neun zehn
lihng yat yih saam sei ngh luhk chat baat gau sahp
German is the worst with its zwei and drei

How about requiring every challenge to be a palindrome?
This reduces the number of 3 digit challenges from 26^3 = 17576 to 26^2 =
676. If you require that the 3 characters are not all the same, this
further reduces the number to 26*25 = 650.

One possibility is to require that challenges be names of the user's
family, friends, and co-workers. But then one must disallow both manuel and
manuela. Karp and Karpov. Rue and Rudich.

I can imagine drawing up a list of permissible challenges that many people
would recognize. For example, names of common sports: ball, baseball,
basketball, beachball, football, handball, hardball, pinball, softball,
volleyball, ... badminton, tennis, soccer, swimming, ping pong, skating,
shotput, javelin throw, ...  
Unfortunately, handball and hardball differ in just one character. At the
least, a cracker can learn whether n and r map to the same digit.

This does not mean that the method to create responses is applicable only
to the specific challenges in the class. The method could be applicable to
challenges in general, but used only for a very specific class of challenges.


On to CAPTCHAS.

The IMAGE SEARCH problem: Given a slightly distorted fragment of an image,
find the original or a close match in the image data base.
It is likely that image search is a hard AI problem. Solving it would
enable us to write a program for looking up pictures in a directory of
pictures.

Some CAPTCHAS based on the IMAGE SEARCH problem:

How can one transform a picture so that the result is virtually
indistinguishable to a human but hard for a computer to look up?

This is a fundamental problem for a great many CAPTCHAS, e.g.

Q: What is common to these 5 pictures?   
A: fork
Luis von Ahn suggests a CAPTCHA that picks a "picturable noun" from BASIC
ENGLISH and searches on the web for 5 images indexed by that noun.

Q: What's wrong with this picture?   
A: The car is in the tree.
Take a piece of one picture and blend it into another. Then distort the
whole so that it can't be looked up. Or make a picture transparent and
overlay a small piece of it on another picture before distorting.

Q: WHO is this?   
A: George Washington
There are lots of pictures of well-known people and places.

Q: What point in this picture corresponds to the given point in this other
picture?
A: Point to the point.
Given a picture and a slightly distorted variant of the picture, find
points in the distorted picture that map from given points in the original
picture. Or given two distinct views of a picture, find the point in one
that corresponds to a given point in the other. The pictures could be the
two images of a 3-D picture, or they could be two slightly different views
of a face.

The fundamental AI problem on which this is based is the problem of finding
a picture in a data base.
It would be valuable to be able to find an image in a data base, given some
slightly distorted portion of the picture. You take a photograph of a
picture to a museum or library and ask: Who was the artist? How can I find
more such pictures? Right now, the museum can do no better than have you
show the picture to a curator -- this according to Bill Stein, vice
president of the Carnegie Art Museum. (And the curator may or may not have
the answer you want.) 

Is the image search problem hard? If not, how would you solve it?

HOMEWORK #6:

How easy or hard is it to break PhonOID #82
given that challenges are required to be palindromes
           <a1 a2 ...   a2 a1>
in which a1 a2 ... are pairwise distinct.