SECURITY and CRYPTOGRAPHY 15-827 29 NOV 01 Lecture #19 M.B. 4615 Wean Matthew points out that the hard AI problem on which the following CAPTCHA is based is possibly (pretty much) solved by the folks who integrate multiple camera views: Q: Find the correspondence between labelled points in this picture to labelled points in this other picture? A: A-1. B-5. C-4. d-2. E-3. Given a picture and a slightly distorted variant of the picture, find points in the distorted picture that map from given points in the original picture. Or given two distinct views of a picture, find the point in one that corresponds to a given point in the other. The pictures could be the two images of a 3-D picture, or they could be two slightly different views of a face. CONJECTURE: An English-text-only CAPTCHA is (now 2001) impossible. The conjecture can and will be proved (below) under certain strong assumptions. What is needed is to weaken the assumptions sufficiently to convince ourselves that the conjecture is true, or else find a way to construct an English-text CAPTCHA. THE MODEL: A CAPTCHA is a randomizing algorithm with access to GOOGLE that carries on a conversation with an opponent -- human or a bot that has access to GOOGLE. The conversation proceeds in stages, starting with stage 1. In each stage, the CAPTCHA presents a CHALLENGE and the opponent gives a RESPONSE. At that point, the CAPTCHA either ACCEPTS (as human), REJECTS (as bot), or continues on to the next stage by giving a new challenge. The CAPTCHA is required to decide in a small (eg at most 10) number of stages. MORE DETAILS ON THE MODEL: The CAPTCHA initially sets stage k:=1. In STAGE k: *CAPTCHA generates random # and then uses it [if k>1, it also uses its history(k-1) of conversation and its state(k-1) -- EXCLUSIVE of previously generated random #s -- if any] to generate public challenge(k) and private state(k). *It awaits/gets public response(k). *Then it uses its current history(k) = challenge(1) response(1) : : challenge(k) response(k) and current state(k) to evaluate: ACCEPT, REJECT, or continue. We prove impossibility of an English-text-only CAPTCHA based on the following: ASSUMPTIONS: 1. Random numbers are used, if at all, only to create a public challenge (a continuation of the conversation) and some small amount of private state information. The CAPTCHA never uses its random numbers, if any, to decide between ACCEPT, REJECT, or continue (the conversation). 2. At the end of stage k, the decision to ACCEPT, REJECT, or continue is completely determined by history(k). The only purpose of state(k) is to help the CAPTCHA decide whether to ACCEPT, REJECT, or continue *efficiently*. 3. If, at the end of a stage, the CAPTCHA neither ACCEPTS nor REJECTS (it believes that the opponent could still be human), then the CAPTCHA is guaranteed to present a challenge (continuation of the conversation) (possibly dependent on random numbers) that has a non-rejectable (i.e. conceivably human) response. 4. If any response is not rejected (the CAPTCHA does not reject), then a random response has a nontrivial probability (greater than 1% say) to not be rejected. 5. Given history(k-1) and challenge(k), it is efficiently possible for a bot to find a random # and a state(k-1) that causes its own private virtual copy of the CAPTCHA to generate the same challenge(k), and therefore to evaluate response(k). PROOF: ........ QUESTION: How does the OCR-based CAPTCHA circumvent this Theorem? The CAPTCHA chooses one of a very large set of random numbers to select the challenge (a distorted image) and the state information (the actual word). The opponent cannot guess the random number (efficiently). It therefore cannot simulate the generation of (a copy of) the actual challenge.