Email Address Validation Code


This page exists because I've found that nobody in the world puts any effort into writing/maintaining code to validate email addresses. The problem seems fairly simple --- just to figure out if a text string is an email address, or if it's something else freakish that needs to be corrected.

With great sorrow, I must indicate that in all my dealings on the web, that far too many web sites get email address validation terribly wrong. The result is that many people with perfectly valid email addresses, or power users who rely on special address variations to filter their mail must suffer as second class net citizens.

The purpose of this page is to be a repository of free, working code that people can take and use in their environments for address parsing and validation, and information relating to the "field".

Contact me at cache atsymbol cs.cmu.edu to add additional entries to this page.


Important Reading


Common Problems


Code Snippets

Perl

This code handles non-obsolete address-bits without comments and without random (though legal) whitespace. is_valid_local_email_part takes the stuff before the '@' and tells you if it is valid. is_valid_email will validate an entire email address (Thanks to Bill Entwistle for pointing out a bug).

    sub is_valid_local_email_part ($) { 
        my ($addr) = @_;
        my $atext = qr/[A-Za-z0-9\!\#\$\%\&\'\*\+\-\/\=\?\^\_\`\{\|\}\~]/;
        my $dot_atom_text = qr/$atext+(\.$atext+)*/;

        my $no_ws_ctl_char = qr/[\x01-\x08\x0b\x0c\x0e-\x1f\x7f]/;
        my $qtext_char = qr/([\x21\x23-\x5b\x5d-\x7e]|$no_ws_ctl_char)/;
        my $text = qr/[\x01-\x09\x0b\x0c\x0e-\x7f]/;
        my $qtext = qr/($qtext_char|\\$text)*/;
        my $quoted_string = qr/"$qtext"/;

        if ( $addr =~ /^($dot_atom_text|$quoted_string)$/ ) { 
                return 1;
        } else {
                return 0;
        }
    }
    sub is_valid_email ($) { 
        my ($addr) = @_;
        my $atext = qr/[A-Za-z0-9\!\#\$\%\&\'\*\+\-\/\=\?\^\_\`\{\|\+\~]/;
        my $dot_atom_text = qr/$atext+(\.$atext+)*/;

        my $no_ws_ctl_char = qr/[\x01-\x08\x0b\x0c\x0e-\x1f\x7f]/;
        my $qtext_char = qr/([\x21\x23-\x5b\x5d-\x7e]|$no_ws_ctl_char)/;
        my $text = qr/[\x01-\x09\x0b\x0c\x0e-\x7f]/;
        my $qtext = qr/($qtext_char|\\$text)*/;
        my $quoted_string = qr/"$qtext"/;

        my $quotedpair = qr/\\$text/;
        my $dtext = qr/[\x21-\x5a\x5e-\x7e\x01-\x08\x0b\x0c\x0e-\x1f\x7f]/;
        my $dcontent = qr/($dtext|$quotedpair)/;        
        my $domain_literal = qr/\[(${dcontent})*\]/;

        if ( $addr =~ /^($dot_atom_text|$quoted_string)\@($dot_atom_text|$domain_literal)$/ ) { 
                return 1;
        } else {
                return 0;
        }
    }


Special Thanks