Saturday 11 July 2009

RFC 2822: mom+dad@some.fqdn.is.ok; likewise+mom.is.a.*@is.ok.too

Dear form creators,

Please stop trying to be smart asses and say an address such as mom+dad@some.fqdn.ok is not OK. As a matter of fact IT IS!


If you want to be shocked, find out that even *@some.other.fqdn.ok is ALSO OK!

And If you really want to be correct and validate addresses against some regexp, there are only some really LOOOOOONG ones which should make it clear that your itsy-bitsy regexp which pretends to match valid email addresses, IS WRONG!


Correct regexps/codes that validate email addresses look something like this or like this. So please stop te nonsense.

Reasonable email addresses can and do contain . and + along many other characters in the local part (i.e. the part before the @).

PLEASE GET THIS THROUGH YOUR THICK SKULLS: THE ONLY RELIABLE WAY TO VALIDATE THE VALIDITY OF AN EMAIL ADDRESS IS TO TRY TO SEND MAIL TO IT.

7 comments:

Anonymous said...

There's actually RFC 5321 (supersedes RFC 2821), which specifies what's valid and what's not:

Local-part = Dot-string / Quoted-string
; MAY be case-sensitive


Dot-string = Atom *("." Atom)

Atom = 1*atext

Quoted-string = DQUOTE *QcontentSMTP DQUOTE

QcontentSMTP = qtextSMTP / quoted-pairSMTP

quoted-pairSMTP = %d92 %d32-126
; i.e., backslash followed by any ASCII
; graphic (including itself) or SPace

qtextSMTP = %d32-33 / %d35-91 / %d93-126
; i.e., within a quoted string, any
; ASCII graphic or space is permitted
; without blackslash-quoting except
; double-quote and the backslash itself.

String = Atom / Quoted-string

glandium said...

Well, it's certainly better than accepting the address with a '+' and patheticly fail to accept it later because of the '+'... because it ends up in the url arguments or POST data and isn't considered a '+' but a space. This is not fictional, it happened to me.

Anonymous said...

glandium, even a space is valid in the localpart, if it's quoted, e.g. "Martin Krafft"@madduck.net would be a valid address (although the RFC discourages the use of quotes).

Jon Dowland said...

@glandium: yes, partial-acceptance is a pain. BT.com accepted + when I signed up, but not later when I needed to log in for something.

I'm glad though that this incapability to handle + properly extends to spam harvesting software.

On that note, does anyone else wish @debian.org supported + addressing?

Jon Dowland said...

last anonymous was me.

Anonymous said...

Amen. Every printable ASCII character is permitted in the local part. Yes, that *includes* the ‘@’ character. This is because there are quoting rules in the syntax.

So anything before the *last* ‘@’ symbol in the email address is the local part, and is *none of your business*, form validation weenies. The only way to know if the address is valid is to send a message to it.

Which, web form developer, you'll be doing anyway, right? To validate it actually belongs to me, right? If not, don't bother to ask for it in the first place and save everyone a whole lot of time.

Anonymous said...

@bignose:

> So anything before the *last* ‘@’ symbol in the email address is the
> local part, and is *none of your business*, form validation weenies.

Sorry, input validation is necessary! At least check the string size, as the maximum length of the local part is 64 chars. (see http://en.wikipedia.org/wiki/E-mail_address#RFC_specification for easy to digest infos)