[nycphp-talk] Filtering input to be appended inside email
Mikko Rantalainen
mikko.rantalainen at peda.net
Tue Sep 13 10:19:18 EDT 2005
Daniel Convissor wrote:
> Hi Michael:
> On Mon, Sep 12, 2005 at 12:41:12PM -0400, Michael Southwell wrote:
>>
>>The point is simply to identify which scripts have sent emails to the
>>known-bad addresses; those are the vulnerable ones.
>
> I'm afraid that will lead people into both a false sense of security and
> using email address blacklists. Folks should audit their email scripts,
> period.
I agree. Broken code is broken code. If you aren't sure if your
email script works correctly, take it offline immediately.
>>There were other problems as well, which I noted in my polished
>>version. We need an officially sanctioned version of the function
>>before we can post anything.
>
> Agreed. Here's what I think is a good starting point for discussion...
>
> <?php
> // untested!!!!
> // MUST do is_set() checks on all of these for first!
> // left out for brevity.
>
> if (eregi('^[a-z0-9_.=+-]+@([a-z0-9-]+\.)+([a-z]{2,6})$', $_POST['address'])) {
> $address = $_POST['address'];
> } else {
> echo 'bad email';
> exit;
That looks pretty simple but it doesn't allow even nearly all valid
email addresses.
I'd rather create two functions like
/**
takes string $input_email and returns RFC 2822 section 3.4
compatible address or empty string if input cannot be
handled.
http://rfc.net/rfc2822.html#s3.4.
*/
function getSafeEmail($input_email) { ... return $safe_email; }
and
/**
takes string $input_header and encodes it as a single header
to be used for mailing.
http://rfc.net/rfc2822.html#s2.2.3.
*/
function getSafeHeader($input_header) { ... return $safe_header; }
and I'd put all input through these functions. Like $from =
$_POST["FROM"] or so.
**
Of these, the first one is much harder to implement correctly. A
simple implementation could only accept limited addr-spec format of
syntax
dot-atom "@" dot-atom
where the dot-atom is defined at http://rfc.net/rfc2822.html#s3.2.4.
Note that this is much simpler than full address spec defined in
http://rfc.net/rfc2822.html#s3.4.
Note that this "simple" format wouldn't allow all valid email
addresses but at least it would allow stuff like
mikko.rantalainen+nyphp at peda.net
unlike many complex regexes that are meant to filter email addresses.
A simple, untested implementation would look like
function getSafeEmail($input_email)
{
# http://rfc.net/rfc2822.html#s3.2.4.
$dot_atom = "^a-z0-9!#\$%&'*/=?_`{|}~+-";
# filter extra characters off
$safe_email = preg_replace("@[^{$dot_atom}]@gi","",$input_email);
if
(preg_match("@[{$dot_atom}](\.[{$dot_atom}])*\@[{$dot_atom}](\.[{$dot_atom}])+ at i",$safe_email))
return $safe_email;
else
return ""; # error
}
For the second function we have two possible ways to make sure that
$input_header indeed contains exactly one valid header; either
remove all line feeds from the input or append a space after every
line feed which makes whole input a single header wrapped to
multiple lines (http://rfc.net/rfc2822.html#s2.2.3.). I'll choose
the latter method for this implementation. Again, this is untested.
function getSafeHeader($input_header)
{
# split as defined in http://rfc.net/rfc2822.html#s2.2.
list($name,$value) = explode(":",$input_header,2);
# verify header name
if (!preg_match("@^[".chr(33)."-".chr(126)."]+$@",$name))
return "";
# header cannot contain CRLF
# our implementation strips out CRs, make sure all LFs
# are safe and reinserts CRs
$value = preg_replace("@\r@","",trim($value));
$value = preg_replace("@\n@","\n ",$value);
$value = preg_replace("@\n@","\r\n",$value);
$safe_header = $name.": ".$value."\r\n";
return $safe_header;
}
Body doesn't need to be handled unless you use HTML mail (shame on
you), in which case all XSS issues are there waiting.
--
Mikko
More information about the talk
mailing list