Groups | Blog | Home
all groups > dotnet performance > march 2004 >

dotnet performance : Tradeoffs? requestEncoding & preventing "Canonicalization" attacks


Chris Mohan
3/23/2004 11:46:08 AM
Hi
While flipping through the ms book "Improving Web Application Security: Threats & CounterMeasures" I came across a recommendation for preventing/reducing the threat of canonicalization attacks. The book (offhandedly)suggested that one way to address this threat is to change your web app's default request and response encoding settings within the web.config file. The default setting, defined in the machine.config file is utf-8 and the book recommended that this can be changed to ISO-8859-1 in the web.config's globalization element. Like so
<globalization requestEncoding="ISO-8859-1" responseEncoding="ISO-8859-1"/

My question: what are the trade-offs for doing this within an application that uses western character sets for English + a little spanish content.

Specifically, performance and text readability consequences.
How would this effect low-end browsers

Page of recommendation: 274 from book
More info on canonicalization
http://www.schneier.com/crypto-gram-0007.html#9

Thanks
-Chri
shawnste NO[at]SPAM microsoft.com
3/26/2004 8:49:25 PM
I always recommend using utf-8, although each situation will vary.

ISO-8859-1 (or any other encoding) would have its own security issues. For
example, if the user enters characters that don't exist in that code page,
will they be best-fit to the closest approximate letter? Or will they be
replaced with a ?. In both cases spoofing issues arise, namely that
characters are changed from their intended form to another, and that could
cause problems.

A good way to prevent this kind of threat is to make sure that you process
your data in a consistent fashion. If you have to check for certain
character patterns, do so in a consistent fashion and use APIs that do what
you intend. If you normalize a string, realize it could change its form.
If you do comparisons, understand what your compare options are ignoring.
If you take substrings you could also remove code points and change the
meaning. (For example, if you have AË as A + E + (dieresis), and remove
the E, then you'll have A + (dieresis) which looks like Ä, which is very
different than the original string).

Accepting data, validating it, then persisting it in a non-unicode code
page also could cause the data to be different when it was read back in.
(Due to best fit mappings or other behaviors of that Encoding).

So try normalizing your strings and doing your security checks after doing
any string manipulations (or redoing checks after changing strings).

Most encodings are susceptible to spoofing issues, however I don't think
that restricting your encoding is a good solution, I would prefer to accept
a complete Unicode range and then take the appropriate actions for the
strings I'm processing.

Hope that helps,

Shawn Steele
Windows International - GIFT
Chris Mohan
3/30/2004 1:46:09 PM
Hi Shawn,

Thank you for taking the time to provide such a thought-filled response. Consistency of validation-- great point. Looks like I need to add another thing to my to-do list: build an input validation helper class so I can pass all user input into one re-usable class.

Hmm.. that has a nice ring to it: "Input Validation Application Block" but google found nothing

After reading your response I realized that changing an entire's apps request & response encodings is a pretty drastic solution. Blocking port 80 would surely reduce any threats of canonicalization attacks too! ;-

Thanks again
-Chris Moha
AddThis Social Bookmark Button