News Item: : Querystring obfuscation. my take.
(Category: Misc)
Posted by yiangos
Thursday 27 August 2009 - 02:24:34
At one point, it was required to obfuscate the data in the querystring. Hiding it (i.e. using different methods to pass data to a page) was not an option, as the pages needed to be able to stand alone (e.g. give a fully functional link for an offer banner). So I built an obfuscation scheme.
My first attempt was to use a gzip stream on the entire querystring and base64-encode the resulting binary data. However, I found out through trial and error that this method has a lower boundary. The resulting string is never less than 200 characters long. The queristrings I was trying to obfuscate were typically 40-50 characters long (including parameter names, equal signs and ampersands). Moreover, the base64 encoding introduced charaxters that could never appear in a querystring value (equal signs). So I dumped that idea, as it seemed too much of an overkill to obfuscate a 40 character string and come up with a 200 character string as a result. I had to find a different algorithm.
The algorithm I came up with, is quite simple and straight forward, yet to my eyes beautiful in its simplicity.
First of all, I noted that most (if not all) data passed via GET are either strings or numbers (there's a caveat here, which I'll cover at the end).
Next there was a decision to make, i.e. whether to include parameter names in the final string or not. I opted not to include them, and rather use some application wide structure (e.g. cache) to store the parameter names. Since the obfuscation/parsing of the querystring would occur within the same class, it would be easy to maintain the correspondence between values in the querystring and names in the list structure kept at application level.
So, having made that decision, I started designing the algorithm.
To obfuscate, we distinguish between numbers and strings.
If a parameter value is a number, then convert the value to hex.
If the parameter value is a string, then convert it to a byte array, then convert each byte to its hex representation (2-digit).
Prepend the now obfuscated value with a letter (other than a-f which are used in hex) to tell whether the value that follows is a number or a string, e.g. use U for numbers and L for strings.
Finally, concatenate all values (with their distinguishing letter) using a third letter as a delimiter (e.g. M) and using the order imposed by the list that contains the parameter names.
To parse an obfuscated string, we reverse the steps:
First we split on the delimiter letter.
For each value, we check the first letter. If it signifies number, then convert the value directly from hex to decimal, and you're done. If it signifies string, then convert each pair of characters to a number (essentially a byte), and thus populate a byte array. Finally encode the bytearray using the encoding used to create the bytearray during obfuscation and you're done.
Each value is finally assigned to its corresponding parameter name.
code to come...
This news item is from Midnight blogging
( http://yiannis.vavouranakis.gr/myblog/news.php?extend.43 )