Scalable Polymorphic Hash  an "engineered" compression function
TurboCrypt, our testbed for truly ultimate ciphers, requires the user to input passphrases. Passphrases should be long, but they can also be short. Especially short passwords require best possible protection. A while ago we were already ahead of the herd when we came up with an ultimate solution for compressing passphrase input: a Polymorphic Hash function.
Scalable Polymorphic Hash cascades standard hash functions. The goal is a hardened design using conventional primitives. The expression "overkill" or better "overprotection" is not far away from what is intended. If the threat potential is incalculable, it's sometimes better to rely on something that is heavily overdesigned in size as well as in strength. Tanks are quite popular in the armies all over the world. Such vehicles look pretty daunting, they are generally extensively armored and somehow it's safer to cruise with one of these through hostile areas instead of having to rely on a lightweight truck which is easily blown away.
Polymorphic Hash is definitely not the fastest compression function. As it is intended for use as hash function to compress password phrases for encrypted disks or file archives, there is absolutely no reason why it should be fast. In contrast, it consumes a lot of memory and CPU time and there are attempts to further increase code size, memory usage and especially CPU time. Why? It doesn't really make a difference if a user gets access to encrypted files after 1 microsecond or 0.5 seconds. For an attacker, a fast algorithm is advantageous: It only takes 1 second to try 1 million passwords. If an attacker strives to get access to data that is protected by a much slower hash function, by far less password combinations can be tried during the same time unless the attacker posesses 500.000 computers.
Here's the link to the paper that describes Scalable Polymorphic Hash: roellgen07_scalable_pmc_hash.pdf
Complete C++ source code (Microsoft Visual Studio 6) is available here: http://downloads.turbocrypt.com/source_code/scalable_pmc_hash_vc6.zip
The algorithm uses AES Rijndeal, Whirlpool, SHA1, SHA256, Tiger 128, Serpent, MD5, Cast256, Ripemd 256 and ARCFOUR as basic building blocks (primitives), but also less secure building blocks would certainly perform well.
Why not rely on only one algorithm? Well, let's think back a few years to SSL 40 bit, DES and A5 (still affects more than a billion GSM mobile phones: it seems to take only seconds to crack the A5 algorithm). Hash functions tend to perform no better. For MD5 (Rivest, 1991), a collision of the compression function of MD5 was announced in 1996 by Dobbertin. Against SHA0 (NSA, 1993) the first attack was presented in 1998 and an attack against SHA1 (improved SHA0, NSA, 1995) was presented in 2005. As it is not our philosophy to fully rely on functions that have a potential lifetime of 10 or 20 years, it is certainly clever to create combined secrecy systems. By doing this, we take advantage of known technology, complexity increases and the result is comparably "ultimate" if the secrecy system itself is not a constant (like all conventional designs)! We simply take advantage of our core competence to create secrecy systems that are at least to some extent compiled at runtime and that can even tolerate moderate deficiencies in the primitives that are used.
In Scalable PMC Hash, the wellknown primitives Whirlpool, Haval 256 and Ripemd 256 act as worker compression functions. These three algorithms have been extensively tested, cryptanalyis is available and so there is no need to worry about obvious weaknesses. Whirlpool, Haval 256 and Ripemd 256 are concatenated to form hash function h(x) = r[p[h1(p(x))  h2(r(p(x)))  h3(p(r(p(x))))]]. If h1, h2 and h3 are independent, then finding a collision for h requires finding a collision for all three simultaneously (i.e., on the same input), which one could hope would require the product of the efforts to attack them individually. It is important to note that Scalable PMC Hash does NOT use the generic MerkleDamgård construction in its classical way. In contrast to a classical MerkleDamgård hash function, message blocks are transposed using a keyed transposition function prior to feeding them into hash function h1, h2 and h3. Another keyed transposition is performed on the output of h1, h2 and h3. This provides a simple yet powerful way to increase strength using only available components. All keyed operations involve polymorphic cryptographic functions. Recent multicollision attacks show clearly the need for cryptographic functions being variable to some extent. C.E. Shannon provides another piece of the puzzle in "Communication Theory of Secrecy Systems" (1949): " The combining operations give us ways of constructing many new types of secrecy systems from certain ones, such as the examples given. We may also use them to describe the situation facing a cryptanalyst when attempting to solve a cryptogram of unknown type. He is, in fact, solving a secrecy system of the type T = p1 * A + p2 * B + ... + pr * S , sum of pi = 1, where the A, B, ..., S are known types of ciphers, with the pi their a priori probabilities in this situation, and pi * X corresponds to the possibility of a completely new unknown type of cipher. The polymorphic stage provides this "unknown type of cipher": For the Polymorphic Stage, a recurrence relation has to be analyzed by cryptanalysts. The sequence of pseudorandom numbers is computed by two equations adding another six unknowns per cycle. Assuming that more than one additional unknown remain not known, it is mathematically impossible to solve these equations. The unknowns are arithmetic operations (plus or minus). On any modern microprocessor, add and subtract operations draw the same electric current, consume the same CPU time (one clock cycle if the operands can be fetched from first level cache), have nearly identical operation codes and are thus impossible to distinguish even with any kind of power attack (assuming that the operands are unknown).
Scalable Polymorphic Hash has the potential to be a good start point for new compression function designs. It's an approach that differs to a great extent from conventional design methods and design goals. Almost everybody in this world who designs hash functions generally has the goal to create the smallest code and the fastest method in the world. Well, for modern PCs it makes no difference if an algorithm consumes 1kB or 1MB of memory. For some purposes it would although be nice if a cryptographic function consumes a lot of time. It could be advantageous for users if a hash function computes back and forth on a comparably short password for 0.5s instead of 1 microsecond.
