I have an application where I want to use a reliable message digest algorithm, such as SHA-1 or MD5. Both of these...
By submitting your email address, you agree to receive emails regarding relevant topic offers from TechTarget and its partners. You can withdraw your consent at any time. Contact TechTarget at 275 Grove Street, Newton, MA.
are implemented in Java 1.4, and I have some sample code and results from the IBM DevelopersWorks site. However, when I compile and run the code on a Sun box, the message digest doesn't match the expected results. It appears to be a code-page issue.
Can these Java message digest algorithm implementations be used in such a manner as to generate the same results across platforms and control for code-page differences?
The problem isn't the hash algorithms, it is what we call "text canonicalization." What this means is that you have to account for code-page differences before hashing by translating into some known "canonical form" -- or remember *not* to do any translation before hashing. Either of them is an acceptable way to solve the problem. You have to do the hash over the actual data.
OpenPGP (for which I'm a spec author) specifies that all text is in UTF-8 of Unicode.
For more information on this topic, visit these other SearchSecurity.com resources:
Ask the Expert: Clarification of encryption keys
Ask the Expert: Using MD5 in Java
WhatIs Definition: canonicalization
Have a question for an expert?
Please add a title for your question
Get answers from a TechTarget expert on whatever's puzzling you.