The web is constantly evolving with new technologies being added all the time, creating a platform completely unrecognisable from when the web first began. MWR Labs recently carried out a research project to assess some of these new technologies and the possibilities they bring for helping to solve computationally intensive problems within security.
The main aim behind the project was to try to harness the power of two new technologies in particular, WebGL and WebCL, for retrieving passwords from hashes using a brute force technique. If this proved possible, the secondary aim was to assess how cost effective it would be to retrieve hashes in this way compared to using cloud computing. Let’s start with a brief introduction into these two new technologies.
The WebCL specification is still under development by Khronos; however, Nokia and Samsung have both created implementations to act as starting points. Nokia have created a Firefox extension and Samsung have developed a WebKit implementation. Currently WebCL is only available through the above implementations.
Retrieving passwords from hashes using brute force is simply taking every combination of characters that comprise a password, hashing it and testing if it matches the original hash. Our starting point was to try to implement the MD5 hashing algorithm in WebGL and WebCL.
The basic idea behind the distribution platform was to have a centralised server that distributes computation between worker nodes. The server contains a list of the hashes trying to be cracked and gives worker nodes a range of character combinations along with the hash to be cracked. The worker nodes then hash all these character combinations and if one matches the original hash, they report the recovered password to the centralised server. The server was implemented using Ruby with communication to the Nodes through JSON requests. The following two diagrams show an overview of the system and a step by step run through of a hash being cracked.
Layout of the distributed platform:
Step by step process of cracking the hash:
The following features were implemented:
Firstly, let us compare the performance of our MD5 hashing algorithms. Speeds are shown in million hashes per second (MH/s). The benchmarks were run on the following CPU and GPU. CPU – Intel Core 2 Duo @ 2.53GHz, GPU – ATI HD5570.
|Highest Speed Achieved||110.280MH/s|
|Highest Number of Nodes Connected||7615|
|Total Nodes Connected||146,740|
|Longest connected node||3 Days, 13 Hrs. 25 Mins.|
|Fastest node||Chrome 13 0.563MH/s|
|Cost per 1 Billion MD5||$0.028|
|Cost to recover 8 Char A-Za-z0-9||~$6,250|
|Time needed to recover 8 Char A-Za-z0-9||123 Days!|
Based on these statistics we can estimate the performance of the DHC if we assume that we only have WebCL nodes.
|WebCL Estimated Average Node Speed||10MH/s|
|Amazon EC2 GPU Instance IGHashGPU Speed||2790MH/s1|
|Amazon EC2 Cost||$2.10 per hour|
From these estimates we can expect WebCL to be roughly 50% faster than using Amazon’s EC2 GPU instances. We can also see that based on our estimates WebCL is faster however, with this distributed platform speed can be varied depending on what the daily spend limit for the advertising campaign is set to. For example, if we doubled the daily spend limit we would expect to get double the number of impressions in the same amount of time, thus roughly doubling our speed. Of course, eventually there will be a limit on the speed that can be achieved due to the centralised server acting as a bottleneck.