Shaun Friedle created an impressive piece of Javascript which can automatically defeat CAPTCHAs used by the Megaupload file hosting service. While their CAPTCHAs are particularly weak, it’s an impressive Javascript feat that breaks into some new territory, namely Javascript-based optical character recognition. John Resig posted a breakdown of how the software works. Here’s the quick summary:
- The HTML 5 Canvas getImageData API is used to get at the pixel data from the Captcha image. Canvas gives you the ability to embed an image into a canvas (from which you can later extract the pixel data back out again).
- The script includes an implementation of a neural network, written in pure JavaScript.
- The pixel data, extracted from the image using Canvas, is fed into the neural network in an attempt to divine the exact characters being used – in a sort of crude form of Optical Character Recognition (OCR).
Shaun designed the software as a Greasemonkey script that will break CAPTCHAs for Megaupload and automatically trigger a download. The code is designed specifically for this CAPTCHA style, but there’s no reason why the getImageData trick combined with a alternate OCR implementation couldn’t be used to solve for other systems. This is pretty fascinating stuff.
Is there a better (more convenient, harder to cheat) way to prove humanness? What else could you make in Javascript using OCR, neural nets, or per-pixel image processing?
Megaupload Auto-fill CAPTCHA
MuCaptcha Online Demo
OCR and Neural Nets in JavaScript – John Resig



Cool coding, however a neural network is simply a memorization table. Even them changing the CAPTCHA letters to lowercase will break this code. And things get real messy when they introduce funny letters and random lines.
Neural networks have the ability to resist some noise, allowing them to see through any funny letters and random lines. You can also teach them lower case letters.
This probably requires two different checks. Something like a questions ‘Which of the following is a Rabbit’. Then show several images of various animals. The software would need to be able to do two things, firstly parse the English sentence as a question and understand what it means. Then be able to decode several images and determine the match.
While MegaUpload uses a remarkably uncomplicated three letter CAPTCHA to validate genuine downloads I think this is extremely nice of them and calling it weak, while true, is somewhat the point.
Who here has stared in incomprehension at the “pay to make this go away” cats and dogs CAPTCHA on Rapidshare? That’s the other end of the scale for “security”. Being able to punch in three letters and get your speedy, bulk download is wonderful in a world where the CAPTCHA has become a tool to generate revenue.
Captcha and other identical ideas are there to stop spambots from entering areas like forums. Why the hell would you want to help bypass this? It’s just gonna help the spammer.
Stephen, Rapidshare has no captcha at all anymore, they have seen the light – both rapidshare and megaupload restrict free users to one download at a time and limit the amount they can download in a day and that is enough to prevent anyone overusing their service anyway.
selfSilent, I don’t know how a script that breaks the captcha on a download site will help anyone get into a forum, spammers already have much more sophisticated software capable of breaking captchas on forums, the only thing that’s new about my script is it uses javascript.
I need a CAPTCHA decoder to reserve for some 50% deals on a website. It starts by making you click reserve, then goes to a page with a CAPTCHA code, then you have to click reserve again. They are usually limited to 100 items, and sell out in less than 10 seconds. If I could figure out how to get a decoder to get past that for me it would be so much easier. I need to read a bit more about these decoders.
Can we get the source files?
Can we get the source files?
Can we get the source files?
Can we get the source files?
// What's Trending
Raspberry Pi Design Contest
Lost PLA Casting from 3D Prints
Ten Tips for Adhesive Tape
Seventeen Sneaky Secret Hides
I Have a (Puzzling) Dream
10 Things to Connect to Your Raspberry Pi
Teardrop Camper Trailer
47 Raspberry Pi Projects to Inspire Your Next Build
// What's Shared
A better way to slice a pumpkin
DIY Nerf Darts
100 Dollar Store Organization Ideas for Craft Rooms and Beyond
In the Maker Shed: Minty Boost USB Charger
Mad’s Mouse House
Lace Princess Crowns
I Have a (Puzzling) Dream
Play the Rings of a Tree Trunk Like a Record
// Most Commented
DIY Hacks & How To’s: Get Emergency Power from a Phone Line
Resin Casting: Going from CAD to Engineering-Grade Plastic Parts
Ten Tips for Screws and Screwdrivers
Ten Tips for Better Measurement
Is it a Hackerspace, Makerspace, TechShop, or FabLab?
Arduino Announces New Wireless Linux Board
Makers on TV: Big Brain Theory
Tool Review: BioLite CampStove
Trending Topics
Get our Newsletters
About Maker Media
Subscribe
to MAKE!
Get the print and digital versions when you subscribe