The ability to analyze a firmware image and extract data from it is extremely useful. It can allow you to analyze an embedded device for bugs, vulnerabilities, or GPL violations without ever having access to the device.

In this tutorial, we’ll be examining the firmware update file for the Linksys WAG120N with the intent of finding and extracting the kernel and file system from the firmware image. The firmware image used is for the WAG120N hardware version 1.0, firmware version 1.00.16 (ETSI) Annex B, released on 08/16/2010 and is currently available for download from the Linksys Web site.

The first thing to do with a firmware image is to run the Linux file utility against it to make sure it isn’t a standard archive or compressed file. You don’t want to sit down and start analyzing a firmware image only to realize later that it’s just a ZIP file:

OK, it’s nothing known to the file utility. Next, let’s do a hex dump and run strings on it:

Taking a look at the strings output, we see references to the U-Boot boot loader and the Linux kernel. This is encouraging, as it suggests that this device does in fact run Linux, and U-Boot is a very common and well documented boot loader:

However, taking a quick look at the hexdump doesn’t immediately reveal anything interesting:

So let’s run binwalk against the firmware image to see what it can identify for us. There are a lot of false positive matches (these will be addressed in the up-coming 0.3.0 release!), but there are a few results that stand out:

Binwalk has found two uImage headers (which is the header format used by U-Boot), each of which is immediately followed by an LZMA compressed file.

Binwalk breaks out most of the information contained in these uImage headers, including their descriptions: ‘u-boot image’ and ‘MIPS Linux-2.4.31′. It also shows the reported compression type of ‘lzma’. Since each uImage header is followed by LZMA compressed data, this information appears to be legitimate.

The LZMA files can be extracted with dd and then decompressed with the lzma utility. Don’t worry about specifying a size limit when running dd; any trailing garbage will be ignored by lzma during decompression:

We are now left with the decompressed files ‘uboot’ and ‘kernel’. Running strings against them confirms that they are in fact the U-Boot and Linux kernel images:

We’ve got the kernel and the boot loader images, now all that’s left is finding and extracting the file system. Since binwalk didn’t find any file systems that looked legitimate, we’re going to have to do some digging of our own.

Let’s run strings against the extracted Linux kernel and grep the output for any file system references; this might give us a hint as to what file system(s) we should be looking for:

Ah! SquashFS is a very common embedded file system. Although binwalk has several SquashFS signatures, it is not uncommon to find variations of the ‘sqsh’ magic string (which indicates the beginning of a SquashFS image), so what we may be looking for here is a non-standard SquashFS signature inside the firmware file.

So how do we find an unknown signature inside a 4MB binary file?

Different sections inside of firmware images are often aligned to a certain size. This often means that there will have to be some padding between sections, as the size of each section will almost certainly not fall exactly on this alignment boundary.

An easy way to find these padded sections is to search for lines in our hexdump output that start with an asterisk (‘*’). When hexdump sees the same bytes repeated many times, it simply replaces those bytes with an asterisk to indicate that the last line was repeated many times. A good place to start looking for a file system inside a firmware image is immediately after these padded sections of data, as the start of the file system will likely need to fall on one of these aligned boundaries.

There are a couple interesting sections that contain the string ‘sErCoMm’. This could be something, but given the small size of some of these sections and the fact that they don’t appear to have anything to do with SquashFS, it is unlikely:

There are some other sections as well, but again, these are very small, much too small to be a file system:

Then we come across this section, which has the string ‘sqlz’ :

The standard SquashFS image starts with ‘sqsh’, but we’ve already seen that the firmware developers have used LZMA compression elsewhere in this image. Also, most firmware that uses SquashFS tends to use LZMA compression instead of the standard zlib compression. So this signature could be a modified SquashFS signature that is a concatination of ‘sq’ (SQuashfs) and ‘lz’ (LZma). Let’s extract it with dd and take a look:

Of course, ‘sqlz’ is not a standard signature, so the file utility still doesn’t recognize our extracted data. Let’s try editing the ‘sqlz’ string to read ‘sqsh’:

Running file against our modified SquashFS image gives us much better results:

This definitely looks like a valid SquashFS image! But due to the LZMA compression and the older SquashFS version (2.1),  you won’t be able to extract any files from it using the standard SquashFS tools. However, using the unsquashfs-2.1 utility included in Jeremy Collake’s firmware mod kit works perfectly:

Now that we know this works, we should go ahead and add this new signature to binwalk so that it will identify the ‘sqlz’ magic string in the future. Adding this new signature is as easy as opening binwalk’s magic file (/etc/binwalk/magic), copy/pasting the ‘sqsh’ signature and changing the ‘sqsh’ to ‘sqlz’:

Re-running binwalk against the original firmware image, we see that it now correctly identifies the SquashFS entry:

And there you have it. We successfully identified and extracted the boot loader, kernel and file system from this firmware image, plus we have a new SquashFS signature to boot!

50 Responses to “Reverse Engineering Firmware: Linksys WAG120N”

  1. Max Says:

    Awesome! Great read.

  2. Denis Says:

    Very nice tutorial, thank you !

  3. Coder Says:

    Great!
    Will repost on my blog

  4. two55309 Says:

    very cool! I would be interested in seeing the reverse as well. Making a new firmware file

  5. Gi0 Says:

    Great reading!!
    I agree with two55309, i d also love to read more about editing the firmware

  6. Heremite Says:

    Great post! Congrats! Looking forward for more about editing the firmware.

  7. alex Says:

    Awesome!
    I also would be interested in seeing the reverse of this.(building firmware image)

  8. Click170 Says:

    Awesome tutorial! Thankyou!
    +1 for reverse tutorial?

  9. Slie Says:

    Awesome

  10. Michael Says:

    Fantastic post! I had always wondered about how this was done, but could never find a good resource.

    I wish to add my +1 for a further post in this series, possibly detailing the reverse of this as others have suggested.

    Many thanks!

  11. Reverse engineering a firmware | punnie.ath.cx Says:

    [...] If so, you should check it out. [...]

  12. Craig Says:

    Thanks for the feedback everyone! It sounds pretty unanimous – a tutorial on building a firmware image will have to be next. :)

  13. Reverse engineering embedded device firmware - Hack a Day Says:

    [...] not necessarily an easy thing to learn, the ability to reverse engineer embedded device firmware is an incredibly useful skill. Reverse engineering firmware allows you to analyze a device for bugs [...]

  14. Reverse engineering embedded device firmware | You've been blogged! Says:

    [...] not necessarily an easy thing to learn, the ability to reverse engineer embedded device firmware is an incredibly useful skill. Reverse engineering firmware allows you to analyze a device for bugs [...]

  15. wak Says:

    Excellent tutorial! You describe this in a way that makes it seem very easy. I look forward to reading what you write next.

  16. s3c Says:

    Really good work, hope to see more of these in the future, for other platforms perhaps? :)

  17. Chintan Parikh Says:

    Thank you for your detailed explanation, followed all the steps and it worked pretty nicely.

  18. Daniel Buchanan Says:

    I like this and I’m thinking about putting this on my blog as a linkback.

  19. Aaron Says:

    Wow, great! Thanks for sharing your wisdom with the world!

  20. Adam Baxter Says:

    What do I do when the firmware (e.g. http://bit.ly/jMx2Si) has some kind of checksum at the end? This makes re-packing it quite tricky.

  21. gamegod Says:

    This is a great tutorial! I had never heard of binwalk before, seems like a really handy tool for embedded work. Thanks for sharing!

  22. Ingeniería inversa en firmware, un ejemplo práctico | CyberHades Says:

    [...] buen tutorial que puedes encontrar aquí (en inglés). Comparte esta [...]

  23. Joseph Says:

    Hello,

    I’ve gotten up to the part where unsquashfs-lzma. I’ve downloaded the Firmware Modification Kit, and compiled with make. Then I’ve taken the unsquashfs-lzma and put it in the same directory as my firmware file. Upon running it:
    >./unsquashfs-lzma wag120.squashfs
    >Reading a different endian SQUASHFS filesystem on wag120.squashfs
    >zlib::uncompress failed, unknown error -3
    >Bus error

    Any help?

  24. Zhukov Says:

    By far one of the best tutorials I’ve seen.

    Very well done!

    Thank you for all the effort you had on putting this altogether!

  25. Jim Says:

    Great info, congrats by me also!!
    Tried to follow your tut on a Thomson 585v7 firmware. Binwalk unfortunately cant recognise anything in the .bin file. Is it because of no signature for it in the magic file or it has to do something with the bin file?
    Keep up the good work and please do post an editing the firmware tutorial

  26. john Says:

    Great detail tut
    thank you thank you

  27. links for 2011-05-31 « xtra’s blog Says:

    [...] /dev/ttyS0 » Blog Archive » Reverse Engineering Firmware: Linksys WAG120N (tags: tools linux hardware blog programming) Leave a Comment LikeBe the first to like this post.Leave a Comment » [...]

  28. Nold Says:

    Hey, thanks a lot for this article! Love it!

  29. Ricardo Says:

    This was such an informative read.

    Thanks

  30. Ralph Corderoy Says:

    I skimmed the article and see what’s going on but I couldn’t be bothered to click on one thumbnail after another of a terminal only to see text, e.g. of a command and its output. :-(

    Don’t you think it would be a much more readable, and Googleable, article if the text of the screenshots was presented as text in the article and none of the images existed?

  31. 80) Says:

    >GPL violations
    GPL is The Cancer

  32. Axelle Says:

    Nice post ! Very interesting. Found anything interesting in the SquashFS image?

    BTW, I do however agree with Ralph you could lighten the screenshots or just put the commands inline with ;)

  33. Rafael Bracho Says:

    I truly appreciate the article, it will be bookmarked for future reference. Unlike Ralph Corderoy, I especially liked the screen shots, as they allow one to see exactly how the different commands are invoked and what their outputs are. Once again, thanks, kudos!

  34. Vasilis Mavroudis Says:

    Just great dude! I started messing with firmware modding and penetration testing on embedded devices and this tutorial gave me a good boost…
    Bookmarked!

  35. Rogan Dawes Says:

    sercomm was not a total dead end, by the way:

    http://www.nslu2-linux.org/wiki/Main/SercommFirmwareUpdater

  36. Craig Says:

    @Adam Baxter:

    If there’s a checksum in the firmware you’ll have to determine where the checksum is and what checksum algorithm is used. Haven’t looked at the ZIP file you linked to yet, but if there’s GPL code available for the device that that’s usually the easiest way to figure it out. Also look to see if open source projects like OpenWRT or DD-WRT (since you didn’t mention what this device is I’m assuming it’s a router!) have done any work on it. I plan on covering firmware mods in more detail in a later tutorial.

    @Joseph:

    The unsquashfs error you’re getting looks like you are using the standard version of unsquashfs instead of the lzma version. Are you sure you’re running the correct unsquashfs binary from the firmware mod kit source?

    @ Jim:

    I haven’t looked at the Thomson routers before. If you aren’t getting any results from binwalk then it didn’t find any known signatures in the file. What kind of output do you get from running strings? If there are few readable strings then I would suspect the firmware (or part of it) is compressed. Try running binwalk with the -a option; this will use all signatures and will result in a lot of false positive matches but may help you find some gzip or other compressed data in the firmware.

    @ Ralph/Axelle:

    I started writing the tutorial by just copy/pasting the text, but felt that the screenshots made the commands and their output much easier to follow. I agree though that having to click back and forth is a pain – I’ll be sure to have future pictures viewable inline with the text!

    @Rogan Dawes:

    I did Google Sercomm and found that link too, but since it wasn’t related to the filesystem I was looking for and wasn’t the object of the tutorial I skipped over it. Good point though that I should have mentioned in the tutorial – always Google odd strings in the firmware!

  37. Chintan Parikh Says:

    Binwalk compilation errors for Mac OSX – Could you please tell me how could I install libmagic for mac. Sorry for naive question but I couldn’t find source on the web.

    thanks.

  38. Craig Says:

    Chintan: I did look for libmagic for OSX, but couldn’t find it. Linux only I’m afraid. If you find a port for OSX thought, let me know!

  39. Craig Says:

    @Axelle:

    Yes, a quick skim through firmware found the hard coded SSL public/private key pair for the HTTPS server. Another one for LittleBlackBox!

  40. powertomato Says:

    Great post!

    @Chintan Parikh
    Try compiling the library:
    http://sourceforge.net/projects/libmagic/

  41. Craig Says:

    powertomato: Have you tried building against that libmagic library? It doesn’t look like it’s compatible with the libmagic library binwalk is built against.

  42. Gi0 Says:

    @Craig

    Also tried your instructions on a Thomson 585v7 7.4.4.7 UK firmware (Download link of bin file: http://goo.gl/fvTcr). Strings cant get anything usefull (link: pastebin.com/ibXN4KQy) and binwalk with the -a switch seems to get too much info (i m guessing false positive, as you mentioned. Link:pastebin.com/nCbvDuq2)
    If you have any time to provide any ideas/steps to get pass this, i would gladly try to provide signatures for Thomson firmwares for binwalk.
    Once again, thank you for your tutorial and time!

  43. Craig Says:

    Gi0: I don’t see anything obvious in that firmware image, but given the size and the layout of the firmware update file, I’d be surprised if it’s Linux based.

    Given the lack of nearly any strings, I’d say that aside from the header (the first few bytes look like they are “magic” bytes for the firmware header) the firmware image is probably compressed, but nothing that I recognize.

    Thomson apparently has its own boot loader, at least for that model, so it’s possible that there could be a custom compression algorithm. See more here: https://forum.openwrt.org/viewtopic.php?pid=101182

    Do you have physical access to one of these routers? Getting to the jtag/serial ports will likely give you a lot more information.

  44. Reverse Engineering Firmware « Yet Another Technology Blog Says:

    [...] http://www.devttys0.com/2011/05/reverse-engineering-firmware-linksys-wag120n/ [...]

  45. Gi0 Says:

    @Graig:
    I have access to a couple of those routers, thank you for the tip and also for the link you provided!

  46. Rulzwrld Says:

    This is an amazing article. The next part is definitely inevitable.

  47. Hacker Harry Says:

    simple, yet effective.
    nice read.

  48. Sassan Panahinejad Says:

    An interesting article.
    Binwalk is definitely something I’ll be making use of in the future.
    Also the method for searching for section starts using the padding in the image was crafty :)

  49. Mendel Says:

    Great article! I’ll try to do the same with TP-Link firmwares. For now, binwalk showed me the image is “Linux Journalled Flash filesystem, little endian”

  50. powertomato Says:

    @Craig:
    No I didn’t try it… It’s not compatible – my bad! Sorry!
    AFAIK libmagic is an optional part of posix so there is a good chance that the .so library file is somewhere on the MacOS system.