The official llllloooooo blog

Friday, December 1, 2017

Convert binary data files between Big Endian and Little Endian with xxd

Here is a nifty command line trick you can use to convert binary data files between big endian and little endian format. So for example if you are using 4 byte / 32 bit words, byte swapping 0x1a2b3c4d to 0x4d3c2b1a.

"input.bin" is your input binary file.
"temp.txt" is an intermediate file that you can delete later.
"output.bin" is the output binary with all the 4 byte / 32 bit words endian swapped.

xxd -e -g4 input.bin temp.txt
xxd -r temp.txt output.bin

or if you want to skip using the temporary text file...

xxd -e -g4 input.bin | xxd -r > output.bin



The xxd tool is used to make a hexdump or convert a hexdump back into binary.

The xxd tool normally comes as part of the "vim" application, so if xxd isn't available on your system you might need to install vim. Additionally I don't think this will work with the busybox version of xxd found on embedded system because as far as I can see it doesn't currently support all the required command line options. The standard linux / BSD / unix version of xxd should be fine.

Note that this procedure does not convert executables between big endian and little endian, only binary data.

For example. Say I have a binary file "test.bin" as follows
 
00000000  41 42 43 44 45 46 47 48  49 4a 4b 4c 4d 4e 4f 50

00000010  51 52 53 54 55 56 57 58  59 5a 30 31 32 33 34 35
00000020  36 37 38 39 61 62 63 64  65 66 67 68 69 6a 6b 0a

First use "xxd -e -g4" to convert the binary file to a hexdump text file. The "-e" option says to use Little Endian output format. Use this option regardless of whether you're converting from Big to Little or Little to Big Endian. The "-g4" option specifies to use 4 bytes (32 bits) per word. You can use a different word size option with "-g".

xxd -e -g4 test.bin temp-hexdump.txt

The contents of temp-hexdump.txt are as follows

#cat temp-hexdump.txt
00000000: 44434241 48474645 4c4b4a49 504f4e4d  ABCDEFGHIJKLMNOP
00000010: 54535251 58575655 31305a59 35343332  QRSTUVWXYZ012345
00000020: 39383736 64636261 68676665 0a6b6a69  6789abcdefghijk.


Note that the 4 byte hex values in the middle are presented in little endian format (i.e. reversed) so for the first word instead of "41 42 43 44" we have "44 43 42 41".

Finally we use "xxd -r" convert this hexdump back to binary format. Note that xxd -r only looks at the numerical hex data so if you want to edit your text file, just edit the hex numbers not the strings at the right.

xxd -r temp-hexdump.txt test-new-endian.bin

The contents of the output file test-new-endian.bin are endian swapped from the original as follows

00000000  44 43 42 41 48 47 46 45  4c 4b 4a 49 50 4f 4e 4d
00000010  54 53 52 51 58 57 56 55  31 30 5a 59 35 34 33 32
00000020  39 38 37 36 64 63 62 61  68 67 66 65 0a 6b 6a 69



Finally note that if your input binary file size isn't a multiple of the chosen word size, such as 32 bits / 4 bytes, then the last couple of bytes in the file will be discarded. One thing you could do is manually edit the temporary hexdump text file in the middle of the process add some zeros to the hex data at the end to make your data have a whole number of words before executing "xxd -r".

I used this method as a basis to convert an "md" memory dump from the u-boot boot loader utility from one endian into another on my debian linux system. Obviously there are ways you can programmatically do this with python scripts and so forth but I always prefer using existing command line tools if possible.