Monday, April 16, 2012

Convert Unix, Windows, Mac line endings using OS X command

Today I had to copy some MySQL data from Debian server into test environment on my MacBook. While importing data from tab delimited text files, I noticed warnings that data in the last column of several tables was being truncated. I looked at the tables and noticed MySQL doing some very strange formatting when printing them. It looked almost as if last column was padded with a bunch of white space. I opened import file in TextWrangler and it appeared fine, but when I looked in document options, I saw this:


The good ol' EOL (end-of-line) character...

Different operating systems use different characters to mark the end of line:
  • Unix / Linux / OS X uses LF (line feed, '\n', 0x0A)
  • Macs prior to OS X use CR (carriage return, '\r', 0x0D)
  • Windows / DOS uses CR+LF (carriage return followed by line feed, '\r\n', 0x0D0A)
I'm guessing the person who sent me those files first transferred them to his Windows machine in ASCII mode, so newline characters got automatically converted during transfer.

Since some of the files were very big, instead of changing line endings in TextWrangler I decided to use command line (shocking, I know).

First I executed
cat -v file-name
to confirm existence of the dreaded ^M (carriage return) at the end of every line, and then ran
tr -d '\r' < file-name > file-name-unix
to generate new files without CR characters.

tr (translate character) is a nice little utility that does just that, substitutes one character with another or deletes it (like in my example). It's available on pretty much any *nix distro so no need to install additional software.