Wednesday, May 2, 2012

Text & binary modes on ftp – CR LF ASCII codes


Why there are binary mode and text mode transfers?

What is the difference between binary and text mode transfers?

When FTP data transfer modes are important, When it doesn't matter?

Sometimes, I can see an unknown character like "^M" at and of a line, what is this character? Why is it there?

I have to talk about "how to recording data" when we are writing to a text file. Because, the questions above are directly related how to save text files on disk! As you know, there is a character table for text which is called ASCII code table as following,


When we press to a character on the keyboard while we are using a text editor, the text editor records them as equivalent value on the ascii table. For example, when we pressed the character "a", the editor saves it as hex "0x61". This situation is same for all Operating Systems.

BUT,

We see a text as lines of characters but normally all of the information is constructed by 1's and 0's. Those texts don't lie as lines in computer's memory. Text editors try to mark each line with special characters starting with a character. The problem is that when we press ENTER button on the keyboard, the text editors are using different marking information on different Operating Systems.

Let's continue our sample,
We want to write

"Hello\n"

"\n" part is for pressing ENTER button on the keyboard. This is a characteristic symbol in the C programming language. Content of that string is stored on Windows, Unix/Linux and Mac Operating Systems as follow;

54 65 73 74 0D 0A - On a Windows system
54 65 73 74 0A    - On a Unix/Linux system 
54 65 73 74 0D    - On a Mac system                 

0D and 0A parts are for illustrating CR and LF on ASCII table. CR means, "Carriage Return" and LF means "Line Feed". As you see below Microsoft Windows needs both to mark a new line however Unix / Linux operating system don't need both of them. CR is enough to show anew line. The Mac style is just the opposite of the Unix/Linux.

There are 255 characters in Ascii table. The extended part of Ascii table is other part as follows;



Just because for this reason, ftp protocol is supporting two different transfer mode.

Binary mode is transferring data from ftp client to ftp server bit by bit without any modification. If client and server have different type of operating systems, target system can not identify the line endings if the subject to send is text file.

Text mode is able to understand source and target system difference and fix the line ending codes as expected format by the requester.

So, If your system and  target ftp server are in different type and your data is text, you have to select text transfer mode on your ftp client to ensure files to be sent in correct form.

No comments:

Post a Comment

Thanks