Friday, January 4, 2013

Files – User view part -1


In this chapter and two more to follow, we study the user view of the files. This part of the study should help a platform developer in the process of designing file system to elucidate requirements from the users.
Files by defintiion are and abstraction mechanism. They provide a way to store information on the disk and read it back later. This must be done in such a way as to shield the users from the differences in the type of hardware, how data is stored and organised.
Let us take different charachteristics of the file and examine them one by one.
First is naminig of the file.
This is a requirement for the file system designer for he must know a way to unambigously resolve a file based on a identity and the best identity at a user level must be something in a readable and friendly form. Thus an important desing consideration is the attributes is file name such as
1. length of the file name : This attribute helps designer to keep appropriate place holder in the data structures in the platform
2. Character sets allowed in the file name: This attribute helps designer to help parse the file name while seeking it through the disks. Limited set implies less effort in parsing but user experience takes a hit and vice versa.
3. Case sensitivity: This attribute helps designer code his logic about resolving the file names. MS-DOS has a view of ignoring the case while UNIX like systems prefer to be case sensitive.
Moving on we dissect the file names into to two parts seperated by '.' (period). As <filename>.<extension>. In practise the <filename> could be anything. A string of literals supported by the file system and <extension> is a string of literals again mostly confined to five numbers tops.
Examples are
Sherlock.mp3, Watson.mpg, MrsHudon.txt.
The extensions are primarily for the user to interpret, to know what sort of files they are. It would wrong to mention “Types of files” in this context as from the platform perspective “Types” of files bear a particular meaning and the files mentioned above fall under the type of files called the “regular files”. Anyways, coming back to user view, the extension serve minimal input to the filesystem while it is important to the user level applications.
For the system having a GUI there is a mechanism by which user can register a particular extention of file to a particular program. This is the mechanism by which GUI launchers help open the files with appropriate programs. In the command line mode we invoke the program and supply the file as parameter and this manner there is no need for the extension to be present. While in GUI mode we assert on a data file and appropriate program is automatically launched. Extensions are here to help users and gurantee that user knows what he is doing. Windows is file extensions aware while UNIX like systems is not.
However some application programs mandate the extensions. This is more so in softwares such as compilers where extensions matter the most in helping the compiler stages to unambigously pick up files for pre-processing, compiling and linking the programs. And for build tools such as Make.
Next we see how the files are really structured.
Strictly speaking filesystem need not worry about the contents of the file. For the contents matter only for the user level programs which operate on it. Hence, modern file systems keep the data raw in the files as written and render the same to user application when read from. This method provides maximum flexibity and most operating systems work this way.
However special cases are present where the files are written into and read from in units called as records. This is of historic signicance where the earlier mainframes used 80 column punch cards. For our design sake we consider unstructured byte streams.
Moving on we have the types of files.
Like mentioned earlier the “type” of file means something particular in file systems domain. I am presenting a super set of file types and this set maps to the types of files in the linux file system. The list however need not be same in every operating system. Idea of this column is to familiarise overselves with taxonamy of files.
While it is reasonably safe to suppose that everything you encounter on a Linux system is a file, there are some exceptions.

1. Regular files: These are regualr user files. Like music, texts, video et cetara.

2. Directories: files that are lists of other files.

3. Special files: the mechanism used for input and output. Most special files are in /dev, we will discuss them later.

4. Links: a system to make a file or directory visible in multiple parts of the system's file tree. We will talk about links in detail.

5. (Domain) sockets: a special file type, similar to TCP/IP sockets, providing inter-process networking protected by the file system's access control.

6. Named pipes: act more or less like sockets and form a way for processes to communicate with each other, without using network socket semantics.

No comments:

Post a Comment