|
sortYou're familiar with the basic operation of sort:
By default, sort takes each line of the specified input file and sorts it into ascending order. Special characters are sorted according to the internal encoding of the characters. For example, on a machine that encodes characters in ASCII, the space character is represented internally as the number 32, and the double quote as the number 34. This means that the former would be sorted before the latter. Note that the sorting order is implementation dependent, so although you are generally assured that sort will perform as expected on alphabetic input, the ordering of numbers, punctuation, and special characters is not always guaranteed. We will assume we're working with the ASCII character set in all our examples here. sort has many options that provide more flexibility in performing your sort. We'll just describe a few of the options here. The -u OptionThe -u option tells sort to eliminate duplicate lines from the output.
Here you see that the duplicate line that contained Tony was eliminated from the output. The -r OptionUse the -r option to reverse the order of the sort:
The -o OptionBy default, sort writes the sorted data to standard output. To have it go into a file, you can use output redirection:
Alternatively, you can use the -o option to specify the output file. Simply list the name of the output file right after the -o:
This sorts names and writes the results to sorted_names. Frequently, you want to sort the lines in a file and have the sorted data replace the original. Typing
won't work�it ends up wiping out the names file. However, with the -o option, it is okay to specify the same name for the output file as the input file:
The -n OptionSuppose that you have a file containing pairs of (x, y) data points as shown:
Suppose that you want to feed this data into a plotting program called plotdata, but that the program requires that the incoming data pairs be sorted in increasing value of x (the first value on each line). The -n option to sort specifies that the first field on the line is to be considered a number, and the data is to be sorted arithmetically. Compare the output of sort used first without the -n option and then with it:
Skipping FieldsIf you had to sort your data file by the y value�that is, the second number in each line�you could tell sort to skip past the first number on the line by using the option
instead of -n. The +1 says to skip the first field. Similarly, +5n would mean to skip the first five fields on each line and then sort the data numerically. Fields are delimited by space or tab characters by default. If a different delimiter is to be used, the -t option must be used.
The -t OptionAs mentioned, if you skip over fields, sort assumes that the fields being skipped are delimited by space or tab characters. The -t option says otherwise. In this case, the character that follows the -t is taken as the delimiter character. Look at our sample password file again:
If you wanted to sort this file by username (the first field on each line), you could just issue the command
To sort the file instead by the third colon-delimited field (which contains what is known as your user id), you would want an arithmetic sort, skipping the first two fields (+2n), specifying the colon character as the field delimiter (-t:):
Here we've emboldened the third field of each line so that you can easily verify that the file was sorted correctly by user id. Other OptionsOther options to sort enable you to skip characters within a field, specify the field to end the sort on, merge sorted input files, and sort in "dictionary order" (only letters, numbers, and spaces are used for the comparison). For more details on these options, look under sort in your Unix User's Manual. |
|
No comments:
Post a Comment