Example: Using Based Pointers
Previous programs have shown how to sort files in various situations. The object, of course, is to illustrate different ways to manage memory, not to discuss sorting techniques. Program 5-1 uses a binary search tree that is destroyed after each sort, and Program 5-4 sorts an array of fixed-size records in mapped memory. Appendix C shows performance results for different implementations, including the next one in Program 5-5.
Suppose that it is necessary to maintain a permanent index file representing the sorted keys of the original file. The apparent solution is to map a file that contains the permanent index in a search tree or sorted key form to memory. Unfortunately, there is a major difficulty with this solution. All pointers in the tree, as stored in the file, are relative to the address returned by MapViewOfFile. The next time the program runs and maps the file, the pointers will be useless.
Program 5-5, together with Program 5-6, solves this problem, which is characteristic of any mapped data structure that uses pointers. The solution uses the _based keyword available with Microsoft C. An alternative is to map the file to an array and use indexing to access records in the mapped files.
The program is written as yet another version of the sort command, this time called sortMM. There are enough new features, however, to make it interesting.
The records are of varying lengths. The program uses the first field as a key but detects its length. There are two file mappings. One mapping is for the original file, and the other is for the file containing the sorted keys. The second file is the index file, and each of its records contains a key and a pointer (base address) in the original file. qsort sorts the key file, much as in Program 5-4. The index file is saved and can be used later, and there is an option (-I) that bypasses the sort and uses an existing index file. The index file can also be used to perform a fast key file search by performing a binary search (using, perhaps, the C library bsearch function) on the index file.
Figure 5-5 shows the relationship of the index file to the file to be sorted. Program 5-5, sortMM, is the main program that sets up the file mapping, sorts the index file, and displays the results. It calls a function, CreateIndexFile, which is shown in Program 5-6.
Program 5-5. sortMM: Based Pointers in an Index File
/* Chapter 5. sortMM command. Memory Mapped sorting -- one file only. Options: -r Sort in reverse order. -I Use existing index file to produce sorted file. */
#include "EvryThng.h" int KeyCompare (LPCTSTR , LPCTSTR); DWORD CreateIndexFile (DWORD, LPCTSTR, LPTSTR); DWORD KStart, KSize; /* Key start position & size (TCHAR). */ BOOL Revrs; int _tmain (int argc, LPTSTR argv []) { HANDLE hInFile, hInMap; /* Input file handles. */ HANDLE hXFile, hXMap; /* Index file handles. */ HANDLE hStdOut = GetStdHandle (STD_OUTPUT_HANDLE); BOOL IdxExists; DWORD FsIn, FsX, RSize, iKey, nWrite, *pSizes; LPTSTR pInFile = NULL; LPBYTE pXFile = NULL, pX; TCHAR _based (pInFile) *pIn; TCHAR IdxFlNam [MAX_PATH], ChNewLine = TNEWLINE; int FlIdx = Options (argc, argv, _T ("rI"), &Revrs, &IdxExists, NULL);
/* Step 1: Open and map the input file. */ hInFile = CreateFile (argv [FlIdx], GENERIC_READ | GENERIC_WRITE, 0, NULL, OPEN_EXISTING, 0, NULL); hInMap = CreateFileMapping (hInFile, NULL, PAGE_READWRITE, 0, 0, NULL); pInFile = MapViewOfFile (hInMap, FILE_MAP_ALL_ACCESS, 0, 0, 0); FsIn = GetFileSize (hInFile, NULL);
/* Steps 2 and 3: Create the index file name. */ _stprintf (IdxFlNam, _T ("%s%s"), argv [FlIdx], _T (".idx")); if (!IdxExists) RSize = CreateIndexFile (FsIn, IdxFlNam, pInFile);
/* Step 4: Map the index file. */ hXFile = CreateFile (IdxFlNam, GENERIC_READ | GENERIC_WRITE, 0, NULL, OPEN_EXISTING, 0, NULL); hXMap = CreateFileMapping (hXFile, NULL, PAGE_READWRITE, 0, 0, NULL); pXFile = MapViewOfFile (hXMap, FILE_MAP_ALL_ACCESS, 0, 0, 0); FsX = GetFileSize (hXFile, NULL); pSizes = (LPDWORD) pXFile; /* Size fields in .idx file. */ KSize = *pSizes; /* Key size */ KStart = *(pSizes + 1); /* Start position of key in record. */ FsX -= 2 * sizeof (DWORD);
/* Step 5: Sort the index file with qsort. */ if (!IdxExists) qsort (pXFile + 2 * sizeof (DWORD), FsX / RSize, RSize, KeyCompare);
/* Step 6: Output the input file in sorted order. */ pX = pXFile + 2 * sizeof (DWORD) + RSize - sizeof (LPTSTR); for (iKey = 0; iKey < FsX / RSize; iKey++) { WriteFile (hStdOut, &ChNewLine, TSIZE, &nWrite, NULL); /* The cast on pX is necessary! */ pIn = (TCHAR _based (pInFile)*) *(LPDWORD) pX; while ((*pIn != CR || *(pIn + 1) != LF) && (DWORD) pIn < FsIn) { WriteFile (hStdOut, pIn, TSIZE, &nWrite, NULL); pIn++; } pX += RSize; } UnmapViewOfFile (pInFile); CloseHandle (hInMap); CloseHandle (hInFile); UnmapViewOfFile (pXFile); CloseHandle (hXMap); CloseHandle (hXFile); return 0; }
[View full size image]
Program 5-6 is the CreateIndexFile function, which creates the index file. It initially scans the input file to determine the key length from the first record.
Subsequently, it must scan the input file to find the bound of each varying-length record to set up the structure shown in Figure 5-5.
Program 5-6. sortMM: Creating the Index File
DWORD CreateIndexFile (DWORD FsIn, LPCTSTR IdxFlNam, LPTSTR pInFile) { HANDLE hXFile; TCHAR _based (pInFile) *pInScan = 0; DWORD nWrite;
/* Step 2a: Create an index file. Do not map it yet. */ hXFile = CreateFile (IdxFlNam, GENERIC_READ | GENERIC_WRITE, FILE_SHARE_READ, NULL, CREATE_ALWAYS, 0, NULL);
/* Step 2b: Get first key & determine key size/start. Skip white space and get key length. */ KStart = (DWORD) pInScan; while (*pInScan != TSPACE && *pInScan != TAB) pInScan++; /* Find the first key field. */ KSize = ((DWORD) pInScan - KStart) / TSIZE;
/* Step 3: Scan the complete file, writing keys and record pointers to the key file. */ WriteFile (hXFile, &KSize, sizeof (DWORD), &nWrite, NULL); WriteFile (hXFile, &KStart, sizeof (DWORD), &nWrite, NULL); pInScan = 0; while ((DWORD) pInScan < FsIn) { WriteFile (hXFile, pInScan + KStart, KSize * TSIZE, &nWrite, NULL); WriteFile (hXFile, &pInScan, sizeof (LPTSTR), &nWrite, NULL); while ((DWORD) pInScan < FsIn && ((*pInScan != CR) || (*(pInScan + 1) != LF))) { pInScan++; /* Skip to end of line. */ } pInScan += 2; /* Skip past CR, LF. */ } CloseHandle (hXFile); /* Size of an individual record. */ return KSize * TSIZE + sizeof (LPTSTR); }
|
No comments:
Post a Comment