Wednesday, October 28, 2009

Process Intensity Versus Data Intensity



[ Team LiB ]









Process Intensity Versus Data Intensity


Closely tied to the concept of interactivity is a concept that I describe with the phrase process intensity versus data intensity. Process intensity is the degree to which a program emphasizes processes instead of data. All programs use a mix of process and data. Process is reflected in algorithms, equations, and branches. Data is reflected in data tables, images, sounds, and text. A process-intensive program spends most of its time crunching numbers; a data-intensive program spends most of its time moving bytes around.


The difference between data and process constitutes a central construct around which the universe is built, and it shows up in every field of human intellectual inquiry. In language, it shows up as nouns and verbs. In economics, it's goods and services. In physics, it's particles and waves. In military science, it's assets and operations. And in computer science, it's bits and cycles. Process is abstract where data is tangible. Data is direct, where process is indirect. The difference between data and process is the difference between numbers and equations, between facts and principles, between events and forces, between knowledge and ideas.


Processing data is the very essence of what a computer does. There are many technologies that can store data: magnetic tape, punched cards, punched tape, paper and ink, microfilm, microfiche, and optical disk, to name just a few. But there is only one technology that can process data: the computer. This is its single source of superiority over the other technologies. Using the computer in a data-intensive mode wastes its greatest strength.


Because process intensity is so close to the essence of "computeriness," it provides us with a useful criterion for evaluating the value of any piece of software. That criterion is a vague quantification of the desirability of process intensity. It uses the ratio of operations per datum, which I call the crunch per bit ratio. I intend here that an operation is any process applied to a datum, such as an arithmetic operation, logical operation, or a simple Boolean inclusion or exclusion. A datum in this scheme can be a bit, a byte, a character, or a floating-point number�it is a small piece of information.


The "process intensity principle" is grand in implications and global in sweep. Like any such all-encompassing notion, it is subject to a variety of minor-league objections and compromising truths.


Objection 1: Substitutability


Experienced programmers know that data can often be substituted for process. Many algorithms can be replaced by tables of data. This is a common trick for expending RAM to speed up processing. Because of this, many programmers see process and data as interchangeable. This misconception arises from applying low-level considerations to the higher levels of software design. Sure, you can cook up a table of sine values with little trouble�but can you imagine a table specifying every possible behavioral result in a complex game such as Balance of Power?


Objection 2: Greater Data Capacity


A more serious challenge comes from the evolution of personal computing technology. In twenty years, we have moved from an 8-bit 6502 running at 1MHz to 64-bit CPUs running at 1GHz. This represents about a 10,000-fold increase in processing power. At the same time, though, RAM sizes have increased from a typical 4 kilobytes of RAM to perhaps 256 megabytes of RAM�a 64,000-fold increase. Mass storage has increased from cassettes holding, say, 4 kilobytes, to hard disks holding 20 gigabytes�a 5,000,000-fold increase. Thus, data storage capacity is increasing faster than processing capacity. Under these circumstances, we would be foolish not to shift some of our emphasis to data intensity. But this consideration, while perfectly valid, is secondary in nature; it is a matter of adjustment rather than fundamental stance.


Objection 3: "Balance"


Then there is the argument that process and data are both necessary to good computing. Proponents of this school note that an algorithm without data to crunch is useless; they therefore claim that a good program establishes a balance between process and data. While the argument is fundamentally sound, it does not suggest anything about the proper mix between process and data. It merely establishes that some small amount of data is necessary. It does not in any way suggest that data deserves emphasis equal to that accorded to process.


The importance of process intensity does not mean that data has no intrinsic value. Data endows a game with useful color and texture. An excessively process-intensive game will be so devoid of data that it will take on an almost mathematical feel. Consider, for example, this sequence of games: checkerschessDiplomacyBalance of Power. As we move along this sequence, the amount of data about the world integrated into the game increases. Checkers is very pure, very clean; chess adds a little more data in the different capabilities of the pieces. Diplomacy brings in more data about the nature of military power and the geographical relationships in Europe. Balance of Power throws in a mountain of data about the world. Even though the absolute amount of data increases, the crunch per bit ratio remains high (perhaps it even increases) across this sequence. My point here is that data is not intrinsically evil; the amount of data can be increased if the amount of process is concomitantly raised.


A Hidden Reason: Difficulty of Abstraction


The most powerful resistance to process intensity, though, is unstated. It is a mental laziness that afflicts all of us. Process intensity is so very hard to implement. Data intensity is easy to put into a program. Just get that artwork into a file and read it onto the screen; store that sound effect on the disk and pump it out to the speaker. There's instant gratification in these data-intensive approaches. It looks and sounds great immediately. Process intensity requires all those hours mucking around with equations. Because it's so indirect, you're never certain how it will behave. The results always look so primitive next to the data-intensive stuff. So we follow the path of least resistance right down to data intensity.



LESSON 12


Eschew data-intensive designs; aspire to process-intensity.



Process intensity is a powerful theoretical concept for designing all kinds of software, not just games. It is highly theoretical, and so it is difficult to understand and implement, and there are numerous exceptions and compromising considerations that arise when applying the notion. Nevertheless, it remains a useful theoretical tool in game design.






    [ Team LiB ]



    No comments: