The "Great Debate" is a very emotional exchange that has been running continuously since the late '70s. On some newsgroup somewhere, you will find a thread discussing this topic; although you will have the best luck looking at the Internet comp.lang.asm.x86, comp.lang.c, alt.assembly, or comp.lang.c++ newsgroups. Of course, almost anything you read in these newsgroups is rubbish, no matter what side of the argument the author is championing. Because this debate has been raging for (what seems like) forever, it is clear there is no easy answer; especially one that someone can make up off the top of their head (this describes about 99.9% of all postings to a usenet newsgroup). This page contains a series of essays that discuss the advances of compilers and machine architectures in an attempt to answer the above question.
Although I intend to write a large number of these essays, I encourage others, even those with opposing viewpoints, to contribute to this exchange. If you would like to contribute a "well thought out, non-emotional essay" to this series, please send your contribution (HTML is best, ASCII text is second best) to
debate@webster.ucr.edu
Okay, it's time to back off from what is possible and start talking about things that are realistic. In particular, if one is willing to accept the fact that compilers will never produce code that is better than humans (or even equal to), can they produce code that is good enough? In other words, is it cost effective to use assembly language these days?
Quite frankly, there are only a few reasons for using assembly language in a modern program:
Some might argue that we've long since past the point where case (1) above applies. After all, memory is cheap and CPU cycles are cheap. Why worry about them? Well, this is an example of having those "UNIX blinders" firmly in place. There are lots of computer systems where memory (both RAM and ROM) are very tight. Look at any microcontroller chip, for example. Likewise, these microcontrollers run at 1MHz, 2MHz and similar speeds (some use clock dividers, so there may be 12MHz or better going into the chip, but the fastest instruction may require six clock cycles). A typical microcontroller has 128 bytes (that's bytes, not KBytes or MBytes) and executes instructions slower than 1 MIPS. Yes, you can buy better parts; however, if you're planning on building a million versions of some Barbie doll, the difference between a $0.50 microcontroller and a $5.00 microcontroller can make the difference between the success and failure of your product (since the price of the components affects the retail price of the doll).
Well, nothing can be done one way or another about point (2) above.
Some people may find point three hard to believe. They firmly believe that assembly language is always harder to read and harder to use than a HLL. However, there are a (small) class of problems for which assembly language is much better suited than a HLL. For example, try rotating the bits in a character sometime. Much easier to accomplish this in assembly language (assuming the presence of a common rotate instruction) than in a language like C.
Perhaps point four is a sign of mental illness. However, I have found certain projects in assembly language to be much more enjoyable than the same project in a HLL. This is certainly not to imply that assembly language is always more fun to use. Quite frankly, I find languages like Snobol4 and Delphi a lot more fun to use than assembly on many projects; however, assembly language is the more interesting language to use on several projects (ask me about my implementations of TIC-TAC-TOE sometime).
The important point to note here is that I am not claiming that assembly language is always the appropriate language to use. Such an argument is obviously flawed for a large variety of reasons. I simply want to point out that there are some very real reasons for deciding to using assembly language within some projects.
Now to the question at hand: "Does it make economic sense to use assembly language in a project?" Mostly, the answer is no. Unless your project falls into one of the first three categories above, assembly language is probably the wrong choice. Let's discuss some of the common reasons:
All of these issues lead to one inescapable conclusion: it costs more to develop code in assembly language than it does in a language like C++. Likewise, it costs more to maintain an assembly language program (probably in the same proportion to the development costs). Another conclusion one will probably reach is that it will take you longer to develop the code using assembly language (hence the greater cost). On the other hand, assuming you're using competent programmers, you will probably get a better product. It will be faster, use fewer machine resources, and will probably contain less "gold-plating" that often results in additional support problems.
A common argument against the above statement is that "and your users will have to deal with more bugs because the code was written in assembly language." However, the extra testing and debugging necessary has already been factored into the statement "it takes longer and costs more to use assembly language." Hence the quality argument is a moot point.
Does this mean assembly is suitable for any project where the code might use too many system resources? Of course not. Once again, if portability is a primary concern, your development costs will increase by about 40% of the original cost for each platform to which you port your 100% assembly application. However, if you, like 80% of the world's software developers, write your code specifically for an Intel machine running Windows, the judicious use of assembly language at certain points is easily justified.
In the 50's and 60's computer resources were very expensive. Computers literally cost hundreds of dollars per hour to operate. At the time, a programmer typically made around $17,000/year (by the way, if that seems really low, keep in mind that it's probably equivalent to about a $35,000 annual salary in today's dollars). If a programmer got a program working in a month and then spent a second month working on it to double the speed, clearly such effort paid for itself in a small period of time. I was an undergraduate in the middle 70's, just at the end of this phase. I certainly remember the importance instructors and graders placed on writing optimal code.
In the middle to late 70's, however, all this began to change. The advent of the microcomputer all but totally eliminated this concept of charging for CPU time by the hour. For the cost of one hour's CPU time in 1976, I can now purchase a CPU that is about 5-10 times faster than that old IBM 360/50 that was charging me $150/CPU hour to use. All of a sudden, managers discovered that the programmer's time was far more valuable than the computer's time. Today, if a programmer spends an extra month doubling the speed of his/her code, the bean counters in the front office get very upset because the code cost twice as much to write. This led to a complete rethinking of software economics, a school of thought that persists today.
Of course, programmers have done nothing to promote this view; NOT! Software engineers have been "programmed" with the concept that they are invaluable resources in the organization. Their time is valuable. They need support to ensure they are as productive as possible. And so on... So the programmer who would have written a fairly good program and then spent twice as long making it run twice as fast gets criticized for the effort. Soon, programmers who write good, solid code, using decent algorithms, discover that their peers who write sloppy "quick and dirty" code but get it done in half the time, are getting all the accolades and pats on the back. Before long, everybody is in a mode where they are seeking the fastest solution, which is generally the first one that comes to mind. The first solution that comes to mind is generally sub-optimal and rather low quality. In particular, quick and dirty solutions typically require an excess of machine resources. Look no farther than at today's bloated applications to see the end result of this.
The salvation of the quick and dirty school of programming has been the computer architects. By continuously providing us with chips that run twice as fast every 18 months (Moore's law) and halving the cost of memory and CPUs in about that same time frame, users haven't really noticed how bad software design has gotten. If CPUs double in speed every 18 months and it takes about two years to complete a major application (written so quickly it runs at half the speed of a preceding application), the software engineers are still ahead because CPUs are running better than twice as fast. The users don't really care because they are getting slightly better performance than the previous generation of software and it didn't cost them any more. Therefore, there is very little motivation for software engineers to change their design practices. Any thought of optimization (in any language) is a distant memory.
Economically, the cost of the machine is insignificant compared to the software engineer's time. By a similar token, it is possible to show that, in many cases, the user's time is more valuable than the software engineers'. "How is this?" you might ask, "Software engineers make $50.00/hour while lowly users make $5.00/hour." For a commercial software product, however, there are probably 10,000 potential users for each software engineer. If a software engineer spends an extra three months ($25,000) optimizing code that winds up saving an average user only one minute per day, those extra three months of development will pay for themselves (world-wide, economically) in only one month. Every month after that, your set of users will save an additional $25,000.
Of course, this claim assumes they use that extra minute per day to be especially productive (rather than visiting the water cooler with the minute they saved), but overall, any big gain you make in the performance of a program translates into really big gains world-wide if you have enough users. Therefore, software engineers need to consider the user's time and the most valuable resource associated with a project. Programmer time should be considered secondary to this. Machine time is still irrelevant and will continue to be irrelevant.
What does this have to do with assembly language? Well, if you've optimized your program as best you can in a HLL, assembly is one option you may employ to squeeze even more performance out of your program. If you know assembly language well enough (and it really does take an expert to consistently beat a compiler) the extra time you spend coding your application (or part of it) in assembly pays off big time when you consider your user's time as well.
Of course, if you don't anticipate a large number of users (take note, UNIX users :-) the engineer's time remains the most valuable commodity.
A common reason for the worthlessness of assembly language is "Software spends 90% of its time in 10% of the code. Therefore it doesn't make sense to write an application in assembly language since 90% of your effort would be wasted.
Okay, what about that other 10%? The problem, you see, is that software engineers often use this excuse as a way of avoiding optimization altogether. Yet this rule definitely claims that at least 10% of your code is in dire need of optimization. Let's look a little deeper at the 90/10 rule and check out what it actually means.
Hypothesis: 90% of the execution time occurs in 10% of your code.
The 90/10 rule, especially the way software engineers throw it around in a cavalier manner, suggests that you can easily locate this 10% of your program like it were a tumor and surgically remove it, thereby speeding up the rest of your program. Gee, now that didn't take too much effort, right?
The problem with this view is that the 10% of your code that takes most of the execution time is not generally found all in one spot. You'll find 2% here, 3% there, 1% over in the corner, two-thirds of a percent somewhere else, maybe other 1% in the data base routines. If you think you can dramatically speed up your code by surgically replacing 10% of your code with some assembly language, boy do you have another think coming. Unfortunately, some 1% segment that is slow is often directly connectted to another 1-2% that isn't slow. You'll wind up converting that connecting code as well. So to replace that 1%, you wind up replacing 3% of your code. If you're very careful, you'll probably find you wind up replacing about 25% of your program just to get at that 10% that was really slow.
Myth #1b: You only need to rework 10% of your code.
Okay, let's assume you manage to locate that 10% of your code and you optimize it so that it is five times faster than before (a good assembly language programmer can often achieve this). Since that 10% of your code used to take 90% of the execution time and you've sped it up by a factor of five, that means it now consumes only 18% of the execution time. This implies that 82% of the execution time is now being spent somewhere else (i.e., in the other 90% of the code). This suggests that you could still double or even triple the speed of your program (after the first optimization process) by attacking the remainder of the program. Funny thing about mathematics -- numerically, the faster you make your program run, the eaiser it is to double or triple the speed of your programs. For example, if I was able to speed up my program by a factor of four by attacking that 10% of the code responsible for 90% of the time, if I can locate another 10% of the code responsible for 80% of the execution time in the remaining code, I can easily double the speed of my program again.
Of course, the astute reader will point out that the 90/10 rule probably doesn't apply repeatedly and as you optmize your code it gets more difficult to keep optimizing it, but the whole thing I'm trying to point out here is that the 90/10 (or 80/20) rule suggests more than it really delivers. In particular, it is a poor defense against using assembly language (indeed, on the surface it suggests that you should write about 10-20% of your code in assembly since that's the portion of your program that will be time critical).
Is the 90/10 rule a good argument against writing entire programs in assembly language? That is a point that could be debated forever. However, I've mentioned that you'll probably wind up converting 20-25% of your code in order to optimize that slow 10%. The amount of effort you put into finding that 10%, plus the effort writing the entire program in a HLL to begin with could come fairly close to "paying" for the cost of writing the code in assembly in the first place.
Myth #1c: You only need to rework 10% of your code.
The 90/10 rule generally applies to a single execution of a program by a single user. Put two separate users to work on the same program, especially a complex program like Microsoft Excel (for example) and you can watch the 90/10 rule fall completely apart. Those two users could wind up using completely different feature sets in MS-Excel resulting in the execution of totally different sections of code in the program. While, for either user, the 90/10 rule may apply, it could also be the case that these two users spend most of their time executing different portions of the program. Hence, were you to locate the 10% for one user and optimize it, that optimization might not do much for the second.
A big problem with the 90/10 rule is that it is too general. So general, in fact, that many common programs don't live up to this rule. Often you will see people refer to the "80/20 Rule" in an attempt to generalize the argument even more. Since it is difficult to directly apply the 90/10 (or 80/20) rule directly to an arbitrary program, I have invented what I call the "Rule of Fifths" that is a more general statement concerning where a program spends its time.
The "Rule of Fifths" (I refer to it as the 20/20 rule) says that programs can be divided into five (not necessarily equal) pieces:
(1) Code that executes frequently,
(2) Code that seldom executes,
(3) Busy-wait loops that do not contribute to a computational solution (e.g., waiting for a keypress),
(4) Code that executes once (initialization code), and
(5) Code that never executes (e.g., error recovery code, sections of code inserted for defensive programming purposes, and dead code).
Obviously, when attempting to optimize a program, you want to concentrate on those sections of code that fall into category (1) above. The problem, as with the 90/10 rule, is to determine which code belongs in category (1).
One big difference between the "Rule of Fifths" and the 90/10 rule is that the "Rule of Fifths" explicitly acknowledges that the division of the code is a dynamic entity. That is, the code that executes frequently may change from one execution to another. This is especially apparent when two different users run a program. For example, user "A" might use a certain set of features that user "B" never uses and vice-versa. For user "A" of the program, the 10% of the code that requires 90% of the execution time may be different than for user "B". The "Rule of Fifths" acknowledges this possibility by noting that some code swaps places in categories (1) and (2) above depending upon the environment and the user.
The "Rule of Fifths" isn't particularly good about telling you which statements require optimization. However, it does point out that three components of a typical programs (cases 3-5 above) should never require optimization. Code falling into category (2) above shouldn't require optimization, but because code moves between categories (1) and (2) one can never be sure what is truly in category (1) vs. category (2). This is another reason while you will wind up having to optimize more than 10% of your code, despite what the 90/10 rule has to say.
As I've mentioned earlier, assembly language isn't particularly hard to write - optimized assembly language is hard to write (indeed, optimized any language is hard to write). I was once involved in a discussion about this topic on the Internet and a hard-core C programmer used the following statement to claim people shouldn't try to write optimized code in assembly language:
You'll spend a week writing your optimized function in assembly language whereas it will take me a day to do the same thing in C. Then I can spend the rest of the week figuring out a better way to do it in C so that my new algorithm will run faster than your assembly implementation.
Hey folks, guess what? This programmer discovered why it takes a week to write an assembly language function, he just didn't realize it. You see, it would take me about a day to implement the same function in assembly as it took him to write that code in C. The reason it takes a week to get the optimized assembly version is because I'd probably rewrite the code about 10 times over that week constantly trying to find a better way of doing the task. Because assembly language is more expressive than C, I stand a much better chance of finding a faster solution than does the C programmer.
What I'm trying to point out here is that people perceive assembly language as being a hard language to develop in because they've never really considered writing anything but optimized code in assembly. If optimized code is not what you're after, assembly language is fairly easy to use. Consider the following MASM 6.1 compatible code sequence:
var integer i, k[20], *m float f boolean again endvar DoInput: try mov again, false geti i except $Conversion printf "Conversion error, please re-enter\n" mov Again, true endtry cmp Again, true je DoInput printf "You entered the value %d\n", i . . . cout "Please enter an integer and a real value: " cin i, j cout "You entered the integer value ", i, \ " and the float value ", j . . .
A C++ programmer, even one who isn't familiar with 80x86 assembly language, should find this code sequence somewhat readable. Perhaps you're thinking "this is unlike any assembly language I've ever looked at; this isn't real assembly language." You'd be wrong. As I said, this code will assemble with the Microsoft Macro Assembler (v6.1 and later). You can run the resulting program under DOS or in a console window under Windows 3.1, 95, or NT. The secret, of course, is to make sure you're using version two (or later) of the "UCR Standard Library for 80x86 Assembly Language Programmers."
"Well that's not fair." you'd probably say. This is the UCR Standard Library, not true assembly language. My only reply is "Okay, then, write a comparable C/C++ program without using any library routines that you didn't write yourself. Then come back and tell me how easy it is to program in C or C++."
The point here is that assembly language can be very easy to write if:
I do want to make an important point here - you cannot write "A C program that uses MOV instructions" and expect it to be fast. In other words, if you use the C programming paradigm to write your assembly language code, you're going to find that any good C compiler will probably produce better code than you've written yourself. I never said writing good assembly code was easy, I simply said that it is possible to make assembly code easy to write.
Does this mean that you need to write "optimizied assembly or assembly not at all?" Heavens no! The 90/10 rule applies to assembly language programs just as well as it does to HLL programs. Some distinct portion of your code does not have to be all that fast (90% of it according to the 90-10) rule. So you can write a large part of your application in the the "easy to write but slow form" and leave the optimization for the hard sections of the program. Why not just write those rarely used portions in a HLL and forget about using assembly? Sometimes that's a good solution. Othertimes, the interface between the assembly and HLL gets to be complex enough that it introduces inefficiencies and bugs so it's more trouble than it's worth.
For those not familiar with the term, amortization means dividing the cost of something across some number of units (time, objects produced, etc.). With respect to software development, it is important to amortize the cost of development against the number of units shipped.
For example, if you ship only one copy of a program (e.g., a custom application), you must amortize the cost of development across that single shipment (that is, you had better be collecting more money for the application than it cost you to develop it). On the other hand, if you intend to ship tens of thousands of units, the cost of each package need only cover a small part of the total development cost. This is a well-known form of amortization. If it were the only form, it would suggest that you should keep your development costs as low as possible to maximize your profits. As you can probably imagine, those denying the usefulness of assembly language often use this form of amortization to bolster their argument.
There are other effects amortization has on the profits one might receive on a software project. For example, suppose you produce a better product (faster, smaller, more features, whatever). Presumably, more people will buy it because it is better. You must amortize those extra sales across the extra expense of developing the software to determine if the effort is worth it. Writing code in assembly language can produce a faster product and/or a smaller product. In some cases you can easily justify the use of assembly language because the profit of the extra sales covers the extra cost of development and mainenance. Unfortunately for those who would like to use assembly language for everything, it is very difficult to predict if you will, indeed, have better sales because you used assembly language to develop the program. Many people claim this to be the case, I remain skeptical.
One area where the use of assembly language is justifiable is if you can trade off development costs (basically a fixed, one-time cost known as an NRE [non-recoverable engineering] fee) for recurring costs. For example, if you develop a program for an embedded system and discover you can fit the code into 4K of ROM rather than 8K of ROM, the microcontroller you must purchase to implement the product will cost less. Likewise, if you write a program that is twice as fast as it otherwise would be, you can use a processor that runs at one-half the clock frequency; this also results in a lower-cost product. Note that these savings are multiplied by the number of units you ship. For high-volume applications, the savings can be very significant and can easily pay for the extra effort required to completely write a program in assembly language.
Of course, if you do not intend to ship a large quantity of product, the development cost will be the primary expense. On the other hand, if you intend to ship a large quantity of product, the per-unit cost will quickly dwarf the development costs (i.e., amortize them away).