QUOTE (Peter Nachtwey @ Sep 11 2009, 09:17 AM)

QUOTE (paulengr @ Sep 10 2009, 05:42 AM)

QUOTE (Peter Nachtwey @ Sep 4 2009, 10:11 PM)

QUOTE (Alaric @ Sep 4 2009, 03:38 PM)

X * X will execute considerably faster (as much as 20x faster) than X**2.
From watching the forums I have come to the conclusion that it is impossible to train everybody. What we did is check the exponent. If the exponent is a constant integer then we translate x**2 to x*x. x**3 gets translated to x*x*x. x^4 gets translated to (x*x)^2 which gets translated to xx*xx which takes only 2 multiplies. A smart compiler should do this. It doesn't take much effort. Now the question is what did Rockwell do?
You should always avoid divides by a constant. It is much faster to multiply by a constant. On our processors it is about 4 to 5 times faster to multiply than divide. We got smart here to and translate dividing by a constant to multiplying by (1/constant).
1. The PLC-5 is an interpreted system. There is no compiler. Everything is tokenized. So your statement that "a smart compiler should do this" doesn't apply with an interpeter.
Sure it does. There is little difference between the compiler generating real code, tokens, p-code or what ever.
Agreed. However, structurally, you'd see funny things go on with your CPT blocks (expressions get changed) if it worked this way, and probably someone out there would have a fit. "Compilers" for interpreters in the early 80's when the PLC-5 came out were little more than simple tokenizers with no actual optimizer to speak of. The only place you saw threaded interpreters was in Forth (which was considered highly innovative).
QUOTE
QUOTE
There is little optimization at all.
:(, but back then just having a compute block made the PLC5 far superior at doing math however slow it may have been.
QUOTE
2. As far as the MUL/DIV/ADD (as in doubling can be done with an ADD) normally I'd agree, partly because I know exactly how it has to be implemented in the ALU. However with a PLC-5, again, this is not the case. MUL/DIV are the same speed. It seems ludicrous in my mind that this is the case since a Booth multiplier takes up a lot of silicon real estate and is far slower than an ADD while a DIV is best implemented with an inversion which takes a long time (O(N log N) or there abouts vs. O(1)!) but hey, it is what it is.
The PLC 5 uses a 68000 and I don't think is has a math coprocessor so there is no booth algorithm. The overhead of the interpreter probably dwarfs the difference in execution time difference between a floating point MUL and DIV.
OK, I disagree partly. The general point of your arguments however is correct. So I guess I'm having crow for dinner tonight. As in all other systems, ADD/SUB are roughly equals, multiply is slower (but not dramatically), and you have to be desperate or forced into a corner to use a divide.
I'm very familiar with the 68000 (and the little 6800 brother) from a design point of view because those are the processors that I cut my teeth on. I'd slightly disagree that the PLC-5 uses a 68000 for everything because in reality, it was a very messy design including no less than 11 ASICs. When I opened one up, I counted at least 3 CPU's one of which was a 68000, one of which was a Z-80 series, and there was at least one or two others in there, too. Even the 6800 had a Booth multiplier. No separate floating point coprocessor. That was an Intel thing that came later and Motorola was forever playing catch up on those. However it was always the ongoing joke that each new Intel processor would come automatically with some pretty severe arithmetic bugs so the controls guys avoided them like the plague (except the 8080 series for register specialization masochists).
The 68000 definitely has a Booth multiplier, at least as far as integers go. This design was very common in Motorola processors at the time since they first implemented it in the 6800 series. Hardware-wise it's not too difficult to implement as long as you can do bit shifts in your ALU and add a little extra logic to check the value of the two least significant bits because it does the multiply two bits at a time. There was no floating point unit at all and no division instruction...you had to do those the hard way. So at least with integers, yes, you had hardware accelerated multiplication (such as it was). As I recall from code at that time, the usual strategy was to do an inversion and then multiply to achieve a division operation. There were also lots of folks using all kinds of creative Taylor series and Chebyshev polynomials that mostly just made my head hurt looking at the code. I am probably going to get my exact numbers wrong but I believe the hardware multiply was roughly 5 times slower than an ADD. With floating point, usually the exponents could be added/subtracted, the mantissa (fractional part) fed through the integer multiply, and then some scaling with bit shifting and adding/subtracting on the exponents was necessary to clean up the result, and all this had to be done in software alone. However, with the ASIC's on board the PLC-5 at least one of them might have been a math coprocessor but at this point I'm purely guessing. Only reason that I would suspect this to be the case is that the instruction timings aren't too shabby with floating point operations compared to their integer counterparts (roughly half as fast), leading me to believe that there almost has to be some sort of hardware assistance.
Regardless, we don't have to argue about semantics when we have numbers. According to the instruction set reference manual in appendix A (instruction timing), using floating point for the case at hand an add takes 14.9 microseconds, a subtract takes 15.6 microseconds, multiply 18.2, and divide clocks in at 23.4. A compute instruction takes 2.48 microseconds to set up, plus 0.8 microseconds for each operation, plus the normal cost of the operations themselves. This is worst case assuming direct addressing within the first 2048 words of the data table. Last but not least, the high speed exponentiation operation takes 897 microseconds, enough for almost 50 multiplies. Int-to-float takes 8.9 microseconds so be sure to type "0.0" for 0 for instance (I think this is one area where the tokenizer does this for you). Immediate addressing (using lots of registers instead of constants as recommended in previous posts) adds on 1 microsecond per variable.