Method of Real number multiplication operation

10 ビュー (過去 30 日間)
MIN HOON
MIN HOON 2023 年 3 月 6 日
編集済み: Andy Bartlett 2023 年 3 月 7 日
Environment
In simulink,
┌ ─ ─ ─ ─ ─ ─ ─ ─ ┐
Value (uint16) │ㅡㅡㅡㅡ [ Gain : 9.8 (fixdt(1,16,0.1,0) ] ㅡㅡㅡㅡㅡ [out]
└ ─ ─ ─ ─ ─ ─ ─ ─ ┘
When I try to generate code in above model, code is generated like below briefly.
(uint16_T) ((((uint32_T)((uint16_T)( (((uint32_T49U)*((uint32_T)Value)) >> 3 )) * ((uint32_T)52429U))) >> 15)
In above equation, I think that we can represent like below.
Value * (49>>3) * (52429>>15)
Additionally, I recognized that the sequantial patterns of binary about 49 and 52429 are respectively same as 98 and 0.1.
So, I think that 49>>3 part represents 98, and 52429>>15 part represents 0.1.
From what i know, even though we represent real number in binary, computer only can understand it like inteager, which is why above equation appear.
However, I couldn't find out any meaning of bitwise operation (>>3 , >>15)
I mean... Why is " >> 3 " generated with 49 instead of 52429?
Please let me know more than what I know...

採用された回答

Andy Bartlett
Andy Bartlett 2023 年 3 月 6 日
Quick Answer:
Shifts are Multiply or Divide by a Power of Two
Shift left of M bits is equivalent to multiplication by 2^M when no overflow occurs.
Arithmetic shift right of N bit is equivalent to dividing by 2^N followed by rounding toward floor.
Integer shift left and shift right are fast machine instructions that take just one clock cycle on most modern microcontrollers.
Shifts are also much more efficient than division.
When you see shifts in generated C code for fixed-point math, the shifts usually mean multiply by 2^M or divide by 2^N.
So
( ( ( Value * 49 ) >>3 ) * 52429 ) >>15
is mathematically equivalent to
floor( ( floor( ( Value * 49 ) * 2^-3 ) * 52429 ) * 2^-15 )
which if we ignore the rounding is
Value * ( 49 * 2^-3 * 52429 * 2^-15)
or combining constants show the expected math (with input and output both uint16).
Value * 9.8
Long Answer:
Assumptions
Based on the generated code, I assume
  • you've configured the gain's output data type to be uint16.
  • all the fixed-point scaling Biases are zero
Ideal math
The ideal math for a gain in real-world values is
VY = VU * VG
where VG is the gain parameter value.
Replacing each of the variables with the zero-bias fixed-point scaling equation
V = Slope * Q
where Q is the stored integer value, gives
SlopeY * QY = SlopeU * QU * SlopeG * QG
solving for QY gives
QY = QU * QG * (SlopeU * SlopeG / SlopeY)
Using the known values
Based on the given information, we have
SlopeU = 1.0
SlopeY = 1.0
SlopeG = 0.1
QG = 98
Substituting gives
QY = QU * 98 * (1 * 0.1 / 1)
QY = QU * 98 * 0.1
or
QY = QU * 9.8
This is the "ideal math" expressed in terms of the stored integer values.
We expect the generated code to be equivalent to this except for rounding or overflow issues.
Why the math is done in two parts
The math for implementing the gain is effectively done in two parts
QTemp = ( Value * 49 ) >> 3
QY = ( QTemp * 52429 ) >> 15
One reason the math is broken into two parts is because the Gain block is designed to treat the gain parameter as if it could be tuned at some later point to a different stored integer value. So it starts with stored integer value 98, but it might be later tuned to another unsigned 16 bit stored integer values, such as 65001.
So the gain does NOT attempt to optimize the ideal math for
QY = QU * 98 * 0.1
Instead it tries to optimize the generated code for the ideal math
QY = QU * QG * 0.1
This could be implemented as
QY = QU * QG * 52429 * 2^-19
where 52429 * 2^-19 approximates 0.1.
But the pair of integer multiplications Q1 * QG * 52429 requires 48 bits to avoid all posible overflows over the range of tunable values for QG. For the Simulink model's specified production hardware target, the Gain block is trying to avoid going above 32 bit integers.
The operation is split into two parts with each part not exceeding 32 bits needed.
QY = ( QU * QG * 2^-4 ) * 52429 * 2^-15
or conceptually
QTemp = QU * QG * 2^-4
QY = QTemp * 52429 * 2^-15
Code Generation Optimization
Later when code is finally generated, the gain parameter is known to be constant.
Coders can exploit this information provided it doesn't change the defined math behaviors.
Coder see this math to be implemented
QTemp = QU * 98 * 2^-4
QY = QTemp * 52429 * 2^-15
Since 98 in binary has a trailing zero, this
QTemp = QU * 98 * 2^-4
can be replaced with the slightly simpler
QTemp = QU * 49 * 2^-3
Now replacing the multiplications and divisions by powers of two with shifts gives
QTemp = (QU * 49) >> 3
QY = ( QTemp * 52429 ) >> 15
This finishes the explaination of where the pieces of the generated code come from.
Recommendation
Using Slope and Bias scaling can improve accuracy. But to achieve the full accuracy and also get maximally efficient generated code, you want the output type to have scaling that matches the operation. So for gain block multiplication, you want
SlopeY = SlopeU * SlopeG
In your example, you've configured the scaling to not match
1 ~= 1 * 0.1
This causes less efficient code to be generated and removes the accuracy gains in representing 9.8 with Slope 0.1.
Either change the output of the gain to have SlopeY = 0.1.
Or use binary point scaling for the gain parameter. For example, if you use unsigned 16 bit best precision scaling
format long
g = fi( 9.8, 0, 16 )
g =
9.800048828125000 DataTypeMode: Fixed-point: binary point scaling Signedness: Unsigned WordLength: 16 FractionLength: 12
QG = g.storedInteger
QG = uint16
40141
Then for cases that don't overflow the output type (uint16),
the worst case rounding error is just 0.8.
And the generated code is just
QY = ( QU * 40141U ) >> 12;
  2 件のコメント
MIN HOON
MIN HOON 2023 年 3 月 7 日
Thank you so much for you kind and detailed answer.
However, I couldn't understand why seperating 2^-19 into 2^-4 and 2^15 is used as method for avoiding exceeding 32bits.
Can you explain it more specific?
Andy Bartlett
Andy Bartlett 2023 年 3 月 7 日
編集済み: Andy Bartlett 2023 年 3 月 7 日
N-bit by M-bit Multiplies in C
When multiplying an N bit integer by an M bit integer, the full precision product will need N+M bits.
But N bit by N bit multiplications provided by the C language do NOT give the full precision 2N bit product. They only give the least significant N bits of the product. So the trick in C to get the full precision product is to upcast that the inputs to a data type that has M+N bits, then do multiplication in that bigger type.
C code normally provides two sizes of integer multiplies int * int and and long long * long long. (Note this assumes long is the same size as int or long long.) Usually, the int multiply has a 2X or more speed and code compactness advantage over long long multiply. Multiplications bigger than long long are not native and require much bigger and slower code to implement.
Your example is 16 bit by 16 bit multiplies
In the example you provided, the multiplications are a 16 bit input times a 16 bit input initially producing a 32 bit product. These can both be produced by a lean and fast int32 (or uint32) multiplication.
The first multiplication is easily observed to be 16 bit by 16 bit based on the input and gain parameter types.
The second multiplication is also 16 bit by 16 bit because an EXPLICIT DOWNCAST to uint16 was inserted after the first multiply and the shift right of 3 bits. Hence the second multiply can also be done with a lean and fast int32 (or uint32) multiplication.
Overflow Management is Part of the Strategy
The math that you've configured uint16 * 9.8 being assigned back to uint16 will overflow for most of the possible inputs.
For example, for 6688 <= u <= 65535, the product u * 9.8 is greater than intmax('uint16') == 65535 so overflow is expected.
The implementation, by design, has structured the first stage to overflow a little less than the overall operation. Stage 1 will overflow 10700 <= u. The expected overflows for 6688 <= u < 10700 will happen in the second stage. More importantly, for the "good cases" 0 <= u <= 6687, neither stage will overflow.
Keeping the multiplications lean and keeping the good cases free of overflows is why the 2^-19 was split up as 2^-4 for stage one and 2^-15 for stage 2.

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeFixed Point についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by