In the following, we state, in Subsection 3.1, Theorems 1 to 8 to derive the lower bounds of R-operations in PSCM and, in Subsection 3.2, Theorems 9 and 10 for PMCM, along with their corresponding proofs. The pipelining operation, which has not been alluded in the previous works [3] and [4], is explicitly included in the proposed lower bounds through the R-operations.
3.1 PSCM case
Whenever a constant c is mentioned in the theorems of this sub-section (Theorems 1 to 8), we consider that the MNSD of that constant is S and its number of prime factors is Ω.
Theorem 1 provides the upper limit of non-zero digits that can be generated by any graph with a given number of depth levels, regardless of its number of R-operations. From this, we can know the minimum number of depth levels that a graph must have to implement a constant with a given S.
Theorems 2 and 3 prove the properties of the completely multiplicative graphs, namely, generating the upper limit of non-zero digits mentioned in Theorem 1 with the minimum possible number of R-operations. From them, we have that the completely multiplicative graph is a solution with the lower bound for the number of R-operations. However, as it is known, this graph has articulation points, and every articulation point represents the union between two cascaded subgraphs, i.e., the product of two smaller constants. Therefore, Theorem 4 uses Ω to identify what constants can be implemented with the completely multiplicative graph (for example, prime constants cannot be factorized into smaller constants; thus, they cannot be implemented by a completely multiplicative graph).
Theorem 5 identifies the minimum number of R-operations needed in any non-multiplicative graph with a given number of depth levels, and Theorem 6 proves that non-multiplicative graphs can generate the upper limit of non-zero digits mentioned in Theorem 1 with its minimum number of R-operations. Then, Theorem 7 establish the lower bound for the number of R-operations needed to implement a prime constant (Ω = 1).
Finally, Theorem 8 completes the information of Theorems 4 and 7, namely, the lower bound of R-operations needed to implement non-prime constants that have fewer number of factors than the number of subgraphs used in a completely multiplicative graph.
Theorem 1
A graph with p depth levels can provide at most n
p
non-zero digits for a constant.
Proof
The proof is given by induction (see proof of Theorem 6.9 in [39] for the case of 2-input A-operations):
-
1)
The base case corresponds to the first depth level, where a n-input A-operation can form a constant with at most n non-zero digits. This is true since the input of any graph has one non-zero digit [3, 4, 39].
-
2)
As inductive step, we assume that, in the p-th level, there are n
p non-zero digits at most. In the (p + 1)-th level, an A-operation can form a constant whose number of non-zero digits is the sum of the numbers of non-zero digits at every input of that A-operation. This is at most n times the maximum number of non-zero digits available in the previous level, i.e., n × n
p = n
p + 1 non-zero digits.
Since assuming that the theorem is true for p implies that the theorem is also true for p + 1, and since the base case is also true, the proof is complete. An adder, regardless of its number of inputs, cannot generate more non-zero digits than the sum of the numbers of non-zero digits in every one of its inputs. Thus, the MNSD can be, at most, n-plicate if the inputs of the n-input adder placed in any depth level come from the immediately previous depth level. ■
Theorem 2
A completely multiplicative graph with p A-operations can generate n
p
non-zero digits.
Proof
This proof is an straightforward extension of the proof of Theorem 6.8 in [39], which corresponds to completely multiplicative graphs with 2-input A-operations. As stated earlier, the input of a graph has one non-zero digit. In the completely multiplicative graph, there are at most n non-zero digits after the A-operation placed at the 1st depth level. Cascading an A-operation to that output yields at most n × n non-zero digits, and so on. The number of non-zero digits at the depth level p is at most the n-tuple of the number of non-zero digits of a fundamental at the (p − 1)-th depth level. Consequently, the maximum number of non-zero digits at the p-th depth level is n
p. ■
Theorem 3
A completely multiplicative graph with p depth levels needs only p R-operations.
Proof
The completely multiplicative graph with p depth levels has p A-operations, and every A-operation forms a subgraph. Pipelining between two subgraphs needs only one register, according to [38], because the pipelining occurs on the articulation point. This results in every A-operation being followed by a register. Since an A-operation followed by a register is considered an R-operation, there are only p R-operations in total. This is illustrated in Fig. 7. ■
Theorem 4
A constant with (n
p − 1
+ 1) ≤ S ≤ n
p
and Ω ≥ p needs at least p R-operations.
Proof
From Theorem 2, we have that a constant with (n
p − 1
+ 1) ≤ S ≤ n
p non-zero digits can be implemented with at least p depth levels, which implies at least p A-operations. From Theorem 3, we have that a completely multiplicative graph can generate those values for S with only p R-operations. The completely multiplicative graph with p R-operations consists of p cascaded subgraphs; thus, a constant implemented with that graph must have at least p prime factors. Since Ω ≥ p holds, the completely multiplicative graph can be employed to implement that constant using p R-operations. ■
Theorem 5
A non-multiplicative graph with p depth levels needs at least (2p − 1) R-operations.
Proof
According to Theorem 3, if a graph with p depth levels has only p R-operations in total, it must be a pipelined completely multiplicative graph. According to Theorem 2, that graph can generate the maximum possible number of non-zero digits, namely, n
p. To make non-multiplicative that optimal graph, the (p − 1) articulation points must be eliminated. From [38], it is known that at least one additional R-operation must be added for every eliminated articulation point. Therefore, at least (2p − 1) R-operations are required, i.e., the original p minimum number of R-operations in the form of addition-delay pairs plus the additional (p − 1) R-operations in the form of pure delays. Figure 8 shows an example with p = 3. ■
Theorem 6
A non-multiplicative graph with p depth levels and (2p − 1) R-operations can generate n
p
non-zero digits.
Proof
Consider a graph with p depth levels formed by two completely multiplicative graphs of (p − 1) levels each, connected in parallel from the input of the graph, and one A-operation placed in the p-th level summing up the outputs of the aforementioned graphs. The output of one of these graphs is connected to the n − 1 inputs of the last A-operation, and the output of the other graph is connected to the remaining input of the last A-operation. This is a non-multiplicative graph because it is not formed by cascading subgraphs, and it is composed by (2p − 1) A-operations. According to Theorem 2, we can obtain n
p − 1 non-zero digits from the completely multiplicative graphs, and according to Theorem 3, these graphs can be pipelined without requiring extra registers. Since the last A-operation can add n times the n
p − 1 non-zero digits in each one of its inputs and can be pipelined without extra cost, the resulting graph generates n
p non-zero digits using (2p − 1) R-operations. An example of this is shown in Fig. 9. ■
Theorem 7
A constant with (n
p − 1
+ 1) ≤ S ≤ n
p
and Ω = 1 needs at least 2p − 1 R-operations.
Proof
Since Ω = 1 holds, the non-multiplicative graph must be employed to implement that constant. From Theorem 6, we have that a constant with (n
p − 1
+ 1) ≤ S ≤ n
p non-zero digits can be implemented with at least p depth levels and at least 2p − 1 R-operations. This is a lower bound for the number of R-operations, since from Theorem 5, we have that a non-multiplicative graph with p-levels needs at least 2p − 1 R-operations. ■
Theorem 8
A constant with (n
p−1
+ 1) ≤ S ≤ n
p
and 1 < Ω < p needs at least (2p − Ω) R-operations.
Proof
From Theorem 1, we have that p depth levels are necessary to achieve the values of S in the specified range. Since Ω < p holds, we can take advantage of a completely multiplicative graph with Ω−1 R-operations at most, which, according to Theorem 2, generates n
Ω−1 non-zero digits at most, and represents the product of Ω − 1 factors. The last factor can be formed with a non-multiplicative subgraph with [p − (Ω − 1)] depth levels. According to Theorem 5, this subgraph needs at least 2[p − (Ω − 1)] − 1 R-operations, and according to Theorem 6, it can generate n
[p − (Ω − 1)] non-zero digits. The total graph, illustrated in Fig. 10, can generate at most n
Ω − 1 × n
[p − (Ω − 1)] = n
p non-zero digits and uses at least (Ω − 1) + 2[p − (Ω − 1)] − 1 = 2p − 2(Ω − 1) + (Ω − 1 − 1 = 2p − (Ω − 1) − 1 = (2p − Ω) R-operations. ■
Finally, from Theorem 1, we have that the number of depth levels necessary to achieve S is p = ⌈ log
n
(S)⌉. Substituting this value for p and using Theorems 4, 7, and 8, we obtain the lower bound for the number of R-operations needed to form a PSCM block as follows:
$$ {L}_{SCM}=\left\{\begin{array}{l}2\left\lceil { \log}_n(S)\right\rceil -\varOmega; \kern3.5em \varOmega <\left\lceil { \log}_n(S)\right\rceil, \\ {}\left\lceil { \log}_n(S)\right\rceil; \kern2.25em \varOmega \ge \left\lceil { \log}_n(S)\right\rceil .\kern3.25em \end{array}\right. $$
(6)
3.2 PMCM case
The theorems in this section are stated for N constants c
1, c
2, …, c
N
, whose respective MNSDs are S
1, S
2, …, S
N
, and their respective numbers of prime factors are Ω1, Ω2, …, Ω
N
, such that S
1 ≤ S
2 ≤ … ≤ S
N
.
Theorem 9 indicates the lower bound for the number of n-input A-operations needed to form an MCM block. If pipelining is added, more R-operations than the aforementioned lower bound may be needed because the constants with fewer prime factors may use non-multiplicative graphs, which require extra R-operations (see Theorems 5 to 8). Besides, all the outputs of the PMCM block must have equal number of depth levels to balance the input–output delay, which also may require extra R-operations. Based on these observations, Theorem 10 extends the lower bound provided in Theorem 9 by identifying at least how many extra R-operations would be needed. From these theorems, we obtain the lower bound for the number of R-operations needed to form a PMCM block.
Theorem 9
At least K n-input A-operations are needed to build an MCM block, where K is given by
$$ K=\left\lceil { \log}_n\left({S}_1\right)\right\rceil +{\displaystyle \sum_{i=1}^{N-1} E\left({S}_i,{S}_{i+1}\right)}, $$
(7)
with
\( E\left({S}_i,{S}_{i+1}\right)=\left\{\begin{array}{c}\hfill 1;\kern5em {S}_i={S}_{i+1},\hfill \\ {}\hfill \left\lceil { \log}_n\frac{S_{i+1}}{S_i}\right\rceil; \kern0.75em {S}_i<{S}_{i+1}.\hfill \end{array}\right. \)
Proof
Recall that every A-operation has only one possible configuration and therefore can generate only one fundamental. Simply shifted (i.e., scaled by a power of two) versions of that fundamental can be obtained from that A-operation. Since the target constants are integer and odd by definition, it is not possible to obtain two target constants from the same A-operation. Therefore, there must be at least N n-input A-operations for the N constants. Note that, since the terms S
i
are sorted in ascendant order, S
1 corresponds to the simplest constant, i.e., the one with the smallest number of non-zero digits. From Theorem 1, we have that with p depth levels we can obtain n
p non-zero digits at most. By using the relation n
p ≥ S
1, we have that the minimum number of levels necessary to generate S
1 non-zero digits is ⌈ log
n
(S
1)⌉, which implies the existence of at least ⌈ log
n
(S
1)⌉ A-operations for that constant. Finally, if S
i+1 > n × S
i
holds, we have that a single A-operation is not able to generate the constant c
i+1 if there are only coefficients with at most S
i
digits available because the number of non-zero digits at the output of an A-operation is at most the sum of the number of non-zero digits at its inputs. Therefore, at least ⌈ log
n
(S
i + 1/S
i
)⌉ A-operations will be required. This proof is a straightforward extension of the proof given in [3] for the lower bound of 2-input A-operations that form an MCM block. ■
Theorem 10
At least L R-operations are needed to build a PMCM block, where L = K + F + G, with
$$ F=\left\{\begin{array}{c}\hfill {\displaystyle \underset{i}{ \max }}\left\{\left\lceil { \log}_n\left({S}_i\right)\right\rceil -{\varOmega}_i\right\};\kern0.5em \forall\ i\kern0.5em \mathrm{such}\ \mathrm{that}\kern0.75em {\varOmega}_i<\left\lceil { \log}_n\left({S}_i\right)\right\rceil, \hfill \\ {}\hfill 0;\kern8.25em \mathrm{otherwise}.\hfill \end{array}\right. $$
$$ G={\displaystyle \sum_{i=1}^{N-1}\left\lceil { \log}_n\left({S}_N\right)\right\rceil -\left\lceil { \log}_n\left({S}_i\right)\right\rceil } $$
and K given in (7).
Proof
Consider that there is a constant c
m
that satisfies Ω
m
< ⌈ log
n
(S
m
)⌉ and, if there are more constants that satisfy such condition, c
m
has the greatest difference [⌈ log
n
(S
m
)⌉ − Ω
m
]. From Theorem 8, we have that the constant can be formed by cascading a non-multiplicative graph with a completely multiplicative graph, where the non-multiplicative graph needs 2[⌈ log
n
(S
m
)⌉ − (Ω
m
− 1)] − 1 R-operations. Since Theorem 9 has not taken into consideration the number of prime factors, only [⌈ log
n
(S
m
)⌉ − (Ω
m
− 1)] A-operations have been accounted in that theorem, under the assumption that the constant c
m
can be constructed with the optimal completely multiplicative graph. Therefore, at least [⌈ log
n
(S
m
)⌉ − (Ω
m
− 1)] − 1 extra R-operations must be included when pipelining is applied, which explains the term F. The term G is explained by the fact that extra R-operations may be needed to achieve the same number of pipelined stages from input to output in every constant. Since the minimum depth level of a constant is given by ⌈ log
n
(S)⌉, the differences between the minimum depth level of the constant c
N
(which has the greatest depth level among other constants) and the minimum depth levels of the other constants are accumulated in the term G. ■
From Theorem 10, we can express the lower bound for the number of R-operations in the PMCM case as
$$ {L}_{PMCM}=\left\lceil { \log}_n\left({S}_1\right)\right\rceil +{\displaystyle \sum_{i=1}^{N-1}\left(\left\lceil { \log}_n\left({S}_N\right)\right\rceil -\left\lceil { \log}_n\left({S}_i\right)\right\rceil \right)}+{\displaystyle \sum_{i=1}^{N-1} E\left({S}_i,{S}_{i+1}\right)}+ F, $$
(8)
with \( E\left({S}_i,{S}_{i+1}\right)=\left\{\begin{array}{c}\hfill 1;\kern5em {S}_i={S}_{i+1},\hfill \\ {}\hfill \left\lceil { \log}_n\frac{S_{i+1}}{S_i}\right\rceil; \kern0.75em {S}_i<{S}_{i+1},\hfill \end{array}\right. \)
and \( F=\left\{\begin{array}{c}\hfill {\displaystyle \underset{i}{ \max }}\left\{\left\lceil { \log}_n\left({S}_i\right)\right\rceil -{\varOmega}_i\right\};\kern0.5em \forall\ i\kern0.5em \mathrm{such}\ \mathrm{that}\kern0.75em {\varOmega}_i<\left\lceil { \log}_n\left({S}_i\right)\right\rceil, \hfill \\ {}\hfill 0;\kern8.25em \mathrm{otherwise}.\hfill \end{array}\right. \)