Engineering

# **Research Paper**



# A new Parallel Counter Architecture with Reduced Transistor Count for Power and Area Optimization

# \*Dr. K.Babulu \*\* K.Venkateswara Rao

## \* Professor, Department of ECE, Jawaharlal Nehru Technological University Kakinada (JNTUK), Kakinada

\*\* Department of ECE, Jawaharlal Nehru Technological University Kakinada (JNTUK), Kakinada

## ABSTRACT

Parallel counter achieved a high reputation through a novel pipeline partitioning methodology based parallel counter architecture which is made of less area & less power by reducing the number of transistor required for building the counter architecture. In this paper a novel technique is proposed based on the comparison between Conventional Conditional Data Mapping Flip-flop which is replacing the conventional 24 transistor flip-flop which is the basic building block of the parallel counter architecture. As it is a parallel counter architecture & it utilizes the state look-a-head logic it will counts 2 states per cycle through which we are achieving parallel working. The simulations are done using Microwind & DSCH analysis software tools and the result between all those types are listed below

## Keywords: Flip-flop, Low Power parallel counter, Sequential counter, DSCH, Micro-wind

## I. INTRODUCTION

Parallel counters are m-input, n-output combinational logic circuits that determine the number of logic ONE's in their input vectors and generate a binary coded output vector. It is useful in implementing fast multiplier and digital neural networks. A wide variety of parallel counter architectures that exist in literature [4-6] have been summarized in [3]. Counters are sequential circuits that keep track of the Number of pulses applied on their inputs. They are among the most widely used components in digital systems, with applications in computer systems, communication equipments, scientific instruments, and industrial process control etc.

Flip-Flops are the basic elements for storing information and they are the fundamental building blocks for all sequential circuits. Flip-flops have their content change only either at the rising or falling edge of the enable signal. But, after the rising or falling edge of the enable signal, the flip-flop's content remains constant even if the input changes. In a conventional D Flip Flop, the clock signal always flows into the D flip- flop irrespective of whether the input changes or not. Part of the clock energy is consumed by the internal clock buffer to control the transmission gates unnecessarily. Hence, if the input of the flip-flop is identical to its output, the switching of the clock can be suppressed to conserve power

[1,2]. A large part of the on-chip power is consumed by the clock drivers. It is desirable to have less clocked load in the system. For example, CCFF used 14 clocked transistors, and CDFF used 15 clocked transistors In contrast, conditional data mapping flip-flop (CDMFF) used only seven clocked transistors, resulting in about 50% reduction in the number of clocked transistors, hence CDMFF used less port than CCFF and CDFF. (Note that CDFF used double edge clocking [5,7]. For simplicity purposes, we did not include the power savings by double edge triggering on the clock distribution network.) This shows the effectiveness of reducing clocked transistor numbers to achieve low power [8]. In this paper a novel technique is proposed based on the comparison between Conventional Conditional Data Mapping Flip-flop which is replacing the conventional 24 transistor flip-flop which is the basic building block of the parallel counter architecture.

II. CONVENTIONAL CONDITIONAL DATA MAPPING D FLIP-FLOP Flip-flops and latches consume a large portion of system power, due to redundant transitions of their internal nodes, in the case where the logic state of their outputs is unchanged when triggered by a clock signal. To reduce this redundancy, several techniques as well as their flip-flops have been proposed recently, such as data look- ahead flip-flop (DLFF) [1], However, the conditional circuitry is bulky, since at least two signals, the output and the input, need to be sensed for judging the passage of clock signal, and thus, incur penalties not only on clock-to-output delay but also on setup times. CPFF is characterized by its use of conditional circuitry to disable precharging of its internal nodes in the case of a redundant event. The alternative way is shown by CCFF and CDFF, both of which have conditional circuitry to deactivate discharging of its internal nodes in the case of a redundant event.

The clock signal always flows into the D flip-flop irrespective of whether the input changes or not. Part of the clock energy is consumed by the internal clock buffer to control the transmission gates unnecessarily. Hence, if the input of the flip-flop is identical to its output, the switching of the clock can be suppressed to conserve power. This shows the effectiveness of reducing clocked transistor numbers to achieve low power. Since CDMFF outperforms CCFF and CDFF in view of power consumption, we do not discuss CCFF or CDFF further in this paper CDFF and CCFF use many clocked transistors. CDMFF reduces the number of clocked transistors but it has redundant clocking. Clocked-pair shared flip-flop. a floating node. To ensure efficient and robust implementation of low power sequential element, we propose Clocked Pair Shared flip-flop to use less clocked transistor than CDMFF (Fig-1) and to overcome the floating problem in CDMFF.



Figure 1: Conventional Conditional Data

## Mapping Flip-flop Design



Figure 2: Module 1 Modified Design based on 14T

Module-1 (Fig-2) is a parallel synchronous binary counter, which is responsible for low- order bit counting and future states generation for all module-3 's in the counting path by pipelining the enable for these future states( Fig. 2. For module-1. Module-1 outputs Q1Q0 (the counter's two low-order bits) and QEN1 =Q1and Q0 bar Connects to the module-2's input.



Figure 3: Modul-3 modified based on 14-T CDMFF

The module-3 (Fig-3) output QEN3=Q1AND Q0 AND QC (note Q1 and Q0 refer to the two counting bits stored internally at each module-3 and do not refer to and of the outputted count value), signaling that not only has that module-3 overflowed, but all modules preceding that module-3 have also overflowed, thus enabling the count in the subsequent module-3.



Figure 4: Our Proposed Low power CDMFF

based parallel counter architecture

By using the CDMFF(Fig-4) Logic family idea we are designing this circuit as well as by using the pass transistor logic we are using only one clocking transistor so it will be consuming only less power in the counter network of the Flip flop when compared to all other circuits. As well as we are having only 14 Transistors excluding the not gates also. So we will be having much reduced power and area when compared to the other two designs. At the same time due to the reduced no of transistor count we can reduce the delay oriented things also. Thus we are reducing the overall switching delay and power, area consumption. So this circuit will be acting as good sequential elements when compared to other flip-flop design. The state diagrams of the proposed system are shown in Fig.5.



Figure 6: Waveform output of the proposed flip-flop.

The graph represents the input & output characteristics of our proposed system from that we can clearly understand how it works as parallel counter architecture. There is some nano seconds delay is there even though it's a negligible amount only. Those delays can be further reduced by reducing the sizes of the transistor we are using in this circuit. Or by reducing the nano meter technology also we can reduce the constraints.



Figure 5: (a) State Diagram of Module 1 (b) State Diagram of module 3s

### **III. SIMULATION RESULT**

In our work we have used DSCH simulator of the proposed parallel counter and the result is shown below.



Figure 7: The layout of the proposed design

PARIPEX - INDIAN JOURNAL OF RESEARCH ♥ 63

The Layout design of the proposed new flip- flop is shown in the figure6 the area of that is mentioned at the downside of the layout. The Power consumption characteristics also mentioned below in figure5.

| Section diverses | -1 | 11 11 |   |        | - |
|------------------|----|-------|---|--------|---|
|                  | 1  | 1. 7  |   | 1.1    |   |
| -                |    |       |   |        |   |
|                  | 1  | 1     | 1 | 1 11   | 1 |
| -                |    | -     |   |        |   |
| K                |    | _     |   |        |   |
|                  |    |       |   |        |   |
|                  |    |       |   | -      |   |
|                  |    | _     |   | _      |   |
| *                |    |       |   | _      |   |
|                  |    |       |   | 100.00 |   |
|                  |    | -0-0  |   |        | - |
|                  |    |       |   |        |   |
| н.               | 1  | 1. 1  |   |        | - |
|                  |    | _     |   |        |   |
| *:               |    |       |   | -      |   |
| N                | -  |       |   | _      |   |
|                  |    |       |   | -      |   |
| C                | _  | _     |   |        |   |

Figure 8: Power characteristic of the proposed design

This Fig-7 shows the power consumption over CDMFF and which is more efficient then other design.

Table-1 Tabulation for transition count calculation of proposed design

| Туре           | Transistor per<br>component | Total transistor count |
|----------------|-----------------------------|------------------------|
| module 1       | 1x50                        | 050                    |
| module 2       | 6x14                        | 084                    |
| module 3       | 3x70                        | 210                    |
| 3 i/p AND Gate | 1x8                         | 008                    |
| States Decoder | 2x9                         | 018                    |
| Total          |                             | 370                    |

Table-1 shows the total number of components and transistors required per component for our 8-bit counter based on the component design modules depicted in Figs. 2 and 3 using CMOS transistor design structures. Module-1, module-2, and module-3 components require 50, 84, and 210 transistors, respectively. The complete 8-bit counter requires only 370 transistors, which equals approximately 48 125 m of silicon die area.

## IV. CONCLUSION

In this paper a novel technique is proposed based on the comparison between Conventional Conditional Data Mapping Flip-flop which is replacing the conventional 24 transistor flip-flop which is the basic building block of the parallel counter architecture. As it is a parallel counter architecture & it utilizes the state look-a-head logic it counts 2 states per cycle through which we are achieving parallel working. The simulations are done using Micro-wind & DSCH analysis software tools. The result is the best as compare to the other technique mentioned in the literature.

#### **V. FUTURE WORK**

Furthermore we can reduce the power consumption by using low swing voltage approach. If supply voltage is halved the switching activity of the transistor will be reduced leads power reduction. Then transistor scaling or layout optimization is another way to reduce power consumption.

#### REFERENCES

[1]. Peiyi Zhao, Jason McNeely, Weidong Kuang, Nan Wang, and "Design of Sequential Elements for Low Power Clocking System" IEEE Transaction May 2011 [2], H. Kawaguchi and T. Sakurai, "A reduced clock-swing flip-flop (RCSFF) for 63% power reduction," IEEE J. Solid-State Circuits, vol. 33, no. 5, pp. 807–811, May 1998. [3]. A. Chandrakasan, [W.Bowhill, and F. Fox, Design of High-performance Microprocessor Circuits, 1st ed. Piscataway, NJ: IEEE Press, 2001. [4]. G. Gerosa, "A.2.2W, 80 MHz superscalar RISC microprocessor," IEEE J. Solid-State Circuits, vol. 29, no. 12, pp. 1440–1454, Dec. 1994. [5]. B. Nikolic, V. G. Oklobzija, V. | Stojanovic, W. Jia, J. K. Chiu, and J. M. M. Leung, "Improved sense- amplifier-based flip-flop Design and measurements," IEEE J. Solid-State Circuits, vol. 35, no. 6, pp. 876–883, Jun. 2000. [6]. S. D. Naffziger, G. Colon-Bonet, T. | Fischer, R. Riedlinger, T. J. Sullivan, and T. Grutkowski, "The implementation of the Itanium 2 microprocessor," IEEE J. Solid-State Circuits, vol. 37, no. 11, pp. 1448–1460, Nov. 2002. [7]. J. Tschanz, S. Narendra, Z. P. Chen, S. Borkar, M. Sachdev, and V. De, "Comparative delay and energy of single edge-triggered & dual edge triggered pulsed flip-flops for high-performance microprocessors," in Proc. ISPLED, Huntington Beach, CA, Aug. 2001, pp. 207–212. [8]. P. Zhao, J. McNeely, P. Golconda, M. A. Bayoumi, W. D. Kuang, and [9]. B. Barcenas, "Low power clock branch sharing double-edge triggered flip-flop," IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 15, no. 3, pp. 338–345, Mar. 2007.