hb 2512851289
TRANSCRIPT
7/31/2019 Hb 2512851289
http://slidepdf.com/reader/full/hb-2512851289 1/5
Santimoy Mandal, Shyam Sundar Prasad / International Journal of Engineering Research and
Applications (IJERA) ISSN: 2248-9622 www.ijera.com
Vol. 2, Issue 5, September- October 2012, pp.1285-1289
1285 | P a g e
Double Pass-Transistor Logic for High Performance,
Low Latency Wave Pipeline Circuit
Santimoy Mandal
Dept. of Electronics and CommunicationEngineering, RVS college of Engineering andTechnology
Jamshedpur, India
Shyam Sundar Prasad
Dept.of Electronics and CommunicationNational Institute of Technology, Jamshedpur,India
Abstract — High throughput and low latency designs
are required in modern high performance
systems, especially for signal processingapplications.Existing logic families cannot provide
both of them simultaneously.We propose Double
Pass Transistor Logic (DPL) which can be used as
a universal logic to provide finest grain pipelining
without affecting overall latency or increasing thearea. It does not require any special process steps
and hence, can be realized in a normal process
technology as against the CPL proposed by
Yano et al [2] which uses threshold voltage
adjustment of selected devices. The design
procedure is described for (a) low latency, (b) highthroughput and (c) low area requirements.In
addition to the various advantages, it is
envisioned that DPL designs can also be used to
build ultra-high speed pipelined system without
pipelining latches, viz., wave pipelined digital
systems, where the throughput achievable is
beyond that permitted by the delay of apipeline stage.
I. INTRODUCTION High speed adders and multipliers are
required to meet the demands of signal processing andmultimedia applications.Wavepipelining or “maximal
rate pipelining” [l] is a design method that canincrease the throughput of a combinational circuit.In
conventional pipelining, the combinational circuit isbroken into smaller blocks or pipeline stages andsynchronizing elements like D-flip flops are used as
storage elements. The maximum speed is limited
By the number of pipe stages, the size of pipe stagesand the complexity of the clock distribution network.In the wave pipelining approach, flip flops are not
used as storage elements between pipeline stages.Instead, the internal capacitances of the gates are usedfor storing the intermediate values [l] [3] [4].There isconsiderable area reduction and minimization of power due to the elimination of storage elements. Thisalso eliminates clock distribution and clock skew
problems as no clock signal is required within thecombinational block. New inputs can be applied tothe circuit before the outputs are available,effectively allowing multiple waves of data to
propagate coherently through the circuit.Wave pipelining requires all paths from the inputs tothe outputs to be balanced. This is achieved by
inserting active delay buffers in the paths in which
there are less number of gates than the longest pathfrom the input to the output. The rough tuningmethod [6] ensures that the gate count along all thepaths is the same. However, rough-tuned circuit is stillnot balanced as there is bound to be different delaysdue to different fan-outs. The absence of
synchronizing elements in the wave pipelined circuitcould lead to collision between adjacent waves of data. The clock period should be such that the wavesdo not collide with each other giving enough time for
the gates to complete its task. The pipe stages in awave pipelined circuit are composed of single gatesand the load capacitances of the gates are used forstorage. The load capacitance may vary for different
gates in the same stage depending on the fan-outs.Different load capacitances result in different rise andfall times for the driver gates. This delay variation is
reduced by fine tuning [5] [6]. Fine tuning involvessizing of the transistors in the output inverters of the
driver gate to balance the delay. Once fine tuned, thecircuit can be clocked at its maximum speed limited
only by the delay.Section II discusses the timing constraints of wavepipelining and the necessary features in basic gates to
be designed for wave pipelining. Section III gives anoverview of the existing logic styles for wavepipelining. The limitations of the logic styles and thetuning methods are also discussed. Section IVpresents the performance of basic gates highly suitablefor wave pipelining. The power analysis of 8 bit
multiplier is represent in section V.Section VI presentsconclusion and further research direction.
II. TIMING CONSTRAINTS IN
WAVEPIPELININGWave pipelined circuits can be clocked at a
much higher frequency than conventional pipeliningbecause its maximum rate is limited only by the pathdelay difference instead of the maximum path delay.The minimum clock period for a wave pipelinedcircuit [7] can be represented by
Tcp≥MAX [∆Tp + 2∆C+Tsh + Trf, ∆Tx + ∆C + Tms + Trf]
Where Tcp is the clock period of the circuit, ∆tp is
the difference between the longest and shortest pathsin the circuit, ∆C is the worst case clock skew, Tsh isthe setup plus hold time for the registers, Trf is the
7/31/2019 Hb 2512851289
http://slidepdf.com/reader/full/hb-2512851289 2/5
Santimoy Mandal, Shyam Sundar Prasad / International Journal of Engineering Research and
Applications (IJERA) ISSN: 2248-9622 www.ijera.com
Vol. 2, Issue 5, September- October 2012, pp.1285-1289
1286 | P a g e
worst case rise/fall time at the last logic stage, ∆Tx, is
the propagation delay of the longest path from theinput to signal X at any intermediate node, and Tmsis the minimum time that X must be stable for thenext stage of logic to operate correctly. The operating
speed is limited by the delay between the shortest and
the longest path and not on the total delay of thecircuit as in conventional pipelining. The goal of thedesign process would be to reduce ∆Tp and ∆Tx as
much as possible while the other Parameters haveknown methods to reduce them.
III. EXISTING LOGIC STYLES FOR
WAVEPIPELINING For a balanced wave pipelined circuit, the
gates designed should not have input dependent delayor fan-out dependent delay. All the gates in aparticular logic family should have the same delay.Conventional static CMOS is the most preferred logic
among designers because of its high reliability.A 2inputs NAND gate is shown in Fig.1.The architecture
of the basic gates result in input dependent andfunctionality dependent delays.Several design styleswere proposed by researchers satisfying the timingconstraints of wave pipelining.
V
A
B
Fig.1 Different CMOS NAND logic style
A. Dual rail logic stylesNormal Process Complementary
Logic(NPCPL)[9], Wave pipeline Transmission GateLogic(WTGL)[3], are the dual rail logic styles usedfor wave-pipelining. NPCPL and WTGL are based onpass transistors and DRSCMOS is based on static
CMOS. In NPCPL, a basic building block is used todevelop all basic gates by properly choosing the input
signals Ai, Aj and B(for an AND/NAND gate(XY/
XY) Ai=X, Aj = Y and B = Y). The poor conductionof logic 1 by NMOS transistors in NPCPL result involtage degradation and poor noise margin.WTGLgates use transmission gates to obtain full logic swing
and better noise margin but static power dissipation is
there because here the use NMOS.WTGL and NPCPLare fast because of the high logic functionality and
low input capacitance of separate circuit paths foreach possible input combination, thus eliminating passtransistors. Dual rail logic styles are multi-functionalin nature and all the basic gates have the same
delay.System designed with dual rail styles can berough tuned because of the similarity in the basicarchitecture and the availability of “DELAY” gates.
All the gates have output inverters for fine tuningpurposes. WTGL and NPCPL have unbalanced inputcapacitances resulting in complex.
B. Double Pass transistor logic (DPL)Suzuki et al. [8] proposed the double pass
transistor logic [9] that overcomes all the problems of CPL, namely, voltage degradation and noisemargin.DPL gates give improved circuit performance
at reduced supply voltage because of the use of bothNMOS and PMOS transistors. DPL gates aresymmetrical whereby the load in any DPL gate isdistributed equally among the inputs.DPL
XOR/XNOR gate is perfectly symmetrical. Thesymmetrical arrangement and the double transmissionproperty suggest that the DPL gates will performvery efficiently in wave pipelined circuits. The
PMOS and NMOS transistors are used such that dualcurrent path is set up for each input combinationresulting in smallest equivalent resistance for DPL
gates compared to other logic styles. In WTGL, hereare two paths but the same input is passed along boththe paths. The inputs are different in the two paths in
DPL thereby distributing the load among the inputs.DPL was claimed to be the most energy efficientlogic style among the discussed logic styles byUming KO et.al. [10]. The symmetrical input loading,
double transmission property and the energyefficiency of DPL gates make the DPL logic family
the best suited logic style for wave pipelining.
IV. PERFORMANCE OF BASIC GATES
The power * delay product is a goodmeasure for comparing the logic styles that are to be
used in low power, high speed digital systems. Thebasic gates of all the logic styles were designed usingTANNER EDA V.13 with TSMC 0.18µm CMOStechnology at 2V rail to rail power supply. Table I
gives a measure of the power*delay product of various styles used in wave pipelining. Powermeasurement was done using the non invasive powermeasurement technique suggested by Kang [12].The
power*delay product of the various styles show thatDPL has the lowest power*delay product among thedual rail logic styles. The single rail logic styles have
A
B
+V +V
B AY
+V +V
BAY
7/31/2019 Hb 2512851289
http://slidepdf.com/reader/full/hb-2512851289 3/5
Santimoy Mandal, Shyam Sundar Prasad / International Journal of Engineering Research and
Applications (IJERA) ISSN: 2248-9622 www.ijera.com
Vol. 2, Issue 5, September- October 2012, pp.1285-1289
1287 | P a g e
low power because of the Lower number of transistors
and less switching activity.
Table I
logic
Risetime
Tr(ps)
Falltime
Tp(ps)
Tr-Tp(ps)
τphl τplh
Power
Dissipation(mW)
PDP
DP
LNAND
44.46 44.19 0.2732.6
529.50 .344 10.68
NPCP
LNAND
34.78 56.34 21.5642.4
261.32 .118 6.12
WT
GLNAND
29.56 27.53 2.0329.2
228.31 .34 9.741
DVL
NAND
69.77 70.53 0.5651.6
269.02 6.40
386.048
CMOSNA
ND
40.73 41.92 1.1933.1
627.43 2.03 55.68
Give a measure of the power*delay product of logic styles used in wave pipelining. Powermeasurement was done using the non invasive power
measurement technique suggested by Kang.Thoughthe power delay product of the WTGL and NPCPLlogic has low but in NPCPL logic need threshold levelrestorer and low noise margin and in WTGL it hasconstant static power dissipation due to PMOS.
A. Modification to the DPL gatesThe design goal for easier fine tuning is
to have balanced input capacitance, that is, the inputsof the gate should be perfectly symmetrical. The DPL
AND/NAND gates and the DPL OR/NOR gates arenot perfectly symmetrical. All the inputs in these gates
are connected to the gates of one NMOS and onePMOS transistor but source connections are either toPMOS or NMOS.The drain capacitances of the
NMOS and PMOS transistors are not the samebecause of the difference in sizes of the transistors andthe process parameters. Hence the gates are modifiedso that GND and supply connections are replaced by
primary inputs. Delay gate is necessary to develop acomplete library of basic gates. The delay gate has justone input unlike the other gates. Hence fewertransistors would be enough to design this gate. For
achieving dual current path for a DELAY/DELAYgate, transmission gates should be used. Dual current
paths require that the transistors are on all the time.
Hence the transistors should be driven by the suppliesand are not controlled by the inputs. TheMUX/DMUX gate is the only gate where perfectsymmetry could not be achieved. This is because the
multiplexer is a three input gate. The select input
drives only the gates of the transistors and the othertwo inputs have the same capacitance.
B. Performance of DPL basic gatesThe power * delay product is a good measure
for comparing the logic styles that are to be used inlow power, high speed digital systems. The basicgates of all the DPL logic styles were shown in
Table II designed using the layout editorTANNER in 0.18 micron technology and thesimulations were done using 2V supply in TSpice.
Table II
logic
Risetime
Tr(ps)
Falltime
Tp(ps)
Tr-Tp(p
s)
τphl
τplh PowerDissi
pation(mW)
PDP
DPLNAND
44.46 44.19 0.2732.6
529.50 .344
10.6
8
DPL
AND
59.68 72.56 12.8827.2
542.06 .118 8.17
DPLOR
50.14 24.32 25.8290.9
652.25 .114
8.133
DPLXNOR
42.69 43.93 1.2482.5
8107.0
7.147
13.93
DPL
XOR
47.01 64.35 17.3426.9
023.60 .113 2.85
DPLNO
R
52.47 36.00 16.47 117.35 75.50 .36134.8
0
DPL
MUX
44.49 40.95 3.5489.5
486.85 .112
10.75
DPL
DEMUX
25.27 43.05 17.78 15.57
26.56 .336 7.07
DPLDELA
Y
81.27 78.74 2.98 200203.4
0.227
45.78
C. Wallace tree multiplierSeveral popular and well-known schemes,
with the objective of improving the speed of theparallel multiplier, have been developed in past. In
1964, C.S. Wallace observed that it is possible tofind a structure, which performs the addition
7/31/2019 Hb 2512851289
http://slidepdf.com/reader/full/hb-2512851289 4/5
7/31/2019 Hb 2512851289
http://slidepdf.com/reader/full/hb-2512851289 5/5
Santimoy Mandal, Shyam Sundar Prasad / International Journal of Engineering Research and
Applications (IJERA) ISSN: 2248-9622 www.ijera.com
Vol. 2, Issue 5, September- October 2012, pp.1285-1289
1289 | P a g e
[11] Liu, W. et al. “A 250MHz Wave PipelinedAdder in 2 CMOS,” in IEEE Joumal of Solid
State Circuits, September 1994.[12] Sung MO Kang, “Accurate Simulation of Power
Dissipation in VLSI Circuits,” IEEE Journal of
Solid State Circuits, Vol SC-21., No 5., Oct
1986.[13] V. G. Oklobdzija and D. Villeger, “Improving
Multiplier Design By Using Improved ColumnCompression Tree And Optimized Final AdderIn CMOS Technology,” IEEE Transactions onVLSI Systems, Vol.3, No.2, June, 1995, 25
pages.[14] Z. Shun, O. A. Pfander, H.-J. Pfleiderer, and A.
Bermak, “A VLSI ar -chitecture for a run-time
multi- precision reconfigurable Booth multi- plier,” in Proc. 14th IEEE Int. Conf. Electron.,
Circuits, Syst., Dec.2007, pp. 975 – 978.