a new 16-bits risc processor...

12
3 rd Seminar on Science and Technology, 10-11 August 2004, Kota Kinabalu, Sabah A NEW 16-BITS RISC PROCESSOR ARCHITECTURE: CONTROLLER STATE MACHINES AND FUNCTIONAL VERIFICATION USING VERILOG™ HDL Ismail Saad, Pukhraj Vaya, Abu Bakar A.R, Wan Hoong Wai School of Engineering and Information Technology University Malaysia Sabah, Locked Bag 2073, 88999 Kota Kinabalu, Sabah, Malaysia Tel: +60-8-832-0000 x 3147/3066, Fax: +60-8-832-0348 (e-mail: [email protected], [email protected] , [email protected]) ABSTRACT This paper presents the design and simulation of new 16-bits RISC microprocessor architecture with an emphasis on state machines namely Controller State Machine (CSM) model and the processor functionality verification using Verilog Hardware Description Language (HDL). The processor system consists of ROM, RAM, I/O and CPU. The CPU module is merely a shell which instances the real processor definition in cpu_core.v, control.v, datapath.v and alu.v module. The design and verification of CSM, which represents the core mechanism of control unit architectural design, are elaborated in detail in this paper. The processor offers 36 types of instruction to be used by the programmer. The functional verification task of the processor is carried out using VCS(Verilog Code Simulator) simulators by executing the 36 instructions which four of them are discussed in this paper. Key words: Verilog HDL, RISC, Datapath, Behavioural Model, VCS Simulator 1.0 Introduction Microprocessor application is not limited to personal computer but also used in a specific field such as robotics, communications, control systems, etc [1-5]. However, the existing process of designing a very large scale ICs such a new microprocessor for specific application is complicated, time consuming and prone to human errors. Thus, we have employed the design methodology based on the Verilog-HDL (Hardware Description Language) software for our new architecture of 16-bits RISC microprocessor. The

Upload: others

Post on 01-Oct-2020

9 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: A NEW 16-BITS RISC PROCESSOR ARCHITECTUREke26604.weebly.com/.../8/...a_new_16-bits_risc_processor_architectu… · The processor offers 36 types of instruction to be used by the programmer

3rd Seminar on Science and Technology, 10-11 August 2004, Kota Kinabalu, Sabah

A NEW 16-BITS RISC PROCESSOR ARCHITECTURE:

CONTROLLER STATE MACHINES AND FUNCTIONAL VERIFICATION

USING VERILOG™ HDL

Ismail Saad, Pukhraj Vaya, Abu Bakar A.R, Wan Hoong Wai

School of Engineering and Information Technology University Malaysia Sabah, Locked Bag 2073, 88999 Kota Kinabalu, Sabah, Malaysia

Tel: +60-8-832-0000 x 3147/3066, Fax: +60-8-832-0348 (e-mail: [email protected], [email protected] , [email protected])

ABSTRACT

This paper presents the design and simulation of new 16-bits RISC microprocessor architecture

with an emphasis on state machines namely Controller State Machine (CSM) model and the processor

functionality verification using Verilog Hardware Description Language (HDL). The processor system

consists of ROM, RAM, I/O and CPU. The CPU module is merely a shell which instances the real

processor definition in cpu_core.v, control.v, datapath.v and alu.v module. The design and verification of

CSM, which represents the core mechanism of control unit architectural design, are elaborated in detail in

this paper. The processor offers 36 types of instruction to be used by the programmer. The functional

verification task of the processor is carried out using VCS™ (Verilog Code Simulator) simulators by

executing the 36 instructions which four of them are discussed in this paper.

Key words: Verilog HDL, RISC, Datapath, Behavioural Model, VCS Simulator

1.0 Introduction

Microprocessor application is not limited to personal computer but also used in a specific field

such as robotics, communications, control systems, etc [1-5]. However, the existing process of designing

a very large scale ICs such a new microprocessor for specific application is complicated, time consuming

and prone to human errors. Thus, we have employed the design methodology based on the Verilog-HDL

(Hardware Description Language) software for our new architecture of 16-bits RISC microprocessor. The

Page 2: A NEW 16-BITS RISC PROCESSOR ARCHITECTUREke26604.weebly.com/.../8/...a_new_16-bits_risc_processor_architectu… · The processor offers 36 types of instruction to be used by the programmer

3rd Seminar on Science and Technology, 10-11 August 2004, Kota Kinabalu, Sabah

Verilog is a tool that simplify the design processes by allowing designer to describe the design at the

highest level of abstraction (behavioral and register transfer level) [8-10]. The design can be tested by

simulation before sent-off for fabrication and thus, cost and time are saved [6-7]. However, the success of

designing such new processors depends mainly on the accurate design of the state controller in the system

control unit [12-14]. In this paper, we present the design and simulation of a new 16-bits processor

architecture based on HDL design methodology using Verilog language with an emphasis on the design

and verification of controller state machine as well as the processor functionality.

2.0 Processor Architecture

The new 16-bits RISC processor design has a multiplexed 16-bit data and address path. The

instruction has a variable length, as it takes one word for instruction that operates within registers only

and two words for instructions operated on registers/memory and register/immediate. The 16-bits

instruction field consists of 2-mode bit, 1-bit each for set condition (set_bit) and test condition (test_bit),

3-bit ALU function (ALU_func) and 3-bit each for destination register (Rd), source1 register (Rs1) and

source2 register (Rs2). The processor can execute 36 instructions, which are grouped into 2 instructions

type; arithmetic/logical and load/store. There are six registers in the processor where 3 of them are general

purpose (R1,R2,R3) while the other 3 are dedicated register that is PC (Program Counter), IR (Instruction

Register) and DR (Direct Register). On top of that, a dummy register, R0 (always zero) is also included in

the register file which follow the convention of RISC architecture [15].

3. Processor Verilog Module Systems

The top module of processor system is defined in system.v file. It consists of CPU, 256 words of

ROM (addresses 0-255), 256 words of RAM (addresses 256-511), I/O module consisting of a bank of 16

switches (mapped at address 512) and a bank of 16 LEDs (mapped at address 513), transparent address

latch that stores address and decoder module to select either ROM, RAM or I/O modules. The second top

Page 3: A NEW 16-BITS RISC PROCESSOR ARCHITECTUREke26604.weebly.com/.../8/...a_new_16-bits_risc_processor_architectu… · The processor offers 36 types of instruction to be used by the programmer

3rd Seminar on Science and Technology, 10-11 August 2004, Kota Kinabalu, Sabah

module is cpu.v file, which is merely a shell which simulates the pad ring and instances the real processor

architecture definition in the cpu_core.v module. The cpu_core.v module instances the processor control.v

and datapath.v definition. Finally, datapath.v module instances the ALU definition in alu.v file as show in

figure 1 below. A monitor.v module is also written for monitoring the activity of processor design. The

control.v, datapath.v, monitor.v and alu.v module includes an opcodes.v module which contains a

definition of operational codes and oprenads of the processor architecture.

cpu.v

cpu_core.v

control.v

datapath.v

alu.v

Fig.1 ssor : Verilog Module Structure for Proce

4. Processor Control Unit Design

The control unit is the core of the microprocessor. It accepts as input, those signals that are needed to

operate the controller, and provides as output all the control signals necessary to effect that operation.

Thus, two main functions of control unit are to execute operations in a proper sequence by means of CSM

and to interpret the instruction words and consequently generate the control signal that causes each

instruction to be executed. Our control unit design consists of 16-bit Instruction Register (IR), 1-bit Zero

Flag register, Controller State Machine and Sub States of memory cycles and the different types of

generated control signals as illustrated in Fig.2:

Page 4: A NEW 16-BITS RISC PROCESSOR ARCHITECTUREke26604.weebly.com/.../8/...a_new_16-bits_risc_processor_architectu… · The processor offers 36 types of instruction to be used by the programmer

3rd Seminar on Science and Technology, 10-11 August 2004, Kota Kinabalu, Sabah

State : 0: Fetch1

CONTROLFunction

Zero

TrisPC

4.1 Controller State Machine

The CSM has three states: Fetch1 (00), Fetch2 (11) and Execute (01) that coded by using gray code.

The controller state machine is based on the Mealy machines as referred in the reference [1,14]. Details of

the state transition are shown in the state diagram in the Fig.3.

3: Fetch2

1: Execute

Sub_state :

0: address_setup 1: address_hold 2: data_hold

IR

15 14 13 12 11 10 9 8 4 5 6 7 0 1 2 3

Rs2 Rs1 RdOpcode ModeBit

TrisALU

TrisRs2

TrisRd

nTrisRd

PC_inc

Rs2_sel

WriteR1

WriteR2

WriteR3

ReadP

ReadR

ReadR1_1

ReadR2_1

ReadR3_1

C_1

0_1

Zero Flag

ReadPC_2

ReadR0_2

ReadR1_2

ReadR2_2

ReadR3_2

LoadDR

WritePC LoadPC

ALUfuncsetbit

testbit

Zero zero_flag_reg

3: data_setup

Fig.2: Processor Control Module Architecture

Fetch1 (00)

Fetch2 (11)

Execute (01)

TRUE, 01 TRUE, 10,11

FALSE, XX TRUE, 10/11

TRUE, XX

TRUE, 00 or FLASE, 00/01

Fig.3: Controller State Machine State Diagram

FALSE, 10/11

Page 5: A NEW 16-BITS RISC PROCESSOR ARCHITECTUREke26604.weebly.com/.../8/...a_new_16-bits_risc_processor_architectu… · The processor offers 36 types of instruction to be used by the programmer

3rd Seminar on Science and Technology, 10-11 August 2004, Kota Kinabalu, Sabah

In addition it also has 4 memory cycles sub states: address_setup (00), address_hold (01),

data_setup (11) and data_hold (10). To distinguish transitions of operation from one state to another, the

data_hold sub state of memory cycle and the 2-mode bit fields of instruction are used.

Referring to Fig. 3, TRUE or FALSE represents the presence of data_hold in the sub state cycle,

the 2-bit (00,01,11,10) is represent the possible values of mode bit and XX is referred as don’t care

condition. Details of states transition are explained details below.

Fetch1 states case:

The Fetch1 states will remain in its current state when the data_hold is TRUE and mode bit is 00. Then,

Fetch1 states will jump to the Execute states and if the data_hold is TRUE and mode bit is 01. In order for

the Fetch1 states to jump from its present state to Fetch2, the condition to be fulfilled is when the

data_hold is TRUE and mode bit is 10 or 11. When the data_hold is FALSE and mode bit is 01 or 10, the

Fetch1 state will remains at its current state.

Execute states case:

The Execute state will remain in its current state during the FALSE data_hold and don’t care conditions

(XX) of mode bit occur. Then, if data_hold is TRUE and mode bit is don’t care conditions then the next

state will be Fecth1.

Fetch2 states case:

For the purpose of Load and Store operations both Fetch2 and Execute states will be used accordingly. If

the data_hold is TRUE and mode bit is 10 or 11 then the next state will be jump to execute states.

Otherwise, if data_hold is FALSE and mode bit is 10 or 11 then the current state will be remained.

Generally, Fetch1 states is dedicated for register and register instruction type, which uses 4 clock

cycle or 1 memory cycle to be executed. Execute states is for register and immediate instruction type, that

uses 8 clock cycle or 2 memory cycles to be executed. For Load and Store instruction type, which is the

Page 6: A NEW 16-BITS RISC PROCESSOR ARCHITECTUREke26604.weebly.com/.../8/...a_new_16-bits_risc_processor_architectu… · The processor offers 36 types of instruction to be used by the programmer

3rd Seminar on Science and Technology, 10-11 August 2004, Kota Kinabalu, Sabah

longest instruction to be executed, Fetch2 and Execute state is uses 3 memory cycles or 12 clock cycles.

Hence, all instructions are completed in exactly 12 clock cycles. Gray code style is employed for the state

assignments since each of the state transition requires only one bit changing. This approach is chosen to

reduce the glitch problem during bit changing process.

This controller state machine is coded in verilog by using case statement and the algorithm can be

viewed as follows:

case (state) `Fetch1: if (sub_state == `data_hold && (ModeBit == 2'b00)) state <= `Fetch1; else if (sub_state == `data_hold && ModeBit == 2'b01) state <= `Execute; else if (sub_state == `data_hold && ((ModeBit == 2'b10) || (ModeBit == 2'b11))) state <= `Fetch2; else if (ModeBit == 2'b01 || ModeBit == 2'b00) state <= `Fetch1; `Fetch2: if (sub_state == `data_hold && ((ModeBit == 2'b10) || (ModeBit == 2'b11))) state <= `Execute; else if (ModeBit == 2'b10 || ModeBit == 2'b11) state <= `Fetch2; `Execute: if (sub_state == `data_hold ) state <= `Fetch1; else state <= `Execute;

4.2 VERIFICATION OF THE CONTROLLER STATE MACHINE

The verification of the controller state machine is done by simulating the whole control unit

together with the instructions that saved in the ROM. The states transition will take place when data_hold

is TRUE. With remain states are excluded three states transition discussed accordingly in the following.

The states and sub_state of the processor are defined in the verilog control module as below:

`define Fetch1 0 `define Execute 1 `define Fetch2 3 `define address_setup 0 `define address_hold 1 `define data_setup 3 `define data_hold 2

Page 7: A NEW 16-BITS RISC PROCESSOR ARCHITECTUREke26604.weebly.com/.../8/...a_new_16-bits_risc_processor_architectu… · The processor offers 36 types of instruction to be used by the programmer

3rd Seminar on Science and Technology, 10-11 August 2004, Kota Kinabalu, Sabah

Fig. 4.0: Fetch1 (0) to Execute (1) state transition

Fig. 4.0 shows the transition from Fetch1 (2) to Execute (1) state happen when data_hold (3) is

TRUE (2) and mode bit equals to 01. It also shown the Execute operation (register and immediate) uses 8

clock cycles to be executed denoted by c1 to c2 range.

Fig. 4.1: Execute (1) to Fetch1 (0) state transition

Fig. 4.1 shows the transition from Execute (1) to Fetch1 (0) state happen when the data_hold (3) is

TRUE (2) and mode bit equals to 00. In the first data_hold TRUE there is state transition happen due to

the previous instruction (4040) is Execute operation where it uses 8 clock cycles as denoted by red line in

the Fig. 4.1. The Fetch1 operation (register and register) is the shortest types of instruction to be executed

where it only used 4 clock cycles as denoted by c1 to c2 range.

Fig. 4.2: Fetch1 (0) to Fetch2 (3) state transition

Page 8: A NEW 16-BITS RISC PROCESSOR ARCHITECTUREke26604.weebly.com/.../8/...a_new_16-bits_risc_processor_architectu… · The processor offers 36 types of instruction to be used by the programmer

3rd Seminar on Science and Technology, 10-11 August 2004, Kota Kinabalu, Sabah

Fig. 4.2 shows the transition from Fetch1 (0) to Fetch2 (3) state happen when the data_hold is

TRUE (2) and mode bit equals to 11. The executed instruction (c00b) is longest types of instruction. It

used for store and load operation where it requires Fetch2 (3) and Execute (1) state and total 12 clock

cycles needed in order to executed the instruction as denoted by c1 to c2 range.

5. Processor Functionality Verification

Verification of processor functionalities is done for the basic operations which include arithmetic,

logic and shift operation. This processor architecture offers 36 types of instructions available for use. At

the simulation level, the functionalities of the processor are verified through timing diagram of each

module as generated in the VCS simulator windows. For example, only 4 types of instructions are shown

here which Register + Immediate, Register + Register and Load/Store Instructions.

5.1 Register + Immediate value operation test Rd ← Rs1Addi Imm //R1 ← R0 + 259(103hex);

This instruction is used to verify add operations between Register1 (R1) and immediate value (259) where

the immediate value is stored into Register1. Details of the process are shown in Fig.5.0.

Fig.5.0: Timing Diagram of Register + Immediate Operation

Page 9: A NEW 16-BITS RISC PROCESSOR ARCHITECTUREke26604.weebly.com/.../8/...a_new_16-bits_risc_processor_architectu… · The processor offers 36 types of instruction to be used by the programmer

3rd Seminar on Science and Technology, 10-11 August 2004, Kota Kinabalu, Sabah

5.2 Register + Register operation test Rd ← Rs1 Addr Rs2 // R3 ← R1 + R2;

This instruction is used to verify add operation within registers. In this example the instruction involves

add operation between Register1 and Register2 then output is stored into Register3. The immediate value

259(0103hex) in the Register1 is added to immediate value in the Register2: 93(005dhex) that is stored

initially, then result: 352(0160hex) is stored into Register 3. Details of the process are shown in the

Fig.5.1.

Fig.5.1: Timing Diagram of Register + Register Operation 5.3 Store operation test mem[Rs1+ Imm] ← Rd // mem[R0 + 259] ← R1;

This instruction is used to verify store operation. In this example the instruction involves store operation

from Register2 into memory at location [259]. The Register0 (R0) is a dummy register and it always 0.

After Write signals enabled the content of Register2 is stored into memory addresses at [259]. Details of

the process are shown in the Fig.5.3.

Page 10: A NEW 16-BITS RISC PROCESSOR ARCHITECTUREke26604.weebly.com/.../8/...a_new_16-bits_risc_processor_architectu… · The processor offers 36 types of instruction to be used by the programmer

3rd Seminar on Science and Technology, 10-11 August 2004, Kota Kinabalu, Sabah

The content of memory at location [259] after store operation is now equal to 10 as shown in the Fig.5.4.

Fig.5.3: Timing Diagram of Store Operation

Fig.5.4: Interactive Display of Memory

Content 5.4 Load operation test Rd ← mem[Rs1+ Imm] // R2 ← mem[R0 + 259];

This instruction is used to verify load operation. In this example the instruction involves load from

memory at location [259] into Register 2. The content of memory locations addresses at [259] is loaded

into destination register (Register2). Details of the process are shown in the Fig.5.5.

Page 11: A NEW 16-BITS RISC PROCESSOR ARCHITECTUREke26604.weebly.com/.../8/...a_new_16-bits_risc_processor_architectu… · The processor offers 36 types of instruction to be used by the programmer

3rd Seminar on Science and Technology, 10-11 August 2004, Kota Kinabalu, Sabah

Fig.5.5: Timing Diagram of Load Operation

6. Conclusions

New 16-bit processor architecture is successfully designed based on HDL methodology and

simulated completely through VCS simulator in Synopsys tools in order to verify the processor

functionalities. The success of the processor depends to the state controller of the system. As presented in

the paper, controller state machine is designed to control the state transition and its functionalities is

verified through execution of 36 instructions out of which 4 instructions as test bench cases are explained

thoroughly in the paper.

Page 12: A NEW 16-BITS RISC PROCESSOR ARCHITECTUREke26604.weebly.com/.../8/...a_new_16-bits_risc_processor_architectu… · The processor offers 36 types of instruction to be used by the programmer

3rd Seminar on Science and Technology, 10-11 August 2004, Kota Kinabalu, Sabah

7. References [1] D.D Gajski, Principles of Digital Design, Prentice Hall, 1997. [2] M. Zwolinski, Digital System Design with VHDL, Prentice Hall, 2000. [3] D. A. Patterson & J.L. Hennesy, Computer Organization and Design - The Hardware/ Software Interface, Morgan Kaufmann, 1999. [4] M. Morris Mano, Digital Logic and Computer Design, Prentice Hall, 1997. [5] G.H Miller, Microcomputer Engineering, 2nd edition, Printice Hall, 1998. [6] Dally, W-J. Chang, A. The Role of Custom Design In ASIC Chips, Proceedings of the 37th conference on design automation, ACM Press, pg 643-647, 2000. [7] Flynn, M-J. Winner, R-I. ASIC microprocessor, Proceedings of the 22nd annual International Workshop on Microprogramming and Microarchitecture, ACM Press, pg 237-243, 1989. [8] Samir Palnitkar, Verilog HDL A Guide to Digital Design and Synthesis, Printice Hall, 1995. [9] Lioupis, D. Papagiannis, A. Psihogiou, D., A Systematic approach to software peripherals for embedded system, Proceedings of the ninth International symposium on hardware/software codesign, ACM Press, pg 14-145, 2001. [10] J.C Diaz, P. Plaza, L.A. Merayo, P. Scarfone, M. Zamboni, Design and validation with HDL of a complex input/output processor for an ATM switch : the CMC, Verilog HDL conference, Proceedings, pg 67-71, 1995. [11] A.E Mahdi, I.A Grout, PLL based ASIC system for DSP real-time analogue interface, www.ece.ul.ie/hompage/ian_grout/publications.html ,2002. [12] M.G Arnold, T.A Bailey, J.R Cowles, J.J Cupal, A.W Wallace, A purely data structure for accurate high level timing simulation of synchronous designs, Verilog HDL Conference, pg 101- 107, 1994. [13] O. Hebert, I.C Kraljic, Y. Savaria, A Method to Derive Application-Specific Embedded Processing Cores, International Conference on Hardware Software Codesign, San Diego, California, United States, ACM Press, pg 88-92, 2000. [14] S.Golson., State machine design technique for Verilog and VHDL, Synopsys Journal of High-Level Design, pg 1-20, September 1994. [15] D.A Patterson, C.H Sequin, RISC 1: A Reduced Instruction Set VLSI Computer, International Symposium on Computer Architecture (selected paper), Spain, pg 216-230, 1998.