Basics of Assembly Programming
CS 590: Topics in Computer Science
Assignment 03: Basics of Assembly Programming
This assignment is about programming a microprocessor. For this we will use the SMS32V50
simulator. http://www.softwareforeducation.com/sms32v50/
You can run this natively in MS Windows. In the case of a Mac or Linux machine you can use
the WinE (Windows Emulator) program. https://wiki.winehq.org/Download
Other alternatives are running Windows or a light clone of Windows like ReactOS that is
equivalent to Windows XP (Furthermore you can get a disk image for Windows XP SP3 here.
The product key is now freely available: M6TF9-8XQ2M-YQK9F-7TBB2-XGG88 or this other
key MRX3F-47B9T-2487J-KWKMF-RPWBY.) in a virtual machine like VirtualBox or VMWare, or
using one of the Windows machines at one of our labs and running the SMS32V50 directly on
any of them.
Exercises:
a. Write a program that will make the traffic lights work properly, like in a real situation
where one of the traffic lights is in green while the other is in red and vice versa, passing
through the yellow light in the process.
b. Write a program to make the stepper motor rotate two turns in one direction and half the
turn in the other direction, continuously.
c. Write a program that will make the seven segment displays show a countdown from 9 to
0 alternating between the left and the right display.
d. Fix the program 99Heater.asm so that the temperature will stay at 21 ºC
e. Also fix the program lift.asm to avoid crashes.
Name your programs using the following format:
a. Lastname_Firstname_tlights.asm
b. Lastname_Firstname_stepperMotor.asm
c. Lastname_Firstname_7segmDisplays.asm
d. Lastname_Firstname_99Heater.asm
e. Lastname_Firstname_lift.asm
Use the online manual as a reference as well as the manual included in the download.
http://www.softwareforeducation.com/sms32v50/sms32v50_manual/index.htm
Questions
1. Explain what an ISA (Instruction Set Architecture) is.
2. Why is assembly language not portable?
3. Why is assembly language so important?
SUBMISSION
Write your report including screenshots of every one of your programs running and explaining
how they work. Include comments in your code too and upload your source code files to
1/2
Western Online along with your report as a DOCX or PDF document through Western Online.
Include in the report the answers to the questions writing the corresponding numbers and
questions (or at least the numbers) in bold and in the proper order before every answer. At the
end include a conclusion (properly labeled as that) explaining the importance of all this.
Use the following format to name your document file:
YourLastName-YourFirstName_CS590_Assignment_03 (or docx extension)
After submission verify that the upload was successful and with the correct file(s).
Do not submit compressed files.
Write your report including screenshots of every one of your programs running and explaining
how they work. Include comments in your code too and submit your source code file (.asm).
Submit your report as a DOCX or PDF document through Western Online. Include in the report
the answers to the questions writing the corresponding numbers and questions (or at least the
numbers) in bold using a different text color and in the proper order before every answer
(answers in black and not in bold); these same rules apply for the five exercises. Points will be
deducted if you do not follow these guidelines.
At the end include a conclusion (properly labeled as that) explaining the importance of all this.
Use the following format to name your document file:
CS590_HW_03_YourLastName-YourFirstName_ (or docx extension)
After submission verify that the upload was successful and with the correct file(s).
Do not submit compressed files.
—–///
2/2
Microprocessor Simulator V5.0 Help
© C Neil Bauers 2003 – http://www.softwareforeducation.com/
General Tutorials Reference
Introduction
Architecture
Installation
Un-Installation
To Register
Registration Form
FAQ and Bugs
PC Support Handbook
Use Alt+Tab to switch
between the help and
simulator windows.
Getting Started
All Learning Tasks
01 First Program
— Nasty Example
02 Traffic Lights
03 Data Moves
04 Counting
05 Keyboard Input
06 Procedures
07 Text I/O
08 Data Tables
09 Parameters
10 SW Interrupts
11 HW Interrupts
Shortcut Keys
ASCII Codes
Glossary
Hexadecimal and Binary
Instruction Set Summary
Instruction Set Detailed
The List File
Negative Numbers
Pop-up Help
Logic and Truth
The Editor
Peripheral Devices
This simulator is for learners in the 16+ age range although many younger enthusiasts
have used it too. It introduces low level programming and microcomputer architecture.
Tutorial materials are included covering the subject in some depth.
The tutorials align closely with the British GCE A2 Computing specifications and also the
British BTEC National for IT Practitioners (Computer Systems).
The simulator has enough depth and flexibility to be used with university undergraduate
students studying low level programming for the first time.
Introduction
Contents
Who Should Use the Simulator
The simulator is intended for any student studying low level programming, control or machine
architecture for the first time.
The simulator can be used by students aged 14 to 16 to solve less complex problems such as
controlling the traffic lights and snake.
More advanced students typically 16 or older can solve quite complex low level programming
problems involving conditional jumps, procedures, software and hardware interrupts and Boolean
logic. Although programs will be small, there is good scope for modular design and separation of
code and data tables.
The simulator is suitable for courses such as
BTEC National Diploma for IT Practitioners (Computer Systems and Control Technology)
AS and A2 Computing (Low Level Programming)
Electronics Courses.
Courses involving microcontrollers.
Courses involving control systems.
Description of the Simulator
In the shareware version the following instructions are not included. CALL, RET, INT and IRET. The
hardware timer interrupt does not function because IRET can not be used either. The registered
version includes these features. You can register the software here.
This simulator emulates an eight bit CPU that is similar to the low eight bits of the 80×86 family of
chips. 256 bytes of RAM are simulated. It is surprising how much can be done with only 256 bytes or
RAM.
Features
8 bit CPU
16 Input Output ports. Not all are used.
Simulated peripherals on ports 0 to 5.
An assembler.
On-line help.
Single step through programs.
Continuously run programs.
Interrupt 02 triggered by a hardware timer (simulated).
CPU Clock Speed can be altered.
Peripherals Example Programs
Keyboard Input 99keyb.asm
Traffic Lights 99tlight.asm
Seven Segment Display 99sevseg.asm
Heater and Thermostat 99hon.asm and 99hoff.asm
Snake and Maze 99snake.asm
Stepper Motor 99step.asm
Memory Mapped VDU 99keyb.asm
Documentation
On-line hypertext help is stored in a Website. It is possible to copy from the help pages and paste
into a word processor or text editor programs. Registered users have permission to modify help files
for use by students and to print and or make multiple photocopies.
Disclaimer
This simulation software is not guaranteed in any way. It may differ from reality. It might not even
work at all. Try it out and if you like it, please register.
System Architecture
Contents
Simplified Simulator Architecture
central processing unit (CPU)
256 bytes of random access memory (RAM)
16 input output (IO) ports. Only six are used.
A hardware timer that triggers interrupt 02 at regular time intervals that you can pre-set
using the configuration tab.
A keyboard that triggers interrupt 03.
Peripherals connected to the Ports.
The simulator is programmable in that you can run many different programs. In real life, the RAM
would be replaced by read only memory (ROM) and the system would only ever run one program
hard wired into the ROM. There are hundreds of examples of systems like this controlling traffic
lights, CD players, simple games consoles, many children’s games, TV remote controls, microwave
oven timers, clock radios, car engine management systems, central heating controllers,
environmental control systems and the list goes on.
The Central Processing Unit
The central processing unit is the “brain” of the computer. All calculations, decisions and data moves
are made here. The CPU has storage locations called registers. It has an arithmetic and logic unit
(ALU) where the processing is done. Data is taken from the registers, processed and results go back
into the registers. Move (MOV) commands are used to transfer data between RAM locations and the
registers. There are many instructions, each with a specific purpose. This collection is called the
instruction set.
General Purpose Registers
The CPU has four general-purpose registers called AL, BL, CL and DL. These are eight bits or one
byte wide. Registers can hold unsigned numbers in the range 0 to +255 and signed numbers in the
range –128 to +127. These are used as temporary storage locations. Registers are used in
preference to RAM locations because it takes a relatively long time to transfer data between RAM
and the CPU. Faster computers generally have more CPU registers or memory on the CPU chip.
The registers are named AL, BL, CL and DL because the 16-bit version of this CPU has more
registers called AH, BH, CH and DH. The ‘L’ means Low and the ‘H’ means High. These are the low
and high ends of the 16-bit register.
Special Purpose Registers
The special purpose registers in the CPU are called IP, SR and SP.
IP is the Instruction pointer
This register contains the address of the instruction being executed. When execution is complete, IP
is increased to point to the next instruction. Jump instructions alter the value of IP so the program
flow jumps to a new position. CALL and INT also change the value stored in IP. In the RAM displays,
the instruction pointer is highlighted red with yellow text.
SR is the Status Register
This register contains flags that report the CPU status.
The ‘Z’ zero flag is set to one if a calculation gave a zero result.
The ‘S’ sign flag is set to one if a calculation gave a negative result.
The ‘O’ overflow flag is set if a result was too big to fit in a register.
The ‘I’ interrupt is set if interrupts are enabled. See CLI and STI.
SP is the Stack Pointer
The stack is an area of memory organised using the LIFO last in first out rule. The stack pointer
points to the next free stack location. The simulator stack starts at address BF just below the RAM
used for the video display. The stack grows towards address zero. Data is pushed onto the stack to
save it for later use. Data is popped off the stack when needed. The stack pointer SP keeps track of
where to push or pop data items. In the RAM displays, the stack pointer is highlighted blue with
yellow text.
Random Access Memory
The simulator has 256 bytes of ram. The addresses are from 0 to 255 in decimal numbers or from
[00] to [FF] in hexadecimal. RAM addresses are usually given in square brackets such as [7C] where
7C is a hexadecimal number. Read [7C] as “the data stored at location 7C”.
Busses
Busses are collections of wires used to carry signals around the computer. They are commonly
printed as parallel tracks on circuit boards. Slots are sockets that enable cards to be connected to
the system bus. An 8-bit computer typically has registers 8 bits wide and 8 wires in a bus. A 16-bit
computer has 16 bit registers and 16 address and data wires and so on. The original IBM PC had 8
data wires and 20 address wires enabling one megabyte of RAM to be accessed. 32 bit registers and
busses are now usual (1997-2003).
Data Bus
The Data Bus is used to carry data between the CPU, RAM and IO
ports. The simulator has an 8-bit data bus.
Address Bus
The Address Bus is used to specify what RAM address or IO port
should be used. The simulator has an 8-bit address bus.
Control Bus
The Control Bus This has a wire to determine whether to access RAM
or IO ports. It also has a wire to determine whether data is being
read or written. The CPU reads data when it flows into the CPU. It
writes data when it flows out of the CPU to RAM or the IO ports.
The System Clock wire carries regular pulses so that all the
electronic components can function at the correct times. Clock
speeds between 100 and 200 million cycles per second are typical
(1997). This is referred to as the clock speed in MHz or megahertz.
The simulator runs in slow motion at about one instruction per
second. This is adjustable over a small range.
Hardware Interrupts
Hardware Interrupts require at least one wire. These enable the CPU
to respond to events triggered by hardware such as printers running
out of paper. The CPU processes some machine code in response to
the interrupt. When finished, it continues with its original task. The
IBM PC has 16 interrupts controlled by 4 wires.
To Install
Contents
Important
FIRST make a back-up copy of the distribution disk or downloaded file.
Print and file Userinfo.Reg. This contains your registration key.
System Requirements
Sms32V50.Exe requires Windows95/98/NT/2000/XP. A mouse or other pointing device is highly
recommended to access the hypertext help pages.
To Install
You don’t need to run setup. In fact there is no setup program.
Make a folder for all the simulator files and copy them from the distribution disk to this directory. If
you have downloaded a Zip file, unzip it into this directory.
Create a shortcut to sms32v50.exe and/or create a shortcut on the start menu.
Suggestion
In a School College or University setting, the exercise files should be made available to students.
These files are numbered as in this example – 02tlight.asm and 99snake.asm. The other example
files demonstrate the capabilities of the simulator and are typical of the sorts of programs students
might write for assignments. DEMO.ASM could be made available.
The other example files should be kept by the teacher for reference.
Note Userinfo.reg, if available, should be in the same directory as Sms32V50.Exe. Userinfo.reg
contains the registration key that you need to run the simulator as a registered program. When
Userinfo.reg is missing or faulty, the simulator runs in shareware mode. When you register, you will
get instructions relating to Userinfo.reg. This file has nothing to do with the Windows registry. It is a
text file containing the key. The name is a hangover from the days of Windows 3.11 when there
were no registries.
Note The on line help should be in the same directory as Sms32V50.Exe
Network Installation
The simulator has been designed to run from a server. It does not need to be installed on
workstations. This makes deployment simple. One day all software will be like this and stressed out
network administrators will revert to being normal, kind, cheerful human beings.
Create a read only folder on the server and make sure all users can access it. (Create a share or use
a public drive). Copy all the files from the distribution into the folder on the server. If they are
zipped, unzip them. Provide a shortcut to sms32v50.exe for users to start up the program. The
shortcut should specify a working directory where users have write permission such as their home
folder.
Users need write access to their own home folder (a floppy disk would be OK too). Users need to
save their work to this folder (or floppy). If the simulator starts in this directory, it will save an INI
file in the user’s directory that holds information about window positions and preferences. If the INI
file is missing, the simulator will start with default settings and will attempt to create the INI file. If
SMS starts in a read only directory, it will not be able to save its INI file. If possible, correct this non
fatal problem.
To Run
If you have set up an icon, double click the icon or select the icon and press enter. You can double
click on Sms32V50.exe within Windows Explorer or My Computer. You can drag and drop an
assembly code file onto the program icon. Assembly code file names should end in .asm. You can
drop any type of file into the simulator but the result is not likely to be useful.
To Un-Install
Contents
Important
FIRST make a back-up copy of the distribution disk or downloaded file.
Print and file Userinfo.Reg. This contains your registration key.
To Un-install
The simulator is completely self contained and can be deleted without upsetting any other
applications. Using Windows Explorer or My Computer or Network Neighborhood (for a network
installation) delete all the files in the simulator directory. Remove any icons or other references to
the simulator from your system. There are NO registry entries or DLL files so the un-install is
completely clean.
To Register
Contents
Please visit the website for registration information. If you have no internet access, please use this
form.
When You Register
You will be sent a software key that unlocks all the features in the simulator. You should enter this
key into the place provided. The menu choice for entering your key is
Help – Register Alt+H R
Type or paste your key into the space provided and press the Register button. If there are no
spelling errors, the simulator will be registered.
Shareware
The Microcontroller Simulator is shareware. In the unregistered shareware version the following
features are not included. CALL, RET, INT, IRET and the simulated hardware timer interrupt 02.
Distribution
The complete unaltered shareware package may be distributed freely without restriction. Re-
packaging to suit different distribution media is permitted. Please give a copy to any interested
friend, colleague or student.
Student Registration
Individual students may use this software free. There is no need to register. Students may also use
registered copies free of charge. These copies should, whenever possible, be obtained from your
school, college or university.
Educational Institutions
Schools, Colleges, Universities and other Educational Institutions are asked to register. When you
register, you will receive a key that unlocks the full version of this software. The prompts to register
will go away and CALL, RET, INT, IRET and simulated hardware timer interrupts will be made
available.
Payment
You are asked to register this software by paying 25 Euros (15.00 Pounds Sterling in the UK). To
register quickly, use the on-line service. Alternatively please send a cheque in your local currency
for the equivalent of 25 Euros. Cash is OK as a last resort. I hope to cover my software and Internet
access costs. Please make cheques payable to C N BAUERS.
Registered Sites
All users are invited to visit the website regularly for upgrades and updates. Registered sites may
photocopy manuals and help pages as required. You may give registered copies of the software to
your students. These copies should be free. You may run unlimited instances of the software on a
single site. If you have multiple sites in different villages, towns or cities, you are asked to register
and pay for each site separately.
Local Education Authorities
You can register ALL the schools in your area. The cost is 10 Euros per school with learners over the
age of 16. This software is intended for the 16+ age group but you may deploy the program for
younger pupils too. Schools can download their own software. The registration key can be
distributed in a letter to each school or in a newsletter.
Here is an example of a registration key (you will get a valid key)
Bawsetshire Local Education AuthorityABCDE
Simulator Registration Form
Contents
The fastest way to register is on-line. If you can’t use this service, please use this form.
Please fill in this form and mail it to the address below. These details will not be passed to any junk
mail organisation.
Educational institutions should include a cheque for 15 GB Pounds OR 25 Euros OR 30 US Dollars
OR the equivalent in your own currency.
Cheques should be made payable to C N Bauers.
Please mail me at nbauers@samphire.demon.co.uk to agree a price in your local currency.
Please check this address is up-to-date on the website.
C N Bauers
87 Cliff Hill
Gorleston
Great Yarmouth
Norfolk
NR31 6DH
United Kingdom
Please provide your details below (please write clearly!) …
Contact Person ___________________________________________
Institution Name _________________________________________
Institution Address ______________________________________
__________________________________________________________
__________________________________________________________
__________________________________________________________
__________________________________________________________
__________________________________________________________
Your EMail address _______________________________________
FAQ and Bugs
Contents
Bugs and Features
Here are some bugs and features that I know about.
If you discover other problems, please send me an E Mail.
I will cure the easy bugs and list the hard ones here.
Is there a Discussion Forum?
Yes – it is here – http://groups.yahoo.com/group/learn-asm/
Why is my display messed up?
The simulator was developed using the default windows fonts, colours and borders. Some
combinations of colours, fonts and window borders have caused problems. The symptoms include
invisible text, text that won’t fit inside the window, labels that don’t line up with the item they are
supposed to label and stretched bitmap images that look untidy. The cure is to use Windows default
settings or compatible settings.
Where are my windows?
Beginners and some experts might hit another problem. With the display in a high resolution mode,
the simulator windows can be moved towards the bottom right. When the display is restored to a
lower resolution and the simulator is re-started, all its windows will be off screen where thay can’t
be seen or controlled. The cure is to close the simulator and delete Sms32V50.INI. The simulator
will restart with its windows in default visible positions.
Why won’t this file save? “Sms32v50.Ini”
This is common on a network installation. Make sure the working directory is one that you have
permission to write to.
Why can’t I save my work?
Make sure you are saving to a folder or directory where you have write permission. This problem
usually occurs when you are running the simulator from a network installation.
PC Support Handbook
Contents
The PC Support Handbook
This is an excellent book for anyone learning about personal computer hardware.
Details are at http://www.dumbreck.demon.co.uk/
This book covers PC architecture in some detail and makes excellent further reading in conjunction
with the use of this simulator.
The book’s ISBN reference is 09541711-1-X and it is available in the following ways:
From all major book distributors.
From the Maplin chain of stores or Maplin mail/web order.
Directly from the publishers, Dumbreck Publishing. This is often the quickest method.
Please get up-to-date contact details from their website.
PC Support Handbook Contents
Computer Basics
Software & Data
Operating Systems
Numbering Systems
Computer Architecture
Display Technology
Computer Memory
Discs & Drives
Computer Peripherals
System Selection
Hardware Installation
P.C. Configuration
Windows Configuration
P.C. Support
Faultfinding
Computer Security
Data Communications
Local Area Networks
The Internet
Creating Websites
Multimedia
Using the Simulator – Getting Started
Contents
On Line Help
Press the F1 key to get on line help.
Writing a Program
To write and run a program using the simulator, select the source code editor tab by pressing
Alt+U.
Type in your program. It is best to get small parts of the program working rather than typing it all in
at once.
Here is a simple example. Also look at the tutorial example programs. You can type this into the
simulator or copy and paste it. The assembly code has been annotated with comments that explain
the code. These comments are ignored by the assembler program. Comments begin with a
semicolon and continue to the end of the line.
; ===== COUNT =================================================
MOV AL,0 ; Move 0 into the AL register
REP: ; This label is used with jump commands
ADD AL,2 ; Add two to AL
JMP REP ; Jump back to the rep label
END ; Program ends here
; =============================================================
Running a Program
To run a program, you can step through it one line at a time by pressing Alt+P or by
clicking this button repeatedly.
You can run a program continuously by pressing F9 or Alt+R or by pressing this button
To speed up or slow down a running program use these buttons or type Alt+L or Alt+T
To stop a running program press Alt+O or click or press Escape or press this button.
To restart a paused program, continuing from where it left off, press Alt+N or click this
button.
To restart a program from the beginning, reset the CPU by pressing Alt+E or click this
button.
To re-open the RAM display window, press Alt+M or click this button.
Assembly Code
The code you type is called assembly code. This human-readable
code is translated into machine code by the Assembler. The
machine code (binary) is understood by the CPU. To assemble a
program, press Alt+A or click this button.
You can see an animation of the assembler process by checking this
box.
When you run or setp a program, if necessary, the code is
assembled.
Assembler Phases
There is short delay while the assembbler goes through all the stages of assembling the program.
The steps are
1. Save the source code.
2. Convert the source code into tokens (this simulator uses human readable tokens for
educational value rather than efficiency).
3. Parse the source code and (if necessary) generate error messages. If there are no errors,
generate the machine codes. This process could be coded more efficiently. If the tokens
representing machine op codes like MOV and JMP were numerical, the assembler could look
up the machine code equivalents in an array instead of ploughing through many if-then-else
statements. Once again, this has been done to demonstrate the process of assembling code
for educational reasons.
4. Calculate jumps, the distances of the jump/branch instructions.
Viewing Machine Code
The machine code stored in RAM can be viewed in three modes by selecting the appropriate radio
button.
Hexadecimal – This display corresponds exactly to the binary executed by the CPU.
ASCII – This display is convenient if your program is processing text. The text is readable but the
machine codes are not.
Source Code – This display shows how the assembly code commands are placed in memory.
Tutorial Examples
The tutorial examples provide a step by step introduction to the commands and techniques of low
level programming. Each program has one or more learning tasks associated with it. Some of the
tasks are simple. Some are real brain teasers.
Learning Tasks
Contents
The Tasks
Here are all the learning tasks grouped together with pointers to the example programs and
explanatory notes.
Simple Arithmetic
Example – 01first.asm – Arithmetic
1. Write a program that subtracts using SUB
2. Write a program that multiplies using MUL
3. Write a program that divides using DIV
4. Write a program that divides by zero. Make a note to avoid doing this in real life.
Using Hexadecimal
Example – 02tlight.asm – Traffic Lights
5. Use the help page on Hexadecimal and Binary numbers. Work out what hexadecimal
numbers will activate the correct traffic lights. Modify the program to step the lights through
a realistic sequence.
ASCII Codes
Example – 03move.asm
6. Look up the ASCII codes of H, E, L, L and O and copy these values to memory locations C0,
C1, C2, C3 and C4. This is a simple and somewhat crude way to display text on a memory
mapped display.
Counting and Jump Commands
Example – 04incjmp.asm
7. Rewrite the example program to count backwards using DEC BL.
8. Rewrite the example program to count in threes using ADD BL,3.
9. Rewrite the program to count 1 2 4 8 16 using MUL BL,2
10. Here is a more difficult task. Count 0 1 1 2 3 5 8 13 21 34 55 98 overflow. Here each number
is the sum of the previous two. You will need to use two registers and two RAM locations for
temporary storage of numbers. If you have never programmed before, this is a real brain
teaser. Remember that the result will overflow when it goes above 127.
This number sequence was first described by Leonardo Fibonacci of Pisa (1170_1230)
Character Input Output
Example – 05keybin.asm
11. Easy! Input characters and display each character at the top left position of the VDU by
copying them all to address [C0].
12. Harder Use BL to point to address [C0] and increment BL after each key press in order to see
the text as you type it.
13. Harder! Store all the text you type in RAM when you type it in. When you press Enter, display
the stored text on the VDU display.
14. Difficult Type in text and store it. When Enter is pressed, display it on the VDU screen in
reverse order. Using the stack makes this task easier
Procedures
Example – 06proc.asm
15. Re-do the traffic lights program and use this procedure to set up realistic time delays.
02tlight.asm
16. Re-do the text input and display program with procedures. Use one procedure to input the
text and one to display it.
Text IO and Procedures
Example – 07textio.asm
17. Write a program using three procedures. The first should read text from the keyboard and
store it in RAM. The second should convert any upper case characters in the stored text to
lower case. The third should display the text on the VDU screen.
Data Tables
Example – 08table.asm
18. Improve the traffic lights data table so there is an overlap with both sets of lights on red.
19. Use a data table to navigate the snake through the maze. This is on port 04. Send FF to the
snake to reset it. Up, down left and right are controlled by the left four bits. The right four
bits control the distance moved.
20. Write a program to spin the stepper motor. Activate bits 1, 2, 4 and 8 in sequence to
energise the electromagnets in turn. The motor can be half stepped by turning on pairs of
magnets followed by a single magnet followed by a pair and so on.
21. Use a data table to make the motor perform a complex sequence of forward and reverse
moves. This is the type of control needed in robotic systems, printers and plotters. For this
exercise, it does not matter exactly what the motor does.
Parameters
Example – 09param.asm
22. Write a procedure that doubles a number. Pass the single parameter into the procedure using
a register. Use the same register to return the result.
23. Write a procedure to invert all the bits in a byte. All the zeros should become ones. All the
ones should become zeros. Pass the value to be processed into the procedure using a RAM
location. Return the result in the same RAM location.
24. Write a procedure that works out Factorial N. This example shows one method for working
out factorial N. Factorial 5 is 5 * 4 * 3 * 2 * 1 = 120. Your procedure should work properly
for factorial 1, 2, 3, 4 or 5. Factorial 6 would cause an overflow. Use the stack to pass
parameters and return the result. Calculate the result. Using a look up table is cheating!
25. Write a procedure that works out Factorial N. Use the stack for parameter passing. Write a
recursive procedure. Use this definition of Factorial.
Factorial ( 0 ) is defined as 1.
Factorial ( N ) is defined as N * Factorial (N – 1).
To work out Factorial (N), the procedure first tests to see if N is zero and if not then re-uses
itself to work out N * Factorial (N – 1). This problem is hard to understand in any
programming language. In assembly code it is harder still.
Software Interrupts
Example – 10swint.asm
26. The simulated keyboard generates INT 03 every time a key is pressed. Write an interrupt 03
handler to process the key presses. Use IN 07 to fetch the pressed key into the AL register.
The original IBM PC allocated 16 bytes for key press storage. The 16 locations are used in a
circular buffer fashion. Try to implement this.
27. Build on task 26 by puting characters onto the next free screen location. See if you can get
correct behaviour in response to the Enter key being pressed (fairly easy) and if the Back
Space key being pressed (harder).
Hardware Interrupts
Example – 11hwint.asm
28. Write a program that controls the heater and thermostat whilst at the same time counting
from 0 to 9 repeatedly, displaying the result on one of the seven segment displays. If you
want a harder challenge, count from 0 to 99 repeatedly using both displays. Use the
simulated hardware interrupt to control the heater and thermostat.
29. A fiendish problem. Solve the Tower of Hanoi problem whilst steering the snake through the
maze. Use the text characters A, B, C Etc. to represent the disks. Use three of the four rows
on the simulated screen to represent the pillars.
I am not aware of anyone having solved the tower of Hanoi (including me), let alone
controlling the snake at the same time.
30. Use the keyboard on Port 07. Write an interrupt handler (INT 03) to process the key presses.
You must also process INT 02 (the hardware timer) but it need not perform any task. For a
more advanced task, implement a 16 byte circular buffer. Write code to place the buffered
text on the VDU screen when you press Enter. For an even harder task, implement code to
process the Backspace key to delete text characters in the buffer.
Example – 01first.asm – Arithmetic
Contents
Most of these examples include a learning task. Study the example and if you can complete the
task/s, it is likely that your understanding is good.
Example – 01first.asm
; ===== WORK OUT 2 PLUS 2 ======================================
CLO ; Close unwanted windows.
MOV AL,2 ; Copy a 2 into the AL register.
MOV BL,2 ; Copy a 2 into the BL register.
ADD AL,BL ; Add AL to BL. Answer goes into AL.
END ; Program ends
; ===== Program Ends ===========================================
YOUR TASK
=========
Use SUB, DIV and MUL to subtract, divide and multiply.
What happens if you divide by zero?
Make use of CL and DL as well as AL and BL.
Type this code into the simulator editor OR copy and paste the code OR load the example from disk.
Step through the program by pressing Alt+P repeatedly.
While you are stepping, watch the CPU registers. You should see a ‘2’ appear in the AL register
followed by a ‘2’ in the BL register. AL should be added to BL and a ‘4’ should appear in AL. The
altered registers are highlighted yellow.
Watch the register labelled IP (Instruction Pointer). This register keeps track of where the processor
has got to in the program. If you look at the RAM display, one RAM location is labelled with a red
blob. This corresponds to the Instruction Pointer. Note how the red blob (IP) moves when you step
the program.
When doing the learning exercises, add to and modify your own copy of the example.
What you need to know
Comments
Any text after a semicolon is not part of the program and is ignored by the
simulator. These comments are used for explanations of what the program is
doing. Good programmers make extensive use of comments. Good comments
should not just repeat the code. Good comments should explain why things
are begin done.
CLO
The CLO command is unique to this simulator. It closes any window that is not
needed while a program is running. This command makes it easier to write
nice demonstration programs. It also avoids having to close several windows
manually.
MOV
The MOV command is short for Move. In this example numbers are being
copied into registers where arithmetic can be done. MOV copies data from one
location to another. The data in the original location is left intact by the MOV
command. Move was shortened to Mov because, in olden times, computer
memory was fiendishly expensive. Every command was shortened as much as
possible, much like the mobile phone texting language used today.
ADD
Arithmetic
The add command comes in two versions. Here are two examples
ADD AL,BL – Add BL to AL and store the result into AL
ADD AL,5 – Add 5 to AL and store the result into AL
Look at the on-line help to find out about SUB, MUL and DIV. Remeber that
you can access on-line help by pressing the F1 key.
Registers
Registers are storage locations where 8 bit binary numbers are stored. The
central processing unit in this simulator has four general purpose registers
called AL, BL, CL and DL. These registers are interchangeable and can, with a
few exceptions, be used for any purpose.
Newer central processing unit (CPU) chips have 16, 32 or even 64 bit
registers. These work in the same way but more data can be moved in one
step so there is a speed advantage.
Wider registers can store larger integer (whole) numbers. This simplifies many
programming tasks. The other three registers SP, IP and SR are described
later.
Hexadecimal
Numbers
In the command MOV AL,2 the 2 is a hexadecimal number. The hexadecimal
number system is used in low level programming because there is a very
convenient conversion between binary and hex. Study the Hexadecimal and
Binary number systems.
END
The last command in all programs should be END. Any text after the END
keyword is ignored.
Your Tasks
Use all the registers AL, BL, CL and DL and experiment with ADD, SUB, MUL and DIV.
Find out what happens if you try to divide by zero.
Example – 99nasty.asm – Nasty
Contents
This example shows how you can create totally unreadable code.
Try not to do this.
This program actually works. Copy it and paste it into the simulator and try it!
Click the List-File tab to see the code laid out better and to see the addresses where the code is
stored.
To get back to the editor window click the Source-Code tab.
Example – 99nasty.asm
; —– Here is how NOT to write a program —–
_: Mov BL,C0 Mov AL,3C Q: Mov [BL],AL CMP AL,7B
JZ Z INC AL INC BL JMP Q Z: MOV CL,40 MOV AL,20
MOV BL,C0 Y: MOV [BL],AL INC BL DEC CL JNZ Y JMP
_ END ; Look at the list file. It comes out OK!
; Press Escape to stop the program running.
; ———————————————-
Here it is tidied up
; —– A Program to display ASCII characters —————–
; —– Here it is tidied up. This version is annotated. ——
; —– This makes it possible to understand. —————–
; —– The labels have been given more readable names too. —
Start:
Mov BL,C0 ; Make BL point to video RAM
Mov AL,3C ; 3C is the ASCII code of the ‘less than’ symbol
Here:
Mov [BL],AL ; Copy the ASCII code in AL to the RAM location that BL is
pointing to.
CMP AL,7B ; Compare AL with ‘{‘
JZ Clear ; If AL contained ‘{‘ jump to Clear:
INC AL ; Put the next ASCII code into AL
INC BL ; Make BL point to the next video RAM location
JMP Here ; Jump back to Here
Clear:
MOV CL,40 ; We are going to repeat 40 (hex) times
MOV AL,20 ; The ASCII code of the space character
MOV BL,C0 ; The address of the start of video RAM
Loop:
MOV [BL],AL ; Copy the ASCII space in AL to the video RAM that BL is
pointing to.
INC BL ; Make BL point to the next video RAM location
DEC CL ; CL is counting down towards zero
JNZ Loop ; If CL is not zero jump back to Loop
JMP Start ; CL was zero so jump back to the Start and do it all again.
END
; ————————————————————-
Your Task
Write all your future programs …
with good layout
with meaningful label names
with useful comments that EXPLAIN the code
avoiding comments that state the totally obvoius and just repeat the code
Bad Comment – just repeats the code
INC BL ; Add one to BL
Useful Comment – explains why the step is needed
INC BL ; Make BL point to the next video RAM location
Example – 02tlight.asm – Traffic Lights
Contents
Example – 02tlight.asm
; ===== CONTROL THE TRAFFIC LIGHTS =============================
CLO ; Close unwanted windows.
Start:
; Turn off all the traffic lights.
MOV AL,0 ; Copy 00000000 into the AL register.
OUT 01 ; Send AL to Port One (The traffic lights).
; Turn on all the traffic lights.
MOV AL,FC ; Copy 11111100 into the AL register.
OUT 01 ; Send AL to Port One (The traffic lights).
JMP Start ; Jump back to the start.
END ; Program ends.
; ===== Program Ends ==========================================
YOUR TASK
=========
Use the help page on Hexadecimal and ASCII codes.
Work out what hexadecimal numbers will activate the
correct traffic lights. Modify the program to step
the lights through a realistic sequence.
To run the program press the Step button repeatedly or press the Run button.
To stop the program, press Stop. When the program is running, click the RAM-Source or RAM-Hex or
RAM-ASCII tabs. These give alternative views of the contents of random access memory (RAM).
Also click the List File tab to see the machine code generated by the simulator and the addresses
where the codes are stored.
Ports
The traffic lights are connected to port one. Imagine this as a socket on the back of the processor
box. Data sent to port one goes to the traffic lights and controls them.
There are six lamps to control. Red, Amber and Green for a pair of lights. This can be achieved with
a single byte of data where two bits are unused.
By setting the correct bits to One, the correct lamps come on.
Fill in the rest of this table to work out the Hexadecimal values
you need. Of course you need to know the sequence of lights
in your country.
Red Amber Green Red Amber Green
Not
used
Not
used
Hex
1 0 0 0 0 1 0 0 84
What you need to know
Labels and
Jumps
Labels mark positions that are used by Jump commands. All the commands in
this program are repeated for ever or until Stop is pressed. Label names must
start with a letter or _ character. Label names must not start with a digit. The
line
JMP Start
causes the program to jump back and re-do the earlier commands.
Destination labels end in a colon. For example
Start:
Controlling
the Lights
If you look carefully at the traffic lights display, you can see which bit controls
each light bulb. Work out the pattern of noughts and ones needed to turn on a
sensible set of bulbs. Use the Hexadecimal and Binary numbers table to work
out the hexadecimal equivalent. Move this hexadecimal number into AL.
OUT 01
This command copies the contents of the AL register to Output Port One. The
traffic lights are connected to port one. A binary one causes a bulb to switch
on. A nought causes it to turn off.
Example – 03move.asm – Data Moves
Contents
Example – 03move.asm
; —————————————————————
; A program to demonstrate MOV commands. Mov is short for move.
; —————————————————————
CLO ; Close unwanted windows.
; ===== IMMEDIATE MOVES =====
MOV AL,15 ; Copy 15 HEX into the AL register
MOV BL,40 ; Copy 40 HEX into the BL register
MOV CL,50 ; Copy 50 HEX into the CL register
MOV DL,60 ; Copy 60 HEX into the DL register
Foo:
INC AL ; Increment AL for no particular reason.
; ===== INDIRECT MOVES =====
MOV [A0],AL ; Copy value in AL to RAM location [40]
MOV BL,[40] ; Copy value in RAM location [A0] into BL
; ===== REGISTER INDIRECT MOVES =====
MOV [CL],AL ; Copy the value in AL to the RAM
; location that CL points to.
MOV BL,[CL] ; Copy the RAM location that CL points
; to into the BL register.
JMP Foo ; PRESS ESCAPE TO STOP THE PROGRAM
END
; —————————————————————
TASK
====
Look up the ASCII codes of the letters in H,E,L,L,O and move
these ASCII codes to RAM addresses [C0], [C1], [C2], [C3]
and [C4]. Run the program and watch how the text appears on
the simulated VDU display. This is very much the same as what
happens in the IBM PC running MS DOS. The program you write
should work but if you continue to study low level programming,
you will find much more efficient and flexible ways of solving
this problem.
Step through the program and watch the register values changing. In particular, look at the RAM-
Hex display and note the way that values in RAM change. Addresses [50] and [A0] are altered. You
can copy the example program from the help page and paste it into the source code editor.
Addresing Modes
There are several ADDRESSING MODES available with move commands.
Immediate Addressing
A hexadecimal number is copied into a register. Examples…
MOV AL,15 ; Copy 15 HEX into the AL register
MOV BL,40 ; Copy 40 HEX into the BL register
MOV CL,50 ; Copy 50 HEX into the CL register
MOV DL,60 ; Copy 60 HEX into the DL register
Indirect Addressing
A value is moved to or from RAM. The ram address is given as a number like [22] in square
brackets. Examples…
MOV [A0],AL ; Copy value in AL to RAM location [40]
MOV BL,[40] ; Copy value in RAM location [A0] into BL
Register Indirect Addressing
Copy a value from RAM to a register or copy a value from a register to RAM. The RAM address is
contained in a second register enclosed in square brackets like this [CL]. Examples …
MOV [CL],AL ; Copy the value in AL to the RAM location that CL points to.
MOV BL,[CL] ; Copy the RAM location that CL points to into the BL register.
Register Moves
Not available in this simulation.
A register move looks like this
MOV AL,BL
To do this using simulator commands, use
PUSH BL
POP AL
Push and Pop are explained later.
Calculated Addresses
Not available in this simulator.
Copy a value from RAM to a register or copy a value from a register to RAM. The RAM address is
contained in square brackets and is calculated. This is done to simplify access to record structures.
For example a surname might be stored 12 bytes from the start of the record. This technique is
shown in the examples below.
MOV [CL + 5],AL ; Copy the value in AL to the RAM location that CL + 5 points to.
MOV BL,[CL + 12] ; Copy the RAM location that CL + 12 points to into the BL register.
Implied Addresses
Not available in this simulator.
In this case, memory locations are named. Address [50] might be called ‘puppy’. This means that
moves can be programmed like this.
MOV AL,puppy ; Copy the value in RAM at position puppy into the AL register.
MOV puppy,BL ; Copy BL into the RAM location that puppy refers to.
Example – 04IncJmp.asm – Counting
Contents
Example – 04IncJmp.asm
; ===== Counting ===================================
MOV BL,40 ; Initial value stored in BL
Rep: ; Jump back to this label
INC BL ; Add ONE to BL
JMP Rep ; Jump back to Rep
END ; Program Ends
; ===== Program Ends ===============================
TASK
=====
Rewrite the program to count backwards using DEC BL.
Rewrite the program to count in threes using ADD BL,3.
Rewrite the program to count 1 2 4 8 16 using MUL BL,2
Here is a more difficult task.
Count 0 1 1 2 3 5 8 13 21 34 55 98 overflow.
Here each number is the sum of the previous two.
You will need to use registers or RAM locations
for temporary storage of the numbers.
If you have never programmed before, this is a real brain teaser.
Remember that the result will overflow when it goes above 127.
This number sequence was first described by
Leonardo Fibonacci of Pisa (1170_1230)
The program counts up in steps of one until the total is too big to be stored in a single byte. At this
point the calculation overflows. Watch the values in the registers. In particular, watch IP and SR.
These are explained below.
Although this program is very simple, some new ideas are introduced.
MOV BL,40
This line initialises BL to 40.
Rep:
Rep: is a label. Labels are used with Jump commands. It is possible for programs to jump backwards
or forwards. Because of the way numbers are stored, the largest jumps are -128 backwards and +
127 forwards. Labels must begin with a letter or the _ character. Labels may contain letters, digits
and the _ character. Destination labels must end with a Colon:
INC BL
This command adds one to BL. Watch the BL register. It will count up from 40 in hexadecimal so
after 49 comes 4A, 4B, 4C, 4D, 4E, 4F, 50, 51 and so on.
Overflow
When BL reaches 7F hex or 127 in decimal numbers the next number ought to be 128 but because
of the way numbers are stored in binary, the next number is minus 128. This effect is called an
OVERFLOW.
Status Register (SR)
The status register labelled SR contains four flag bits that give information about the state of the
CPU. There are three flags that indicate whether a calculation overflowed, gave a negative result or
gave a zero result. Calculations set these flags
S The sign flag indicates a negative result.
O The overflow flag indicates overflows.
Z The zero flag indicates a zero result.
I Interrupts enabled. STI turns this on. CLI turns this off.
These flags are described in more detail later.
JMP Rep
This command causes the central processing unit (CPU) to jump back and repeat earlier commands
or jump forward and skip some commands.
Instruction Pointer (IP)
The instruction pointer labelled IP contains the address of the instruction being executed. This is
indicated by a red highlighted RAM position in the simulator. Each CPU command causes the IP to be
increased by 1, 2 or 3 depending on the size of the command. In the RAM displays, the instruction
pointer is highlighted red with yellow text.
NOP ; Increase IP by 1
INC BL ; Increase IP by 2
ADD AL,BL ; Increase IP by 3
JMP Rep ; Add or subtract a value from IP to
; jump to a new part of the program.
Fetch Execute Cycle
Fetch the instruction. IP points to it. This is called the operator.
If necessary, fetch data. IP + 1 points to it. This is the first operand.
If necessary, fetch data. IP + 2 points to it. This is the second operand.
Execute the command. This may involve more fetching or putting of data.
Increase IP to point to the next command or calculate IP for Jump commands.
Repeat this cycle.
Every machine cycle has one operator or instruction. There could be zero, one or two operands
depending on the instruction. OP Codes are the machine codes that correspond to the operators and
operands.
Example – 05keyb-in.asm – Keyboard Input
Contents
Example – 05keyb-in.asm
; ————————————————————–
; Input key presses from the keyboard until Enter is pressed.
; ————————————————————–
CLO ; Close unwanted windows.
Rep:
IN 00 ; Wait for key press – Store it in AL.
CMP AL,0D ; Was it the Enter key? (ASCII 0D)
JNZ Rep ; No – jump back. Yes – end.
END
; ————————————————————–
TASK
11) Easy! Display each character at the top left position of the
VDU by copying them all to address [C0].
12) Harder Use BL to point to address [C0] and increment BL after
each key press in order to see the text as you type it.
13) Harder! Store all the text you type in RAM when you type it in.
When you press Enter, display the stored text on the VDU display.
14) Difficult Type in text and store it. When Enter is pressed,
display it on the VDU screen in reverse order. Using the stack
makes this task easier.
You can copy this example program from the help page and paste it into the source code editor.
IN 00
Input from port zero. In this simulator, port zero is wired to the keyboard hardware. The simulator
waits for a key press and copies the ASCII code of the key press into the AL register. This is not
very realistic but is easy to program. There is a more realistic keyboard on port 07 and interrupt 03
but this is for more advanced programmers.
CMP AL,0D
Compare the AL register with the ASCII code of the Enter key. The ASCII code of the Enter key is
0Dhex.
CMP AL,BL works as follows. The processor calculates AL – BL. If the result is zero, the ‘Z’ flag in the
status register SR is set. If the result is negative, the ‘S’ flag is set. If the result is positive, no flags
are set. The ‘Z’ flag is set if AL and BL are equal. The ‘S’ flag is set if BL is greater then AL. No flag is
set if AL is greater than BL.
JNZ Rep
JNZ stands for Jump Not Zero. Jump if the ‘Z’ flag is not set. The program will jump forwards or
back to the address that Rep marks.
A related command is JZ. This stands for Jump Zero. Jump if the zero flag is set. In this program,
the CMP command sets the flags. Arithmetic commands also set the status flags.
MOV [C0],AL
This will copy AL to address [C0]. The visual display unit works with addresses [C0] to [FF]. This
gives a display with 4 rows and 16 columns. Address [C0] is the top left corner of the screen.
MOV [BL],AL
This copies AL to the address that BL points to. BL can be made to point to the VDU screen at [C0]
by using MOV BL,C0. BL can be made to point to each screen position in turn by using INC BL. This
is needed for task 2.
Example – 06proc.asm – Procedures
Contents
Example – 06proc.asm
; —————————————————————
; A general purpose time delay procedure.
; The delay is controlled by the value in AL.
; When the procedure terminates, the CPU registers are
; restored to the same values that were present before
; the procedure was called. Push, Pop, Pushf and Popf
; are used to achieve this. In this example one procedure
; is re-used three times. This re-use is one of the main
; advantages of using procedures.
;—— The Main Program —————————————-
Start:
MOV AL,8 ; A short delay.
CALL 30 ; Call the procedure at address [30]
MOV AL,10 ; A middle sized delay.
CALL 30 ; Call the procedure at address [30]
MOV AL,20 ; A Longer delay.
CALL 30 ; Call the procedure at address [30]
JMP Start ; Jump back to the start.
; —– Time Delay Procedure Stored At Address [30] ————-
ORG 30 ; Generate machine code from address [30]
PUSH AL ; Save AL on the stack.
PUSHF ; Save the CPU flags on the stack.
Rep:
DEC AL ; Subtract one from AL.
JNZ REP ; Jump back to Rep if AL was not Zero.
POPF ; Restore the CPU flags from the stack.
POP AL ; Restore AL from the stack.
RET ; Return from the procedure.
; —————————————————————
END
; —————————————————————
TASK
15) Re-do the traffic lights program and use this procedure
to set up realistic time delays. 02tlight.asm
16) Re-do the text input and display program with procedures.
Use one procedure to input the text and one to display it.
; —————————————————————
You can copy this example program from the help page and paste it into the source code editor.
MOV AL,8
A value is placed into the AL register before calling the time delay procedure. This value determines
the length of the delay.
CALL 30
Call the procedure at address [30]. This alters the instruction pointer IP to [30] and the program
continues to run from that address. When the CPU reaches the RET command it returns to the
address that it came from. This return address is saved on the stack.
Stack
This is a region in memory where values are saved and restored. The stack uses the Last In First
Out rule. LIFO. The CALL command saves the return address on the stack. The RET command gets
the saved value from the stack and jumps to that address by setting IP.
ORG 30
Origin at address [30]. ORG specifies at what RAM address machine code should be generated. The
time delay procedure is stored at address [30].
PUSH AL
Save the value of AL onto the stack. The CPU stack pointer SP points to the next free stack location.
The push command saves a value at this position. SP is then moved back one place to the next free
position. In this simulator, the stack grows towards address Zero. A stack overflow occurs if the
stack tries to fill more than the available memory. A stack underflow occurs if you try to pop an
empty stack.
PUSHF
Save the CPU flags in the status register SR onto the stack. This ensures that the flags can be put
back as they were when the procedure completes. The stack pointer is moved back one place. See
the Push command. NOTE: Items must be popped in the reverse order they were pushed.
DEC AL
Subtract one from AL. This command sets the Z flag if the answer was Zero or the S flag if the
answer was negative.
JNZ REP
Jump Not Zero to the address that Rep marks. Jump if the Z flag is not set.
POPF
Restore the CPU flags from the stack. Increase the stack pointer by one.
POP AL
Restore the AL register from the stack. This is done by first moving the stack pointer SP forward one
place and copying the value at that stack position into the AL register. A stack underflow occurs
when an attempt is made to pop more items off the stack than were present. NOTE: Items must be
popped in the reverse order they were pushed.
RET
Return from the procedure to the address that was saved on the stack by the CALL command.
Procedures can re-use themselves. This is called recursion. It is a powerful technique and dangerous
if you don’t understand what is happening! Accidental or uncontrolled recursion causes the stack to
grow until it overwrites the program or overflows.
Example – 07textio.asm – Text I/O
Procedures
Contents
Example – 07textio.asm
; ————————————————————–
; A program to read in a string of text and store it in RAM.
; The end of text will be labelled with ASCII code zero/null.
; ————————————————————–
; THE MAIN PROGRAM
MOV BL,70 ; [70] is the address where the text will
; be stored. The procedure uses this.
CALL 10 ; The procedure at [10] reads in text and
; places it starting from the address
; in BL.
; BL should still contain [70] here.
CALL 40 ; This procedure does nothing until you
; write it. It should display the text.
HALT ; DON’T USE END HERE BY MISTAKE.
; ————————————————————–
; A PROCEDURE TO READ IN THE TEXT
ORG 10 ; Code starts from address [10]
PUSH AL ; Save AL onto the stack
PUSH BL ; Save BL onto the stack
PUSHF ; Save the CPU flags onto the stack
Rep:
IN 00 ; Input from port 00 (keyboard)
CMP AL,0D ; Was key press the Enter key?
JZ Stop ; If yes then jump to Stop
MOV [BL],AL ; Copy keypress to RAM at position [BL]
INC BL ; BL points to the next location.
JMP Rep ; Jump back to get the next character
Stop:
MOV AL,0 ; This is the NULL end marker
MOV [BL],AL ; Copy NULL character to this position.
POPF ; Restore flags from the stack
POP BL ; Restore BL from the stack
POP AL ; Restore AL from the stack
RET ; Return from the procedure.
; ————————————————————–
; A PROCEDURE TO DISPLAY TEXT ON THE SIMULATED SCREEN
ORG 40 ; Code starts from address [10]
; **** YOU MUST FILL THIS GAP ****
RET ; At present this procedure does
; nothing other than return.
; ————————————————————–
END ; It is correct to use END at the end.
; ————————————————————–
TASK
17) Write a program using three procedures. The first should
read text from the keyboard and store it in RAM. The second
should convert any upper case characters in the stored text
to lower case. The third should display the text on the
VDU screen.
; ————————————————————–
You can copy this example program from the help page and paste it into the source code editor.
Passing Parameters
MOV BL,70
The BL register contains 70. This value is needed by the text input procedure. It is the address
where the text will be stored in RAM. This is an example of passing a parameter using a register. All
you are doing is getting a number from one part of a program to another.
INC BL
This command adds one to BL. The effect is to make BL point to the next memory location ready for
the next text character to be stored.
CALL 10
Call the procedure at address [10]. This is achieved in practice by setting the CPU instruction pointer
IP to [10].
RET
At the end of the procedure, the RET command resets the CPU instruction pointer IP back to the
instruction after the CALL instruction to the procedure. This address was stored on the stack by the
CALL instruction.
HALT
Don’t confuse HALT and END. The END command causes the assembler to stop scanning for more
instructions in the program. The HALT command generates machine code 00 which causes the CPU
to halt. There can be several HALT commands in a program but only one END command.
ORG 10
Origin [10]. The assembler program starts generating machine code from address [10].
PUSH AL and POP AL
Save the value of AL onto the stack. This is an area of RAM starting at address BF. The stack grows
towards zero. The RAM displays show the stack pointer as a blue highlight with yellow text. Push
and Pop are used so that procedures and interrupts can tidy up after themselves. The procedure or
interrupt can alter CPU registers but it restores them to their old values before returning.
PUSHF and POPF
PUSHF saves the CPU flags onto the stack. POPF restores the CPU flags to their original value. This
enables procedures and interrupts to do useful work without unexpected side affects on the rest of
the program.
IN 00
Input from port zero. This port is connected to the keyboard. The key press is stored into the AL
register.
CMP AL,0D
Compare the AL register with the hexadecimal number 0D. 0D is the ASCII code of the Enter key.
This line is asking “Was the enter key pressed?” CMP works by subtracting 0D from AL. If they were
equal then the subtraction gives an answer of zero. This causes the CPU zero or ‘Z’ flag to be set.
JZ Stop
Jump to the Stop label if the CPU ‘Z’ flag was set. This is a conditional jump.
MOV [BL],AL
Move the key press stored in AL into the RAM location that [BL] points to. INC BL is then used to
make BL point to the next RAM location.
JMP Rep
Jump back to the Rep label. This is an unconditional jump. It always jumps and the CPU flags are
ignored.
RET
Return from the procedure to the address stored on the stack. This is done by setting the instruction
pointer IP in the CPU.
Example – 08table.asm – Data Tables
Contents
Example – 08table.asm
; —– EXAMPLE 8 ——- DATA TABLES ————————–
JMP Start ; Skip past the data table.
DB 84 ; Data table begins.
DB C8 ; These values control the traffic lights
DB 31 ; This sequence is simplified.
DB 58 ; Last entry is also used as end marker
Start:
MOV BL,02 ; 02 is start address of data table
Rep:
MOV AL,[BL] ; Copy data from table to AL
OUT 01 ; Output from AL register to port 01
CMP AL,58 ; Last item in data table ???
JZ Start ; If yes then jump to Start
INC BL ; In no then point BL to the next entry
JMP Rep ; Jump back to do next table entry
END
; ————————————————————–
TASK
18) Improve the traffic lights data table so there is an
overlap with both sets of lights on red.
19) Use a data table to navigate the snake through the maze.
This is on port 04. Send FF to the snake to reset it.
Up, down left and right are controlled by the left four bits.
The right four bits control the distance moved.
20) Write a program to spin the stepper motor. Activate bits
1, 2, 4 and 8 in sequence to energise the electromagnets
in turn. The motor can be half stepped by turning on pairs
of magnets followed by a single magnet followed by a pair
and so on.
21) Use a data table to make the motor perform a complex sequence
of forward and reverse moves. This is the type of control
needed in robotic systems, printers and plotters. For this
exercise, it does not matter exactly what the motor does.
; ————————————————————–
You can copy this example program from the help page and paste it into the source code editor.
DB 84
DB stands for Define Byte/s. In this case 84hex is stored into RAM at address [02]. Addresses [00]
and [01] are occupied by the JMP Start machine codes.
84 hex is 1000 0100 in binary. This is the pattern or noughts and ones needed to turn on the left
red light and the right green light.
MOV BL,02
Move 02 into the BL register. [O2] is the RAM address of the start of the data table. BL is used as a
pointer to the data table.
MOV AL,[BL]
[BL] points to the data table. This line copies a value from the data table into the AL register.
OUT 01
Send the contents of the AL register to port 01. Port 01 is connected to the traffic lights.
CMP AL,58
58 is the last entry in the data table. If AL contains 58, it is necessary to reset BL to point back to
the start of the table ready to repeat the sequence. If AL is equal to 58, the ‘Z’ flag in the CPU will
be set.
JZ Start
Jump back to start if the ‘Z’ flag in the CPU is set.
INC BL
Add one to BL to make it point to the next entry in the data table.
Example – 09param.asm – Parameters
Contents
Example – 09param.asm
; —– EXAMPLE 9 ——- Passing Parameters ——————-
; —– Use Registers to pass parameters into a procedure ——
JMP Start ; Skip over bytes used for data storage
DB 0 ; Reserve a byte of RAM at address [02]
DB 0 ; Reserve a byte of RAM at address [03]
Start:
MOV AL,5
MOV BL,4
CALL 30 ; A procedure to add AL to BL
; Result returned in AL.
; —– Use RAM locations to pass parameters into a procedure —
MOV AL,3
MOV [02],AL ; Store 3 into address [02]
MOV BL,1
MOV [03],BL ; Store 1 into address [03]
CALL 40
; —– Use the Stack to pass parameters into a procedure ——
MOV AL,7
PUSH AL
MOV BL,2
PUSH BL
CALL 60
POP BL
POP AL ; This one contains the answer
JMP Start ; Go back and do it again.
; —– A procedure to add two numbers ————————-
; Parameters passed into procedure using AL and BL
; Result returned in AL
; This method is simple but is no good if there are a
; lot of parameters to be passed.
ORG 30 ; Code starts at address [30]
ADD AL,BL ; Do the addition. Result goes into AL
RET ; Return from the procedure
; —– A procedure to add two numbers ————————-
; Parameters passed into procedure using RAM locations.
; Result returned in RAM location
; This method is more complex and there is no limit on
; the number of parameters passed unless RAM runs out.
ORG 40 ; Code starts at address [40]
PUSH CL ; Save registers and flags on the stack
PUSH DL
PUSHF
MOV CL,[02] ; Fetch a parameter from RAM
MOV DL,[03] ; Fetch a parameter from RAM
ADD CL,DL ; Do the addition
MOV [02],CL ; Store the result in RAM
POPF ; Restore original register
POP DL ; and flag values
POP CL
RET
; —– A procedure to add two numbers ————————-
; The numbers to be added are on the stack.
; POP parameters off the stack
; Do the addition
; Push answer back onto the stack
; The majority of procedure calls in real life make use
; of the stack for parameter passing. It is very common
; for the address of a complex data structure in RAM to
; be passed to a procedure using the stack.
ORG 60 ; Code starts at address [60]
POP DL ; Return address
POP BL ; A parameter
POP AL ; A parameter
ADD AL,BL
PUSH AL ; Answer ; The number of pushes must
PUSH AL ; Answer ; match the number of pops.
PUSH DL ; Put the stack back as it was before
RET
; ————————————————————–
END
Task
22) Write a procedure that doubles a number. Pass the single
parameter into the procedure using a register. Use the
same register to return the result.
23) Write a procedure to invert all the bits in a byte. All
the zeros should become ones. All the ones should become
zeros. Pass the value to be processed into the procedure
using a RAM location. Return the result in the same RAM
location.
24) Write a procedure that works out Factorial N. This example
shows one method for working out factorial N.
Factorial 5 is 5 * 4 * 3 * 2 * 1 = 120. Your procedure
should work properly for factorial 1, 2, 3, 4 or 5.
Factorial 6 would cause an overflow. Use the stack to pass
parameters and return the result. Calculate the result.
Using a look up table is cheating!
25) Write a procedure that works out Factorial N. Use the
stack for parameter passing. Write a recursive
procedure. Use this definition of Factorial.
Factorial ( 0 ) is defined as 1.
Factorial ( N ) is defined as N * Factorial (N – 1).
To work out Factorial (N), the procedure first tests to see
if N is zero and if not then re-uses itself to work out
N * Factorial (N – 1). This problem is hard to understand
in any programming language. In assembly code it is
harder still.
You can copy this example program from the help page and paste it into the source code editor.
Passing Parameters
Parameters can be passed in three ways.
1. CPU registers can be used – Fast but little data can be passed. In some programming
languages the “Register” keyword is used to achieve this.
2. RAM locations can be used – Slower and recursion may not be possible. In some
programming languages the “Static” keyword is used to achieve this. This technique is useful
if very large amounts of data are help in RAM. Passing a pointer to the data is more efficient
than making a copy of the data on the stack.
3. The stack can be used – Harder to understand and code but a lot of data can be passed
and recursion is possible. Compilers generally use this method by default unless otherwise
directed.
The example program uses all three methods to add two numbers together. The example tasks
involve all three methods.
Example – 10swint.asm
Software Interrupts
Contents
Example – 10swint.asm
; ————————————————————–
; An example of software interrupts.
; ————————————————————–
JMP Start ; Jump past table of interrupt vectors
DB 51 ; Vector at 02 pointing to address 51
DB 71 ; Vector at 03 pointing to address 71
Start:
INT 02 ; Do interrupt 02
INT 03 ; Do interrupt 03
JMP Start
; ————————————————————–
ORG 50
DB E0 ; Data byte – could be a whole table here
; Interrupt code starts here
MOV AL,[50] ; Copy bits from RAM into AL
NOT AL ; Invert the bits in AL
MOV [50],AL ; Copy inverted bits back to RAM
OUT 01 ; Send data to traffic lights
IRET
; ————————————————————–
ORG 70
DB 0 ; Data byte – could be a table here
; Interrupt code starts here
MOV AL,[70] ; Copy bits from RAM into AL
NOT AL ; Invert the bits in AL
AND AL,FE ; Force right most bit to zero
MOV [70],AL ; Copy inverted bits back to RAM
OUT 02 ; Send data to seven segment display
IRET
; ————————————————————–
END
; ————————————————————–
TASK
26) Write a new interrupt 02 that fetches a key press from the
keyboard and stores it into RAM. The IBM PC allocates 16
bytes for key press storage. The 16 locations are used in
a circular fashion.
27) Create a new interrupt that puts characters onto the next
free screen location. See if you can get correct behaviour
in response to the Enter key being pressed (fairly easy)
and if the Back Space key is pressed (harder).
You can copy this example program from the help page and paste it into the source code editor.
Interrupts and Procedures
Interrupts are short code fragments that provide useful services that can be used by other
programs. Typical routines handle key presses, mouse movements and button presses, screen
writing, disk reading and writing and so on.
An interrupt is like a procedure but it is called in a different way. Procedures are called by jumping
to the start address of the procedure. This address is known only to the program that owns the
procedure. Interrupts are called by looking up the address of the interrupt code in a table of
interrupt vectors. The contents of this table is published and widely known. MS DOS makes heavy
use of interrupts for all its disk, screen, mouse, network, keyboard and other services.
By writing your own code and making the interrupt vector point to the code you wrote, the
behaviour of interrupts can be completely altered. Your interrupt code might add some useful
behaviour and then jump back to the original code to complete the work. This is called TRAPPING
the interrupt.
Software interrupts are triggered, on demand, by programs.
Hardware interrupts are triggered by electronic signals to the CPU from hardware devices.
Interrupt Vector Table
In the IBM compatible computer, addresses 0 to 1024 decimal are used for storing interrupt vectors.
The entries in this table of vectors point to all the code fragments that control MS DOS screen, disk,
mouse, keyboard and other services. The simulator vectors sit between addresses 0 and 15 decimal.
It is convenient to start a simulator program with a jump command that occupies two bytes. This
means that the first free address for an interrupt vector is [02]. This is used by the hardware timer
if the interrupt flag is set.
Have another look at the example program. 10swint.asm
Calling an Interrupt
This is quite complex. The command INT 02 causes the CPU to retrieve the contents of RAM location
02. After saving the return address onto the stack, the instruction pointer IP is set to this address.
The interrupt code is then executed. When complete the IRET command causes the return from the
interrupt. The CPU instruction pointer IP is set to the address that was saved onto the stack earlier.
Trapping an Interrupt
If you wan to trap interrupt 02, change the address stored at address 02 to point to code that you
have written. Your code will then handle the interrupt. When complete, your code can use IRET to
return from the interrupt or it can jump to the address that was originally in address 02. This causes
the original interrupt code to be executed as well. In this way, you can replace or modify the
behaviour of an interrupt.
Example – 11hwint.asm
Hardware Interrupts
Contents
Example – 11hwint.asm
; ————————————————————–
; An example of using hardware interrupts.
; This program spins the stepper motor continuously and
; steps the traffic lights on each hardware interrupt.
; Uncheck the “Show only one peripheral at a time” box
; to enable both displays to appear simultaneously.
; ————————————————————–
JMP Start ; Jump past table of interrupt vectors
DB 50 ; Vector at 02 pointing to address 50
Start:
STI ; Set I flag. Enable hardware interrupts
MOV AL,11 ;
Rep:
OUT 05 ; Stepper motor
ROR AL ; Rotate bits in AL right
JMP Rep
JMP Start
; ————————————————————–
ORG 50
PUSH al ; Save AL onto the stack.
PUSH bl ; Save BL onto the stack.
PUSHF ; Save flags onto the stack.
JMP PastData
DB 84 ; Red Green
DB c8 ; Red+Amber Amber
DB 30 ; Green Red
DB 58 ; Amber Red+Amber
DB 57 ; Used to track progress through table
PastData:
MOV BL,[5B] ; BL now points to the data table
MOV AL,[BL] ; Data from table goes into AL
OUT 01 ; Send AL data to traffic lights
CMP AL,58 ; Last entry in the table
JZ Reset ; If last entry then reset pointer
INC BL ; BL points to next table entry
MOV [5B],BL ; Save pointer in RAM
JMP Stop
Reset:
MOV BL,57 ; Pointer to data table start address
MOV [5B],BL ; Save pointer into RAM location 54
Stop:
POPF ; Restore flags to their previous value
POP bl ; Restore BL to its previous value
POP al ; Restore AL to its previous value
IRET
; ————————————————————–
END
; ————————————————————–
TASK
28) Write a program that controls the heater and thermostat
whilst at the same time counting from 0 to 9 repeatedly,
displaying the result on one of the seven segment displays.
If you want a harder challenge, count from 0 to 99 repeatedly
using both displays. Use the simulated hardware interrupt to
control the heater and thermostat.
29) A fiendish problem. Solve the Tower of Hanoi problem whilst
steering the snake through the maze. Use the text characters
A, B, C Etc. to represent the disks. Use three of the four
rows on the simulated screen to represent the pillars.
30) Use the keyboard on Port 07. Write an interrupt handler
(INT 03) to process the key presses. You must also process
INT 02 (the hardware timer) but it need not perform any task.
For a more advanced task, implement a 16 byte circular buffer.
Write code to place the buffered text on the VDU screen when
you press Enter. For an even harder task, implement code to
process the Backspace key to delete text characters in the buffer.
You can copy this example program from the help page and paste it into the source code editor.
Hardware Interrupts
Hardware Interrupts are short code fragments that provide useful services that can be triggered by
items of hardware. When a printer runs out of paper, it sends a signal to the CPU. The CPU
interrupts normal processing and processes the interrupt. In this case code would run to display a
“Paper Out” message on the screen. When this processing is complete, normal processing resumes.
This simulator has a timer that triggers INT 02 at regular time intervals that you can pre-set in the
Configuration Tab. You must put an interrupt vector at address 02 that points to your interrupt code.
Look at the example.
STI and CLI
Hardware interrupts are ignored unless the ‘I’ flag in the status register is set. To set the ‘I’ flag, use
the set ‘I’ command, STI. To clear the ‘I’ flag, use the clear ‘I’ command CLI.
Hardware interrupts can be trapped in the same way that software interrupts can.
Hardware interrupts are triggered, as needed by disk drives, printers, key presses, mouse
movements and other hardware events.
This scheme makes processing more efficient. Without interrupts, the CPU would have to poll the
hardware devices at regular time intervals to see if any processing was needed. This would happen
whether or not processing was necessary. Interrupts can be assigned priorities such that a disk drive
might take priority over a printer. It is up to the programmer to optimise all this for efficient
processing. In the IBM compatible PC, low number interrupts have a higher priority than the higher
numbers.
Calling an Interrupt
This is quite complex. The command INT 02 whether triggered by hardware or software, causes the
CPU to retrieve the contents of RAM location 02. After saving the return address onto the stack, the
instruction pointer IP is set to the address that came from RAM.
The interrupt code is then executed. When complete the IRET command causes the return from the
interrupt. The CPU instruction pointer IP is set to the address that was saved onto the stack earlier.
Hardware interrupts differ slightly from software interrupts. A software interrupt is called with a
command like INT 02 and the return address is the next instruction after this. IP + 2 is pushed onto
the stack. Hardware interrupts are not triggered by an instruction in a program so the return
address does not have to be set past the calling instruction. IP is pushed onto the stack.
Trapping an Interrupt
This is the same as trapping software interrupts described on the previous page.
Shortcut Keys
Contents
Alt Keys Control Keys Function Keys
A Assemble Button A Edit Select All F1 Help
B Log Assembler Activity B F2
C Configuration Tab C Edit Copy F3
D D F4
E Edit Menu E F5
F File Menu F Edit Find F6
G Log File Tab G F7
H Help Menu H F8
I I F9 Run
J List File Tab J F10
K Tokens Tab K F11
L Slower Button L F12
M Show Ram Button M
N Continue Button N
O Stop Button O File Open
P Step Button P
Q Q
R Run Button R Edit Replace
S Reset Button S File Save
T Faster Button T
U Source Code Tab U
V View Menu V Edit Paste
W Write Run Log W
X Examples Menu X Edit Cut
Y Y
Z Z
ASCII Codes
Contents
American Standard Code for Information Interchange
The ASCII code has 128 standard characters and a further 128 characters that vary from machine to
machine and country to country.
The first 128 ASCII characters are shown here.
Dec
00
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
Hex 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00 00 Nul Bel Bak Tab LF CR
16 10 EOF ESC
32 20 Spa ! ” # $ % & ‘ ( ) * + , – . /
48 30 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
64 40 @ A B C D E F G H I J K L M N O
80 50 P Q R S T U V W X Y Z [ \ ] ^ _
96 60 ` a b c d e f g h i j k l m n o
112 70 p q r s t u v w x y z { | } ~ Nul
The codes from 128 to 255 are not shown here.
Codes with special meanings to DOS, Printers, Teletype Machines and ANSI screens.
Decimal
0 Nul NULL character. ( End of text string marker. )
7 Bel Bell or beep character.
8 Bak Backspace character.
9 Tab TAB character.
10 LF Line Feed (start a new line).
13 CR Carriage Return code.
26 EOF DOS End of text file code.
27 Esc Escape code. It has special effects on older printers and ANSI screens. ANSI =
American National Standards Institute.
32 Spa Space Character.
255 Nul NULL character.
Unicode
ASCII is being replaced by the 16 bit Unicode with 65536 characters that represent every text
character in every country in the world including those used historically. Most new operating
systems software packages support Unicode.
Glossary
Contents
Glossary
386 CPU chips in IBM compatible computers are typically numbered 086, 186, 286,
386, 486, 586, Pentium Etc. 086 chips are now regarded as old fashioned and
slow. To run Windows, a 32 bit 386 chip was the minimum recommended.
8 Bit CPU The CPU has registers and connections to the outside world that are 8 bits wide.
16 bit and 32 bit CPUs are now more common, more powerful and more
expensive. 64 bit CPUs exist but are not common (2003)
80×86 The family of Intel chips numbered 8086, 80186, 80286, 80386, 80486 and
Pentium.
Accelerator Key Improves your productivity. For example Alt+F4 closes the current window and
is quicker to press than the equivalent mouse or menu actions.
Analogue Electronic systems that deal with continuously varying signals. Radio, TV and
HiFi systems are all analogue. CD Players are digital but the digital signals must
be converted to analogue before being sent to the HiFi system.
ANSI American National Standards Institute
Architecture CPU designs are more complex than typical building designs. Computer
architecture is equivalent to building architecture. To make best use of a
computer, it is useful to know something about the computer design or
architecture.
ASCII The American Standard Code for Information Interchange. This is an eight bit
code. There are 128 characters which are standard. There are a further 128
characters that vary depending on the country and the graphics symbols
required by printers. American ASCII is being replaced by International Unicode.
ASM The usual file extension for assembly code programs
Assembler
Assembly code
Human readable commands like MOV AL,33 correspond closely to CPU machine
codes. The assembler program translates the human readable codes into
machine codes readable by the CPU
Author C Neil Bauers can be E Mailed on the internet at
nbauers@samphire.demon.co.uk
Backup copy Copies of files kept in case of disaster. These should be kept in a secure place
away from the computer system they belong to. Important files should be
backed up in more than one place. Sod’s law applies to back up files. The file
you really need is the one you have failed to back up.
Base Address The start address of an object stored in memory. For example : The original IBM
PC VGA screen base address is B800:0000 followed by 4000 more bytes.
Binary Base two numbers used by digital systems. Count with two symbols [ 0 1 ]
Binary numbers are composed of noughts and ones. Electronically this is
achieved by circuits that are switched off or on.
Bit Masks Patterns of noughts and ones used with AND, OR and XOR to extract or inserts
bits into bytes.
Bits Binary digits. Single digits that are nought or one.
Byte Eight Bits. The data in a byte can have many different meanings depending on
the context. A byte can represent a CPU command, an ASCII character, a
decimal number, a graphics pattern or anything you have programmed it to
represent.
Carriage Return ASCII code 13 used to move the printer carriage or head to the left of the page.
The screen cursor performs in the equivalent way. See also – Line Feed
Case Sensitive Upper and lower case are taken to be different. This simulator is not case
sensitive.
Chip Shorthand for microchip or integrated circuit. The CPU is often referred to as the
CPU Chip.
Click Usually the left mouse button being pressed when the mouse is pointing at a
screen object.
Clock The CPU clock steps the computer and CPU at regular time intervals keeping all
parts of the computer in step. Typical clock speeds range between 1 to 500
Megahertz. 200 MHz was typical for a PC in 1997.
Comments These begin with ‘;’ and are used to explain what the program is doing. Good
comments explain why things are being done. Bad comments simply repeat
what is obvious by looking at the code.
Conditional
Jumps
These jumps either take place or not depending on the flags in the status
register. See JS, JNS, JO, JNO, JZ, JNZ, JMP.
Control Key This is used to give keys special meanings. For example the combination of the
control key with the F4 function key will close a window in some software
packages.
Control Systems Industrial and domestic equipment is frequently controlled by a small
microcomputer called a microcontroller. The control system is programmed once
for life so a TV remote controller can not be re-programmed as a washing
machine controller.
CPU Central Processing Unit. The part of the computer that does the computations.
Usually this is a single microchip.
Cursor A flashing symbol that indicates position within text. Alternatively the mouse
cursor indicates the mouse position. Special purpose cursors are used in some
software.
Data tables These store numbers, text or pointers to other data objects. It is easier to look
after data in a table than data scattered throughout a program. It is good style
to use data tables.
Decimal Base 10
numbers.
Count and do arithmetic with ten symbols. [ 0 1 2 3 4 5 6 7 8 9 ]
Digital Electronic Systems that use binary. Computers use binary numbers and are digital. HiFi
systems do not use binary and are not digital. (A HiFi remote control system is
digital) See analogue.
Directory or
Folder
File systems are organised into directories in much the same way that filing
cabinets are organised into draws and folders. Your files should be stored in a
directory that you have created. This keeps your files from getting muddled up
with all the other files on the computer.
Divide by zero This will cause an error. Any number divided by zero is infinitely big. This can
not be calculated.
End Of File ASCII code 26 is used to indicate the end of MS DOS text files.
Escape ASCII
code 27
This character is often interpreted in a special way by programs, VDUs and
printers.
Executable Code Non human readable program code executed by the CPU.
Explorer See File Manager
Extension The MS DOS file extension is zero or more characters after the dot in the file
name. Word processor files often have .DOC on the end. Assembly code files
end in .ASM
F1 Key Commonly this accesses the on line help.
File Data stored on disk or tape. When the data is loaded from the file into RAM, it
could consist of a program or data used by the program.
File Manager or
Explorer
A windows program that enables you to manage your files. Copying, renaming
and deleting files and directories are typical file management tasks.
Flags The Interrupt, Sign, Zero and Overflow flags in the status register indicate the
outcome of the previous calculation. See S Flag, O flag and Z flag.
Floppy disk Used to store files. 3.5 inch disks have a hard rectangular plastic casing to
protect the thin floppy disk inside. Older disks are actually floppy. The case is
bendy cardboard.
Folder See Directory
Function keys F1, F2 … F10. These keys have special purposes depending on the software in
use. F1 usually activates help. F10 usually activates the menu.
General Purpose
Registers
AL, BL, CL and DL are used to store data and perform calculations.
Gigahertz 1000 Megahertz. CPU Clock speeds are now measured in gigahertz.
Graphics Images, pictures and geometrical shapes are examples of graphics. Windows
displays everything as graphics. This gives good looking displays but a lot of
processing is needed to achieve it.
Hard disk A disk that can not normally be removed from the computer. Most computer
files are stored on the hard disk. There should also be backup copies stored
elsewhere in case the hard disk fails.
Hexadecimal Base 16 numbers. Count and do arithmetic with 16 symbols. [ 0 1 2 3 4 5 6 7 8
9 A B C D E F ] Hexadecimal and Binary are easily converted which is why
hexadecimal is used.
Hot Keys Ctrl+S and Ctrl+O are examples of hot keys. These give quick access to menu
options. Ctrl+S gives the File Save command. Ctrl+O gives the File Open
command.
I Flag The I or interrupt flag in the status register indicates if the CPU will accept or
ignore hardware interrupts. The commands CLI and STI clear and set this flag.
Hardware interrupts are used to signal events like “Key pressed”, “Disk
Ready”.or “Printer out of paper.” A hardware timer can generate an interrupt at
regular time intervals.
Immediate The instruction MOV AL,25 is an example of an immediate instruction. See also :
Register, Indirect, Register indirect and MOV.
Indirect
Indirection
This is where data in RAM is referred to with a pointer. For example MOV
AL,[20] moves the data from RAM location 20 into the AL register. [20] is a
pointer to the RAM location. The technique is called indirection. See MOV,
Immediate, Register, Register indirect
Instruction
Pointer
IP points to the instruction being executed. When the instruction is complete,
the IP is moved onto the next instruction. In the RAM displays, the instruction
pointer is highlighted red with yellow text.
Instruction Set The set of instructions that are recognised by a CPU. Typical instructions are
Move, Add and Subtract.
interrupt code
interrupt
handler
Interrupt routine
A program fragment designed to be activated at any time that an interrupt
occurs. The fragment is stored at an address pointed to by an interrupt vector.
Interrupts can be triggered by hardware. For example a key press or the printer
running out of paper cause a hardware interrupt. The CPU switches to the code
that handles the interrupt. When finished, the CPU continues with its earlier
task.
Interrupt Vector A pointer stored in a table. The pointer points at the interrupt handler. See INT.
IO Short for Input Output. See IN and OUT
Least significant
bit
LSB. The right hand bit in a byte which is worth 0 or 1.
MSB. The left hand bit in a byte which is worth 0 or 128.
Least and Most significant bits.
LIFO See Stack.
Line Feed ASCII code 10 used to start a new line on the printed page or screen. See also –
Carriage Return.
List File This is generated by the simulator assembler. It contains the program written by
the programmer. It also contains the machine codes generated by the
assembler.
Low level Low level programming is done using the CPU machine code or mnemonics the
are close to the machine codes.
LSB LSB. The right hand bit in a byte which is worth 0 or 1.
MSB. The left hand bit in a byte which is worth 0 or 128.
Least and Most significant bits.
Machine codes Machine codes are executed by the CPU See Assembly codes.
Human readable commands look like this MOV AL,55
The hexadecimal equivalent looks like this D0 00 55
The binary machine code looks like this 110100000000000001010101
A Megabyte 224 bytes to be precise or a million bytes approximately
Megahertz MHz. Million clock cycles per second. A 33 MHz clock means that the CPU
performs 33 million steps per second. These sorts of speeds are needed to fill
screens with high resolution graphics quickly.
Memory Mapped Memory mapped hardware is controlled by writing data into memory locations
occupied by that hardware device. This simulator has a memory mapped screen
so each screen position corresponds to a memory location.
Microchip Complex electronic circuits miniaturised onto a single wafer or chip of silicon
Microcontroller Usually a single chip microcircuit containing a CPU, RAM and ROM.
Microcontrollers are used in TV remote controllers, washing machines, digital
clocks, microwave ovens, industrial plant controllers, car engine management
systems and computer games.
Microprocessor A single chip CPU.
Mnemonic A memorable and human readable item like MOV that corresponds to a non
memorable item like 11010000 that means the same thing.
Most significant
bit MSB
LSB. The right hand bit in a byte which is worth 0 or 1.
MSB. The left hand bit in a byte which is worth 0 or 128.
Most Significant Bit. The left hand bit in a byte. It has a value of 128 decimal or
80Hex if the byte is unsigned (positive numbers only). It has a value of -128 if
the byte is signed (positive and negative numbers). The MSB has a value that
depends on how wide in bits the data storage location is.
Multiplexing Combining two or more data flows onto a single carrying medium. For example
a thousand telephone calls can be carried on a single cable. De-multiplexing
separates the channels and routes them to their correct destinations.
NULL ASCII code zero used to mark the end of text strings.
O Flag The O or overflow Flag in the status register indicates if the previous calculation
overflowed its register.
Off Line The network is disconected. However resources, can be made available locally
(off-line) even when the network is not available. When the network is re-
connected, the data files are synchronised so everyone gets the most up-to-date
information.
On Line The network is connected. Computer resources are connected and available and
can be accessed with a negligible or short time delay. On line resources usually
involve interaction with the user.
OP Code A binary code that the CPU can interpret as a command. These correspond to
commands like MOV and ADD.
Operand Essential data that comes after the op code.
MOV AL, 55
Op-Code Operand Operand
Overflow Flag This is set if the result of the previous calculation was too big to fit the register.
Parameters Data passed into procedures of functions. Parameters can be passed using
registers (very fast), RAM locations (good for big data items) or the Stack
(useful if recursion is needed).
Peripherals Hardware plugged into the computer. Anything from a keyboard or mouse to a
power station or chemical works.
Pointers In the command MOV AL,[25] the 25 is a pointer to the RAM location with
address 25. See indirection.
Ports Input Output Ports. Peripherals are connected to ports. The IN and OUT machine
instructions are used to communicate with the peripherals.
Procedures Small, modular, self contained, easily tested, code fragments that can be used
many times during the execution of a program. See CALL and RET in the
instruction set.
Process A program that is running or loaded ready to run. Processes can be running,
ready to run or waiting. Waiting processes are usually queueing up for disk or
printer access. A waiting process might be waiting for its share of CPU time.
Programs Instructions executed by a computer to perform tasks.
RAM Random access memory. Electronic memory that stores bytes. Normal RAM
forgets what it was storing when switched off.
Recursion A powerful technique where a procedure or function re-uses itself to achieve a
task.
Register A location in the CPU where data is stored. This simulator has four general
purpose registers called AL, BL, CL and DL. It has special purpose registers
called IP, SR and SP.
Register In the instruction CMP AL,BL registers are being compared. See also :
Immediate, Indirect, Register indirect.
Register indirect In the instruction MOV AL,[BL] the BL register contains a pointer to a RAM
location. The data in this RAM location is moved into AL. This is a register
indirect move. See also : Immediate, Indirect and Register.
Repetition This is achieved by using jump commands to make the program jump back and
repeat instructions.
Reset CPU Reset the CPU to its switch on state. Clear the general purpose registers to zero.
Set IP to zero. Set the flags to Zero. Set the stack pointer to BF. The stack
grows downwards from address FB.
Return address The address stored on the stack that the program returns to when a procedure
or interrupt is complete.
Run Run a program. Programs are collections of stored instructions that are usually
inactive. To run a program, it must be copied from disk into RAM and the CPU is
given the address of the first instruction in the program so it can run it. A
running program is often called a process.
S Flag The S or sign flag in the status register indicates if the previous calculation gave
a negative result.
Save a file Copy processed data from RAM onto disk.
Seven segment displays are used in digital clocks, watches, calculators and so on. Numbers are
built up by illuminating combinations of the seven segments.
Scheduler The scheduler is a process that manages all the other processes in a computer.
It aims to make best use of the hardware resources and to minimise delays to
processes and users.
Sign bit The leftmost bit in a binary number that is used to indicate if the number is
positive or negative.
Sign Flag This is set if a calculation gives a negative result.
Signed Numbers Numbers where the left most bit is the sign bit.
Simulator Computer software that models reality in some way. Virtual reality aims to make
the simulation so realistic that it seems real. Most simulations are designed to
be useful rather than realistic.
Source Code The human readable program code typed into the computer. See executable
code.
SP The stack pointer register. In the RAM displays, the stack pointer is highlighted
blue with yellow text.
SR The status register. This contains flags that are set as a result of the most
recent calculation. A zero result will set the Z (zero) flag. A negative result will
set the S (sign) flag. A result too big to fit in a register will set the O flag
(overflow). If the I flag is clear (not set) interrupts will be ignored.
Stack An area of memory used for temporary storage according to the LIFO rule. Last
in First out. The stack is used to save register contents for later restoration,
pass parameters into procedures and return results, reverse the order in which
data is stored, save addresses so procedures can return to the right place and
there are other uses including doing postfix arithmetic.
Stack Pointer Points to the next free location on the stack. In the RAM displays, the stack
pointer is highlighted blue with yellow text. The stack is memory organised as
LIFO last in first out. It is used to store return addresses, the CPU state,
parameters passed to procedures, results returned from procedures, arithmetic
data being processed and data whose order is to be reversed.
Status Flags
Status Register
The status Register contains status flags that indicate the outcome of the
previous calculation. The flags are Sign, Zero and Overflow. See SR.
Stepper motor A special motor that rotates in small controlled angular movements. It is used
commonly in printers, plotters and medical instruments and disk drives.
Task Switching Use Alt Tab to task switch manually. Operating systems also task switch
automatically. For example when word processing, the clock display continues to
work because from time to time the operating system switches tasks to keep
both going.
Thermostat A temperature controlled switch. On when too cold. Off when too hot.
Token List When programs are translated into machine code, one of the first steps is to
convert the source code of the program into tokens. These are not usually
human readable. The tokens are designed to occupy minimal memory. This
simulator converts source code to tokens but does not bother to code them to
save memory. This is because the programs are too small use much memory.
Twos
complement
Binary numbers where the left most bit determines whether the number is
positive or negative.
Unicode A 16 bit character code with 65536 text characters for all the languages in the
world including most dead (disused) languages. This code is replacing ASCII.
Unsigned
numbers
Numbers without a sign bit. These are always positive.
USERINFO.REG Simulator registration information is contained in this file. It is a text file and has
nothing to do with the Windows registry. The same file format was used under
Windows 3 which did not have a registry.
VDU Visual display unit. Computer output is commonly displayed on the VDU. There
are several VDU display technologies.
Write A simple Windows word processor. Data is saved to disk in a format unique to
the Write program.
Z Flag The Z or zero flag is set it the previous calculation result was zero.
Binary and Hexadecimal
Contents
Converting Between Binary and Hex
The CPU works using binary. Electronically this is done with electronic switches that are either on or
off. This is represented on paper by noughts and ones. A single BIT or binary digit requires one wire
or switch within the CPU. Usually data is handled in BYTES or multiples of bytes. A Byte is a group of
eight bits. A byte looks like this
01001011
This is inconvenient to read, say and write down so programmers use hexadecimal to represent
bytes. Converting between binary and hexadecimal is not difficult. First split the byte into two
nybbles (half a byte) as follows
0100 1011
Then use the following table
BINARY HEXADECIMAL DECIMAL
0 0 0 0 0 0
0 0 0 1 1 1
0 0 1 0 2 2
0 0 1 1 3 3
0 1 0 0 4 4
0 1 0 1 5 5
0 1 1 0 6 6
0 1 1 1 7 7
1 0 0 0 8 8
1 0 0 1 9 9
1 0 1 0 A 10
1 0 1 1 B 11
1 1 0 0 C 12
1 1 0 1 D 13
1 1 1 0 E 14
1 1 1 1 F 15
EXAMPLE
Split the byte into two halves
01001011 becomes 0100 1011
Using the table above
0100 is 4
1011 is B
The answer …
0100 1011 is 4B in Hexadecimal.
To convert the other way take a hexadecimal such as E7.
Look up E in the table. It is 1110.
Look up 7 in the table. It is 0111.
E7 is 1110 0111.
Instruction Set Summary
Contents
AL, BL, CL and DL are eight bit, general purpose registers where data is stored.
Square brackets indicate RAM locations. For example [15] means RAM location 15.
Data can be moved from a register into into RAM and also from RAM into a register.
Registers can be used as pointers to RAM. [BL] is the RAM location that BL points to.
All numbers are in base 16 (Hexadecimal).
Move Instructions. Flags NOT set.
Assembler Machine Code Explanation
MOV AL,15 D0 00 15 AL = 15 Copy 15 into AL
MOV BL,[15] D1 01 15 BL = [15] Copy RAM[15] into AL
MOV [15],CL D2 15 02 [15] = CL Copy CL into RAM[15]
MOV DL,[AL] D3 03 00 DL = [AL] Copy RAM[AL] into DL
MOV [CL],AL D4 02 00 [CL] = AL Copy AL into RAM[CL]
Direct Arithmetic and Logic. Flags are set.
Assembler Machine Code
ADD AL,BL A0 00 01 AL = AL + BL
SUB BL,CL A1 01 02 BL = BL – CL
MUL CL,DL A2 02 03 CL = CL * DL
DIV DL,AL A3 03 00 DL = DL / AL
INC DL A4 03 DL = DL + 1
DEC AL A5 00 AL = AL – 1
AND AL,BL AA 00 01 AL = AL AND BL
OR CL,BL AB 03 02 CL = CL OR BL
XOR AL,BL AC 00 01 AL = AL XOR BL
NOT BL AD 01 BL = NOT BL
ROL AL 9A 00 Rotate bits left. LSB = MSB
ROR BL 9B 01 Rotate bits right. MSB = LSB
SHL CL 9C 02 Shift bits left. Discard MSB.
SHR DL 9D 03 Shift bits right. Discaed LSB.
Immediate Arithmetic and Logic. Flags are set.
Assembler Machine Code
ADD AL,12 B0 00 12 AL = AL + 12
SUB BL,15 B1 01 15 BL = BL – 15
MUL CL,03 B2 02 03 CL = CL * 3
DIV DL,02 B6 03 02 DL = DL / 2
AND AL,10 BA 00 10 AL = AL AND 10
OR CL,F0 BB 02 F0 CL = CL OR F0
XOR AL,AA BC 00 AA AL = AL XOR AA
Compare Instructions. Flags are set.
Assembler Machine Code Explanation
CMP AL,BL DA 00 01 Set ‘Z’ flag if AL = BL.
Set ‘S’ flag if AL < BL. CMP BL,13 DB 01 13 Set 'Z' flag if BL = 13. Set 'S' flag if BL < 13. CMP CL,[20] DC 02 20 Set 'Z' flag if CL = [20]. Set 'S' flag if CL < [20]. Branch Instructions. Flags NOT set. Depending on the type of jump, different machine codes can be generated. Jump instructions cause the instruction pointer (IP) to be altered. The largest possible jumps are +127 bytes and -128 bytes. The CPU flags control these jumps. The 'Z' flag is set if the most recent calculation gave a Zero result. The 'S' flag is set if the most recent calculation gave a negative result. The 'O' flag is set if the most recent calculation gave a result too big to fit in the register. Assembler Machine Code Explanation JMP HERE C0 12 C0 FE Increase IP by 12 Decrease IP by 2 (twos complement) JZ THERE C1 09 C1 9C Increase IP by 9 if the 'Z' flag is set. Decrease IP by 100 if the 'Z' flag is set. JNZ A_Place C2 04 C2 F0 Increase IP by 4 if the 'Z' flag is NOT set. Decrease IP by 16 if the 'Z' flag is NOT set. JS STOP C3 09 C3 E1 Increase IP by 9 if the 'S' flag is set. Decrease IP by 31 if the 'S' flag is set. JNS START C4 04 C4 E0 Increase IP by 4 if the 'S' flag is NOT set. Decrease IP by 32 if the 'S' flag is NOT set. JO REPEAT C5 09 C5 DF Increase IP by 9 if the 'O' flag is set. Decrease IP by 33 if the 'O' flag is set. JNO AGAIN C6 04 C6 FB Increase IP by 4 if the 'O' flag is NOT set. Decrease IP by 5 if the 'O' flag is NOT set. Procedures and Interrupts. Flags NOT set. CALL, RET, INT and IRET are available only in the registered version. Assembler Machine Code Explanation CALL 30 CA 30 Save IP on the stack and jump to the procedure at address 30. RET CB Restore IP from the stack and jump to it. INT 02 CC 02 Save IP on the stack and jump to the address (interrupt vector) retrieved from RAM[02]. IRET CD Restore IP from the stack and jump to it. Stack Manipulation Instructions. Flags NOT set. Assembler Machine Code Explanation PUSH BL E0 01 BL is saved onto the stack. POP CL E1 02 CL is restored from the stack. PUSHF EA SR flags are saved onto the stack. POPF EB SR flags are restored from the stack. Input Output Instructions. Flags NOT set. Assembler Machine Code Explanation IN 07 F0 07 Data input from I/O port 07 to AL. OUT 01 F1 01 Data output to I/O port 07 from AL. Miscellaneous Instructions. CLI and STI set I flag. Assembler Machine Code Explanation CLO FE Close visible peripheral windows. HALT 00 Halt the processor. NOP FF Do nothing for one clock cycle. STI FC Set the interrupt flag in the Status Register. CLI FD Clear the interrupt flag in the Status Register. ORG 40 Code origin Assembler directive: Generate code starting from address 40. DB "Hello" Define byte Assembler directive: Store the ASCII codes of 'Hello' into RAM. DB 84 Define byte Assembler directive: Store 84 into RAM. Detailed Instruction Set Contents The Full Instruction Set Arithmetic Logic Jump Instructions Move Instructions Compare Instructions Stack Instructions Procedures And Interrupts Inputs and Outputs Other Instructions General Information CPU Registers There are four general purpose registers called AL, BL, CL and DL. There are three special purpose registers. These are IP is the instruction pointer. SP is the stack pointer. SR is the status register. This contains the I, S, O and Z flags. Flags Flags give information about the outcome of computations performed by the CPU. Single bits in the status register are used as flags. This simulator has flags to indicate the following. S The sign flag is set if a calculation gives a negative result. O The overflow flag is set if a result is too big to fit in 8 bits. Z The zero flag is set if a calculation gives a zero result. I is the hardware interrupts enabled flag. Most real life CPUs have more than four flags. Registers and Machine Codes The registers and their equivalent machine code numbers are shown below. Register names AL BL CL DL Machine codes 00 01 02 03 Example : To add one to the CL register use the instruction Assembly Code INC CL Machine Code Hex A4 02 Machine code Binary 10100100 00000010 A4 is the machine instruction for the INC command. 02 refers to the CL register. The assembler is not case sensitive. mov is the same as MOV and Mov. Within the simulator, hexadecimal numbers may not have more than two hexadecimal digits. Hexadecimal numbers 15, 3C and FF are examples of hexadecimal numbers. When using the assembler, all numbers should be entered in hexadecimal. The CPU window displays the registers in binary, hexadecimal and decimal. Look at the Hexadecimal and Binary page for more detail. Negative numbers FE is a negative number. Look at the Negative Numbers table for details of twos complement numbers. In a byte, the left most bit is used as a sign bit. This has a value of minus 128 decimal. Bytes can hold signed numbers in the range -128 to +127. Bytes can hold unsigned numbers in the range 0 to 255. Indirection When referring to data in RAM, square brackets are used. For example [15] refers to the data at address 15hex in RAM. The same applies to registers. [BL] refers to the data in RAM at the address held in BL. This is important and frequently causes confusion. These are indirect references. Instead of using the number or the value in the register directly, these values refer to RAM locations. These are also called pointers. Comparing with 80x86 Chips At the mnemonic level, the simulator instructions look very like 80x86 assembly code mnemonics. Sufficient instructions are implemented to permit realistic programming but the full instruction set has not been implemented. All the simulated instructions apply to the low eight bits of the 80x86 CPU. The rest of the CPU has not been simulated. In the registered version, CALL, RET, INT, IRET and simulated hardware interrupts are available so procedures and interrupts can be written. Most of the instructions behave as an 80x86 programmer would expect. The MUL and DIV (multiplication and division) commands are simpler than the 80x86 equivalents. The disadvantage of the simulator approach is that overflow is much more probable. The simulator versions of ADD and SUB are realistic. The 8086 DIV instruction calculates both DIV and MOD in the same instruction. The simulator has MOD as a separate instruction. The machine codes are quite unlike the 80x86 machine codes. They are simpler, less compact but designed to make the machine code as simple as possible. With 80x86 machine code, a mnemonic like MOV AL,15 is encoded in two bytes. MOV AL, is encoded into one byte and the 15 goes into another. This means that a lot of different machine OP CODES are needed for all the different combinations of MOV commands and registers. This simulator needs three bytes. MOV is encoded as a byte sized OP CODE. AL is encoded as a byte containing 00. The 15 goes into a byte as before. This is not very efficient but is very simple. Arithmetic and Logic Detailed Instruction Set Arithmetic Instructions - Flags are set. The Commands Arithmetic Logic Bitwise Add - Addition AND - Logical AND - 1 AND 1 gives 1. Any other input gives 0. ROL - Rotate bits left. Bit at left end moved to right end. Sub - Subtraction OR - Logical OR - 0 OR 0 gives 0. Any other input gives 1. ROR - Rotate bits right. Bit at right end moved to left end. Mul - Multiplication XOR - Logical exclusive OR - Equal inputs give 0. Non equal inputs give 1. SHL - Shift bits left and discard leftmost bit. Div - Division NOT - Logical NOT - Invert the input. 0 gives 1. 1 gives 0. SHR - Shift bits right and discard rightmost bit. Mod - Remainder after division Inc - Increment (add one) Dec - Decrement (subtract one) COMMANDS DIRECT EXAMPLES OP Assembler Machine Code Explanation ADD ADD AL,BL A0 00 01 Add BL to AL SUB SUB CL,DL A1 02 03 Subtract DL from CL MUL MUL AL,CL A2 00 02 Multiply AL by CL DIV DIV BL,DL A3 01 03 Divide BL by DL MOD MOD DL,BL A6 03 01 Remainder after dividing DL by BL INC INC AL A4 00 Add one to AL DEC DEC BL A5 01 Deduct one from BL AND AND CL,AL AA 02 00 CL becomes CL AND AL OR OR CL,DL AB 02 03 CL becomes CL OR DL XOR XOR BL,AL AC 01 00 BL becomes BL XOR AL NOT NOT CL AD 02 Invert the bits in CL ROL ROL DL 9A 03 Bits in DL rotated one place left ROR ROR AL 9B 00 Bits in AL rotated one place right SHL SHL BL 9C 01 Bits in BL shifted one place left SHR SHR CL 9D 02 Bits in CL shifted one place right COMMANDS IMMEDIATE EXAMPLES OP Assembler Machine Code Explanation ADD ADD AL,15 B0 00 15 Add 15 to AL SUB SUB BL,05 B1 01 05 Subtract 5 from BL MUL MUL AL,10 B2 00 10 Multiply AL by 10 DIV DIV BL,04 B3 01 04 Divide BL by 4 MOD MOD DL,20 B6 03 20 Remainder after dividing DL by 20 AND AND CL,55 BA 02 55 CL becomes CL AND 55 (01010101) OR OR CL,AA BB 02 AA CL becomes CL OR AA (10101010) XOR XOR BL,F0 BC 01 F0 BL becomes BL XOR F0 Examples ADD CL,AL - Add CL to AL and put the answer into CL. ADD AL,22 - Add 22 to AL and put the answer into AL. The answer always goes into the first register in the command. DEC BL - Subtract one from BL and put the answer into BL. The other commands all work in the same way. Flags If a calculation gives a zero answer, set the Z zero flag. If a calculation gives a negative answer, set the S sign flag. If a calculation overflows, set the O overflow flag. An overflow happens if the result of a calculation has more bits than will fit into the available register. With 8 bit registers, the largest numbers that fit are -128 to + 127. Jump Instructions Detailed Instruction Set Jump Instructions - Flags are NOT set. These instructions do NOT set the Z, S or O flags but conditional jumps use the flags to determine whether or not to jump. The CPU contains a status register - SR. This contains flags that are set or cleared depending on the most recent calculation performed by the processor. The CMP compare instruction performs a subtraction like the SUB command. It sets the flags but the result is not stored. The Flags - ISOZ 1. ZERO - The Z flag is set if the most recent calculation gave a zero result. 2. SIGN - The S flag is set if the most recent calculation gave a negative result. 3. OVERFLOW - The O flag is set if the most recent calculation gave a result too big to fit a register. 4. INTERRUPT - The I flag is set in software using the STI command. If this flag is set, the CPU will respond to hardware interrupts. The CLI command clears the I flag and hardware interrupts are ignored. The I flag is off by default. The programmer enters a command like JMP HERE. The assembler converts this into machine code by calculating how far to jump. This tedious and error prone taks (for humans) is automated. In an 8 bit register, the largest numbers that can be stored are -128 and +127. This limits the maximum distance a jump can go. Negative numbers cause the processor to jump backwards towards zero. Positive numbers cause the processor to jump forward towards 255. The jump distance is added to IP, the instruction pointer. To understand jumps properly, you also need to understand negative numbers. COMMANDS EXAMPLES OP Assembler Machine Code Explanation JMP JMP HERE C0 25 Unconditional jump. Flags are ignored. Jump forward 25h RAM locations. JMP JMP BACK C0 FE Jump Unconditional jump. Flags are ignored. Jump back -2d RAM locations. JZ JZ STOP C1 42 Jump Zero. Jump if the zero flag (Z) is set. Jump forward +42h places if the (Z) flag is set. JZ JZ START C1 F2 Jump Zero. Jump if the zero flag (Z) is set. Jump back -14d places if the (Z) flag is set. JNZ JNZ FORWARD C2 22 Jump Not Zero. Jump if the zero flag (Z) is NOT set. Jump forward 22h places if the (Z) flag is NOT set. JNZ JNZ REP C2 EE Jump Not Zero. Jump if the zero flag (Z) is NOT set. Jump back -18d places if the (Z) flag is NOT set. JS JS Minus C3 14 Jump Sign. Jump if the sign flag (S) is set. Jump forward 14h places if the sign flag (S) is set. JS JS Minus2 C3 FC Jump Sign. Jump if the sign flag (S) is set. Jump back -4d places if the sign flag (S) is set. JNS JNS Plus C4 33 Jump Not Sign. Jump if the sign flag (S) is NOT set. Jump forward 33h places if the sign flag (S) is NOT set. JNS JNS Plus2 C4 E2 Jump Not Sign. Jump if the sign flag (S) is NOT set. Jump back -30d places if the sign flag (S) is NOT set. JO JO TooBig C5 12 Jump Overflow. Jump if the overflow flag (O) is set. Jump forward 12h places if the overflow flag (O) is set. JO JO ReDo C5 DF Jump Overflow. Jump if the overflow flag (O) is set. Jump back -33d places if the overflow flag (O) is set. JNO JNO OK C6 33 Jump Not Overflow. Jump if the overflow flag (O) is NOT set. Jump forward 33h places if the overflow flag (O) is NOT set. JNO JNO Back C6 E0 Jump Not Overflow. Jump if the overflow flag (O) is NOT set. Jump back -32d places if the overflow flag (O) is NOT set. The full 8086 instruction set has many other jumps. There are more flags in the 8086 as well! Legal Destination Labels here: A nice correct label. here:: Not allowed Only one colon is permitted. 1234: Not allowed. Labels must begin with a letter or '_'. _: OK but not human friendly. here Destination labels must end in a colon. Some of these rules are not strictly enforced in the simulator. Move Instructions Detailed Instruction Set Move Instructions - Flags are NOT set. Move instructions are used to copy data between registers and between RAM and registers. Addressing Mode Assembler Example Supported Explanation Immediate mov al,10 YES Copy 10 into AL Direct (register) mov al,bl NO Copy BL into AL Direct (memory) mov al,[50] YES Copy data from RAM at address 50 into AL. mov [40],cl YES Copy data from CL into RAM at address 40. Indirect mov al,[bl] YES BL is a pointer to a RAM location. Copy data from that RAM location into AL. mov [cl],dl YES CL is a pointer to a RAM location. Copy data from DL into that RAM location. Indexed mov al,[20 + bl] NO A data table is held in RAM at address 20. BL indexes a data item within the data table. Copy from the data table at address 20+BL into AL. mov [20 + bl],al NO A data table is held in RAM at address 20. BL indexes a data item within the data table. Copy from AL into the data table at address 20+BL. Base Register mov al,[bl+si] NO BL points to a data table in memory. SI indexes to a record inside the data table. BL is called the "base register". SI is called the "offset or index". Copy from RAM at address BL+SI into AL. mov [bl+si],al NO BL points to a data table in memory. SI indexes to a record inside the data table. BL is called the "base register". SI is called the "offset". Copy from AL into RAM at address BL+SI. Right to Left Convention ADDRESSING MODES Immediate MOV AL,10 Copy a number into a register. This is the simplest move command and easy to understand. Direct (register) MOV AL,BL Copy one register into another. This is easy to understand. The simulator does not support this command. If you have to copy from one register to another, use a RAM location or the stack to achieve the move. Direct (memory) MOV AL,[50] ; Copy from RAM into AL. Copy the data from address 50. MOV [50],AL ; Copy from AL into RAM. Copy the data to address 50. The square brackets indicate data in RAM. The number in the square brackets indicates the RAM address/location of the data. Indirect MOV AL,[BL] ; Copy from RAM into AL. Copy from the address that BL points to. MOV [BL],AL ; Copy from AL into RAM. Copy to the address that BL points to. Copy between a specified RAM location and a register. The square brackets indicate data in RAM. In this example BL points to RAM. Indexed MOV AL,[20 + BL] ; Copy from RAM into AL. The RAM address is located at 20+BL. MOV [20 + BL],AL ; Copy from AL into RAM. The RAM address is located at 20+BL. Here the BL register is used to "index" data held in a table. The table data starts at address 20. Base Register MOV AL,[BL+SI] ; Copy from RAM into AL. The RAM address is located at BL+SI. MOV [BL+SI],AL ; Copy from AL into RAM. The RAM address is located at BL+SI. BL is the "base register". It holds the start address of a data table. SI is the "source index". It is used to index a record in the data table. Compare Instructions Detailed Instruction Set The Compare CMP Command - Flags are Set. When the simulator does a comparison using CMP, it does a subtraction of the two values it is comparing. The status register flags are set depending on the result of the subtraction. The flags are set but the answer is discarded. (Z) If the values are equal, the subtraction gives a zero result and the (Z) zero flag is set. (S) If the number being subtracted was greater than the other than a negative answer results so the (S) sign flag is set. If the number being subtracted is smaller than the other, no flags are set. Use JZ and JS or JNZ and JNS to test the result of a CMP command. Direct Memory Comparison Assembler Machine Code Explanation CMP CL,[20] DC 02 20 Here the CL register is compared with RAM location 20. Work out CL - RAM[20]. DC is the machine instruction for direct memory comparison. 02 refers to the AL register. 20 points to RAM address 20. Direct Register Comparison Assembler Machine Code Explanation CMP AL,BL DA 00 01 Here two registers are compared. Work out AL - BL DA is the machine instruction for register comparison. 00 refers to the AL register. 01 refers to the BL register. Immediate Comparison Assembler Machine Code Explanation CMP AL,0D DB 00 0D Here the AL register is compared with 0D, (the ASCII code of the Enter key). Work out AL - 0D. DB is the machine instruction for register comparison. 00 refers to the AL register. 0D is the ASCII code of the Enter key. Stack Instructions Detailed Instruction Set Stack Instructions - Flags are NOT set. After pushing items onto the stack, always pop them off in reverse order. This is because the stack works by the Last In First Out (LIFO) rule. The stack is an area of RAM used in this particular way. Any part of RAM could be used. In the simulator, the stack is located just below the Video RAM at address [BF]. The stack grows towards zero. It is easily possible to implement a stack that grows the other way. Stack Examples Assembler Machine Code Explanation PUSH BL E0 01 Push BL onto the stack and subtract one from the stack pointer. E0 is the machine instruction for PUSH. 01 refers to the BL register. POP BL E1 01 Add one to the stack pointer and pop BL from the stack. E1 is the machine instruction for POP. 01 refers to the BL register. PUSHF EA Save the CPU status register (SR) onto the stack. This saves the CPU flags. POPF EB Restore the CPU status register (SR) from the stack. This restores the CPU flags. The stack is used to ... save register contents for later restoration. pass parameters into procedures and return results. reverse the order in which data is stored. save addresses so procedures and interrupts can return to the right place. perform postfix arithmetic. make recursion possible. Stack Pointer A CPU register (SP) that keeps track of (is a pointer to) the data on the stack. It is colour coded with a blue highlight in the simulator RAM display. Push and Pop Push - Add data to the stack at the stack pointer position and subtract one from the stack pointer. Pop - Add one to the stack pointer and remove data from the stack at the stack pointer position. LIFO Last in First out. The stack operates strictly to this rule. When data is pushed onto the stack, it must later be popped in reverse order. Stack Overflow The stack is repeatedly pushed until it is full. The simulator does not detect this condition and the stack can overwite program code or data. Real life programs can fail in the same way. Stack Underflow The stack is repeatedly popped until it is empty. The next pop causes an underflow. Procedures and Interrupts Detailed Instruction Set Procedures and Interrupts - Flags are NOT set. These are available in the registered version. Please register. It is essential to save the registers and flags used by any procedure or interrupt and restore them after the procedure or interrupt has finished its work. Use push and pushf to save. Use pop and popf to restore values. Assembler Machine Code Explanation CALL 30 CA 30 Call the procedure at address 30. The return address is pushed onto the stack and the Instruction Pointer (IP) is set to 30. CA is the machine instruction for CALL. 30 is the address of the start of the procedure being called. RET CB Return from the procedure. Set the Instruction Pointer (IP) to the return address popped off the stack. CB is the machine instruction for Return. INT 03 CC 03 The Instruction Pointer (IP) is set to the address of the interrupt vector retrieved from RAM address 03. The return address is pushed onto the stack. CC is the machine instruction for INT. 03 is the address of the interrupt vector used by the INT command. IRET CD Return from the interrupt. Set the Instruction Pointer (IP) to the return address popped off the stack. CD is the machine instruction for IRET. Input Output Instructions Detailed Instruction Set Input and Output Instructions - Flags are NOT set. The simulator has 16 ports numbered from 00 to 0F. These are connected to simulated, outside- world peripherals. Assembler Machine Code Explanation IN 07 F0 07 Input from Port 07. F0 is the machine instruction for Input. 07 is the port number. OUT 01 F1 01 Output to Port 01. F1 is the machine instruction for Output. 01 is the port number. Peripherals Port Description 00 Input from port 00 for simulated keyboard input. 01 Output to port 01 to control the traffic lights. 02 Output to port 02 to control the seven segment displays. 03 Output to port 03 to control the heater. Input from port 03 to sense the thermostat state. 04 Output to port 04 to control the snake in the maze. 05 Output to port 05 to control the stepper motor. 06 Output to port 06 to control the lift. 07 Output to port 07 to make the keyboard visible. Input from port 07 to read the keyboard ASCII code. 08 Output to port 08 to make the numeric keypad visible. Input from port 08 to read from the numeric keypad. 09-0F Unused Other Instructions Detailed Instruction Set Miscellaneous Instructions - CLI and STI control the (I) Flag Assembler Machine Code Explanation HALT 00 Stop the program. 00 is the machine instruction for HALT. The program will cease to run if it encounters a HALT instruction. Continuous running is cancelled by this command. You can have several halt commands in one program. There should be only one END and code after END is ignored. NOP FF Do nothing for one clock cycle. FF is the machine instruction for NOP. The program will do nothing for one clock cycle. The program then continues as normal. NOP is used to introduce time delays to allow slow electronics to keep up with the CPU. These are also called WAIT STATES. CLO FE Close all the peripheral windows. FE is the machine code for CLO. It applies to this simulator only, and is used to close peripheral windows. This makes it easier to write demonstration programs without the screen getting too cluttered. ORG 30 NONE Code Origin. Generate code starting from this address. To generate code from a starting address other than zero use ORG. This is useful to place procedures, interrupts or data tables at particular addresses in memory. ORG is an assembler directive and no code is generated. DB 84 84 Define a byte. Store the byte (84) in the next free RAM location. Use DB to create data tables containing bytes of data. Use BD to define an Interrupt Vector. DB "Hello" 48, 65, 6C, 6C, 6F Define a string. Store the ASCII codes of the text in quotes in the next free RAM locations. Use DB to store text strings. The stored ASCII codes do not include an end-of-string marker. Use DB 00 for this. CLI FD Clear the I flag If the I flag is cleared, hardware interrupts are ignored. This is the default state for the simulator. Resetting the CPU will also clear the I flag. The timer that generates hardware interrupts will do nothing. STI FC Set the I flag If the I flag is set, the simulator will generate INT 02 at regular time intervals. It is necessary to have an interrupt vector stored at address 02 that points to interrupt handler code stored elsewhere. The interval between timer interrupts can be set using the slider in the Configuration Tab. If interrupts occur faster than the processor can handle them, a simulated system crash will follow. Adjust the CPU clock speed and the timer interval to prevent this – or cause it if you want to see what happens. It is possible to program the simulator using pure machine codes. Here is a simple example. ; ===== NORMAL CODE ===== MOV AL,0 INC AL END ; ===== NORMAL CODE ===== Here is the same program in pure machine code apart from the required END keyword. This should run exactly as the program above. ; ===== PURE MACHINE CODE ===== DB D0 ; MOV DB 00 ; AL DB 00 ; 0 DB A4 ; INC DB 00 ; AL END ; ===== PURE MACHINE CODE ===== This is an interesting exercise but rather defeats the whole point of using an assembler. If you have a dog, why bark yourself? Manually calculating jump distances might be a useful learning exercise, especially for negative jumps. List File Contents The List File In the list file, your original program is shown. Numbers in square blackets such as [1C] are the addresses at which the machine codes were generated. The machine codes are shown. Here is a typical line. MOV CL,C0 ; [10] D0 02 C0 ; Video ram base address The command is to move C0 into the AL register. The machine code was generated at address [10]. The machine codes are D0 00 C0. The programmer's comment is reproduced. Negative Numbers Contents Negative Numbers Dec Hex Dec Hex Dec Hex Dec Hex Dec Hex Dec Hex Dec Hex Dec Hex -128 80 -127 81 -126 82 -125 83 -124 84 -123 85 -122 86 -121 87 -120 88 -119 89 -118 8A -117 8B -116 8C -115 8D -114 8E -113 8F -112 90 -111 91 -110 92 -109 93 -108 94 -107 95 -106 96 -105 97 -104 98 -103 99 -102 9A -101 9B -100 9C -099 9D -098 9E -097 9F -096 A0 -095 A1 -094 A2 -093 A3 -092 A4 -091 A5 -090 A6 -089 A7 -088 A8 -087 A9 -086 AA -085 AB -084 AC -083 AD -082 AE -081 AF -080 B0 -079 B1 -078 B2 -077 B3 -076 B4 -075 B5 -074 B6 -073 B7 -072 B8 -071 B9 -070 BA -069 BB -068 BC -067 BD -066 BE -065 BF -064 C0 -063 C1 -062 C2 -061 C3 -060 C4 -059 C5 -058 C6 -057 C7 -056 C8 -055 C9 -054 CA -053 CB -052 CC -051 CD -050 CE -049 CF -048 D0 -047 D1 -046 D2 -045 D3 -044 D4 -043 D5 -042 D6 -041 D7 -040 D8 -039 D9 -038 DA -037 DB -036 DC -035 DD -034 DE -033 DF -032 E0 -031 E1 -030 E2 -029 E3 -028 E4 -027 E5 -026 E6 -025 E7 -024 E8 -023 E9 -022 EA -021 EB -020 EC -019 ED -018 EE -017 EF -016 F0 -015 F1 -014 F2 -013 F3 -012 F4 -011 F5 -010 F6 -009 F7 -008 F8 -007 F9 -006 FA -005 FB -004 FC -003 FD -002 FE -001 FF Positive Numbers Dec Hex Dec Hex Dec Hex Dec Hex Dec Hex Dec Hex Dec Hex Dec Hex +000 00 +001 01 +002 02 +003 03 +004 04 +005 05 +006 06 +007 07 +008 08 +009 09 +010 0A +011 0B +012 0C +013 0D +014 0E +015 0F +016 10 +017 11 +018 12 +019 13 +020 14 +021 15 +022 16 +023 17 +024 18 +025 18 +026 1A +027 1B +028 1C +029 1D +030 1E +031 1F +032 20 +033 21 +034 22 +035 23 +036 24 +037 25 +038 26 +039 27 +040 28 +041 29 +042 2A +043 2B +044 2C +045 2D +046 2E +047 2F +048 30 +049 31 +050 32 +051 33 +052 34 +053 35 +054 36 +055 37 +056 38 +057 39 +058 3A +059 3B +060 3C +061 3D +062 3E +063 3F +064 40 +065 41 +066 42 +067 43 +068 44 +069 45 +070 46 +071 47 +072 48 +073 49 +074 4A +075 4B +076 4C +077 4D +078 4E +079 4F +080 50 +081 51 +082 52 +083 53 +084 54 +085 55 +086 56 +087 57 +088 58 +089 59 +090 5A +091 5B +092 5C +093 5D +094 5E +095 5F +096 60 +097 61 +098 63 +099 63 +100 64 +101 65 +102 66 +103 67 +104 68 +105 69 +106 6A +107 6B +108 6C +109 6D +110 6E +111 6F +112 70 +113 71 +114 72 +115 73 +116 74 +117 75 +118 76 +119 77 +120 78 +121 79 +122 7A +123 7B +124 7C +125 7D +126 7E +127 7F Two's complement The numbers work as follows. The leftmost bit in an eight bit byte is the sign bit. 1 0 1 0 1 0 1 0 ^ ^ The sign bit has a value of -128 decimal or -80 hexadecimal. The other seven bits are treated as a normal positive number between 0 and 127. This is true whether the overall number is positive or negative. For example to store -1 the binary is 1 1 1 1 1 1 1 1 - 128d + 127d = -1d ^ ^ -128d To store 127 decimal, the binary is 0 1 1 1 1 1 1 1 0 + 127d = 127d ^ ^ The sign bit is zero. 16 and 32 bit machines also use the leftmost bit as the sign bit. The negative numbers work in exactly the same way but much bigger niumbers can be stored. In a 16 bit machine, the sign bit is worth -32768. In a 32 bit machine, the sign bit is worth -2147483648 (2000 million approximately). Pop-up Help Contents ADD AND CALL CLI CLO CMP DB DEC DIV END HALT IN INC INT IRET JMP JNO JNS JNZ JO JS JZ MOD MOV MUL NOP NOT OR ORG OUT POP POPF PUSH PUSHF RET ROL ROR SHL SHR STI SUB XOR CPU General Purpose Registers The CPU is where all the arithmetic and logic (decision making) takes place. The CPU has storage locations called registers. The CPU has flags which indicate zero, negative or overflowed calculations. More information is included in the description of the system architecture. The CPU registers are called AL, BL, CL and DL. The machine code names are 00, 01, 02 and 03. Registers are used for storing binary numbers. Once the numbers are in the registers, it is possible to perform arithmetic or logic. Sending the correct binary patterns to peripherals like the traffic lights, makes it possible to control them. ; semicolon begins a program comment. Comments are used to document programs. They are helpful to new programmers joining a team and to existing people returning to a project having forgotten what it is about. Good comments explain WHY things are being done. Poor comments simply repeat the code or state the totally obvious. Ram Addresses Examples [7F] [22] [AL] [CL] [7F] the contents of RAM at location 7F [CL] the contents of the RAM location that CL points to. CL contains a number that is used as the address. The Instruction Set Pop-up Help ADD - Add two values together CPU flags are set Assembler Machine Code Explanation ADD BL,CL A0 01 02 Add CL to BL. Answer goes into BL ADD AL,12 B0 00 12 Add 12 to AL. Answer goes into AL Pop-up Help AND - Logical AND two values together CPU flags are set Assembler Machine Code Explanation AND BL,CL AA 01 02 AND CL with BL. Answer goes into BL AND AL,12 BA 00 12 AND 12 with AL. Answer goes into AL The AND rule is that two ones give a one. All other inputs give nought. Look at this example... 10101010 00001111 -------- ANSWER 00001010 The left four bits are masked to 0. Pop-up Help CALL and RET CPU flags are NOT set Assembler Machine Code Explanation CALL 50 CA 50 Call the procedure at address 50. The CPU pushes the instruction pointer value IP + 2 onto the stack. Later the CPU returns to this address. IP is then set to 50. RET CB The CPU instruction pointer is set to 50. The CPU executes instructions from this address until it reaches the RET command. It then pops the value of IP off the stack and jumps to this address where execution resumes. Pop-up Help CLI and STI CPU (I) flag is set/cleared Assembler Machine Code Explanation STI FC STI sets the Interrupt flag. CLI FD CLI clears the Interrupt flag 'I' in the status register. STI sets the interrupt flag 'I' in the status register. The machine code for CLI is FD. The machine code for STI is FC. If (I) is set, the CPU will respond to interrupts. The simulator generates a hardware interrupt at regular time intervals that you can adjust. If 'I' is set, there should be an interrupt vector at address [02]. The CPU will jump to the code that this vector points to whenever there is an interrupt. Pop-up Help CLO CPU flags are NOT set Assembler Machine Code Explanation CLO FE Close unwanted peripheral windows. CLO is not an x86 command. It closes all unnecessary simulator windows which would otherwise have to be closed manually one by one. Pop-up Help CMP CPU flags are set Assembler Machine Code Explanation CMP AL,0D DB 00 0D Compare AL with 0D If the values being compared are ... EQUAL set the 'Z' flag. AL less than 0D set the 'S' flag. AL greater than 0D set no flags. CMP AL,BL DA 00 01 Compare AL with BL If the values being compared are ... EQUAL set the 'Z' flag. AL less than BL set the 'S' flag. AL greater than BL set no flags. CMP CL,[20] DC 02 20 Compare CL with 20 If the values being compared are ... EQUAL set the 'Z' flag. CL less than RAM[20] set the 'S' flag. CL greater than RAM[20] set no flags. Pop-up Help DB CPU flags are NOT set Assembler Machine Code Explanation DB 22 DB 33 DB 44 DB 0 22 33 44 00 Define Byte DB gives a method for loading values directly into RAM. DB does not have a machine code. The numbers or text after DB are loaded into RAM. Use DB to set up data tables. DB "Hello" DB 0 48 65 6C 6C 6F 00 ASCII codes are loaded into RAM. End of text is marked by NULL Pop-up Help DEC and INC CPU flags are set Assembler Machine Code Explanation INC BL A4 01 Add one to BL. DEC AL A5 00 Subtract one from AL. Pop-up Help DIV and MOD CPU flags are set Assembler Machine Code Explanation DIV AL,5 B3 00 05 Divide AL by 5. Answer goes into AL. DIV differs from the x86 DIV. DIV AL,BL A3 00 01 Divide AL by BL. Answer goes into AL. DIV differs from the x86 DIV. MOD AL,5 B6 00 05 MOD AL by 5. Remainder after division goes into AL. MOD is not an x86 command. MOD AL,BL A6 00 01 MOD AL by BL. Remainder after division goes into AL. MOD is not an x86 command. The x86 DIV calculates div and mod in one command. The answers are put into different registers. This is not possible with the 8 bit simulator so div and mod are separated and simplified. 8 DIV 3 is 3 (with remainder 2). 8 MOD 3 is 2 Pop-up Help END CPU flags are NOT set Assembler Machine Code Explanation END 00 END stops further program execution. The simulator achieves this by stopping the CPU clock. END is also an assembler directive. All code after END is ignored by the assembler. There should be only one END in a program. Pop-up Help HALT CPU flags are NOT set Assembler Machine Code Explanation HALT 00 HALT stops further program execution. The simulator achieves this by stopping the CPU clock. HALT is not an assembler directive. (See END) There can be any number of HALT commands in a program. Pop-up Help IN and OUT CPU flags are NOT set Assembler Machine Code Explanation IN 07 F0 07 Input from port 07. The data is stored in the AL register. OUT 03 F1 03 Output to port 03. The data comes from the AL register. Pop-up Help INC and DEC CPU flags are set Assembler Machine Code Explanation INC BL A4 01 Add one to BL. DEC AL A5 00 Subtract one from AL. Pop-up Help INT and IRET CPU flags are NOT set Assembler Machine Code Explanation INT 02 CC 02 The return address (IP + 2) is pushed onto the stack. The stack pointer (SP) is reduced by one. RAM location 02 contains the address of the Interrupt Handler. This address is "fetched" and IP is set to it. IRET CD The return address is popped off the stack. The stack pointer (SP) is increased by one. IP is set to the return address popped off the stack. Pop-up Help JMP CPU flags are NOT set and the flags are ignored Assembler Machine Code Explanation JMP Forward C0 12 Set IP to a new value Add 12 to IP The assembler calculates the jump distance. The biggest possible forward jump is +127. JMP Back FE Set IP to a new value Add -2 to IP FE is -2. This is explained here. The assembler calculates the jump distance. The biggest possible backward jump is -128. Pop-up Help JNO CPU flags are NOT set. JNO uses the (O) flag. The (O) flag is set if a calculation gives a result too big to fit in an 8 but register. Assembler Machine Code Explanation JNO Forward C6 12 Jump if the (O) flag is NOT set. If the (O) flag is NOT set, jump forward 12 places. If the (O) flag is NOT set, add 12 to (IP). If the (O) flag is set, add 2 to (IP). The assembler calculates the jump distance. The biggest possible forward jump is +127. JNO Back C6 FE Jump if the (O) flag is NOT set. If the (O) flag is NOT set, jump back 2 places. If the (O) flag is NOT set, add -2 to (IP). If the (O) flag is set, add 2 to (IP). The assembler calculates the jump distance. The biggest possible backward jump is -128. FE is -2. This is explained here. Pop-up Help JNS CPU flags are NOT set. JNS uses the (S) flag. The (S) flag is set if a calculation gives a negative result. Assembler Machine Code Explanation JNS Forward C4 12 Jump if the (S) flag is NOT set. If the (S) flag is NOT set, jump forward 12 places. If the (S) flag is NOT set, add 12 to (IP). If the (S) flag is set, add 2 to (IP). The assembler calculates the jump distance. The biggest possible forward jump is +127. JNS Back C4 FE Jump if the (S) flag is NOT set. If the (S) flag is NOT set, jump back 2 places. If the (S) flag is NOT set, add -2 to (IP). If the (S) flag is set, add 2 to (IP). The assembler calculates the jump distance. The biggest possible backward jump is -128. FE is -2. This is explained here. Pop-up Help JNZ CPU flags are NOT set. JNZ uses the (Z) flag. The (Z) flag is set if a calculation gives a zero result. Assembler Machine Code Explanation JNZ Forward C2 12 Jump if the (Z) flag is NOT set. If the (Z) flag is NOT set, jump forward 12 places. If the (Z) flag is NOT set, add 12 to (IP). If the (Z) flag is set, add 2 to (IP). The assembler calculates the jump distance. The biggest possible forward jump is +127. JNZ Back C2 FE Jump if the (Z) flag is NOT set. If the (Z) flag is NOT set, jump back 2 places. If the (Z) flag is NOT set, add -2 to (IP). If the (Z) flag is set, add 2 to (IP). The assembler calculates the jump distance. The biggest possible backward jump is -128. FE is -2. This is explained here. Pop-up Help JO CPU flags are NOT set. JO uses the (O) flag. The (O) flag is set if a calculation gives a result too big to fit in an 8 but register. Assembler Machine Code Explanation JO Forward C5 12 Jump if the (O) flag is set. If the (O) flag is set, jump forward 12 places. If the (O) flag is set, add 12 to (IP). If the (O) flag is NOT set, add 2 to (IP). The assembler calculates the jump distance. The biggest possible forward jump is +127. JO Back C5 FE Jump if the (O) flag is set. If the (O) flag is set, jump back 2 places. If the (O) flag is set, add -2 to (IP). If the (O) flag is NOT set, add 2 to (IP). The assembler calculates the jump distance. The biggest possible backward jump is -128. FE is -2. This is explained here. Pop-up Help JS CPU flags are NOT set. JS uses the (S) flag. The (S) flag is set if a calculation gives a negative result. Assembler Machine Code Explanation JS Forward C3 12 Jump if the (S) flag is set. If the (S) flag is set, jump forward 12 places. If the (S) flag is set, add 12 to (IP). If the (S) flag is NOT set, add 2 to (IP). The assembler calculates the jump distance. The biggest possible forward jump is +127. JS Back C3 FE Jump if the (S) flag is set. If the (S) flag is set, jump back 2 places. If the (S) flag is set, add -2 to (IP). If the (S) flag is NOT set, add 2 to (IP). The assembler calculates the jump distance. The biggest possible backward jump is -128. FE is -2. This is explained here. Pop-up Help JZ CPU flags are NOT set. JZ uses the (Z) flag. The (Z) flag is set if a calculation gives a zero result. Assembler Machine Code Explanation JZ Forward C1 12 Jump if the (Z) flag is set. If the (Z) flag is set, jump forward 12 places. If the (Z) flag is set, add 12 to (IP). If the (Z) flag is NOT set, add 2 to (IP). The assembler calculates the jump distance. The biggest possible forward jump is +127. JZ Back C1 FE Jump if the (Z) flag is set. If the (Z) flag is set, jump back 2 places. If the (Z) flag is set, add -2 to (IP). If the (Z) flag is NOT set, add 2 to (IP). The assembler calculates the jump distance. The biggest possible backward jump is -128. FE is -2. This is explained here. Pop-up Help DIV and MOD CPU Flags are Set Assembler Machine Code Explanation DIV AL,5 B3 00 05 Divide AL by 5. Answer goes into AL. DIV differs from the x86 DIV. DIV AL,BL A3 00 01 Divide AL by BL. Answer goes into AL. DIV differs from the x86 DIV. MOD AL,5 B6 00 05 MOD AL by 5. Remainder after division goes into AL. MOD is not an x86 command. MOD AL,BL A6 00 01 MOD AL by BL. Remainder after division goes into AL. MOD is not an x86 command. The x86 DIV calculates div and mod in one command. The answers are put into different registers. This is not possible with the 8 bit simulator so div and mod are separated and simplified. 8 DIV 3 is 3 (with remainder 2). 8 MOD 3 is 2 Pop-up Help MOV CPU flags are NOT set Addressing Mode Assembler Example Machine Code Supported Explanation Immediate mov al,10 D0 00 10 YES Copy 10 into AL Direct (register) mov al,bl NO Copy BL into AL Direct (memory) mov al,[50] D1 00 50 YES Copy data from RAM at address 50 into AL. [50] is a pointer to data held in a RAM location. mov [40],cl D2 40 02 YES Copy data from CL into RAM at address 40. [40] is a pointer to data held in a RAM location. Indirect mov al,[bl] D3 00 01 YES BL is a pointer to a RAM location. Copy data from that RAM location into AL. mov [cl],dl D4 02 03 YES CL is a pointer to a RAM location. Copy data from DL into that RAM location. Indexed mov al,[20 + bl] NO A data table is held in RAM at address 20. BL indexes a data item within the data table. Copy from the data table at address 20+BL into AL. mov [20 + bl],al NO A data table is held in RAM at address 20. BL indexes a data item within the data table. Copy from AL into the data table at address 20+BL. Base Register mov al,[bl+si] NO BL points to a data table in memory. SI indexes to a record inside the data table. BL is called the "base register". SI is called the "offset or index". Copy from RAM at address BL+SI into AL. mov [bl+si],al NO BL points to a data table in memory. SI indexes to a record inside the data table. BL is called the "base register". SI is called the "offset". Copy from AL into RAM at address BL+SI. Pop-up Help MUL CPU Flags are Set Assembler Machine Code Explanation MUL AL,BL A2 00 01 Multiply AL by BL. The result goes into AL MUL differs from the x86 MUL. MUL CL,12 B2 02 12 Multiply CL by 12. The result goes into CL MUL differs from the x86 MUL. The x86 MUL places the result into more than one register. This is not possible with the 8 bit simulator so MUL has been simplified. A disadvantage is that an overflow is much more likely to occur. Pop-up Help NOP CPU Flags are NOT Set Assembler Machine Code Explanation NOP FF Do nothing. Do nothing for one CPU clock cycle. This is needed to keep the CPU synchronised with accurately timed electronic circuits. The CPU might need to delay before the electronics are ready. Pop-up Help NOT CPU Flags are Set Assembler Machine Code Explanation NOT DL AD 03 Invert all the bits in DL. If DL contained 01010101, after using NOT it will contain 10101010. Pop-up Help OR CPU Flags are Set Assembler Machine Code Explanation OR AL,12 BB 00 12 Or 12 with AL. Answer goes into AL OR BL,CL AB 01 02 Or CL with BL. Answer goes into BL The OR rule is that two noughts give a nought. All other inputs give one. 10101010 OR 00001111 -------- = 10101111 Pop-up Help ORG CPU Flags are NOT Set Assembler Machine Code Explanation ORG 50 None ORG is not a CPU instruction. It is an instruction to the assembler to tell it to generate code at a particular address. It is useful for writing procedures and interrupts. It can also be used to specify where in memory, data tables go. Pop-up Help OUT and IN CPU Flags are NOT Set Assembler Machine Code Explanation IN 07 F0 07 Input from port 07. The data is stored in the AL register. OUT 03 F1 03 Output to port 03. The data comes from the AL register. Pop-up Help PUSH, POP, PUSHF and POPF CPU Flags are NOT Set Assembler Machine Code Explanation PUSH AL E0 00 Save AL onto the stack. Deduct one from the Stack Pointer (SP) POP BL E1 01 Add one to the stack pointer (SP). Restore BL from the stack PUSHF EA Push the CPU flags from the status register (SR) onto the stack. Deduct one from the Stack Pointer (SP) POPF EB Add one to the stack pointer (SP). POP the CPU flags from the stack into the ststus register (SR). PUSH saves a byte onto the stack. POP gets it back.The stack is an area of memory that obeys the LIFO rule - Last In First Out. When pushing items onto the stack, remember to pop them off again in exact reverse order. The stack can be used to 1. hold the return address of a procedure call 2. hold the return address of an interrupt call 3. pass parameters into procedures 4. get results back from procedures 5. save and restore registers and flags 6. reverse the order of data. Pop-up Help RET and CALL CPU Flags are NOT Set Assembler Machine Code Explanation CALL 50 CA 50 Call the procedure at address 50. The CPU pushes the instruction pointer value IP + 2 onto the stack. Later the CPU returns to this address. IP is then set to 50. RET CB The CPU instruction pointer is set to 50. The CPU executes instructions from this address until it reaches the RET command. It then pops the value of IP off the stack and jumps to this address where execution resumes. Pop-up Help ROL and ROR CPU Flags are Set Assembler Machine Code Explanation ROL AL 9A 00 Rotate the bits in AL left one place. The leftmost bit is moved to the right end of the byte. Before ROL 10000110 - After ROL 00001101 ROR DL 9B 03 Rotate the bits in DL right one place. The rightmost bit is moved to the left end of the byte. Before ROR 10000110 - After ROR 01000011 Pop-up Help SHL and SHR CPU Flags are Set Assembler Machine Code Explanation SHL AL 9C 00 Shift bits left one place. The leftmost bit is discarded. Before SHL 10000110 - After SHL 00001100 SHR DL 9D 03 Shift bits right one place. The rightmost bit is discarded. Before SHR 10000110 - After SHR 01000011 Pop-up Help STI and CLI CPU Flags are NOT Set Assembler Machine Code Explanation STI FC STI sets the Interrupt flag. CLI FD CLI clears the Interrupt flag 'I' in the status register. STI sets the interrupt flag 'I' in the status register. The machine code for CLI is FD. The machine code for STI is FC. If (I) is set, the CPU will respond to interrupts. The simulator generates a hardware interrupt at regular time intervals that you can adjust. If 'I' is set, there should be an interrupt vector at address [02]. The CPU will jump to the code that this vector points to whenever there is an interrupt. Pop-up Help SUB CPU Flags are Set Assembler Machine Code Explanation SUB AL,12 B1 00 12 Subtract 12 from AL. The answer goes into AL. SUB BL,CL A1 01 02 Subtract CL from BL. The answer goes into BL. Pop-up Help XOR CPU Flags are Set Assembler Machine Code Explanation XOR AL,12 BC 00 12 12 XOR AL. The answer goes into AL. XOR BL,CL AC 01 02 CL XOR BL. The answer goes into BL. XOR can be used to invert selected bits. 00001111 This is a bit mask. XOR 01010101 -------- 01011010 The left four bits are unaltered. The right four bits are inverted. Truth Tables and Logic Contents Boolean Operators - Flags are Set A mathematician called Bool invented a branch of maths for processing true and false values instead of numbers. This is called Boolean Algebra. Simple Boolean algebra is consistent with common sense but if you need to process decisions involving many values that might be true or false according to complex rules, you need this branch of mathematics. The Rules Rule One Line Explanation AND 1 AND 1 gives 1. Any other input gives 0. NAND (NOT AND) 1 AND 1 gives 0. Any other input gives 1. OR 0 OR 0 gives 0. Any other input gives 1. NOR (NOT OR) 0 OR 0 gives 1. Any other input gives 0. XOR Equal inputs give 0. Non equal inputs give 1. NOT Invert input bits. 0 becomes 1. 1 becomes 0. Computers work using LOGIC. Displaying graphics such as the mouse cursor involves the XOR (Exclusive OR) command. Addition makes use of AND and XOR. These and a few of the other uses of logic are described below. Truth Tables The one line descriptions of the rules above are clearer if shown in Truth Tables. These tables show the output for all possible input conditions. Logic Gates Logic gates are the building blocks of microcomputers. Modern processors contain millions of gates. Each gate is built from a few transistors. The gates are used to store data, perform arithmetic and manipulate bits using the rules above. The XOR rule can be used to test bits for equality. AND Both inputs must be true for the output to be true. AND is used for addition and decision making. ----------- A B Output ----------- 0 0 0 0 1 0 1 0 0 1 1 1 OR Both inputs must be false for the output to be false. OR is used in decision making. Both AND and OR are used for Bit Masking. Bit masking is used to pick individual bits out of a byte or to set particular bits in a byte. OR is used to set bits to one. AND is used to set bits to nought. AND is used to test if bits are one. OR is used to test if bits are nought. ----------- A B Output ----------- 0 0 0 0 1 1 1 0 1 1 1 1 XOR If the bits in a graphical image are XORed with other bits a new image appears. If the XORing is repeated the image disappears again. This is how the mouse and text cursors get moved around the screen. XOR is combined with AND for use in addition. XOR detects if the inputs are equal or not. ----------- A B Output ----------- 0 0 0 0 1 1 1 0 1 1 1 0 NAND NAND is really AND followed by NOT. Electronic circuits are commonly built from NAND gates (circuits). Computer programming languages and this simulator do not provide NAND. Use NOT AND instead. ----------- A B Output ----------- 0 0 1 0 1 1 1 0 1 1 1 0 NOR NOR is really OR followed by NOT. Electronic circuits are commonly built from NOR gates (circuits). Computer programming languages and this simulator do not provide NOR. Use NOT OR instead. ----------- A B Output ----------- 0 0 1 0 1 0 1 0 0 1 1 0 NOT NOT is used to invert bits or True/False values. All the rules above had two inputs and one output. NOT has a single input and output. ----------- A Output ----------- 0 1 1 0 The Half Adder Truth Table The half adder does binary addition on two bits. The AND gate conputes the carry bit. The XOR gate computes the sum bit. 0 + 0 = 0, carry 0 0 + 1 = 1, carry 0 1 + 0 = 1, carry 0 1 + 1 = 0, carry 1 ------------------ A B SUM CARRY ------------------ 0 0 0 0 0 1 1 0 1 0 1 0 1 1 0 1 Using the Editor Contents Using the Editor Editing the source code in the simulator is similar to most word processors and text editors such as the Windows Notepad. Undo You can undo an editing error. When you have an accident and delete or mess up something by mistake, you can press Ctrl+Z to UNDO the last thing you did. This can be very useful. Cursor Movements Move the text cursor. For small movements, use the Arrow Keys, Home, End, Page Up and Page Down. You can use the mouse too. For larger movements, hold down the Ctrl key and use the Arrow Keys, Home, End, Page Up, and Page Down. You can use the mouse too. Deleting Delete previous character with the Backspace Key Delete next character with the Delete Key Highlighting To highlight a block of text and hold down the Shift key and use the Arrow Keys, Home, End, Page Up and Page Down. You can drag the mouse with the left button pressed to do this too. To highlight whole words, lines or documents, hold down Shift and Ctrl and then use the Arrow Keys, Home, End, Page Up and Page Down. Alternatively drag the mouse with the left button pressed. Key Explanation Ctrl+C Copy a highlighted block Ctrl+X Cut a highlighted block Ctrl+V Paste text copied or cut earlier Delete Delete a highlighted block Ctrl+S Save a file Alt+F a Save a file with a new name Ctrl+O Open a file Alt+F x Quit Virtual Peripherals Contents Using the Peripheral Devices Keyboard Port 07 INT 03 How to Use This is one of the more complex devices. To make the keyboard visible, use OUT 07. Every time a key is pressed, a hardware interrupt, INT 03 is generated. By default, the CPU will ignore this interrupt. To process the interrupt, at the start of the program, use the STI command to set the interrupt flag (I) in the CPU status register (SR). Place an interrupt vector at RAM address 03. This should point to your interrupt handler code. The interrupt handler should use IN 07 to read the key press into the AL register. Once STI has set the (I) flag in the status register (SR), interrupts from the hardware timer will also be generated. These must be processed too. The hardware timer generates INT 02. To process this interrupt, place an interrupt vector at RAM location 02. This should point to the timer interrupt handler code. The timer code can be as simple as IRET. This will cause an interrupt return without doing any other processing. jmp start db 10 ; Hardware Timer Interrupt Vector db 20 ; Keyboard Interrupt Vector ; ===== Hardware Timer ======= org 10 nop ; Do something useful here nop nop nop nop iret ; ============================ ; ===== Keyboard Handler ===== org 20 CLI ; Prevent re-entrant use push al pushf in 07 nop ; Process the key press here nop nop nop nop popf pop al STI iret ; ============================ ; ===== Idle Loop ============ start: STI ; Set (I) flag out 07 ; Make keyboard visible idle: nop ; Do something useful here nop nop nop nop jmp idle ; ============================ end ; ============================ Visual Display Unit Memory Mapped How to Use The Visual Display Unit (VDU) is memory mapped. This means that RAM locations correspond to positions on the screen. RAM location C0 maps to the top left corner of the VDU. The screen has 16 columns and four rows mapped to RAM locations C0 to FF. When you write ASCII codes to these RAM locations, the corresponting text characters appear and the VDU is made visible. This device, when combined with a keyboard, is sometimes called a dumb terminal. It has no graphics capabilities. Here is a code snippet to write text to the screen. ; ===== Memory Mapped VDU ================================= MOV AL,41 ; ASCII code of 'A' MOV [C0],AL ; RAM location mapped to the ; top left corner of the VDU MOV AL,42 ; ASCII code of 'B' MOV [C1],AL ; RAM location mapped to the VDU MOV AL,43 ; ASCII code of 'C' MOV [C2],AL ; RAM location mapped to the VDU END ; ========================================================= Traffic Lights Port 01 How to Use The traffic lights are connected to Port 01. If a byte of data is sent to this port, wherever there is a one, the corresponding traffic light comes on. In the image on the left, the binary data is 01010101. If you look closely you can see that the lights that are on, correspond to the ones in the data byte. 01010101 is 55 hexadecimal. Hex' numbers are explained here. Here is a code snippet to control the lights. ; ======================================================== ; ===== 99Tlight.asm ===================================== ; ===== Traffic Lighte on Port 01 ======================== Start: MOV AL,55 ; 01010101 OUT 01 ; Send the data in AL to Port 01 ; (the traffic lights) MOV AL,AA ; 10101010 OUT 01 ; Send the data in AL to Port 01 ; (the traffic lights) JMP Start END ; ======================================================== Seven Segment Displays Port 02 How to Use The seven segments displays are connected to Port 02. If a byte of data is sent to this port, wherever there is a one, the corresponding segment comes on. The rightmost bit controls which of the two groups of segments is active. This is a simple example of mulitplexing. If the least significant bit (LSB) is zero, the left segments will be active. If the least significant bit (LSB) is one, the right segments will be active. Here is a code snippet. ; ====================================================== ; ===== 99sevseg.asm =================================== ; ===== Seven Segment Displays Port 02 ================= Start: MOV AL,FA ; 1111 1010 OUT 02 ; Send the data in AL to Port 02 MOV AL,0 ; 0000 0000 OUT 02 ; Send the data in AL to Port 02 MOV AL,FB ; 1111 1011 OUT 02 ; Send the data in AL to Port 02 MOV AL,1 ; 0000 0001 OUT 02 ; Send the data in AL to Port 02 JMP Start END ; ====================================================== Heater and Thermostat Port 03 How to Use The heater and thermostat system is connected to Port 03. Send 00 to port 3 to turn the heater off. Send 80 to port 03 to turn the heater on. Input from port 03 to test the thermostat state. The code snippet below is an incomplete solution to control the heater to keep the temperature steady at about 21 C. You can click the thermometer to set the temperature. This can save time when you are testing the system. ; ===== Heater and Thermostat on Port 03 ================= ; ===== 99Heater.asm ===================================== ; ===== Heater and Thermostat on Port 03 ================= MOV AL,0 ; Code to turn the heater off OUT 03 ; Send code to the heater IN 03 ; Input from Port 03 AND AL,1 ; Mask off left seven bits JZ Cold ; If the result is zero, turn the ; heater on HALT ; Quit Cold: MOV AL,80 ; Code to turn the heater on OUT 03 ; Send code to the heater END ; ========================================================== Snake and Maze Port 04 How to Use The left four bits control the direction of the snake. 80 Up 40 Down 20 Left 10 Right The right four bits control the distance moved. For example, 4F means Down 15. 4 means down. F means 15. This program is rather wasteful of RAM. If you want to traverse the entire maze and go back to the strart, you will run out of RAM. A good learning task is to use a data table. This reduces the size of the program greatly. Also, it is good style to separate code and data. Here is a code sample - not using a data table. ; ========================================================== ====== ; ===== 99snake.asm ====================================== ; ===== Snake and Maze =================================== Start: MOV AL,FF ; Special code to reset the snake. OUT 04 ; Send AL to port 04 to control the ; snake. MOV AL,4F ; 4 means DOWN. F means 15. OUT 04 ; Send 4F to the snake OUT 04 ; Send 4F to the snake OUT 04 ; Send 4F to the snake OUT 04 ; Send 4F to the snake JMP Start END ; ======================================================== Stepper Motor Port 05 How to Use Here is a stepper motor. Normal motors run continuously and it is hard to control their movement. Stepper motors step through a precise angle when electromagnets are energised. Stepper motors are used for precise positional control in printers, plotters, robotic devices, disk drives and for any application where precise positional accuracy is required. The motor is controlled by energising the four magnets in turn. It is possible to make the motor move in half steps by energising single and pairs of magnets. If the magnets are energised in the wrong sequence, the motor complains by a bleep from the computer speaker. Here is a code snippet to control the motor. Note that it would be better coding style to use a data table. ; ================================ ; ===== 99Step.asm =============== ; ===== Stepper Motor ============ mov al,1 out 05 mov al,2 out 05 mov al,4 out 05 mov al,8 out 05 mov al,9 out 05 mov al,1 out 05 mov al,3 out 05 mov al,2 out 05 mov al,6 out 05 mov al,4 out 05 mov al,c out 05 mov al,8 out 05 mov al,9 out 05 mov al,1 out 05 end ; ================================ Lift/Elevator Port 06 How to Use Input Signals Bits 8 and 7 are unused. Bit 6 is wired to the top call button. Bit 5 is wired to the bottom call button. If these buttons are clicked with the mouse, the corresponding bits come on. Bit 4 senses the lift and goes high when the lift cage reaches the bottom of the shaft. Bit 3 senses the lift and goes high when the lift cage reaches the top of the shaft. Outputs Bit 2 turns on the lift motor and the cage goes down. Bit 1 turns on the lift motor and the cage goes up. Ways To Destroy the Lift 1. Turn on bits 1 and 2 at the same time. This causes the motor to go up and down simulatneously! 2. Crash the lift into the bottom of the shaft. 3. Crash the lift into the top of the shaft. 4. Run the simulation too slowly. Even if the code is logically correct, the lift crashes into the end of the shaft before the program has time to switch off the motor. Hardware Timer INT 02 How to Use The hardware timer generates INT 02 at regular time intervals. The time interval can be changed using the Configuration tab as shown in the image. The CPU will ignore INT 02 unless the (I) flag in the status register (SR) is set. Use STI to set the (I) flag. Use CLI to clear the (I) flag. The code sample below processes INT 02 but does nothing useful. If the CPU clock is too slow, a new INT 02 can occur before the previous one has been handled. This is not necessarily a problem as long as the CPU eventually catches up. To allow this to work, it is essential that the interrupt handler saves and restores any registers it uses. Use PUSH and PUSF to save registers. Use POPF and POP to restore registers. Remember to pop items in the reverse order that they were pushed. Code like this is "re-entrant". If the CPU is too slow and does not catch up, the stack will gradually grow and eat up all the available RAM. Eventually the stack will overwrite the program causing a crash. It is a useful learning exercise to slow the CPU clock and watch this happen. jmp start db 10 ; Hardware Timer Interrupt Vector ; ===== Hardware Timer ======= org 10 nop ; Do something useful here nop nop nop nop iret ; ============================ ; ===== Idle Loop ============ start: STI ; Set (I) flag idle: nop ; Do something useful here nop nop nop nop jmp idle ; ============================ end ; ============================ Numeric Keypad Port 08 INT 04 How to Use This is one of the more complex devices. To make the numeric keypad visible, use OUT 08. Every time a key is pressed, a hardware interrupt, INT 04 is generated. By default, the CPU will ignore this interrupt. To process the interrupt, at the start of the program, use the STI command to set the interrupt flag (I) in the CPU status register (SR). Place an interrupt vector at RAM address 04. This should point to your interrupt handler code. The interrupt handler should use IN 08 to read the key press into the AL register. Once STI has set the (I) flag in the status register (SR), interrupts from the hardware timer will also be generated. These must be processed too. The hardware timer generates INT 02. To process this interrupt, place an interrupt vector at RAM location 02. This should point to the timer interrupt handler code. The timer code can be as simple as IRET. This will cause an interrupt return without doing any other processing. jmp start db 10 ; Hardware Timer Interrupt Vector db 00 ; Keyboard Interrupt Vector (unused) db 20 ; Numeric Keypad Interrupt Vector ; ===== Hardware Timer ======= org 10 nop ; Do something useful here nop nop nop nop iret ; ============================ ; ===== Keyboard Handler ===== org 20 CLI ; Prevent re-entrant use push al pushf in 08 nop ; Process the key press here nop nop nop nop popf pop al STI iret ; ============================ ; ===== Idle Loop ============ start: STI ; Set (I) flag out 08 ; Make keypad visible idle: nop ; Do something useful here nop nop nop nop jmp idle ; ============================ end ; ============================ MIPS IV Instruction Set Revision 3.2 September, 1995 Charles Price MIPS IV Instruction Set. Rev 3.2 MIPS Technologies, Inc. All Right Reserved RESTRICTED RIGHTS LEGEND Use, duplication, or disclosure of the technical data contained in this document by the Government is subject to restrictions as set forth in subdivision (c) (1) (ii) of the Rights in Technical Data and Computer Software clause at DFARS 52.227-7013 and / or in similar or successor clauses in the FAR, or in the DOD or NASA FAR Supplement. Unpublished rights reserved under the Copyright Laws of the United States. Contractor / manufacturer is MIPS Technologies, Inc., 2011 N. Shoreline Blvd., Mountain View, CA 94039-7311. R2000, R3000, R6000, R4000, R4400, R4200, R8000, R4300 and R10000 are trademarks of MIPS Technologies, Inc. MIPS and R3000 are registered trademarks of MIPS Technologies, Inc. The information in this document is preliminary and subject to change without notice. MIPS Technologies, Inc. (MTI) reserves the right to change any portion of the product described herein to improve function or design. MTI does not assume liability arising out of the application or use of any product or circuit described herein. Information on MIPS products is available electronically: (a) Through the World Wide Web. Point your WWW client to: http://www.mips.com (b) Through ftp from the internet site “sgigate.sgi.com”. Login as “ftp” or “anonymous” and then cd to the directory “pub/doc”. (c) Through an automated FAX service: Inside the USA toll free: (800) 446-6477 (800-IGO-MIPS) Outside the USA: (415) 688-4321 (call from a FAX machine) MIPS Technologies, Inc. 2011 N. Shoreline Blvd. Mountain View, CA 94039-7311 Phone: USA toll free: (800) 998-6477 Outside USA: (415) 933-6477 CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 MIPS IV Instruction Set CPU Instruction Set Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1 Functional Instruction Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2 Load and Store Instructions . . . . . . . . . . . . . . . . . . . . . . A-2 Delayed Loads . . . . . . . . . . . . . . . . . . . . . . . . . . A-3 CPU Loads and Stores . . . . . . . . . . . . . . . . . . . . . . . A-4 Atomic Update Loads and Stores . . . . . . . . . . . . . . . . . . A-5 Coprocessor Loads and Stores . . . . . . . . . . . . . . . . . . . A-5 Computational Instructions . . . . . . . . . . . . . . . . . . . . . . A-6 ALU. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-6 Shifts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-7 Multiply and Divide. . . . . . . . . . . . . . . . . . . . . . . . A-8 Jump and Branch Instructions . . . . . . . . . . . . . . . . . . . . . A-8 Miscellaneous Instructions . . . . . . . . . . . . . . . . . . . . . . . A-9 Exception Instructions . . . . . . . . . . . . . . . . . . . . . . . A-9 Serialization Instructions . . . . . . . . . . . . . . . . . . . . . A-10 Conditional Move Instructions . . . . . . . . . . . . . . . . . . A-10 Prefetch . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-10 Coprocessor Instructions . . . . . . . . . . . . . . . . . . . . . . A-11 Coprocessor Load and Store . . . . . . . . . . . . . . . . . . . A-12 Coprocessor Operations . . . . . . . . . . . . . . . . . . . . . A-12 Memory Access Types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-12 Uncached . . . . . . . . . . . . . . . . . . . . . . . . . . . A-12 Cached Noncoherent . . . . . . . . . . . . . . . . . . . . . . A-12 Cached Coherent . . . . . . . . . . . . . . . . . . . . . . . . A-13 Cached . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-13 Mixing References with Different Access Types. . . . . . . . . . . . . A-13 Cache Coherence Algorithms and Access Types . . . . . . . . . . . . A-14 Implementation-Specific Access Types . . . . . . . . . . . . . . . . A-14 Description of an Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-15 Instruction mnemonic and name . . . . . . . . . . . . . . . . . . . A-15 Instruction encoding picture . . . . . . . . . . . . . . . . . . . . . A-16 Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-16 Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-16 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-16 Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-17 Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-17 Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-17 Programming Notes, Implementation Notes . . . . . . . . . . . . . . A-18 Operation Section Notation and Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-18 Pseudocode Language . . . . . . . . . . . . . . . . . . . . . . . A-18 Pseudocode Symbols . . . . . . . . . . . . . . . . . . . . . . . . A-18 Pseudocode Functions. . . . . . . . . . . . . . . . . . . . . . . . A-20 Coprocessor General Register Access Functions . . . . . . . . . . . A-20 Load and Store Memory Functions . . . . . . . . . . . . . . . . A-21 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Access Functions for Floating-Point Registers . . . . . . . . . . . . A-24 Miscellaneous Functions . . . . . . . . . . . . . . . . . . . . . A-26 Individual CPU Instruction Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-27 CPU Instruction Formats. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-174 CPU Instruction Encoding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-175 Instruction Decode . . . . . . . . . . . . . . . . . . . . . . . . . A-175 SPECIAL Instruction Class. . . . . . . . . . . . . . . . . . . . A-175 REGIMM Instruction Class . . . . . . . . . . . . . . . . . . . A-175 Instruction Subsets of MIPS III and MIPS IV Processors. . . . . . . . . . A-175 Non-CPU Instructions in the Tables . . . . . . . . . . . . . . . . . . A-176 Coprocessor 0 - COP0 . . . . . . . . . . . . . . . . . . . . . . A-176 Coprocessor 1 - COP1, COP1X, MOVCI, and CP1 load/store. . . . . . A-176 Coprocessor 2 - COP2 and CP2 load/store. . . . . . . . . . . . . . A-176 Coprocessor 3 - COP3 and CP3 load/store. . . . . . . . . . . . . . A-176 FPU Instruction Set Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-1 FPU Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-2 Floating-point formats. . . . . . . . . . . . . . . . . . . . . . . . . B-3 Normalized and Denormalized Numbers . . . . . . . . . . . . . . B-4 Reserved Operand Values — Infinity and NaN . . . . . . . . . . . . B-4 Fixed-point formats . . . . . . . . . . . . . . . . . . . . . . . . . B-6 Floating-Point Registers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-6 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-7 Binary Data Transfers . . . . . . . . . . . . . . . . . . . . . . . . . B-7 Formatted Operand Layout . . . . . . . . . . . . . . . . . . . . . . B-9 Implementation and Revision Register . . . . . . . . . . . . . . . . B-10 FPU Control and Status Register — FCSR . . . . . . . . . . . . . . . B-10 Values in FP Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-13 FPU Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-14 Precise Exception Mode . . . . . . . . . . . . . . . . . . . . . . . B-15 Imprecise Exception Mode . . . . . . . . . . . . . . . . . . . . . . B-16 Exception Condition Definitions . . . . . . . . . . . . . . . . . . . B-16 Invalid Operation exception . . . . . . . . . . . . . . . . . . . B-17 Division By Zero exception . . . . . . . . . . . . . . . . . . . . B-18 Overflow exception . . . . . . . . . . . . . . . . . . . . . . . B-18 Underflow exception . . . . . . . . . . . . . . . . . . . . . . B-18 Inexact exception . . . . . . . . . . . . . . . . . . . . . . . . B-19 Unimplemented Operation exception . . . . . . . . . . . . . . . B-19 Functional Instruction Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-19 Data Transfer Instructions . . . . . . . . . . . . . . . . . . . . . . B-19 Arithmetic Instructions . . . . . . . . . . . . . . . . . . . . . . . B-21 Conversion Instructions . . . . . . . . . . . . . . . . . . . . . . . B-22 Formatted Operand Value Move Instructions. . . . . . . . . . . . . . B-23 Conditional Branch Instructions . . . . . . . . . . . . . . . . . . . B-23 CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 Miscellaneous Instructions . . . . . . . . . . . . . . . . . . . . . B-24 CPU Conditional Move . . . . . . . . . . . . . . . . . . . . . B-24 Valid Operands for FP Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-24 Description of an Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-26 Operation Notation Conventions and Functions. . . . . . . . . . . . . . . . . . . . . . . . . . B-26 Individual FPU Instruction Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-27 FPU Instruction Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-95 FPU (CP1) Instruction Opcode Bit Encoding. . . . . . . . . . . . . . . . . . . . . . . . . . . . B-98 Instruction Decode . . . . . . . . . . . . . . . . . . . . . . . . . B-98 COP1 Instruction Class . . . . . . . . . . . . . . . . . . . . . B-98 COP1X Instruction Class . . . . . . . . . . . . . . . . . . . . . B-99 SPECIAL Instruction Class . . . . . . . . . . . . . . . . . . . . B-99 Instruction Subsets of MIPS III and MIPS IV Processors. . . . . . . . . . B-99 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set List of Figures Figure A-1. Example Instruction Description . . . . . . . . . . . . . . . . . . . . A-15 Figure A-2. Unaligned Doubleword Load using LDL and LDR. . . . . . . . . . . . A-83 Figure A-3. Unaligned Doubleword Load using LDR and LDL. . . . . . . . . . . . A-85 Figure A-4. Unaligned Word Load using LWL and LWR. . . . . . . . . . . . . . . A-97 Figure A-5. Unaligned Word Load using LWR and LWL. . . . . . . . . . . . . . . A-100 Figure A-6. Unaligned Doubleword Store with SDL and SDR . . . . . . . . . . . . A-129 Figure A-7. Unaligned Doubleword Store with SDR and SDL . . . . . . . . . . . . A-131 Figure A-8. Unaligned Word Store using SWL and SWR. . . . . . . . . . . . . . . A-149 Figure A-9. Unaligned Word Store using SWR and SWL. . . . . . . . . . . . . . . A-152 Figure A-10. CPU Instruction Formats . . . . . . . . . . . . . . . . . . . . . . . A-174 Figure B-1. Single-Precision Floating-Point Format (S) . . . . . . . . . . . . . . . . B-3 Figure B-2. Double-Precision Floating-Point Format (D) . . . . . . . . . . . . . . . . B-4 Figure B-3. Word Fixed-Point Format (W) . . . . . . . . . . . . . . . . . . . . . . B-6 Figure B-4. Longword Fixed-Point Format (L) . . . . . . . . . . . . . . . . . . . . B-6 Figure B-5. Coprocessor 1 General Registers (FGRs) . . . . . . . . . . . . . . . . . B-7 Figure B-6. Effect of FPU Word Load or Move-to Operations . . . . . . . . . . . . . B-8 Figure B-7. Effect of FPU Doubleword Load or Move-to Operations . . . . . . . . . . B-8 Figure B-8. Floating-point Operand Register (FPR) Organization . . . . . . . . . . . . B-9 Figure B-9. Single Floating Point (S) or Word Fixed (W) Operand in an FPR . . . . . . . B-9 Figure B-10. Double Floating Point (D) or Long Fixed (L) Operand In an FPR . . . . . . B-10 Figure B-11. FPU Implementation and Revision Register . . . . . . . . . . . . . . . B-10 Figure B-12. MIPS I - FPU Control and Status Register (FCSR) . . . . . . . . . . . . B-11 Figure B-13. MIPS III - FPU Control and Status Register (FCSR) . . . . . . . . . . . . B-11 Figure B-14. MIPS IV - FPU Control and Status Register (FCSR) . . . . . . . . . . . . B-11 Figure B-15. The Effect of FPU Operations on the Format of Values Held in FPRs. . . . . B-14 Figure B-16. FPU Instruction Formats . . . . . . . . . . . . . . . . . . . . . . . B-95 CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 List of Tables Table A-1. Load/Store Operations Using Register + Offset Addressing Mode. . . . . . A-3 Table A-2. Load/Store Operations Using Register + Register Addressing Mode. . . . . A-3 Table A-3. Normal CPU Load/Store Instructions . . . . . . . . . . . . . . . . . A-4 Table A-4. Unaligned CPU Load/Store Instructions . . . . . . . . . . . . . . . . A-4 Table A-5. Atomic Update CPU Load/Store Instructions . . . . . . . . . . . . . . A-5 Table A-6. Coprocessor Load/Store Instructions . . . . . . . . . . . . . . . . . A-5 Table A-7. FPU Load/Store Instructions Using Register + Register Addressing . . . . A-5 Table A-8. ALU Instructions With an Immediate Operand . . . . . . . . . . . . . A-6 Table A-9. 3-Operand ALU Instructions . . . . . . . . . . . . . . . . . . . . . A-7 Table A-10. Shift Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . A-7 Table A-11. Multiply/Divide Instructions . . . . . . . . . . . . . . . . . . . . . A-8 Table A-12. Jump Instructions Jumping Within a 256 Megabyte Region . . . . . . . . A-9 Table A-13. Jump Instructions to Absolute Address . . . . . . . . . . . . . . . . A-9 Table A-14. PC-Relative Conditional Branch Instructions Comparing 2 Registers . . . . A-9 Table A-15. PC-Relative Conditional Branch Instructions Comparing Against Zero . . . A-9 Table A-16. System Call and Breakpoint Instructions . . . . . . . . . . . . . . . . A-9 Table A-17. Trap-on-Condition Instructions Comparing Two Registers . . . . . . . . A-10 Table A-18. Trap-on-Condition Instructions Comparing an Immediate . . . . . . . . A-10 Table A-19. Serialization Instructions . . . . . . . . . . . . . . . . . . . . . . . A-10 Table A-20. CPU Conditional Move Instructions . . . . . . . . . . . . . . . . . . A-10 Table A-21. Prefetch Using Register + Offset Address Mode . . . . . . . . . . . . . A-11 Table A-22. Prefetch Using Register + Register Address Mode . . . . . . . . . . . . A-11 Table A-23. Coprocessor Definition and Use in the MIPS Architecture . . . . . . . . . A-11 Table A-24. Coprocessor Operation Instructions . . . . . . . . . . . . . . . . . . A-12 Table A-25. Symbols in Instruction Operation Statements . . . . . . . . . . . . . . A-19 Table A-26. Coprocessor General Register Access Functions . . . . . . . . . . . . . A-21 Table A-27. AccessLength Specifications for Loads/Stores . . . . . . . . . . . . . . A-24 Table A-28. Bytes Loaded by LDL Instruction . . . . . . . . . . . . . . . . . . . A-84 Table A-29. Bytes Loaded by LDR Instruction . . . . . . . . . . . . . . . . . . . A-86 Table A-30. Bytes Loaded by LWL Instruction . . . . . . . . . . . . . . . . . . . A-98 Table A-31. Bytes Loaded by LWR Instruction . . . . . . . . . . . . . . . . . . A-101 Table A-32. Values of Hint Field for Prefetch Instruction . . . . . . . . . . . . . . A-117 Table A-33. Bytes Stored by SDL Instruction . . . . . . . . . . . . . . . . . . . A-130 Table A-34. Bytes Stored by SDR Instruction . . . . . . . . . . . . . . . . . . . A-132 Table A-35. Bytes Stored by SWL Instruction . . . . . . . . . . . . . . . . . . . A-150 Table A-36. Bytes Stored by SWR Instruction . . . . . . . . . . . . . . . . . . . A-153 Table A-37. CPU Instruction Encoding - MIPS I Architecture . . . . . . . . . . . . A-177 Table A-38. CPU Instruction Encoding - MIPS II Architecture . . . . . . . . . . . A-178 Table A-39. CPU Instruction Encoding - MIPS III Architecture . . . . . . . . . . . A-179 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Table A-40. CPU Instruction Encoding - MIPS IV Architecture . . . . . . . . . . . . A-180 Table A-41. Architecture Level in Which CPU Instructions are Defined or Extended. . . A-181 Table A-42. CPU Instruction Encoding Changes - MIPS II Revision. . . . . . . . . . A-182 Table A-43. CPU Instruction Encoding Changes - MIPS III Revision. . . . . . . . . . A-183 Table A-44. CPU Instruction Encoding Changes - MIPS IV Revision. . . . . . . . . . A-184 Table B-1. Parameters of Floating-Point Formats . . . . . . . . . . . . . . . . . . B-3 Table B-2. Value of Single or Double Floating-Point Format Encoding . . . . . . . . . B-4 Table B-3. Value Supplied when a new Quiet NaN is Created . . . . . . . . . . . . . B-6 Table B-4. Default Result for IEEE Exceptions Not Trapped Precisely . . . . . . . . B-17 Table B-5. FPU Loads and Stores Using Register + Offset Address Mode . . . . . . B-20 Table B-6. FPU Loads and Using Register + Register Address Mode . . . . . . . . B-20 Table B-7. FPU Move To/From Instructions . . . . . . . . . . . . . . . . . . . B-20 Table B-8. FPU IEEE Arithmetic Operations . . . . . . . . . . . . . . . . . . . B-21 Table B-9. FPU Approximate Arithmetic Operations . . . . . . . . . . . . . . . B-21 Table B-10. FPU Multiply-Accumulate Arithmetic Operations . . . . . . . . . . . . B-21 Table B-11. FPU Conversion Operations Using the FCSR Rounding Mode . . . . . . B-22 Table B-12. FPU Conversion Operations Using a Directed Rounding Mode . . . . . . B-22 Table B-13. FPU Formatted Operand Move Instructions . . . . . . . . . . . . . . . B-23 Table B-14. FPU Conditional Move on True/False Instructions . . . . . . . . . . . B-23 Table B-15. FPU Conditional Move on Zero/Nonzero Instructions . . . . . . . . . B-23 Table B-16. FPU Conditional Branch Instructions . . . . . . . . . . . . . . . . . B-24 Table B-17. CPU Conditional Move on FPU True/False Instructions . . . . . . . . . B-24 Table B-18. FPU Operand Format Field (fmt, fmt3) Decoding . . . . . . . . . . . . B-25 Table B-19. Valid Formats for FPU Operations . . . . . . . . . . . . . . . . . . B-25 Table B-20. FPU Comparisons Without Special Operand Exceptions . . . . . . . . . B-39 Table B-21. FPU Comparisons With Special Operand Exceptions for QNaNs . . . . . B-40 Table B-22. Values of Hint Field for Prefetch Instruction . . . . . . . . . . . . . . B-79 Table B-23. FPU (CP1) Instruction Encoding - MIPS I Architecture . . . . . . . . . . B-100 Table B-24. FPU (CP1) Instruction Encoding - MIPS II Architecture . . . . . . . . . B-102 Table B-25. FPU (CP1) Instruction Encoding - MIPS III Architecture . . . . . . . . . B-104 Table B-26. FPU (CP1) Instruction Encoding - MIPS IV Architecture . . . . . . . . . B-106 Table B-27. Architecture Level In Which FPU Instructions are Defined or Extended. . . B-109 Table B-28. FPU Instruction Encoding Changes - MIPS II Architecture Revision. . . . . B-112 Table B-29. FPU Instruction Encoding Changes - MIPS III Revision. . . . . . . . . . B-114 Table B-30. FPU Instruction Encoding Changes - MIPS IV Revision. . . . . . . . . B-116 CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 Revision History 2.0 (Jan 94): First General Release This version contained incorrect definitions for MSUB and NMSUB. It did not contain the RECIP and RSQRT instructions. It contained incomplete or erroneous information for LL, LLD, SC, SCD, SYNC, PREF, and PREFX. All copies of this version of the document should be destroyed 2.2 (Jul 94): Mandatory Replacement of Rev 2.0 This version should probably have been 3.0 since it is a major content change. This version is issued with no known errors. It includes the late changes to the MIPS IV definition including the reintroduction of RECIP and RSQRT and the definition of the multiply-accumulate instructions as unfused (rounded) operations. 3.0 (Oct 94): Add itemized instruction lists in the discussion of instruction functional groups. Add a more complete description of FPU operation Correct problems discovered with Revision 2.2. 3.1 (Jan 95): Correct minor problems discovered with Revision 3.0. 3.2 (Sep 95): Revise the opcode encoding tables significantly. Correct minor problems discovered with Revision 3.1. Changes from previous revision: Changes are generally marked by change bars in the outer margin of the page -- just like the bar to the side of this line. Minor corrections to punctuation and spelling are neither marked with change bars nor noted in this list. Some changes in figures are not marked by change bars due to limitations of the publishing tools. CVT.D.fmt Instruction Change the architecture level for the CVT.D.L version of the instruction from: to: MIPS III MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set CVT.S.fmt Instruction Change the architecture level for the CVT.S.L version of the instruction from: to: MIPS III LWL Instruction In the example in Fig. A-4 the sign extension “After executing LWL $24,2($0)” should be changed from: no cng or sign ext to: sign bit (31) extend. The information in the tables later in the instruction description is correct. MOVF Instruction Change the name of the constant value in the function field from: MOVC to: MOVCI There is a corresponding change in the FPU opcode encoding table in section B.12 with opcode=SPECIAL and function=MOVC, changing the value to MOVCI. MOVF.fmt Instruction Change the name of the constant value in the function field from: MOVC to: MOVCF There is a corresponding change in the FPU opcode encoding table in section B.12 with opcode=COP1, fmt = S or D, and function=MOVC, changing the value to MOVCI. MOVF Instruction Change the name of the constant value in the function field from: MOVC to: MOVCI There is a corresponding change in the FPU opcode encoding table in section B.12 with opcode=SPECIAL and function=MOVC, changing the value to MOVCI. MOVT.fmt Instruction Change the name of the constant value in the function field from: MOVC to: MOVCF There is a corresponding change in the FPU opcode encoding table in section B.12 with opcode=COP1, fmt = S or D, and function=MOVC, changing the value to MOVCI. CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 CPU Instruction Encoding tables Revise the presentation of the opcode encoding in section A 8 for greater clarity when considering different architecture levels or operating a MIPS III or MIPS IV processor in the MIPS II or MIPS III instruction subset modes. There is a separate encoding table for each architecture level. There is a table of the MIPS IV encodings showing the architecture level at which each opcode was first defined and subsequently modified or extended. There is a separate table for each architecture revision Ι→II, II→III, and III→IV showing the changes made in that revision. FPU Instruction Encoding tables Revise the presentation of the opcode encoding in section B.12 for greater clarity when considering different architecture levels or operating a MIPS III or MIPS IV processor in the MIPS II or MIPS III instruction subset modes. There is a separate encoding table for each architecture level. There is a table of the MIPS IV encodings showing the architecture level at which each opcode was first defined and subsequently modified or extended. There is a separate table for each architecture revision Ι→II, II→III, and III→IV showing the changes made in that revision. MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-1 CPU Instruction Set A A 1 Introduction This appendix describes the instruction set architecture (ISA) for the central processing unit (CPU) in the MIPS IV architecture. The CPU architecture defines the non-privileged instructions that execute in user mode. It does not define privileged instructions providing processor control executed by the implementation-specific System Control Processor. Instructions for the floating- point unit are described in Appendix B. The practical result is that a processor implementing MIPS IV is also able to run MIPS I, MIPS II, or MIPS III binary programs without change. MIPS I MIPS II MIPS III MIPS IV The original MIPS I CPU ISA has been extended in a backward-compatible fashion three times. The ISA extensions are inclusive as the diagram illustrates; each new architecture level (or version) includes the former levels. The description of an architectural feature includes the architecture level in which the feature is (first) defined or extended. The feature is also available in all later (higher) levels of the architecture. MIPS Architecture Extensions A-2 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set The CPU instruction set is first summarized by functional group then each instruction is described separately in alphabetical order. The appendix describes the organization of the individual instruction descriptions and the notation used in them (including FPU instructions). It concludes with the CPU instruction formats and opcode encoding tables. A 2 Functional Instruction Groups CPU instructions are divided into the following functional groups: • Load and Store • ALU • Jump and Branch • Miscellaneous • Coprocessor A 2.1 Load and Store Instructions Load and store instructions transfer data between the memory system and the general register sets in the CPU and the coprocessors. There are separate instructions for different purposes: transferring various sized fields, treating loaded data as signed or unsigned integers, accessing unaligned fields, selecting the addressing mode, and providing atomic memory update (read-modify-write). Regardless of byte ordering (big- or little-endian), the address of a halfword, word, or doubleword is the smallest byte address among the bytes forming the object. For big-endian ordering this is the most-significant byte; for a little-endian ordering this is the least-significant byte. Except for the few specialized instructions listed in Table A-4, loads and stores must access naturally aligned objects. An attempt to load or store an object at an address that is not an even multiple of the size of the object will cause an Address Error exception. Load and store operations have been added in each revision of the architecture: MIPS II • 64-bit coprocessor transfers • atomic update MIPS III • 64-bit CPU transfers • unsigned word load for CPU MIPS IV • register + register addressing mode for FPU CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-3 Tables A-1 and A-2 tabulate the supported load and store operations and indicate the MIPS architecture level at which each operation was first supported. The instructions themselves are listed in the following sections. Table A-1 Load/Store Operations Using Register + Offset Addressing Mode. Table A-2 Load/Store Operations Using Register + Register Addressing Mode. A 2.1.1 Delayed Loads The MIPS I architecture defines delayed loads; an instruction scheduling restriction requires that an instruction immediately following a load into register Rn cannot use Rn as a source register. The time between the load instruction and the time the data is available is the “load delay slot”. If no useful instruction can be put into the load delay slot, then a null operation (assembler mnemonic NOP) must be inserted. In MIPS II, this instruction scheduling restriction is removed. Programs will execute correctly when the loaded data is used by the instruction following the load, but this may require extra real cycles. Most processors cannot actually load data quickly enough for immediate use and the processor will be forced to wait until the data is available. Scheduling load delay slots is desirable for performance reasons even when it is not necessary for correctness. CPU coprocessor (except 0) Data Size Load Signed Load Unsigned Store Load Store byte I I I halfword I I I word I III I I I doubleword III III II II unaligned word I I unaligned doubleword III III linked word (atomic modify) II II linked doubleword (atomic modify) III III floating-point coprocessor only Data Size Load Store word IV IV doubleword IV IV A-4 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set A 2.1.2 CPU Loads and Stores There are instructions to transfer different amounts of data: bytes, halfwords, words, and doublewords. Signed and unsigned integers of different sizes are supported by loads that either sign-extend or zero-extend the data loaded into the register. Table A-3 Normal CPU Load/Store Instructions Unaligned words and doublewords can be loaded or stored in only two instructions by using a pair of special instructions. The load instructions read the left-side or right-side bytes (left or right side of register) from an aligned word and merge them into the correct bytes of the destination register. MIPS I, though it prohibits other use of loaded data in the load delay slot, permits LWL and LWR instructions targeting the same destination register to be executed sequentially. Store instructions select the correct bytes from a source register and update only those bytes in an aligned memory word (or doubleword). Table A-4 Unaligned CPU Load/Store Instructions Mnemonic Description Defined in LB Load Byte MIPS I LBU Load Byte Unsigned I SB Store Byte I LH Load Halfword I LHU Load Halfword Unsigned I SH Store Halfword I LW Load Word I LWU Load Word Unsigned III SW Store Word I LD Load Doubleword III SD Store Doubleword III Mnemonic Description Defined in LWL Load Word Left MIPS I LWR Load Word Right I SWL Store Word Left I SWR Store Word Right I LDL Load Doubleword Left III LDR Load Doubleword Right III SDL Store Doubleword Left III SDR Store Doubleword Right III CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-5 A 2.1.3 Atomic Update Loads and Stores There are paired instructions, Load Linked and Store Conditional, that can be used to perform atomic read-modify-write of word and doubleword cached memory locations. These instructions are used in carefully coded sequences to provide one of several synchronization primitives, including test-and-set, bit-level locks, semaphores, and sequencers/event counts. The individual instruction descriptions describe how to use them. Table A-5 Atomic Update CPU Load/Store Instructions A 2.1.4 Coprocessor Loads and Stores These loads and stores are coprocessor instructions, however it seems more useful to summarize all load and store instructions in one place instead of listing them in the coprocessor instructions functional group. If a particular coprocessor is not enabled, loads and stores to that processor cannot execute and will cause a Coprocessor Unusable exception. Enabling a coprocessor is a privileged operation provided by the System Control Coprocessor. Table A-6 Coprocessor Load/Store Instructions Table A-7 FPU Load/Store Instructions Using Register + Register Addressing Mnemonic Description Defined in LL Load Linked Word MIPS II SC Store Conditional Word II LLD Load Linked Doubleword III SCD Store Conditional Doubleword III Mnemonic Description Defined in LWCz Load Word to Coprocessor-z MIPS I SWCz Store Word from Coprocessor-z I LDCz Load Doubleword to Coprocessor-z II SDCz Store Doubleword from Coprocessor-z II Mnemonic Description Defined in LWXC1 Load Word Indexed to Floating Point MIPS IV SWXC1 Store Word Indexed from Floating Point IV LDXC1 Load Doubleword Indexed to Floating Point IV SDXC1 Store Doubleword Indexed from Floating Point IV A-6 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set A 2.2 Computational Instructions Two’s complement arithmetic is performed on integers represented in two’s complement notation. There are signed versions of add, subtract, multiply, and divide. There are add and subtract operations, called “unsigned”, that are actually modulo arithmetic without overflow detection. There are unsigned versions of multiply and divide. There is a full complement of shift and logical operations. MIPS I provides 32-bit integers and 32-bit arithmetic. MIPS III adds 64-bit integers and provides separate arithmetic and shift instructions for 64-bit operands. Logical operations are not sensitive to the width of the register. A 2.2.5 ALU Some arithmetic and logical instructions operate on one operand from a register and the other from a 16-bit immediate value in the instruction word. The immediate operand is treated as signed for the arithmetic and compare instructions, and treated as logical (zero-extended to register length) for the logical instructions. Table A-8 ALU Instructions With an Immediate Operand Mnemonic Description Defined in ADDI Add Immediate Word MIPS I ADDIU Add Immediate Unsigned Word I SLTI Set on Less Than Immediate I SLTIU Set on Less Than Immediate Unsigned I ANDI And Immediate I ORI Or Immediate I XORI Exclusive Or Immediate I LUI Load Upper Immediate I DADDI Doubleword Add Immediate III DADDIU Doubleword Add Immediate Unsigned III CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-7 Table A-9 3-Operand ALU Instructions A 2.2.6 Shifts There are shift instructions that take the shift amount from a 5-bit field in the instruction word and shift instructions that take a shift amount from the low-order bits of a general register. The instructions with a fixed shift amount are limited to a 5-bit shift count, so there are separate instructions for doubleword shifts of 0-31 bits and 32-63 bits. Table A-10 Shift Instructions Mnemonic Description Defined in ADD Add Word MIPS I ADDU Add Unsigned Word I SUB Subtract Word I SUBU Subtract Unsigned Word I DADD Doubleword Add III DADDU Doubleword Add Unsigned III DSUB Doubleword Subtract III DSUBU Doubleword Subtract Unsigned III SLT Set on Less Than I SLTU Set on Less Than Unsigned I AND And I OR Or I XOR Exclusive Or I NOR Nor I Mnemonic Description Defined in SLL Shift Word Left Logical MIPS I SRL Shift Word Right Logical I SRA Shift Word Right Arithmetic I SLLV Shift Word Left Logical Variable I SRLV Shift Word Right Logical Variable I SRAV Shift Word Right Arithmetic Variable I DSLL Doubleword Shift Left Logical III DSRL Doubleword Shift Right Logical III DSRA Doubleword Shift Right Arithmetic III DSLL32 Doubleword Shift Left Logical + 32 III DSRL32 Doubleword Shift Right Logical + 32 III DSRA32 Doubleword Shift Right Arithmetic + 32 III DSLLV Doubleword Shift Left Logical Variable III DSRLV Doubleword Shift Right Logical Variable III DSRAV Doubleword Shift Right Arithmetic Variable III A-8 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set A 2.2.7 Multiply and Divide The multiply and divide instructions produce twice as many result bits as is typical with other processors and they deliver their results into the HI and LO special registers. Multiply produces a full-width product twice the width of the input operands; the low half is put in LO and the high half is put in HI. Divide produces both a quotient in LO and a remainder in HI. The results are accessed by instructions that transfer data between HI/LO and the general registers. Table A-11 Multiply/Divide Instructions A 2.3 Jump and Branch Instructions The architecture defines PC-relative conditional branches, a PC-region unconditional jump, an absolute (register) unconditional jump, and a similar set of procedure calls that record a return link address in a general register. For convenience this discussion refers to them all as branches. All branches have an architectural delay of one instruction. When a branch is taken, the instruction immediately following the branch instruction, in the branch delay slot, is executed before the branch to the target instruction takes place. Conditional branches come in two versions that treat the instruction in the delay slot differently when the branch is not taken and execution falls through. The “branch” instructions execute the instruction in the delay slot, but the “branch likely” instructions do not (they are said to nullify it). By convention, if an exception or interrupt prevents the completion of an instruction occupying a branch delay slot, the instruction stream is continued by re-executing the branch instruction. To permit this, branches must be restartable; procedure calls may not use the register in which the return link is stored (usually register 31) to determine the branch target address. Mnemonic Description Defined in MULT Multiply Word MIPS I MULTU Multiply Unsigned Word I DIV Divide Word I DIVU Divide Unsigned Word I DMULT Doubleword Multiply III DMULTU Doubleword Multiply Unsigned III DDIV Doubleword Divide III DDIVU Doubleword Divide Unsigned III MFHI Move From HI I MTHI Move To HI I MFLO Move From LO I MTLO Move To LO I CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-9 Table A-12 Jump Instructions Jumping Within a 256 Megabyte Region Table A-13 Jump Instructions to Absolute Address Table A-14 PC-Relative Conditional Branch Instructions Comparing 2 Registers Table A-15 PC-Relative Conditional Branch Instructions Comparing Against Zero A 2.4 Miscellaneous Instructions A 2.4.1 Exception Instructions Exception instructions have as their sole purpose causing an exception that will transfer control to a software exception handler in the kernel. System call and breakpoint instructions cause exceptions unconditionally. The trap instructions cause exceptions conditionally based upon the result of a comparison. Table A-16 System Call and Breakpoint Instructions Mnemonic Description Defined in J Jump MIPS I JAL Jump and Link I Mnemonic Description Defined in JR Jump Register MIPS I JALR Jump and Link Register I Mnemonic Description Defined in BEQ Branch on Equal MIPS I BNE Branch on Not Equal I BLEZ Branch on Less Than or Equal to Zero I BGTZ Branch on Greater Than Zero I BEQL Branch on Equal Likely II BNEL Branch on Not Equal Likely II BLEZL Branch on Less Than or Equal to Zero Likely II BGTZL Branch on Greater Than Zero Likely II Mnemonic Description Defined in BLTZ Branch on Less Than Zero MIPS I BGEZ Branch on Greater Than or Equal to Zero I BLTZAL Branch on Less Than Zero and Link I BGEZAL Branch on Greater Than or Equal to Zero and Link I BLTZL Branch on Less Than Zero Likely II BGEZL Branch on Greater Than or Equal to Zero Likely II BLTZALL Branch on Less Than Zero and Link Likely II BGEZALL Branch on Greater Than or Equal to Zero and Link Likely II Mnemonic Description Defined in SYSCALL System Call MIPS I BREAK Breakpoint I A-10 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Table A-17 Trap-on-Condition Instructions Comparing Two Registers Table A-18 Trap-on-Condition Instructions Comparing an Immediate A 2.4.2 Serialization Instructions The order in which memory accesses from load and store instruction appear outside the processor executing them, in a multiprocessor system for example, is not specified by the architecture. The SYNC instruction creates a point in the executing instruction stream at which the relative order of some loads and stores is known. Loads and stores executed before the SYNC are completed before loads and stores after the SYNC can start. Table A-19 Serialization Instructions A 2.4.3 Conditional Move Instructions Instructions were added in MIPS IV to conditionally move one CPU general register to another based on the value in a third general register. Table A-20 CPU Conditional Move Instructions A 2.4.4 Prefetch There are two prefetch advisory instructions; one with register+offset addressing and the other with register+register addressing. These instructions advise that memory is likely to be used in a particular way in the near future and should be Mnemonic Description Defined in TGE Trap if Greater Than or Equal MIPS II TGEU Trap if Greater Than or Equal Unsigned II TLT Trap if Less Than II TLTU Trap if Less Than Unsigned II TEQ Trap if Equal II TNE Trap if Not Equal II Mnemonic Description Defined in TGEI Trap if Greater Than or Equal Immediate MIPS II TGEIU Trap if Greater Than or Equal Unsigned Immediate II TLTI Trap if Less Than Immediate II TLTIU Trap if Less Than Unsigned Immediate II TEQI Trap if Equal Immediate II TNEI Trap if Not Equal Immediate II Mnemonic Description Defined in SYNC Synchronize Shared Memory MIPS II Mnemonic Description Defined in MOVN Move Conditional on Not Zero MIPS IV MOVZ Move Conditional on Zero IV CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-11 prefetched into the cache. The PREFX instruction using register+register addressing mode is coded in the FPU opcode space along with the other operations using register+register addressing. Table A-21 Prefetch Using Register + Offset Address Mode Table A-22 Prefetch Using Register + Register Address Mode A 2.5 Coprocessor Instructions Coprocessors are alternate execution units, with register files separate from the CPU. The MIPS architecture provides an abstraction for up to 4 coprocessor units, numbered 0 to 3. Each architecture level defines some of these coprocessors as shown in Table A-23. Coprocessor 0 is always used for system control and coprocessor 1 is used for the floating-point unit. Other coprocessors are architecturally valid, but do not have a reserved use. Some coprocessors are not defined and their opcodes are either reserved or used for other purposes. Table A-23 Coprocessor Definition and Use in the MIPS Architecture The coprocessors may have two register sets, coprocessor general registers and coprocessor control registers, each set containing up to thirty two registers. Coprocessor computational instructions may alter registers in either set. System control for all MIPS processors is implemented as coprocessor 0 (CP0), the System Control Coprocessor. It provides the processor control, memory management, and exception handling functions. The CP0 instructions are specific to each CPU and are documented with the CPU-specific information. If a system includes a floating-point unit, it is implemented as coprocessor 1 (CP1). In MIPS IV, the FPU also uses the computation opcode space for coprocessor unit 3, renamed COP1X. The FPU instructions are documented in Appendix B. Mnemonic Description Defined in PREF Prefetch Indexed MIPS IV Mnemonic Description Defined in PREFX Prefetch Indexed MIPS IV MIPS architecture level coprocessor I II III IV 0 Sys Control Sys Control Sys Control Sys Control 1 FPU FPU FPU FPU 2 unused unused unused unused 3 unused unused not defined FPU (COP 1X) A-12 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set The coprocessor instructions are divided into two main groups: • Load and store instructions that are reserved in the main opcode space. • Coprocessor-specific operations that are defined entirely by the coprocessor. A 2.5.1 Coprocessor Load and Store Load and store instructions are not defined for CP0; the move to/from coprocessor instructions are the only way to write and read the CP0 registers. The loads and stores for coprocessors are summarized in Load and Store Instructions on page A-2. A 2.5.2 Coprocessor Operations There are up to four coprocessors and the instructions are shown generically for coprocessor-z. Within the operation main opcode, the coprocessor has further coprocessor-specific instructions encoded. Table A-24 Coprocessor Operation Instructions A 3 Memory Access Types MIPS systems provide a few memory access types that are characteristic ways to use physical memory and caches to perform a memory access. The memory access type is specified as a cache coherence algorithm (CCA) in the TLB entry for a mapped virtual page. The access type used for a location is associated with the virtual address, not the physical address or the instruction making the reference. Implementations without multiprocessor (MP) support provide uncached and cached accesses. Implementations with MP support provide uncached, cached noncoherent and cached coherent accesses. The memory access types use the memory hierarchy as follows: Uncached Physical memory is used to resolve the access. Each reference causes a read or write to physical memory. Caches are neither examined nor modified. Cached Noncoherent Physical memory and the caches of the processor performing the access are used to resolve the access. Other caches are neither examined nor modified. Mnemonic Description Defined in COPz Coprocessor-z Operation MIPS I CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-13 Cached Coherent Physical memory and all caches in the system containing a coherent copy of the physical location are used to resolve the access. A copy of a location is coherent (noncoherent) if the copy was placed in the cache by a cached coherent (cached noncoherent) access. Caches containing a coherent copy of the location are examined and/or modified to keep the contents of the location coherent. It is unpredictable whether caches holding a noncoherent copy of the location are examined and/or modified during a cached coherent access. Cached For early 32-bit processors without MP support, cached is equivalent to cached noncoherent. If an instruction description mentions the cached noncoherent access type, the comment applies equally to the cached access type in a processor that has the cached access type. For processors with MP support, cached is a collective term, e.g. “cached memory” or “cached access”, that includes both cached noncoherent and cached coherent. Such a collective use does not imply that cached is an access type, it means that the statement applies equally to cached noncoherent and cached coherent access types. A 3.1 Mixing References with Different Access Types It is possible to have more than one virtual location simultaneously mapped to the same physical location. The memory access type used for the virtual mappings may be different, but it is not generally possible to use mappings with different access types at the same time. A processor executing load and store instructions must observe the effect of the load and store instructions to a physical location in the order that they occur in the instruction stream (i.e. program order) for all accesses to virtual locations with the same memory access type. If a processor executes a load or store using one access type to a physical location, the behavior of a subsequent load or store to the same location using a different memory access type is undefined unless a privileged instruction sequence is executed between the two accesses. Each implementation has a privileged implementation-specific mechanism that must be used to change the access type being used to access a location. The memory access type of a location affects the behavior of I-fetch, load, store, and prefetch operations to the location. In addition, memory access types affect some instruction descriptions. Load linked (LL, LLD) and store conditional (SC, SCD) have defined operation only for locations with cached memory access type. SYNC affects only load and stores made to locations with uncached or cached coherent memory access types. A-14 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set A 3.2 Cache Coherence Algorithms and Access Types The memory access types are specified by implementation-specific cache coherence algorithms (CCAs) in TLB entries. Slightly different cache coherence algorithms such as “cached coherent, update on write” and “cached coherent, exclusive on write” can map to the same memory access type, in this case they both map to cached coherent. In order to map to the same access type the fundamental mechanism of both CCAs must be the same. When it affects the operation of the instruction, the instructions are described in terms of the memory access types. The load and store operations in a processor proceeds according to the specific CCA of the reference, however, and the pseudocode for load and store common functions in the section Load and Store Memory Functions on page A-21 use the CCA value rather than the corresponding memory access type. A 3.3 Implementation-Specific Access Types An implementation may provide memory access types other than uncached, cached noncoherent, or cached coherent. Implementation-specific documentation will define the properties of the new access types and their effect on all memory- related operations. CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-15 A 4 Description of an Instruction The CPU instructions are described in alphabetic order. Each description contains several sections that contain specific information about the instruction. The content of the section is described in detail below. An example description is shown in Figure A-1. Figure A-1 Example Instruction Description A 4.1 Instruction mnemonic and name The instruction mnemonic and name are printed as page headings for each page in the instruction description. Instruction mnemonic and descriptive name Instruction encoding constant and variable field names and values Architecture level at Short description Symbolic description Full description of instruction operation Restrictions on instruction and High-level language description of Exceptions that instruction can cause Notes for programmers operands which instruction was defined/redefined and assembler format(s) for each definition instruction operation Notes for implementors • • • • • • • • • • • A-16 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set A 4.2 Instruction encoding picture The instruction word encoding is shown in pictorial form at the top of the instruction description. This picture shows the values of all constant fields and the opcode names for opcode fields in upper-case. It labels all variable fields with lower-case names that are used in the instruction description. Fields that contain zeroes but are not named are unused fields that are required to be zero. A summary of the instruction formats and a definition of the terms used to describe the contents can be found in CPU Instruction Formats on page A-174. A 4.3 Format The assembler formats for the instruction and the architecture level at which the instruction was originally defined are shown. If the instruction definition was later extended, the architecture levels at which it was extended and the assembler formats for the extended definition are shown in order of extension. The MIPS architecture levels are inclusive; higher architecture levels include all instructions in previous levels. Extensions to instructions are backwards compatible. The original assembler formats are valid for the extended architecture. The assembler format is shown with literal parts of the assembler instruction in upper-case characters. The variable parts, the operands, are shown as the lower- case names of the appropriate fields in the instruction encoding picture. The architecture level at which the instruction was first defined, e.g. “MIPS I”, is shown at the right side of the page. There can be more than one assembler format per architecture level. This is sometimes an alternate form of the instruction. Floating-point operations on formatted data show an assembly format with the actual assembler mnemonic for each valid value of the “fmt” field. For example the ADD.fmt instruction shows ADD.S and ADD.D. The assembler format lines sometimes have comments to the right in parentheses to help explain variations in the formats. The comments are not a part of the assembler format. A 4.4 Purpose This is a very short statement of the purpose of the instruction. A 4.5 Description If a one-line symbolic description of the instruction is feasible, it will appear immediately to the right of the Description heading. The main purpose is to show how fields in the instruction are used in the arithmetic or logical operation. The body of the section is a description of the operation of the instruction in text, tables, and figures. This description complements the high-level language description in the Operation section. CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-17 This section uses acronyms for register descriptions. “GPR rt” is CPU General Purpose Register specified by the instruction field rt. “FPR fs” is the Floating Point Operand Register specified by the instruction field fs. “CP1 register fd” is the coprocessor 1 General Register specified by the instruction field fd. “FCSR” is the floating-point control and status register. A 4.6 Restrictions This section documents the restrictions on the instruction. Most restrictions fall into one of six categories: • The valid values for instruction fields (see floating-point ADD.fmt). • The alignment requirements for memory addresses (see LW). • The valid values of operands (see DADD). • The valid operand formats (see floating-point ADD.fmt). • The order of instructions necessary to guarantee correct execution. These ordering constraints avoid pipeline hazards for which some processors do not have hardware interlocks (see MUL). • The valid memory access types (see LL/SC). A 4.7 Operation This section describes the operation of the instruction as pseudocode in a high- level language notation resembling Pascal. The purpose of this section is to describe the operation of the instruction clearly in a form with less ambiguity than prose. This formal description complements the Description section; it is not complete in itself because many of the restrictions are either difficult to include in the pseudocode or omitted for readability. There will be separate Operation sections for 32-bit and 64-bit processors if the operation is different. This is usually necessary because the path to memory is a different size on these processors. See Operation Section Notation and Functions on page A-18 for more information on the formal notation. A 4.8 Exceptions This section lists the exceptions that can be caused by operation of the instruction. It omits exceptions that can be caused by instruction fetch, e.g. TLB Refill. It omits exceptions that can be caused by asynchronous external events, e.g. Interrupt. Although the Bus Error exception may be caused by the operation of a load or store instruction this section does not list Bus Error for load and store instructions because the relationship between load and store instructions and external error indications, like Bus Error, are implementation dependent. A-18 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Reserved Instruction is listed for every instruction not in MIPS I because the instruction will cause this exception on a MIPS I processor. To execute a MIPS II, MIPS III, or MIPS IV instruction, the processor must both support the architecture level and have it enabled. The mechanism to do this is implementation specific. The mechanism used to signal a floating-point unit (FPU) exception is implementation specific. Some implementations use the exception named “Floating Point”. Others use external interrupts (the Interrupt exception). This section lists Floating Point to represent all such mechanisms. The specific FPU traps possible are listed, indented, under the Floating Point entry. The usual floating-point exception model for MIPS architecture processors is precise exceptions. However, the R8000 processor, the first implementation of the MIPS IV architecture, normally operates with imprecise floating-point exceptions. It also has a mode in which it operates with degraded floating-point performance but provides precise exceptions compatible with other MIPS processors. This is mentioned in the description of some floating-point instructions. A general description of this exception model is not included in this document. See the “MIPS R8000 Microprocessor Chip Set Users Manual” for more information. An instruction may cause implementation-dependent exceptions that are not present in the Exceptions section. A 4.9 Programming Notes, Implementation Notes These sections contain material that is useful for programmers and implementors respectively but that is not necessary to describe the instruction and does not belong in the description sections. A 5 Operation Section Notation and Functions In an instruction description, the Operation section describes the operation performed by each instruction using a high-level language notation. The contents of the Operation section are described here. The special symbols and functions used are documented here. A 5.1 Pseudocode Language Each of the high-level language statements is executed in sequential order (as modified by conditional and loop constructs). A 5.2 Pseudocode Symbols Special symbols used in the notation are described in Table A-25. CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-19 Table A-25 Symbols in Instruction Operation Statements Symbol Meaning ← Assignment. =, ≠ Tests for equality and inequality. || Bit string concatenation. xy A y-bit string formed by y copies of the single-bit value x. xy..z Selection of bits y through z of bit string x. Little-endian bit notation (rightmost bit is 0) is used. If y is less than z, this expression is an empty (zero length) bit string. +, - 2’s complement or floating-point arithmetic: addition, subtraction. *, × 2’s complement or floating-point multiplication (both used for either). div 2’s complement integer division. mod 2’s complement modulo. / Floating-point division. < 2’s complement less than comparison. nor Bit-wise logical NOR. xor Bit-wise logical XOR. and Bit-wise logical AND. or Bit-wise logical OR. GPRLEN The length in bits (32 or 64), of the CPU General Purpose Registers. GPR[x] CPU General Purpose Register x. The content of GPR[0] is always zero. FPR[x] Floating-Point operand register x. FCC[cc] Floating-Point condition code cc. FCC[0] has the same value as COC[1]. FGR[x] Floating-Point (Coprocessor unit1), general register x. CPR[z,x] Coprocessor unit z, general register x. CCR[z,x] Coprocessor unit z, control register x. COC[z] Coprocessor unit z condition signal. BigEndianMem Endian mode as configured at chip reset (0 →Little, 1 → Big). Specifies the endianness of the memory interface (see LoadMemory and StoreMemory), and the endianness of Kernel and Supervisor mode execution. ReverseEndian Signal to reverse the endianness of load and store instructions. This feature is available in User mode only, and is effected by setting the RE bit of the Status register. Thus, ReverseEndian may be computed as (SRRE and User mode). BigEndianCPU The endianness for load and store instructions (0 → Little, 1 → Big). In User mode, this endianness may be switched by setting the RE bit in the Status Register. Thus, BigEndianCPU may be computed as (BigEndianMem XOR ReverseEndian). LLbit Bit of virtual state used to specify operation for instructions that provide atomic read-modify-write. It is set when a linked load occurs. It is tested and cleared by the conditional store. It is cleared, during other CPU operation, when a store to the location would no longer be atomic. In particular, it is cleared by exception return instructions. A-20 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set A 5.3 Pseudocode Functions There are several functions used in the pseudocode descriptions. These are used either to make the pseudocode more readable, to abstract implementation specific behavior, or both. The functions are defined in this section. A 5.3.1 Coprocessor General Register Access Functions Defined coprocessors, except for CP0, have instructions to exchange words and doublewords between coprocessor general registers and the rest of the system. What a coprocessor does with a word or doubleword supplied to it and how a coprocessor supplies a word or doubleword is defined by the coprocessor itself. This behavior is abstracted into functions: I :, I +n :, I -n : This occurs as a prefix to operation description lines and functions as a label. It indicates the instruction time during which the effects of the pseudocode lines appears to occur (i.e. when the pseudocode is “executed”). Unless otherwise indicated, all effects of the current instruction appear to occur during the instruction time of the current instruction. No label is equivalent to a time label of “I :”. Sometimes effects of an instruction appear to occur either earlier or later – during the instruction time of another instruction. When that happens, the instruction operation is written in sections labelled with the instruction time, relative to the current instruction I, in which the effect of that pseudocode appears to occur. For example, an instruction may have a result that is not available until after the next instruction. Such an instruction will have the portion of the instruction operation description that writes the result register in a section labelled “I +1:”. The effect of pseudocode statements for the current instruction labelled “I +1:”appears to occur “at the same time” as the effect of pseudocode statements labelled “I :” for the following instruction. Within one pseudocode sequence the effects of the statements takes place in order. However, between sequences of statements for different instructions that occur “at the same time”, there is no order defined. Programs must not depend on a particular order of evaluation between such sections. PC The Program Counter value. During the instruction time of an instruction this is the address of the instruction word. The address of the instruction that occurs during the next instruction time is determined by assigning a value to PC during an instruction time. If no value is assigned to PC during an instruction time by any pseudocode statement, it is automatically incremented by 4 before the next instruction time. A taken branch assigns the target address to PC during the instruction time of the instruction in the branch delay slot. PSIZE The SIZE, number of bits, of Physical address in an implementation. Symbol Meaning CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-21 Table A-26 Coprocessor General Register Access Functions A 5.3.2 Load and Store Memory Functions Regardless of byte ordering (big- or little-endian), the address of a halfword, word, or doubleword is the smallest byte address among the bytes forming the object. For big-endian ordering this is the most-significant byte; for a little-endian ordering this is the least-significant byte. In the operation description pseudocode for load and store operations, the functions shown below are used to summarize the handling of virtual addresses and accessing physical memory. The size of the data item to be loaded or stored is passed in the AccessLength field. The valid constant names and values are shown in Table A-27. The bytes within the addressed unit of memory (word for 32-bit processors or doubleword for 64-bit processors) which are used can be determined directly from the AccessLength and the two or three low-order bits of the address. COP_LW (z, rt, memword) z: The coprocessor unit number. rt: Coprocessor general register specifier. memword: A 32-bit word value supplied to the coprocessor. This is the action taken by coprocessor z when supplied with a word from memory during a load word operation. The action is coprocessor specific. The typical action would be to store the contents of memword in coprocessor general register rt. COP_LD (z, rt, memdouble) z: The coprocessor unit number. rt: Coprocessor general register specifier. memdouble: 64-bit doubleword value supplied to the coprocessor. This is the action taken by coprocessor z when supplied with a doubleword from memory during a load doubleword operation. The action is coprocessor specific. The typical action would be to store the contents of memdouble in coprocessor general register rt. dataword ← COP_SW (z, rt) z: The coprocessor unit number. rt: Coprocessor general register specifier. dataword: 32-bit word value. This defines the action taken by coprocessor z to supply a word of data during a store word operation. The action is coprocessor specific. The typical action would be to supply the contents of the low-order word in coprocessor general register rt. datadouble ← COP_SD (z, rt) z: The coprocessor unit number. rt: Coprocessor general register specifier. datadouble: 64-bit doubleword value. This defines the action taken by coprocessor z to supply a doubleword of data during a store doubleword operation. The action is coprocessor specific. The typical action would be to supply the contents of the doubleword in coprocessor general register rt. A-22 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set (pAddr, CCA) ←AddressTranslation (vAddr, IorD, LorS) pAddr: Physical Address. CCA: Cache Coherence Algorithm: the method used to access caches and memory and resolve the reference. vAddr: Virtual Address. IorD: Indicates whether access is for INSTRUCTION or DATA. LorS: Indicates whether access is for LOAD or STORE. Translate a virtual address to a physical address and a cache coherence algorithm describing the mechanism used to resolve the memory reference. Given the virtual address vAddr, and whether the reference is to Instructions or Data (IorD), find the corresponding physical address (pAddr) and the cache coherence algorithm (CCA) used to resolve the reference. If the virtual address is in one of the unmapped address spaces the physical address and CCA are determined directly by the virtual address. If the virtual address is in one of the mapped address spaces then the TLB is used to determine the physical address and access type; if the required translation is not present in the TLB or the desired access is not permitted the function fails and an exception is taken. MemElem ← LoadMemory (CCA, AccessLength, pAddr, vAddr, IorD) MemElem: Data is returned in a fixed width with a natural alignment. The width is the same size as the CPU general purpose register, 32 or 64 bits, aligned on a 32 or 64-bit boundary respectively. CCA: Cache Coherence Algorithm: the method used to access caches and memory and resolve the reference. AccessLength:Length, in bytes, of access. pAddr: Physical Address. vAddr: Virtual Address. IorD: Indicates whether access is for Instructions or Data. Load a value from memory. Uses the cache and main memory as specified in the Cache Coherence Algorithm (CCA) and the sort of access (IorD) to find the contents of AccessLength memory bytes starting at physical location pAddr. The data is returned in the fixed width naturally-aligned memory element (MemElem). The low-order two (or three) bits of the address and the AccessLength indicate which of the bytes within MemElem needs to be given to the processor. If the memory access type of the reference is uncached then only the referenced bytes are read from memory and valid within the memory element. If the access type is cached, and the data is not present in cache, an implementation specific size and alignment block of memory is read and loaded into the cache to satisfy a load reference. At a minimum, the block is the entire memory element. CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-23 StoreMemory (CCA, AccessLength, MemElem, pAddr, vAddr) CCA: Cache Coherence Algorithm: the method used to access caches and memory and resolve the reference. AccessLength:Length, in bytes, of access. MemElem: Data in the width and alignment of a memory element. The width is the same size as the CPU general purpose register, 4 or 8 bytes, aligned on a 4 or 8-byte boundary. For a partial-memory-element store, only the bytes that will be stored must be valid. pAddr: Physical Address. vAddr: Virtual Address. Store a value to memory. The specified data is stored into the physical location pAddr using the memory hierarchy (data caches and main memory) as specified by the Cache Coherence Algorithm (CCA). The MemElem contains the data for an aligned, fixed-width memory element (word for 32-bit processors, doubleword for 64-bit processors), though only the bytes that will actually be stored to memory need to be valid. The low-order two (or three) bits of pAddr and the AccessLength field indicates which of the bytes within the MemElem data should actually be stored; only these bytes in memory will be changed. Prefetch (CCA, pAddr, vAddr, DATA, hint) CCA: Cache Coherence Algorithm: the method used to access caches and memory and resolve the reference. pAddr: physical Address. vAddr: Virtual Address. DATA: Indicates that access is for DATA. hint: hint that indicates the possible use of the data. Prefetch data from memory. Prefetch is an advisory instruction for which an implementation specific action is taken. The action taken may increase performance but must not change the meaning of the program or alter architecturally-visible state. A-24 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Table A-27 AccessLength Specifications for Loads/Stores A 5.3.3 Access Functions for Floating-Point Registers The details of the relationship between CP1 general registers and floating-point operand registers is encapsulated in the functions included in this section. See Valid Operands for FP Instructions on page B-24 for more information. This function returns the current logical width, in bits, of the CP1 general registers. All 32-bit processors will return “32”. 64-bit processors will return “32” when in 32-bit-CP1-register emulation mode and “64” when in native 64-bit mode. The following pseudocode referring to the StatusFR bit is valid for all existing MIPS 64-bit processors at the time of this writing, however this is a privileged processor-specific mechanism and it may be different in some future processor. SizeFGR() -- current size, in bits, of the CP1 general registers size ←SizeFGR() if 32_bit_processor then size ← 32 else /* 64-bit processor */ if StatusFR = 1 then size ← 64 else size ← 32 endif endif AccessLength Name Value Meaning DOUBLEWORD 7 8 bytes (64 bits) SEPTIBYTE 6 7 bytes (56 bits) SEXTIBYTE 5 6 bytes (48 bits) QUINTIBYTE 4 5 bytes (40 bits) WORD 3 4 bytes (32 bits) TRIPLEBYTE 2 3 bytes (24 bits) HALFWORD 1 2 bytes (16 bits) BYTE 0 1 byte (8 bits) CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-25 This pseudocode specifies how the unformatted contents loaded or moved-to CP1 registers are interpreted to form a formatted value. If an FPR contains a value in some format, rather than unformatted contents from a load (uninterpreted), it is valid to interpret the value in that format, but not to interpret it in a different format. ValueFPR() -- Get a formatted value from an FPR. value ←ValueFPR (fpr, fmt) /* get a formatted value from an FPR */ if SizeFGR() = 64 then case fmt of S, W: value ← FGR[fpr]31..0 D, L: value ← FGR[fpr] endcase elseif fpr0 = 0 then /* fpr is valid (even), 32-bit wide FGRs */ case fmt of S, W: value ← FGR[fpr] D, L: value ← FGR[fpr+1] || FGR[fpr] endcase else /* undefined for odd 32-bit FGRs */ UndefinedResult endif A-26 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set This pseudocode specifies the way that a binary encoding representing a formatted value is stored into CP1 registers by a computational or move operation. This binary representation is visible to store or move-from instructions. Once an FPR contains a value via StoreFPR(), it is not valid to interpret the value with ValueFPR() in a different format. StoreFPR() -- store a formatted value into an FPR. StoreFPR(fpr, fmt, value): /* place a formatted value into an FPR */ if SizeFGR() = 64 then /* 64-bit wide FGRs */ case fmt of S, W: FGR[fpr] ← undefined32 || value D, L: FGR[fpr] ← value endcase elseif fpr0 = 0 then /* fpr is valid (even), 32-bit wide FGRs */ case fmt of S, W: FGR[fpr+1] ← undefined32 FGR[fpr] ← value D, L: FGR[fpr+1] ← value63..32 FGR[fpr] ← value31..0 endcase else /* undefined for odd 32-bit FGRs */ UndefinedResult endif A 5.3.4 Miscellaneous Functions SyncOperation(stype) stype: Type of load/store ordering to perform. order loads and stores to synchronize shared memory. Perform the action necessary to make the effects of groups synchronizable loads and stores indicated by stype occur in the same order for all processors. SignalException(Exception) Exception The exception condition that exists. Signal an exception condition. This will result in an exception that aborts the instruction. The instruction operation pseudocode will never see a return from this function call. UndefinedResult() This function indicates that the result of the operation is undefined. CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-27 A 6 Individual CPU Instruction Descriptions The user-mode CPU instructions are described in alphabetic order. See Description of an Instruction on page A-15 for a description of the information in each instruction description. NullifyCurrentInstruction() Nullify the current instruction. This occurs during the instruction time for some instruction and that instruction is not executed further. This appears for branch-likely instructions during the execution of the instruction in the delay slot and it kills the instruction in the delay slot. CoprocessorOperation (z, cop_fun) z Coprocessor unit number cop_fun Coprocessor function from function field of instruction Perform the specified Coprocessor operation. ADD Add Word A-28 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Format: ADD rd, rs, rt MIPS I Purpose: To add 32-bit integers. If overflow occurs, then trap. Description: rd ← rs + rt The 32-bit word value in GPR rt is added to the 32-bit value in GPR rs to produce a 32-bit result. If the addition results in 32-bit 2’s complement arithmetic overflow then the destination register is not modified and an Integer Overflow exception occurs. If it does not overflow, the 32-bit result is placed into GPR rd. Restrictions: On 64-bit processors, if either GPR rt or GPR rs do not contain sign-extended 32-bit values (bits 63..31 equal), then the result of the operation is undefined. Operation: if (NotWordValue(GPR[rs]) or NotWordValue(GPR[rt])) then UndefinedResult() endif temp ←GPR[rs] + GPR[rt] if (32_bit_arithmetic_overflow) then SignalException(IntegerOverflow) else GPR[rd] ←sign_extend(temp31..0) endif Exceptions: Integer Overflow Programming Notes: ADDU performs the same arithmetic operation but, does not trap on overflow. 31 2526 2021 1516 SPECIAL rs rt 6 5 5 rd 0 ADD 5 5 6 11 10 6 5 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 Add Immediate Word ADDI CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-29 Format: ADDI rt, rs, immediate MIPS I Purpose: To add a constant to a 32-bit integer. If overflow occurs, then trap. Description: rt ← rs + immediate The 16-bit signed immediate is added to the 32-bit value in GPR rs to produce a 32-bit result. If the addition results in 32-bit 2’s complement arithmetic overflow then the destination register is not modified and an Integer Overflow exception occurs. If it does not overflow, the 32-bit result is placed into GPR rt. Restrictions: On 64-bit processors, if GPR rs does not contain a sign-extended 32-bit value (bits 63..31 equal), then the result of the operation is undefined. Operation: if (NotWordValue(GPR[rs])) then UndefinedResult() endif temp ←GPR[rs] + sign_extend(immediate) if (32_bit_arithmetic_overflow) then SignalException(IntegerOverflow) else GPR[rt] ←sign_extend(temp31..0) endif Exceptions: Integer Overflow Programming Notes: ADDIU performs the same arithmetic operation but, does not trap on overflow. 31 2526 2021 1516 0 ADDI rs rt immediate 6 5 5 16 0 0 1 0 0 0 ADDIU Add Immediate Unsigned Word A-30 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Format: ADDIU rt, rs, immediate MIPS I Purpose: To add a constant to a 32-bit integer. Description: rt ← rs + immediate The 16-bit signed immediate is added to the 32-bit value in GPR rs and the 32-bit arithmetic result is placed into GPR rt. No Integer Overflow exception occurs under any circumstances. Restrictions: On 64-bit processors, if GPR rs does not contain a sign-extended 32-bit value (bits 63..31 equal), then the result of the operation is undefined. Operation: if (NotWordValue(GPR[rs])) then UndefinedResult() endif temp ←GPR[rs] + sign_extend(immediate) GPR[rt] ← sign_extend(temp31..0) Exceptions: None Programming Notes: The term “unsigned” in the instruction name is a misnomer; this operation is 32-bit modulo arithmetic that does not trap on overflow. It is appropriate for arithmetic which is not signed, such as address arithmetic, or integer arithmetic environments that ignore overflow, such as “C” language arithmetic. 31 2526 2021 1516 0 ADDIU rs rt immediate 6 5 5 16 0 0 1 0 0 1 Add Unsigned Word ADDU CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-31 Format: ADDU rd, rs, rt MIPS I Purpose: To add 32-bit integers. Description: rd ← rs + rt The 32-bit word value in GPR rt is added to the 32-bit value in GPR rs and the 32-bit arithmetic result is placed into GPR rd. No Integer Overflow exception occurs under any circumstances. Restrictions: On 64-bit processors, if either GPR rt or GPR rs do not contain sign-extended 32-bit values (bits 63..31 equal), then the result of the operation is undefined. Operation: if (NotWordValue(GPR[rs]) or NotWordValue(GPR[rt])) then UndefinedResult() endif temp ←GPR[rs] + GPR[rt] GPR[rd]← sign_extend(temp31..0) Exceptions: None Programming Notes: The term “unsigned” in the instruction name is a misnomer; this operation is 32-bit modulo arithmetic that does not trap on overflow. It is appropriate for arithmetic which is not signed, such as address arithmetic, or integer arithmetic environments that ignore overflow, such as “C” language arithmetic. 31 2526 2021 1516 SPECIAL rs rt 6 5 5 rd 0 ADDU 5 5 6 11 10 6 5 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 AND And A-32 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Format: AND rd, rs, rt MIPS I Purpose: To do a bitwise logical AND. Description: rd ← rs AND rt The contents of GPR rs are combined with the contents of GPR rt in a bitwise logical AND operation. The result is placed into GPR rd. Restrictions: None Operation: GPR[rd] ← GPR[rs] and GPR[rt] Exceptions: None 31 2526 2021 1516 SPECIAL rs rt 6 5 5 rd 0 AND 5 5 6 11 10 6 5 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 And Immediate ANDI CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-33 Format: ANDI rt, rs, immediate MIPS I Purpose: To do a bitwise logical AND with a constant. Description: rt ← rs AND immediate The 16-bit immediate is zero-extended to the left and combined with the contents of GPR rs in a bitwise logical AND operation. The result is placed into GPR rt. Restrictions: None Operation: GPR[rt] ← zero_extend(immediate) and GPR[rs] Exceptions: None 31 2526 2021 1516 0 ANDI rs rt immediate 6 5 5 16 0 0 1 1 0 0 BEQ Branch on Equal A-34 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Format: BEQ rs, rt, offset MIPS I Purpose: To compare GPRs then do a PC-relative conditional branch. Description: if (rs = rt) then branch An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address. If the contents of GPR rs and GPR rt are equal, branch to the effective target address after the instruction in the delay slot is executed. Restrictions: None Operation: I : tgt_offset ← sign_extend(offset || 02) condition ← (GPR[rs] = GPR[rt]) I + 1 :if condition then PC ← PC + tgt_offset endif Exceptions: None Programming Notes: With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register (JR) instructions to branch to more distant addresses. 31 2526 2021 1516 0 BEQ rs rt offset 6 5 5 16 0 0 0 1 0 0 Branch on Equal Likely BEQL CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-35 Format: BEQL rs, rt, offset MIPS II Purpose: To compare GPRs then do a PC-relative conditional branch; execute the delay slot only if the branch is taken. Description: if (rs = rt) then branch_likely An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address. If the contents of GPR rs and GPR rt are equal, branch to the target address after the instruction in the delay slot is executed. If the branch is not taken, the instruction in the delay slot is not executed. Restrictions: None Operation: I : tgt_offset ← sign_extend(offset || 02) condition ← (GPR[rs] = GPR[rt]) I + 1 :if condition then PC ← PC + tgt_offset else NullifyCurrentInstruction() endif Exceptions: Reserved Instruction Programming Notes: With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register (JR) instructions to branch to more distant addresses. 31 2526 2021 1516 0 BEQL rs rt offset 6 5 5 16 0 1 0 1 0 0 BGEZ Branch on Greater Than or Equal to Zero A-36 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Format: BGEZ rs, offset MIPS I Purpose: To test a GPR then do a PC-relative conditional branch. Description: if (rs ≥ 0) then branch An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address. If the contents of GPR rs are greater than or equal to zero (sign bit is 0), branch to the effective target address after the instruction in the delay slot is executed. Restrictions: None Operation: I : tgt_offset ← sign_extend(offset || 02) condition ← GPR[rs] ≥ 0GPRLEN I + 1 :if condition then PC ← PC + tgt_offset endif Exceptions: None Programming Notes: With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register (JR) instructions to branch to more distant addresses. 31 2526 2021 1516 0 REGIMM rs BGEZ offset 6 5 5 16 0 0 0 0 0 1 0 0 0 0 1 Branch on Greater Than or Equal to Zero and Link BGEZAL CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-37 Format: BGEZAL rs, offset MIPS I Purpose: To test a GPR then do a PC-relative conditional procedure call. Description: if (rs ≥ 0) then procedure_call Place the return address link in GPR 31. The return link is the address of the second instruction following the branch, where execution would continue after a procedure call. An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address. If the contents of GPR rs are greater than or equal to zero (sign bit is 0), branch to the effective target address after the instruction in the delay slot is executed. Restrictions: GPR 31 must not be used for the source register rs, because such an instruction does not have the same effect when re-executed. The result of executing such an instruction is undefined. This restriction permits an exception handler to resume execution by re- executing the branch when an exception occurs in the branch delay slot. Operation: I : tgt_offset ← sign_extend(offset || 02) condition ← GPR[rs] ≥ 0GPRLEN GPR[31] ← PC + 8 I + 1 :if condition then PC ← PC + tgt_offset endif Exceptions: None Programming Notes: With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump and link (JAL) or jump and link register (JALR) instructions for procedure calls to more distant addresses. 31 2526 2021 1516 0 REGIMM rs BGEZAL offset 6 5 5 16 0 0 0 0 0 1 1 0 0 0 1 BGEZALL Branch on Greater Than or Equal to Zero and Link Likely A-38 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Format: BGEZALL rs, offset MIPS II Purpose: To test a GPR then do a PC-relative conditional procedure call; execute the delay slot only if the branch is taken. Description: if (rs ≥ 0) then procedure_call_likely Place the return address link in GPR 31. The return link is the address of the second instruction following the branch, where execution would continue after a procedure call. An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address. If the contents of GPR rs are greater than or equal to zero (sign bit is 0), branch to the effective target address after the instruction in the delay slot is executed. If the branch is not taken, the instruction in the delay slot is not executed. Restrictions: GPR 31 must not be used for the source register rs, because such an instruction does not have the same effect when re-executed. The result of executing such an instruction is undefined. This restriction permits an exception handler to resume execution by re- executing the branch when an exception occurs in the branch delay slot. Operation: I : tgt_offset ← sign_extend(offset || 02) condition ← GPR[rs] ≥ 0GPRLEN GPR[31] ← PC + 8 I + 1 :if condition then PC ← PC + tgt_offset else NullifyCurrentInstruction() endif Exceptions: Reserved Instruction Programming Notes: With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump and link (JAL) or jump and link register (JALR) instructions for procedure calls to more distant addresses. 31 2526 2021 1516 0 REGIMM rs BGEZALL offset 6 5 5 16 0 0 0 0 0 1 1 0 0 1 1 Branch on Greater Than or Equal to Zero Likely BGEZL CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-39 Format: BGEZL rs, offset MIPS II Purpose: To test a GPR then do a PC-relative conditional branch; execute the delay slot only if the branch is taken. Description: if (rs ≥ 0) then branch_likely An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address. If the contents of GPR rs are greater than or equal to zero (sign bit is 0), branch to the effective target address after the instruction in the delay slot is executed. If the branch is not taken, the instruction in the delay slot is not executed. Restrictions: None Operation: I : tgt_offset ← sign_extend(offset || 02) condition ← GPR[rs] ≥ 0GPRLEN I + 1 :if condition then PC ← PC + tgt_offset else NullifyCurrentInstruction() endif Exceptions: Reserved Instruction Programming Notes: With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register (JR) instructions to branch to more distant addresses. 31 2526 2021 1516 0 REGIMM rs BGEZL offset 6 5 5 16 0 0 0 0 0 1 0 0 0 1 1 BGTZ Branch on Greater Than Zero A-40 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Format: BGTZ rs, offset MIPS I Purpose: To test a GPR then do a PC-relative conditional branch. Description: if (rs > 0) then branch
An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address
of the instruction following the branch (not the branch itself), in the branch delay slot,
to form a PC-relative effective target address.
If the contents of GPR rs are greater than zero (sign bit is 0 but value not zero), branch
to the effective target address after the instruction in the delay slot is executed.
Restrictions:
None
Operation:
I : tgt_offset ← sign_extend(offset || 02)
condition ← GPR[rs] > 0GPRLEN
I + 1 : if condition then
PC ← PC + tgt_offset
endif
Exceptions:
None
Programming Notes:
With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes.
Use jump (J) or jump register (JR) instructions to branch to more distant addresses.
31 2526 2021 1516 0
BGTZ rs 0 offset
6 5 5 16
0 0 0 1 1 1 0 0 0 0 0
Branch on Greater Than Zero Likely BGTZL
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-41
Format: BGTZL rs, offset MIPS II
Purpose: To test a GPR then do a PC-relative conditional branch; execute the delay
slot only if the branch is taken.
Description: if (rs > 0) then branch_likely
An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address
of the instruction following the branch (not the branch itself), in the branch delay slot,
to form a PC-relative effective target address.
If the contents of GPR rs are greater than zero (sign bit is 0 but value not zero), branch
to the effective target address after the instruction in the delay slot is executed. If the
branch is not taken, the instruction in the delay slot is not executed.
Restrictions:
None
Operation:
I : tgt_offset ← sign_extend(offset || 02)
condition ← GPR[rs] > 0GPRLEN
I + 1 :if condition then
PC ← PC + tgt_offset
else
NullifyCurrentInstruction()
endif
Exceptions:
Reserved Instruction
Programming Notes:
With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes.
Use jump (J) or jump register (JR) instructions to branch to more distant addresses.
31 2526 2021 1516 0
BGTZL rs 0 offset
6 5 5 16
0 1 0 1 1 1 0 0 0 0 0
BLEZ Branch on Less Than or Equal to Zero
A-42 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Format: BLEZ rs, offset MIPS I
Purpose: To test a GPR then do a PC-relative conditional branch.
Description: if (rs ≤ 0) then branch
An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address
of the instruction following the branch (not the branch itself), in the branch delay slot,
to form a PC-relative effective target address.
If the contents of GPR rs are less than or equal to zero (sign bit is 1 or value is zero),
branch to the effective target address after the instruction in the delay slot is executed.
Restrictions:
None
Operation:
I : tgt_offset ← sign_extend(offset || 02)
condition ← GPR[rs] ≤ 0GPRLEN
I + 1 :if condition then
PC ← PC + tgt_offset
endif
Exceptions:
None
Programming Notes:
With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes.
Use jump (J) or jump register (JR) instructions to branch to more distant addresses.
31 2526 2021 1516 0
BLEZ rs 0 offset
6 5 5 16
0 0 0 1 1 0 0 0 0 0 0
Branch on Less Than or Equal to Zero Likely BLEZL
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-43
Format: BLEZL rs, offset MIPS II
Purpose: To test a GPR then do a PC-relative conditional branch; execute the delay
slot only if the branch is taken.
Description: if (rs ≤ 0) then branch_likely
An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address
of the instruction following the branch (not the branch itself), in the branch delay slot,
to form a PC-relative effective target address.
If the contents of GPR rs are less than or equal to zero (sign bit is 1 or value is zero),
branch to the effective target address after the instruction in the delay slot is executed.
If the branch is not taken, the instruction in the delay slot is not executed.
Restrictions:
None
Operation:
I : tgt_offset ← sign_extend(offset || 02)
condition ← GPR[rs] ≤ 0GPRLEN
I + 1 :if condition then
PC ← PC + tgt_offset
else
NullifyCurrentInstruction()
endif
Exceptions:
Reserved Instruction
Programming Notes:
With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes.
Use jump (J) or jump register (JR) instructions to branch to more distant addresses.
31 2526 2021 1516 0
BLEZL rs 0 offset
6 5 5 16
0 1 0 1 1 0 0 0 0 0 0
BLTZ Branch on Less Than Zero
A-44 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Format: BLTZ rs, offset MIPS I
Purpose: To test a GPR then do a PC-relative conditional branch.
Description: if (rs < 0) then branch An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address. If the contents of GPR rs are less than zero (sign bit is 1), branch to the effective target address after the instruction in the delay slot is executed. Restrictions: None Operation: I : tgt_offset ← sign_extend(offset || 02) condition ← GPR[rs] < 0GPRLEN I + 1 :if condition then PC ← PC + tgt_offset endif Exceptions: None Programming Notes: With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register (JR) instructions to branch to more distant addresses. 31 2526 2021 1516 0 REGIMM rs BLTZ offset 6 5 5 16 0 0 0 0 0 1 0 0 0 0 0 Branch on Less Than Zero And Link BLTZAL CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-45 Format: BLTZAL rs, offset MIPS I Purpose: To test a GPR then do a PC-relative conditional procedure call. Description: if (rs < 0) then procedure_call Place the return address link in GPR 31. The return link is the address of the second instruction following the branch (not the branch itself), where execution would continue after a procedure call. An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch, in the branch delay slot, to form a PC-relative effective target address. If the contents of GPR rs are less than zero (sign bit is 1), branch to the effective target address after the instruction in the delay slot is executed. Restrictions: GPR 31 must not be used for the source register rs, because such an instruction does not have the same effect when re-executed. The result of executing such an instruction is undefined. This restriction permits an exception handler to resume execution by re- executing the branch when an exception occurs in the branch delay slot. Operation: I : tgt_offset ← sign_extend(offset || 02) condition ← GPR[rs] < 0GPRLEN GPR[31] ← PC + 8 I + 1 :if condition then PC ← PC + tgt_offset endif Exceptions: None Programming Notes: With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump and link (JAL) or jump and link register (JALR) instructions for procedure calls to more distant addresses. 31 2526 2021 1516 0 REGIMM rs BLTZAL offset 6 5 5 16 0 0 0 0 0 1 1 0 0 0 0 BLTZALL Branch on Less Than Zero And Link Likely A-46 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Format: BLTZALL rs, offset MIPS II Purpose: To test a GPR then do a PC-relative conditional procedure call; execute the delay slot only if the branch is taken. Description: if (rs < 0) then procedure_call_likely Place the return address link in GPR 31. The return link is the address of the second instruction following the branch (not the branch itself), where execution would continue after a procedure call. An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch, in the branch delay slot, to form a PC-relative effective target address. If the contents of GPR rs are less than zero (sign bit is 1), branch to the effective target address after the instruction in the delay slot is executed. If the branch is not taken, the instruction in the delay slot is not executed. Restrictions: GPR 31 must not be used for the source register rs, because such an instruction does not have the same effect when re-executed. The result of executing such an instruction is undefined. This restriction permits an exception handler to resume execution by re- executing the branch when an exception occurs in the branch delay slot. Operation: I : tgt_offset ← sign_extend(offset || 02) condition ← GPR[rs] < 0GPRLEN GPR[31] ← PC + 8 I + 1 :if condition then PC ← PC + tgt_offset else NullifyCurrentInstruction() endif Exceptions: Reserved Instruction Programming Notes: With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump and link (JAL) or jump and link register (JALR) instructions for procedure calls to more distant addresses. 31 2526 2021 1516 0 REGIMM rs BLTZALL offset 6 5 5 16 0 0 0 0 0 1 1 0 0 1 0 Branch on Less Than Zero Likely BLTZL CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-47 Format: BLTZ rs, offset MIPS II Purpose: To test a GPR then do a PC-relative conditional branch; execute the delay slot only if the branch is taken. Description: if (rs < 0) then branch_likely An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address. If the contents of GPR rs are less than zero (sign bit is 1), branch to the effective target address after the instruction in the delay slot is executed. If the branch is not taken, the instruction in the delay slot is not executed. Restrictions: None Operation: I : tgt_offset ← sign_extend(offset || 02) condition ← GPR[rs] < 0GPRLEN I + 1 :if condition then PC ← PC + tgt_offset else NullifyCurrentInstruction() endif Exceptions: Reserved Instruction Programming Notes: With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register (JR) instructions to branch to more distant addresses. 31 2526 2021 1516 0 REGIMM rs BLTZL offset 6 5 5 16 0 0 0 0 0 1 0 0 0 1 0 BNE Branch on Not Equal A-48 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Format: BNE rs, rt, offset MIPS I Purpose: To compare GPRs then do a PC-relative conditional branch. Description: if (rs ≠ rt) then branch An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address. If the contents of GPR rs and GPR rt are not equal, branch to the effective target address after the instruction in the delay slot is executed. Restrictions: None Operation: I : tgt_offset ← sign_extend(offset || 02) condition ← (GPR[rs] ≠ GPR[rt]) I + 1 :if condition then PC ← PC + tgt_offset endif Exceptions: None Programming Notes: With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register (JR) instructions to branch to more distant addresses. 31 2526 2021 1516 0 BNE rs rt offset 6 5 5 16 0 0 0 1 0 1 Branch on Not Equal Likely BNEL CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-49 Format: BNEL rs, rt, offset MIPS II Purpose: To compare GPRs then do a PC-relative conditional branch; execute the delay slot only if the branch is taken. Description: if (rs ≠ rt) then branch_likely An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address. If the contents of GPR rs and GPR rt are not equal, branch to the effective target address after the instruction in the delay slot is executed. If the branch is not taken, the instruction in the delay slot is not executed. Restrictions: None Operation: I : tgt_offset ← sign_extend(offset || 02) condition ← (GPR[rs] ≠ GPR[rt]) I + 1 :if condition then PC ← PC + tgt_offset else NullifyCurrentInstruction() endif Exceptions: Reserved Instruction Programming Notes: With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register (JR) instructions to branch to more distant addresses. 31 2526 2021 1516 0 BNEL rs rt offset 6 5 5 16 0 1 0 1 0 1 BREAK Breakpoint A-50 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Format: BREAK MIPS I Purpose: To cause a Breakpoint exception. Description: A breakpoint exception occurs, immediately and unconditionally transferring control to the exception handler. The code field is available for use as software parameters, but is retrieved by the exception handler only by loading the contents of the memory word containing the instruction. Restrictions: None Operation: SignalException(Breakpoint) Exceptions: Breakpoint 31 2526 SPECIAL 6 0 BREAKcode 6 5 620 0 0 0 0 0 0 0 0 1 1 0 1 Coprocessor Operation COPz CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-51 Format: COP0 cop_fun MIPS I COP1 cop_fun COP2 cop_fun COP3 cop_fun Purpose: To execute a coprocessor instruction. Description: The coprocessor operation specified by cop_fun is performed by coprocessor unit zz. Details of coprocessor operations must be found in the specification for each coprocessor. Each MIPS architecture level defines up to 4 coprocessor units, numbered 0 to 3 (see Coprocessor Instructions on page A-11). The opcodes corresponding to coprocessors that are not defined by an architecture level may be used for other instructions. Restrictions: Access to the coprocessors is controlled by system software. Each coprocessor has a “coprocessor usable” bit in the System Control coprocessor. The usable bit must be set for a user program to execute a coprocessor instruction. If the usable bit is not set, an attempt to execute the instruction will result in a Coprocessor Unusable exception. An unimplemented coprocessor must never be enabled. The result of executing this instruction for an unimplemented coprocessor when the usable bit is set, is undefined. See specification for the specific coprocessor being programmed. Operation: CoprocessorOperation (z, cop_fun) Exceptions: Reserved Instruction Coprocessor Unusable Coprocessor interrupt or Floating-Point Exception (CP1 only for some processors) 31 2526 COPz 6 0 cop_fun 26 0 1 0 0 z z DADD Doubleword Add A-52 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Format: DADD rd, rs, rt MIPS III Purpose: To add 64-bit integers. If overflow occurs, then trap. Description: rd ← rs + rt The 64-bit doubleword value in GPR rt is added to the 64-bit value in GPR rs to produce a 64-bit result. If the addition results in 64-bit 2’s complement arithmetic overflow then the destination register is not modified and an Integer Overflow exception occurs. If it does not overflow, the 64-bit result is placed into GPR rd. Restrictions: None Operation: 64-bit processors temp ← GPR[rs] + GPR[rt] if (64_bit_arithmetic_overflow) then SignalException(IntegerOverflow) else GPR[rd] ← temp endif Exceptions: Integer Overflow Reserved Instruction Programming Notes: DADDU performs the same arithmetic operation but, does not trap on overflow. 31 2526 2021 1516 SPECIAL rs rt 6 5 5 rd 0 DADD 5 5 6 11 10 6 5 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 Doubleword Add Immediate DADDI CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-53 Format: DADDI rt, rs, immediate MIPS III Purpose: To add a constant to a 64-bit integer. If overflow occurs, then trap. Description: rt ← rs + immediate The 16-bit signed immediate is added to the 64-bit value in GPR rs to produce a 64-bit result. If the addition results in 64-bit 2’s complement arithmetic overflow then the destination register is not modified and an Integer Overflow exception occurs. If it does not overflow, the 64-bit result is placed into GPR rt. Restrictions: None Operation: 64-bit processors temp ← GPR[rs] + sign_extend(immediate) if (64_bit_arithmetic_overflow) then SignalException(IntegerOverflow) else GPR[rt] ← temp endif Exceptions: Integer Overflow Reserved Instruction Programming Notes: DADDIU performs the same arithmetic operation but, does not trap on overflow. 31 2526 2021 1516 0 DADDI rs rt immediate 6 5 5 16 0 1 1 0 0 0 DADDIU Doubleword Add Immediate Unsigned A-54 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Format: DADDIU rt, rs, immediate MIPS III Purpose: To add a constant to a 64-bit integer. Description: rt ← rs + immediate The 16-bit signed immediate is added to the 64-bit value in GPR rs and the 64-bit arithmetic result is placed into GPR rt. No Integer Overflow exception occurs under any circumstances. Restrictions: None Operation: 64-bit processors GPR[rt] ← GPR[rs] + sign_extend(immediate) Exceptions: Reserved Instruction Programming Notes: The term “unsigned” in the instruction name is a misnomer; this operation is 64-bit modulo arithmetic that does not trap on overflow. It is appropriate for arithmetic which is not signed, such as address arithmetic, or integer arithmetic environments that ignore overflow, such as “C” language arithmetic. 31 2526 2021 1516 0 DADDIU rs rt immediate 6 5 5 16 0 1 1 0 0 1 Doubleword Add Unsigned DADDU CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-55 Format: DADDU rd, rs, rt MIPS III Purpose: To add 64-bit integers. Description: rd ← rs + rt The 64-bit doubleword value in GPR rt is added to the 64-bit value in GPR rs and the 64-bit arithmetic result is placed into GPR rd. No Integer Overflow exception occurs under any circumstances. Restrictions: None Operation: 64-bit processors GPR[rd] ←GPR[rs] + GPR[rt] Exceptions: Reserved Instruction Programming Notes: The term “unsigned” in the instruction name is a misnomer; this operation is 64-bit modulo arithmetic that does not trap on overflow. It is appropriate for arithmetic which is not signed, such as address arithmetic, or integer arithmetic environments that ignore overflow, such as “C” language arithmetic. 31 2526 2021 1516 SPECIAL rs rt 6 5 5 rd 0 DADDU 5 5 6 11 10 6 5 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 1 DDIV Doubleword Divide A-56 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Format: DDIV rs, rt MIPS III Purpose: To divide 64-bit signed integers. Description: (LO, HI) ← rs / rt The 64-bit doubleword in GPR rs is divided by the 64-bit doubleword in GPR rt, treating both operands as signed values. The 64-bit quotient is placed into special register LO and the 64-bit remainder is placed into special register HI. No arithmetic exception occurs under any circumstances. Restrictions: If either of the two preceding instructions is MFHI or MFLO, the result of the MFHI or MFLO is undefined. Reads of the HI or LO special registers must be separated from subsequent instructions that write to them by two or more other instructions. If the divisor in GPR rt is zero, the arithmetic result value is undefined. Operation: 64-bit processors I - 2 :, I - 1 : LO, HI ← undefined I : LO ← GPR[rs] div GPR[rt] HI ← GPR[rs] mod GPR[rt] Exceptions: Reserved Instruction Programming Notes: See the Programming Notes for the DIV instruction. 31 2526 2021 1516 0 rs rt 6 5 5 6 5 10 6 SPECIAL 0 DDIV 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 Doubleword Divide Unsigned DDIVU CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-57 Format: DDIVU rs, rt MIPS III Purpose: To divide 64-bit unsigned integers. Description: (LO, HI) ← rs / rt The 64-bit doubleword in GPR rs is divided by the 64-bit doubleword in GPR rt, treating both operands as unsigned values. The 64-bit quotient is placed into special register LO and the 64-bit remainder is placed into special register HI. No arithmetic exception occurs under any circumstances. Restrictions: If either of the two preceding instructions is MFHI or MFLO, the result of the MFHI or MFLO is undefined. Reads of the HI or LO special registers must be separated from subsequent instructions that write to them by two or more other instructions. If the divisor in GPR rt is zero, the arithmetic result value is undefined. Operation: 64-bit processors I - 2 :, I - 1 : LO, HI ← undefined I : LO ← (0 || GPR[rs]) div (0 || GPR[rt]) HI ← (0 || GPR[rs]) mod (0 || GPR[rt]) Exceptions: Reserved instruction Programming Notes: See the Programming Notes for the DIV instruction. 31 2526 2021 1516 0 rs rt 6 5 5 6 5 10 6 SPECIAL 0 DDIVU 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 DIV Divide Word A-58 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Format: DIV rs, rt MIPS I Purpose: To divide 32-bit signed integers. Description: (LO, HI) ← rs / rt The 32-bit word value in GPR rs is divided by the 32-bit value in GPR rt, treating both operands as signed values. The 32-bit quotient is placed into special register LO and the 32-bit remainder is placed into special register HI. No arithmetic exception occurs under any circumstances. Restrictions: On 64-bit processors, if either GPR rt or GPR rs do not contain sign-extended 32-bit values (bits 63..31 equal), then the result of the operation is undefined. If either of the two preceding instructions is MFHI or MFLO, the result of the MFHI or MFLO is undefined. Reads of the HI or LO special registers must be separated from subsequent instructions that write to them by two or more other instructions. If the divisor in GPR rt is zero, the arithmetic result value is undefined. Operation: if (NotWordValue(GPR[rs]) or NotWordValue(GPR[rt])) then UndefinedResult() endif I - 2 :, I - 1 : LO, HI ← undefined I : q ← GPR[rs]31..0 div GPR[rt]31..0 LO ← sign_extend(q31..0) r ← GPR[rs]31..0 mod GPR[rt]31..0 HI ← sign_extend(r31..0) Exceptions: None Programming Notes: In some processors the integer divide operation may proceed asynchronously and allow other CPU instructions to execute before it is complete. An attempt to read LO or HI before the results are written will wait (interlock) until the results are ready. Asynchronous execution does not affect the program result, but offers an opportunity for performance improvement by scheduling the divide so that other instructions can execute in parallel. 31 2526 2021 1516 0 rs rt 6 5 5 6 5 10 6 SPECIAL 0 DIV 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 Divide Word DIV CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-59 No arithmetic exception occurs under any circumstances. If divide-by-zero or overflow conditions should be detected and some action taken, then the divide instruction is typically followed by additional instructions to check for a zero divisor and/or for overflow. If the divide is asynchronous then the zero-divisor check can execute in parallel with the divide. The action taken on either divide-by-zero or overflow is either a convention within the program itself or more typically, the system software; one possibility is to take a BREAK exception with a code field value to signal the problem to the system software. As an example, the C programming language in a UNIX environment expects division by zero to either terminate the program or execute a program-specified signal handler. C does not expect overflow to cause any exceptional condition. If the C compiler uses a divide instruction, it also emits code to test for a zero divisor and execute a BREAK instruction to inform the operating system if one is detected. DIVU Divide Unsigned Word A-60 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Format: DIVU rs, rt MIPS I Purpose: To divide 32-bit unsigned integers. Description: (LO, HI) ← rs / rt The 32-bit word value in GPR rs is divided by the 32-bit value in GPR rt, treating both operands as unsigned values. The 32-bit quotient is placed into special register LO and the 32-bit remainder is placed into special register HI. No arithmetic exception occurs under any circumstances. Restrictions: On 64-bit processors, if either GPR rt or GPR rs do not contain sign-extended 32-bit values (bits 63..31 equal), then the result of the operation is undefined. If either of the two preceding instructions is MFHI or MFLO, the result of the MFHI or MFLO is undefined. Reads of the HI or LO special registers must be separated from subsequent instructions that write to them, like this one, by two or more other instructions. If the divisor in GPR rt is zero, the arithmetic result is undefined. Operation: if (NotWordValue(GPR[rs]) or NotWordValue(GPR[rt])) then UndefinedResult() endif I - 2 :, I - 1 : LO, HI ← undefined I : q ← (0 || GPR[rs]31..0) div (0 || GPR[rt]31..0) LO ← sign_extend(q31..0) r ← (0 || GPR[rs]31..0) mod (0 || GPR[rt]31..0) HI ← sign_extend(r31..0) Exceptions: None Programming Notes: See the Programming Notes for the DIV instruction. 31 2526 2021 1516 0 rs rt 6 5 5 6 5 10 6 SPECIAL 0 DIVU 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 Doubleword Multiply DMULT CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-61 Format: DMULT rs, rt MIPS III Purpose: To multiply 64-bit signed integers. Description: (LO, HI) ← rs × rt The 64-bit doubleword value in GPR rt is multiplied by the 64-bit value in GPR rs, treating both operands as signed values, to produce a 128-bit result. The low-order 64- bit doubleword of the result is placed into special register LO, and the high-order 64- bit doubleword is placed into special register HI. No arithmetic exception occurs under any circumstances. Restrictions: If either of the two preceding instructions is MFHI or MFLO, the result of the MFHI or MFLO is undefined. Reads of the HI or LO special registers must be separated from subsequent instructions that write to them by two or more other instructions. Operation: 64-bit processors I - 2 :, I - 1 :LO, HI ← undefined I : prod ← GPR[rs] * GPR[rt] LO ← prod63..0 H I ← prod127..64 Exceptions: Reserved Instruction Programming Notes: In some processors the integer multiply operation may proceed asynchronously and allow other CPU instructions to execute before it is complete. An attempt to read LO or HI before the results are written will wait (interlock) until the results are ready. Asynchronous execution does not affect the program result, but offers an opportunity for performance improvement by scheduling the multiply so that other instructions can execute in parallel. Programs that require overflow detection must check for it explicitly. 31 2526 2021 1516 0 rs rt 6 5 5 6 5 10 6 SPECIAL 0 DMULT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 DMULTU Doubleword Multiply Unsigned A-62 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Format: DMULTU rs, rt MIPS III Purpose: To multiply 64-bit unsigned integers. Description: (LO, HI) ← rs × rt The 64-bit doubleword value in GPR rt is multiplied by the 64-bit value in GPR rs, treating both operands as unsigned values, to produce a 128-bit result. The low-order 64-bit doubleword of the result is placed into special register LO, and the high-order 64-bit doubleword is placed into special register HI. No arithmetic exception occurs under any circumstances. Restrictions: If either of the two preceding instructions is MFHI or MFLO, the result of the MFHI or MFLO is undefined. Reads of the HI or LO special registers must be separated from subsequent instructions that write to them by two or more other instructions. Operation: 64-bit processors I - 2 :, I - 1 :LO, HI ← undefined I : prod ← (0 || GPR[rs]) * (0 || GPR[rt]) LO ← prod63..0 HI ← prod127..64 Exceptions: Reserved Instruction 31 2526 2021 1516 0 rs rt 6 5 5 6 5 10 6 SPECIAL 0 DMULTU 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 1 Doubleword Shift Left Logical DSLL CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-63 Format: DSLL rd, rt, sa MIPS III Purpose: To left shift a doubleword by a fixed amount 0 to 31 bits. Description: rd ← rt << sa The 64-bit doubleword contents of GPR rt are shifted left, inserting zeros into the emptied bits; the result is placed in GPR rd. The bit shift count in the range 0 to 31 is specified by sa. Restrictions: None Operation: 64-bit processors s ← 0 || sa GPR[rd]← GPR[rt](63–s)..0 || 0s Exceptions: Reserved Instruction 31 2526 2021 1516 SPECIAL 0 rt 6 5 5 rd sa DSLL 5 5 6 11 10 6 5 0 0 0 0 0 0 0 1 1 1 0 0 00 0 0 0 0 DSLL32 Doubleword Shift Left Logical Plus 32 A-64 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Format: DSLL32 rd, rt, sa MIPS III Purpose: To left shift a doubleword by a fixed amount 32 to 63 bits. Description: rd ← rt << (sa+32) The 64-bit doubleword contents of GPR rt are shifted left, inserting zeros into the emptied bits; the result is placed in GPR rd. The bit shift count in the range 32 to 63 is specified by sa+32. Restrictions: None Operation: 64-bit processors s ← 1 || sa /* 32+sa */ GPR[rd]← GPR[rt](63–s)..0 || 0s Exceptions: Reserved Instruction 31 2526 2021 1516 SPECIAL rt 6 5 5 rd sa DSLL32 5 5 6 11 10 6 5 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 Doubleword Shift Left Logical Variable DSLLV CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-65 Format: DSLLV rd, rt, rs MIPS III Purpose: To left shift a doubleword by a variable number of bits. Description: rd ← rt << rs The 64-bit doubleword contents of GPR rt are shifted left, inserting zeros into the emptied bits; the result is placed in GPR rd. The bit shift count in the range 0 to 63 is specified by the low-order six bits in GPR rs. Restrictions: None Operation: 64-bit processors s ← 0 || GPR[rs]5..0 GPR[rd]← GPR[rt](63–s)..0 || 0s Exceptions: Reserved Instruction 31 2526 2021 1516 SPECIAL rs rt 6 5 5 rd 0 DSLLV 5 5 6 11 10 6 5 0 0 0 0 0 0 0 0 1 0 1 0 00 0 0 0 0 DSRA Doubleword Shift Right Arithmetic A-66 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Format: DSRA rd, rt, sa MIPS III Purpose: To arithmetic right shift a doubleword by a fixed amount 0 to 31 bits. Description: rd ← rt >> sa (arithmetic)
The 64-bit doubleword contents of GPR rt are shifted right, duplicating the sign bit (63)
into the emptied bits; the result is placed in GPR rd. The bit shift count in the range 0
to 31 is specified by sa.
Restrictions:
None
Operation: 64-bit processors
s ← 0 || sa
GPR[rd]← (GPR[rt]63)s || GPR[rt] 63..s
Exceptions:
Reserved Instruction
31 2526 2021 1516
SPECIAL 0 rt
6 5 5
rd sa DSRA
5 5 6
11 10 6 5 0
0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 1 1
Doubleword Shift Right Arithmetic Plus 32 DSRA32
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-67
Format: DSRA32 rd, rt, sa MIPS III
Purpose: To arithmetic right shift a doubleword by a fixed amount 32-63 bits.
Description: rd ← rt >> (sa+32) (arithmetic)
The doubleword contents of GPR rt are shifted right, duplicating the sign bit (63) into
the emptied bits; the result is placed in GPR rd. The bit shift count in the range 32 to 63
is specified by sa+32.
Restrictions:
None
Operation: 64-bit processors
s ← 1 || sa /* 32+sa */
GPR[rd]← (GPR[rt]63)s || GPR[rt] 63..s
Exceptions:
Reserved Instruction
31 2526 2021 1516
SPECIAL 0 rt
6 5 5
rd sa DSRA32
5 5 6
11 10 6 5 0
0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1
DSRAV Doubleword Shift Right Arithmetic Variable
A-68 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Format: DSRAV rd, rt, rs MIPS III
Purpose: To arithmetic right shift a doubleword by a variable number of bits.
Description: rd ← rt >> rs (arithmetic)
The doubleword contents of GPR rt are shifted right, duplicating the sign bit (63) into
the emptied bits; the result is placed in GPR rd. The bit shift count in the range 0 to 63
is specified by the low-order six bits in GPR rs.
Restrictions:
None
Operation: 64-bit processors
s ← GPR[rs]5..0
GPR[rd]← (GPR[rt]63)s || GPR[rt]63..s
Exceptions:
Reserved Instruction
31 2526 2021 1516
SPECIAL rs rt
6 5 5
rd 0 DSRAV
5 5 6
11 10 6 5 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1
Doubleword Shift Right Logical DSRL
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-69
Format: DSRL rd, rt, sa MIPS III
Purpose: To logical right shift a doubleword by a fixed amount 0 to 31 bits.
Description: rd ← rt >> sa (logical)
The doubleword contents of GPR rt are shifted right, inserting zeros into the emptied
bits; the result is placed in GPR rd. The bit shift count in the range 0 to 31 is specified
by sa.
Restrictions:
None
Operation: 64-bit processors
s ← 0 || sa
GPR[rd]← 0s || GPR[rt]63..s
Exceptions:
Reserved Instruction
31 2526 2021 1516
SPECIAL rt
6 5 5
rd sa DSRL
5 5 6
11 10 6 5 0
0 0 0 0 0 0 1 1 1 0 1 0
0
0 0 0 0 0
DSRL32 Doubleword Shift Right Logical Plus 32
A-70 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Format: DSRL32 rd, rt, sa MIPS III
Purpose: To logical right shift a doubleword by a fixed amount 32 to 63 bits.
Description: rd ← rt >> (sa+32) (logical)
The 64-bit doubleword contents of GPR rt are shifted right, inserting zeros into the
emptied bits; the result is placed in GPR rd. The bit shift count in the range 32 to 63 is
specified by sa+32.
Restrictions:
None
Operation: 64-bit processors
s ← 1 || sa /* 32+sa */
GPR[rd]← 0s || GPR[rt]63..s
Exceptions:
Reserved Instruction
31 2526 2021 1516
SPECIAL rt
6 5 5
rd sa DSRL32
5 5 6
11 10 6 5 0
0 0 0 0 0 0 1 1 1 1 1 0
0
0 0 0 0 0
Doubleword Shift Right Logical Variable DSRLV
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-71
Format: DSRLV rd, rt, rs MIPS III
Purpose: To logical right shift a doubleword by a variable number of bits.
Description: rd ← rt >> rs (logical)
The 64-bit doubleword contents of GPR rt are shifted right, inserting zeros into the
emptied bits; the result is placed in GPR rd. The bit shift count in the range 0 to 63 is
specified by the low-order six bits in GPR rs.
Restrictions:
None
Operation: 64-bit processors
s ← GPR[rs]5..0
GPR[rd]← 0s || GPR[rt]63..s
Exceptions:
Reserved Instruction
31 2526 2021 1516
SPECIAL rt
6 5 5
rd 0 DSRLV
5 5 6
11 10 6 5 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0
rs
DSUB Doubleword Subtract
A-72 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Format: DSUB rd, rs, rt MIPS III
Purpose: To subtract 64-bit integers; trap if overflow.
Description: rd ← rs – rt
The 64-bit doubleword value in GPR rt is subtracted from the 64-bit value in GPR rs to
produce a 64-bit result. If the subtraction results in 64-bit 2’s complement arithmetic
overflow then the destination register is not modified and an Integer Overflow
exception occurs. If it does not overflow, the 64-bit result is placed into GPR rd.
Restrictions:
None
Operation: 64-bit processors
temp ← GPR[rs] – GPR[rt]
if (64_bit_arithmetic_overflow) then
SignalException(IntegerOverflow)
else
GPR[rd] ← temp
endif
Exceptions:
Integer Overflow
Reserved Instruction
Programming Notes:
DSUBU performs the same arithmetic operation but, does not trap on overflow.
31 2526 2021 1516
SPECIAL rs rt
6 5 5
rd 0 DSUB
5 5 6
11 10 6 5 0
0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 0
Doubleword Subtract Unsigned DSUBU
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-73
Format: DSUBU rd, rs, rt MIPS III
Purpose: To subtract 64-bit integers.
Description: rd ← rs – rt
The 64-bit doubleword value in GPR rt is subtracted from the 64-bit value in GPR rs
and the 64-bit arithmetic result is placed into GPR rd.
No Integer Overflow exception occurs under any circumstances.
Restrictions:
None
Operation: 64-bit processors
GPR[rd] ← GPR[rs] – GPR[rt]
Exceptions:
Reserved Instruction
Programming Notes:
The term “unsigned” in the instruction name is a misnomer; this operation is 64-bit
modulo arithmetic that does not trap on overflow. It is appropriate for arithmetic
which is not signed, such as address arithmetic, or integer arithmetic environments
that ignore overflow, such as “C” language arithmetic.
31 2526 2021 1516
SPECIAL rs rt
6 5 5
rd 0 DSUBU
5 5 6
11 10 6 5 0
0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 1
J Jump
A-74 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Format: J target MIPS I
Purpose: To branch within the current 256 MB aligned region.
Description:
This is a PC-region branch (not PC-relative); the effective target address is in the
“current” 256 MB aligned region. The low 28 bits of the target address is the instr_index
field shifted left 2 bits. The remaining upper bits are the corresponding bits of the
address of the instruction in the delay slot (not the branch itself).
Jump to the effective target address. Execute the instruction following the jump, in the
branch delay slot, before jumping.
Restrictions:
None
Operation:
I :
I + 1 :PC ← PCGPRLEN..28 || instr_index || 02
Exceptions:
None
Programming Notes:
Forming the branch target address by catenating PC and index bits rather than adding
a signed offset to the PC is an advantage if all program code addresses fit into a 256 MB
region aligned on a 256 MB boundary. It allows a branch to anywhere in the region
from anywhere in the region which a signed relative offset would not allow.
This definition creates the boundary case where the branch instruction is in the last
word of a 256 MB region and can therefore only branch to the following 256 MB region
containing the branch delay slot.
31 2526
J
6
0
instr_index
26
0 0 0 0 1 0
Jump And Link JAL
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-75
Format: JAL target MIPS I
Purpose: To procedure call within the current 256 MB aligned region.
Description:
Place the return address link in GPR 31. The return link is the address of the second
instruction following the branch, where execution would continue after a procedure
call.
This is a PC-region branch (not PC-relative); the effective target address is in the
“current” 256 MB aligned region. The low 28 bits of the target address is the instr_index
field shifted left 2 bits. The remaining upper bits are the corresponding bits of the
address of the instruction in the delay slot (not the branch itself).
Jump to the effective target address. Execute the instruction following the jump, in the
branch delay slot, before jumping.
Restrictions:
None
Operation:
I : GPR[31] ← PC + 8
I + 1 :PC ← PCGPRLEN..28 || instr_index || 02
Exceptions:
None
Programming Notes:
Forming the branch target address by catenating PC and index bits rather than adding
a signed offset to the PC is an advantage if all program code addresses fit into a 256 MB
region aligned on a 256 MB boundary. It allows a branch to anywhere in the region
from anywhere in the region which a signed relative offset would not allow.
This definition creates the boundary case where the branch instruction is in the last
word of a 256 MB region and can therefore only branch to the following 256 MB region
containing the branch delay slot.
31 2526
JAL
6
0
instr_index
26
0 0 0 0 1 1
JALR Jump And Link Register
A-76 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Format: JALR rs (rd = 31 implied) MIPS I
JALR rd, rs
Purpose: To procedure call to an instruction address in a register.
Description: rd ← return_addr, PC ← rs
Place the return address link in GPR rd. The return link is the address of the second
instruction following the branch, where execution would continue after a procedure
call.
Jump to the effective target address in GPR rs. Execute the instruction following the
jump, in the branch delay slot, before jumping.
Restrictions:
Register specifiers rs and rd must not be equal, because such an instruction does not
have the same effect when re-executed. The result of executing such an instruction is
undefined. This restriction permits an exception handler to resume execution by re-
executing the branch when an exception occurs in the branch delay slot.
The effective target address in GPR rs must be naturally aligned. If either of the two
least-significant bits are not -zero, then an Address Error exception occurs, not for the
jump instruction, but when the branch target is subsequently fetched as an instruction.
Operation:
I : temp ← GPR[rs]
GPR[rd] ← PC + 8
I + 1 :PC ← temp
Exceptions:
None
Programming Notes:
This is the only branch-and-link instruction that can select a register for the return link;
all other link instructions use GPR 31 The default register for GPR rd, if omitted in the
assembly language instruction, is GPR 31.
31 2526 2021 1516
SPECIAL rs
6 5 5
rd JALR
5 5 6
11 10 6 5 0
0 0 1 0 0 10 0 0 0 00 0 0 0 00 0 0 0 0 0
Jump Register JR
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-77
Format: JR rs MIPS I
Purpose: To branch to an instruction address in a register.
Description: PC ← rs
Jump to the effective target address in GPR rs. Execute the instruction following the
jump, in the branch delay slot, before jumping.
Restrictions:
The effective target address in GPR rs must be naturally aligned. If either of the two
least-significant bits are not -zero, then an Address Error exception occurs, not for the
jump instruction, but when the branch target is subsequently fetched as an instruction.
Operation:
I : temp ← GPR[rs]
I + 1 :PC ← temp
Exceptions:
None
21 2031 2526
SPECIAL
6
0
JRrs
6 5
5 15 6
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
LB Load Byte
A-78 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Format: LB rt, offset(base) MIPS I
Purpose: To load a byte from memory as a signed value.
Description: rt ← memory[base+offset]
The contents of the 8-bit byte at the memory location specified by the effective address
are fetched, sign-extended, and placed in GPR rt. The 16-bit signed offset is added to
the contents of GPR base to form the effective address.
Restrictions:
None
Operation: 32-bit processors
vAddr ← sign_extend(offset) + GPR[base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)
pAddr ← pAddr(PSIZE-1).. 2 || (pAddr1..0 xor ReverseEndian2)
memword ← LoadMemory (uncached, BYTE, pAddr, vAddr, DATA)
byte ← vAddr1..0 xor BigEndianCPU2
GPR[rt] ← sign_extend(memword7+8*byte..8*byte)
Operation: 64-bit processors
vAddr ← sign_extend(offset) + GPR[base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)
pAddr ← pAddrPSIZE–1..3 || (pAddr2..0 xor ReverseEndian3)
memdouble ← LoadMemory (uncached, BYTE, pAddr, vAddr, DATA)
byte ← vAddr2..0 xor BigEndianCPU3
GPR[rt] ← sign_extend(memdouble7+8*byte..8*byte)
Exceptions:
TLB Refill, TLB Invalid
Address Error
31 2526 2021 1516 0
LB base rt offset
6 5 5 16
1 0 0 0 0 0
Load Byte Unsigned LBU
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-79
Format: LBU rt, offset(base) MIPS I
Purpose: To load a byte from memory as an unsigned value.
Description: rt ← memory[base+offset]
The contents of the 8-bit byte at the memory location specified by the effective address
are fetched, zero-extended, and placed in GPR rt. The 16-bit signed offset is added to
the contents of GPR base to form the effective address.
Restrictions:
None
Operation: 32-bit processors
vAddr ← sign_extend(offset) + GPR[base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)
pAddr ← pAddrPSIZE – 1 .. 2 || (pAddr1..0 xor ReverseEndian2)
memword ← LoadMemory (uncached, BYTE, pAddr, vAddr, DATA)
byte ← vAddr1..0 xor BigEndianCPU2
GPR[rt] ← zero_extend(memword7+8* byte..8* byte)
Operation: 64-bit processors
vAddr ← sign_extend(offset) + GPR[base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)
pAddr ← pAddrPSIZE–1..3 || (pAddr2..0 xor ReverseEndian3)
memdouble ← LoadMemory (uncached, BYTE, pAddr, vAddr, DATA)
byte ← vAddr2..0 xor BigEndianCPU3
GPR[rt] ← zero_extend(memdouble7+8* byte..8* byte)
Exceptions:
TLB Refill, TLB Invalid
Address Error
31 2526 2021 1516 0
LBU base rt offset
6 5 5 16
1 0 0 1 0 0
LD Load Doubleword
A-80 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Format: LD rt, offset(base) MIPS III
Purpose: To load a doubleword from memory.
Description: rt ← memory[base+offset]
The contents of the 64-bit doubleword at the memory location specified by the aligned
effective address are fetched and placed in GPR rt. The 16-bit signed offset is added to
the contents of GPR base to form the effective address.
Restrictions:
The effective address must be naturally aligned. If any of the three least-significant bits
of the address are non-zero, an Address Error exception occurs.
MIPS IV: The low-order 3 bits of the offset field must be zero. If they are not, the result
of the instruction is undefined.
Operation: 64-bit processors
vAddr ← sign_extend(offset) + GPR[base]
if (vAddr2..0) ≠ 03 then SignalException(AddressError) endif
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)
memdouble ← LoadMemory (uncached, DOUBLEWORD, pAddr, vAddr, DATA)
GPR[rt] ← memdouble
Exceptions:
TLB Refill, TLB Invalid
Bus Error
Address Error
Reserved Instruction
31 2526 2021 1516 0
LD base rt offset
6 5 5 16
1 1 0 1 1 1
Load Doubleword to Coprocessor LDCz
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-81
Format: LDC1 rt, offset(base) MIPS II
LDC2 rt, offset(base)
Purpose: To load a doubleword from memory to a coprocessor general register.
Description: rt ← memory[base+offset]
The contents of the 64-bit doubleword at the memory location specified by the aligned
effective address are fetched and made available to coprocessor unit zz. The 16-bit
signed offset is added to the contents of GPR base to form the effective address.
The manner in which each coprocessor uses the data is defined by the individual
coprocessor specifications. The usual operation would place the data into coprocessor
general register rt.
Each MIPS architecture level defines up to 4 coprocessor units, numbered 0 to 3 (see
Coprocessor Instructions on page A-11). The opcodes corresponding to coprocessors
that are not defined by an architecture level may be used for other instructions.
Restrictions:
Access to the coprocessors is controlled by system software. Each coprocessor has a
“coprocessor usable” bit in the System Control coprocessor. The usable bit must be set
for a user program to execute a coprocessor instruction. If the usable bit is not set, an
attempt to execute the instruction will result in a Coprocessor Unusable exception. An
unimplemented coprocessor must never be enabled. The result of executing this
instruction for an unimplemented coprocessor when the usable bit is set, is undefined.
This instruction is not available for coprocessor 0, the System Control coprocessor, and
the opcode may be used for other instructions.
The effective address must be naturally aligned. If any of the three least-significant bits
of the effective address are non-zero, an Address Error exception occurs.
MIPS IV: The low-order 3 bits of the offset field must be zero. If they are not, the result
of the instruction is undefined.
Operation: 32-bit processors
vAddr ← sign_extend(offset) + GPR[base]
if (vAddr2..0) ≠ 03 then SignalException(AddressError) endif
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)
memdouble ← LoadMemory (uncached, DOUBLEWORD, pAddr, vAddr, DATA)
COP_LD (z, rt, memdouble)
31 2526 2021 1516 0
LDCz base rt offset
6 5 5 16
1 1 0 1 z z
LDCz Load Doubleword to Coprocessor
A-82 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Operation: 64-bit processors
vAddr ← sign_extend(offset) + GPR[base]
if (vAddr2..0) ≠ 03 then SignalException(AddressError) endif
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)
memdouble ← LoadMemory (uncached, DOUBLEWORD, pAddr, vAddr, DATA)
COP_LD (z, rt, memdouble)
Exceptions:
TLB Refill, TLB Invalid
Bus Error
Address Error
Reserved Instruction
Coprocessor Unusable
Load Doubleword Left LDL
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-83
Format: LDL rt, offset(base) MIPS III
Purpose: To load the most-significant part of a doubleword from an unaligned
memory address.
Description: rt ← rt MERGE memory[base+offset]
The 16-bit signed offset is added to the contents of GPR base to form an effective address
(EffAddr). EffAddr is the address of the most-significant of eight consecutive bytes
forming a doubleword in memory (DW) starting at an arbitrary byte boundary. A part
of DW, the most-significant one to eight bytes, is in the aligned doubleword containing
EffAddr. This part of DW is loaded appropriately into the most-significant (left) part of
GPR rt leaving the remainder of GPR rt unchanged.
The figure below illustrates this operation for big-endian byte ordering. The eight
consecutive bytes in 2..9 form an unaligned doubleword starting at location 2. A part
of DW, six bytes, is contained in the aligned doubleword containing the most-
significant byte at 2. First, LDL loads these six bytes into the left part of the destination
register and leaves the remainder of the destination unchanged. Next, the
complementary LDR loads the remainder of the unaligned doubleword.
Figure A-2 Unaligned Doubleword Load using LDL and LDR.
31 2526 2021 1516 0
LDL base rt offset
6 5 5 16
0 1 1 0 1 0
Doubleword at byte 2 in memory, big-endian byte order, – each mem byte contains its address
most — significance — least
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Memory
a b c d e f g h GPR 24: Initial contents
2 3 4 5 6 7 g h After executing LDL $24,2($0)
Then after LDR $24,9($0)
2 3 4 5 6 7 8 9
LDL Load Doubleword Left
A-84 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
The bytes loaded from memory to the destination register depend on both the offset of
the effective address within an aligned doubleword, i.e. the low three bits of the
address (vAddr2..0), and the current byte ordering mode of the processor (big- or little-
endian). The table below shows the bytes loaded for every combination of offset and
byte ordering.
Restrictions:
None
Operation: 64-bit processors
vAddr ← sign_extend(offset) + GPR[base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)
pAddr ← pAddr(PSIZE-1)..3 || (pAddr2..0 xor ReverseEndian3)
if BigEndianMem = 0 then
pAddr ← pAddr(PSIZE-1)..3 || 03
endif
byte ← vAddr2..0 xor BigEndianCPU3
memdouble ← LoadMemory (uncached, byte, pAddr, vAddr, DATA)
GPR[rt] ← memdouble7+8*byte..0 || GPR[rt]55–8*byte..0
Exceptions:
TLB Refill, TLB Invalid
Bus Error
Address Error
Reserved Instruction
Table A-28 Bytes Loaded by LDL Instruction
Memory contents and byte offsets (vAddr2..0) Initial contents of
Destination Registermost — significance — least
0 1 2 3 4 5 6 7 ← big- most — significance — least
I J K L M N O P a b c d e f g h
7 6 5 4 3 2 1 0 ← little-endian offset
Destination register contents after instruction (shaded is unchanged)
Big-endian byte ordering vAddr2..0 Little-endian byte ordering
I J K L M N O P 0 P b c d e f g h
J K L M N O P h 1 O P c d e f g h
K L M N O P g h 2 N O P d e f g h
L M N O P f g h 3 M N O P e f g h
M N O P e f g h 4 L M N O P f g h
N O P d e f g h 5 K L M N O P g h
O P c d e f g h 6 J K L M N O P h
P b c d e f g h 7 I J K L M N O P
Load Doubleword Right LDR
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-85
Format: LDR rt, offset(base) MIPS III
Purpose: To load the least-significant part of a doubleword from an unaligned
memory address.
Description: rt ← rt MERGE memory[base+offset]
The 16-bit signed offset is added to the contents of GPR base to form an effective address
(EffAddr). EffAddr is the address of the least-significant of eight consecutive bytes
forming a doubleword in memory (DW) starting at an arbitrary byte boundary. A part
of DW, the least-significant one to eight bytes, is in the aligned doubleword containing
EffAddr. This part of DW is loaded appropriately into the least-significant (right) part
of GPR rt leaving the remainder of GPR rt unchanged.
The figure below illustrates this operation for big-endian byte ordering. The eight
consecutive bytes in 2..9 form an unaligned doubleword starting at location 2. A part
of DW, two bytes, is contained in the aligned doubleword containing the least-
significant byte at 9. First, LDR loads these two bytes into the right part of the
destination register and leaves the remainder of the destination unchanged. Next, the
complementary LDL loads the remainder of the unaligned doubleword.
Figure A-3 Unaligned Doubleword Load using LDR and LDL.
31 2526 2021 1516 0
LDR base rt offset
6 5 5 16
0 1 1 0 1 1
Doubleword at byte 2 in memory, big-endian byte order, – each mem byte contains its address
most — significance — least
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Memory
a b c d e f g h GPR 24: Initial contents
a b c d e f 8 9 After executing LDR $24,9($0)
Then after LDL $24,2($0)
2 3 4 5 6 7 8 9
LDR Load Doubleword Right
A-86 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
The bytes loaded from memory to the destination register depend on both the offset of
the effective address within an aligned doubleword, i.e. the low three bits of the
address (vAddr2..0), and the current byte ordering mode of the processor (big- or little-
endian). The table below shows the bytes loaded for every combination of offset and
byte ordering.
Restrictions:
None
Operation: 64-bit processors
vAddr ← sign_extend(offset) + GPR[base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)
pAddr ← pAddr(PSIZE-1)..3 || (pAddr2..0 xor ReverseEndian3)
if BigEndianMem = 1 then
pAddr ← pAddr(PSIZE-1)..3 || 03
endif
byte ← vAddr2..0 xor BigEndianCPU3
memdouble ← LoadMemory (uncached, byte, pAddr, vAddr, DATA)
GPR[rt] ← GPR[rt]63..64-8*byte || memdouble63..8*byte
Exceptions:
TLB Refill, TLB Invalid
Bus Error
Address Error
Reserved Instruction
Table A-29 Bytes Loaded by LDR Instruction
Memory contents and byte offsets (vAddr2..0) Initial contents of
Destination Registermost — significance — least
0 1 2 3 4 5 6 7 ← big- most — significance — least
I J K L M N O P a b c d e f g h
7 6 5 4 3 2 1 0 ← little-endian offset
Destination register contents after instruction (shaded is unchanged)
Big-endian byte ordering vAddr2..0 Little-endian byte ordering
a b c d e f g I 0 I J K L M N O P
a b c d e f I J 1 a I J K L M N O
a b c d e I J K 2 a b I J K L M N
a b c d I J K L 3 a b c I J K L M
a b c I J K L M 4 a b c d I J K L
a b I J K L M N 5 a b c d e I J K
a I J K L M N O 6 a b c d e f I J
I J K L M N O P 7 a b c d e f g I
Load Halfword LH
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-87
Format: LH rt, offset(base) MIPS I
Purpose: To load a halfword from memory as a signed value.
Description: rt ← memory[base+offset]
The contents of the 16-bit halfword at the memory location specified by the aligned
effective address are fetched, sign-extended, and placed in GPR rt. The 16-bit signed
offset is added to the contents of GPR base to form the effective address.
Restrictions:
The effective address must be naturally aligned. If the least-significant bit of the
address is non-zero, an Address Error exception occurs.
MIPS IV: The low-order bit of the offset field must be zero. If it is not, the result of the
instruction is undefined.
Operation: 32-bit processors
vAddr ← sign_extend(offset) + GPR[base]
if (vAddr0) ≠ 0 then SignalException(AddressError) endif
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)
pAddr ← pAddrPSIZE – 1..2 || (pAddr1..0 xor (ReverseEndian || 0))
memword ← LoadMemory (uncached, HALFWORD, pAddr, vAddr, DATA)
byte ← vAddr1..0 xor (BigEndianCPU || 0)
GPR[rt] ← sign_extend(memword15+8*byte..8* byte)
Operation: 64-bit processors
vAddr ← sign_extend(offset) + GPR[base]
if (vAddr0) ≠ 0 then SignalException(AddressError) endif
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)
pAddr ← pAddrPSIZE – 1..3 || (pAddr2..0 xor (ReverseEndian || 0))
memdouble ← LoadMemory (uncached, HALFWORD, pAddr, vAddr, DATA)
byte ← vAddr2..0 xor (BigEndianCPU2 || 0)
GPR[rt] ← sign_extend(memdouble15+8*byte..8* byte)
Exceptions:
TLB Refill , TLB Invalid
Bus Error
Address Error
31 2526 2021 1516 0
LH base rt offset
6 5 5 16
1 0 0 0 0 1
LHU Load Halfword Unsigned
A-88 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Format: LHU rt, offset(base) MIPS I
Purpose: To load a halfword from memory as an unsigned value.
Description: rt ← memory[base+offset]
The contents of the 16-bit halfword at the memory location specified by the aligned
effective address are fetched, zero-extended, and placed in GPR rt. The 16-bit signed
offset is added to the contents of GPR base to form the effective address.
Restrictions:
The effective address must be naturally aligned. If the least-significant bit of the
address is non-zero, an Address Error exception occurs.
MIPS IV: The low-order bit of the offset field must be zero. If it is not, the result of the
instruction is undefined.
Operation: 32-bit processors
vAddr ← sign_extend(offset) + GPR[base]
if (vAddr0) ≠ 0 then SignalException(AddressError) endif
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)
pAddr ← pAddrPSIZE – 1..2 || (pAddr1..0 xor (ReverseEndian || 0))
memword ← LoadMemory (uncached, HALFWORD, pAddr, vAddr, DATA)
byte ← vAddr1..0 xor (BigEndianCPU || 0)
GPR[rt] ← zero_extend(memword15+8*byte..8*byte)
Operation: 64-bit processors
vAddr ← sign_extend(offset) + GPR[base]
if (vAddr0) ≠ 0 then SignalException(AddressError) endif
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)
pAddr ← pAddrPSIZE – 1..3 || (pAddr2..0 xor (ReverseEndian2 || 0))
memdouble ← LoadMemory (uncached, HALFWORD, pAddr, vAddr, DATA)
byte ← vAddr2..0 xor (BigEndianCPU2 || 0)
GPR[rt] ← zero_extend(memdouble15+8*byte..8*byte)
Exceptions:
TLB Refill, TLB Invalid
Address Error
31 2526 2021 1516 0
LHU base rt offset
6 5 5 16
1 0 0 1 0 1
Load Linked Word LL
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-89
Format: LL rt, offset(base) MIPS II
Purpose: To load a word from memory for an atomic read-modify-write.
Description: rt ← memory[base+offset]
The LL and SC instructions provide primitives to implement atomic Read-Modify-
Write (RMW) operations for cached memory locations.
The 16-bit signed offset is added to the contents of GPR base to form an effective
address.
The contents of the 32-bit word at the memory location specified by the aligned
effective address are fetched, sign-extended to the GPR register length if necessary, and
written into GPR rt. This begins a RMW sequence on the current processor.
There is one active RMW sequence per processor. When an LL is executed it starts the
active RMW sequence replacing any other sequence that was active.
The RMW sequence is completed by a subsequent SC instruction that either completes
the RMW sequence atomically and succeeds, or does not and fails. See the description
of SC for a list of events and conditions that cause the SC to fail and an example
instruction sequence using LL and SC.
Executing LL on one processor does not cause an action that, by itself, would cause an
SC for the same block to fail on another processor.
An execution of LL does not have to be followed by execution of SC; a program is free
to abandon the RMW sequence without attempting a write.
Restrictions:
The addressed location must be cached; if it is not, the result is undefined (see Memory
Access Types on page A-12).
The effective address must be naturally aligned. If either of the two least-significant
bits of the effective address are non-zero an Address Error exception occurs.
MIPS IV: The low-order 2 bits of the offset field must be zero. If they are not, the result
of the instruction is undefined.
31 2526 2021 1516 0
LL base rt offset
6 5 5 16
1 1 0 0 0 0
LL Load Linked Word
A-90 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Operation: 32-bit processors
vAddr ← sign_extend(offset) + GPR[base]
if (vAddr1..0) ≠ 02 then SignalException(AddressError) endif
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)
memword ← LoadMemory (uncached, WORD, pAddr, vAddr, DATA)
GPR[rt] ← memword
LLbit ← 1
Operation: 64-bit processors
vAddr ← sign_extend(offset) + GPR[base]
if (vAddr1..0) ≠ 02 then SignalException(AddressError) endif
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)
pAddr ← pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian || 02))
memdouble ← LoadMemory (uncached, WORD, pAddr, vAddr, DATA)
byte ← vAddr2..0 xor (BigEndianCPU || 02)
GPR[rt] ← sign_extend(memdouble31+8*byte..8*byte)
LLbit ← 1
Exceptions:
TLB Refill, TLB Invalid
Address Error
Reserved Instruction
Programming Notes:
There is no Load Linked Word Unsigned operation corresponding to Load Word
Unsigned.
Implementation Notes:
An LL on one processor must not take action that, by itself, would cause an SC for the
same block on another processor to fail. If an implementation depends on retaining the
data in cache during the RMW sequence, cache misses caused by LL must not fetch
data in the exclusive state, thus removing it from the cache, if it is present in another
cache.
Load Linked Doubleword LLD
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-91
Format: LLD rt, offset(base) MIPS III
Purpose: To load a doubleword from memory for an atomic read-modify-write.
Description: rt ← memory[base+offset]
The LLD and SCD instructions provide primitives to implement atomic Read-Modify-
Write (RMW) operations for cached memory locations.
The 16-bit signed offset is added to the contents of GPR base to form an effective
address.
The contents of the 64-bit doubleword at the memory location specified by the aligned
effective address are fetched and written into GPR rt. This begins a RMW sequence on
the current processor.
There is one active RMW sequence per processor. When an LLD is executed it starts
the active RMW sequence replacing any other sequence that was active.
The RMW sequence is completed by a subsequent SCD instruction that either
completes the RMW sequence atomically and succeeds, or does not and fails. See the
description of SCD for a list of events and conditions that cause the SCD to fail and an
example instruction sequence using LLD and SCD.
Executing LLD on one processor does not cause an action that, by itself, would cause
an SCD for the same block to fail on another processor.
An execution of LLD does not have to be followed by execution of SCD; a program is
free to abandon the RMW sequence without attempting a write.
Restrictions:
The addressed location must be cached; if it is not, the result is undefined (see Memory
Access Types on page A-12).
The effective address must be naturally aligned. If either of the three least-significant
bits of the effective address are non-zero an Address Error exception occurs.
MIPS IV: The low-order 3 bits of the offset field must be zero. If they are not, the result
of the instruction is undefined.
31 2526 2021 1516 0
LLD base rt offset
6 5 5 16
1 1 0 1 0 0
LLD Load Linked Doubleword
A-92 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Operation: 64-bit processors
vAddr ← sign_extend(offset) + GPR[base]
if (vAddr2..0) ≠ 03 then SignalException(AddressError) endif
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)
memdouble ← LoadMemory (uncached, DOUBLEWORD, pAddr, vAddr, DATA)
GPR[rt] ← memdouble
LLbit ← 1
Exceptions:
TLB Refill, TLB Invalid
Address Error
Reserved Instruction
Programming Notes:
Implementation Notes:
An LLD on one processor must not take action that, by itself, would cause an SCD for
the same block on another processor to fail. If an implementation depends on retaining
the data in cache during the RMW sequence, cache misses caused by LLD must not
fetch data in the exclusive state, thus removing it from the cache, if it is present in
another cache.
Load Upper Immediate LUI
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-93
Format: LUI rt, immediate MIPS I
Purpose: To load a constant into the upper half of a word.
Description: rt ← immediate || 016
The 16-bit immediate is shifted left 16 bits and concatenated with 16 bits of low-order
zeros. The 32-bit result is sign-extended and placed into GPR rt.
Restrictions:
None
Operation:
GPR[rt] ← sign_extend(immediate || 016)
Exceptions:
None
31 2526 2021 1516 0
LUI rt immediate
6 5 5 16
0 0 1 1 1 1
0
0 0 0 0 0
LW Load Word
A-94 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Format: LW rt, offset(base) MIPS I
Purpose: To load a word from memory as a signed value.
Description: rt ← memory[base+offset]
The contents of the 32-bit word at the memory location specified by the aligned
effective address are fetched, sign-extended to the GPR register length if necessary, and
placed in GPR rt. The 16-bit signed offset is added to the contents of GPR base to form
the effective address.
Restrictions:
The effective address must be naturally aligned. If either of the two least-significant
bits of the address are non-zero, an Address Error exception occurs.
MIPS IV: The low-order 2 bits of the offset field must be zero. If they are not, the result
of the instruction is undefined.
Operation: 32-bit processors
vAddr ← sign_extend(offset) + GPR[base]
if (vAddr1..0) ≠ 02 then SignalException(AddressError) endif
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)
memword ← LoadMemory (uncached, WORD, pAddr, vAddr, DATA)
GPR[rt] ← memword
Operation: 64-bit processors
vAddr ← sign_extend(offset) + GPR[base]
if (vAddr1..0) ≠ 02 then SignalException(AddressError) endif
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)
pAddr ← pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian || 02))
memdouble ← LoadMemory (uncached, WORD, pAddr, vAddr, DATA)
byte ← vAddr2..0 xor (BigEndianCPU || 02)
GPR[rt] ← sign_extend(memdouble31+8*byte..8*byte)
Exceptions:
TLB Refill, TLB Invalid
Bus Error
Address Error
31 2526 2021 1516 0
LW base rt offset
6 5 5 16
1 0 0 0 1 1
Load Word To Coprocessor LWCz
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-95
Format: LWC1 rt, offset(base) MIPS I
LWC2 rt, offset(base)
LWC3 rt, offset(base)
Purpose: To load a word from memory to a coprocessor general register.
Description: rt ← memory[base+offset]
The contents of the 32-bit word at the memory location specified by the aligned
effective address are fetched and made available to coprocessor unit zz. The 16-bit
signed offset is added to the contents of GPR base to form the effective address.
The manner in which each coprocessor uses the data is defined by the individual
coprocessor specification. The usual operation would place the data into coprocessor
general register rt.
Each MIPS architecture level defines up to 4 coprocessor units, numbered 0 to 3 (see
Coprocessor Instructions on page A-11). The opcodes corresponding to coprocessors
that are not defined by an architecture level may be used for other instructions.
Restrictions:
Access to the coprocessors is controlled by system software. Each coprocessor has a
“coprocessor usable” bit in the System Control coprocessor. The usable bit must be set
for a user program to execute a coprocessor instruction. If the usable bit is not set, an
attempt to execute the instruction will result in a Coprocessor Unusable exception. An
unimplemented coprocessor must never be enabled. The result of executing this
instruction for an unimplemented coprocessor when the usable bit is set, is undefined.
This instruction is not available for coprocessor 0, the System Control coprocessor, and
the opcode may be used for other instructions.
The effective address must be naturally aligned. If either of the two least-significant
bits of the address are non-zero, an Address Error exception occurs.
MIPS IV: The low-order 2 bits of the offset field must be zero. If they are not, the result
of the instruction is undefined.
Operation: 32-bit processors
I : vAddr ← sign_extend(offset) + GPR[base]
if (vAddr1..0) ≠ 02 then SignalException(AddressError) endif
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)
memword ← LoadMemory (uncached, WORD, pAddr, vAddr, DATA)
I + 1 :COP_LW (z, rt, memword)
31 2526 2021 1516 0
LWCz base rt offset
6 5 5 16
1 1 0 0 z z
LWCz Load Word To Coprocessor
A-96 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Operation: 64-bit processors
vAddr ← sign_extend(offset) + GPR[base}
if (vAddr1..0) ≠ 02 then SignalException(AddressError) endif
(pAddr, uncached)← AddressTranslation (vAddr, DATA, LOAD)
pAddr ← pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian || 02))
memdouble ← LoadMemory (uncached, DOUBLEWORD, pAddr, vAddr, DATA)
byte ← vAddr2..0 xor (BigEndianCPU || 02)
memword ← memdouble31+8*byte..8*byte
COP_LW (z, rt, memdouble)
Exceptions:
TLB Refill, TLB Invalid
Bus Error
Address Error
Coprocessor Unusable
Load Word Left LWL
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-97
Format: LWL rt, offset(base) MIPS I
Purpose: To load the most-significant part of a word as a signed value from an
unaligned memory address.
Description: rt ← rt MERGE memory[base+offset]
The 16-bit signed offset is added to the contents of GPR base to form an effective address
(EffAddr). EffAddr is the address of the most-significant of four consecutive bytes
forming a word in memory (W) starting at an arbitrary byte boundary. A part of W, the
most-significant one to four bytes, is in the aligned word containing EffAddr. This part
of W is loaded into the most-significant (left) part of the word in GPR rt. The remaining
least-significant part of the word in GPR rt is unchanged.
If GPR rt is a 64-bit register, the destination word is the low-order word of the register.
The loaded value is treated as a signed value; the word sign bit (bit 31) is always loaded
from memory and the new sign bit value is copied into bits 63..32.
Figure A-4 Unaligned Word Load using LWL and LWR.
31 2526 2021 1516 0
LWL base rt offset
6 5 5 16
1 0 0 0 1 0
Word at byte 2 in memory, big-endian byte order, – each mem byte contains its address
most – significance – least
0 1 2 3 4 5 6 7 8 9 Memory initial contents
e f g h 32-bit GPR 24: Initial contents
a b c d e f g h 64-bit GPR 24
2 3 g h After executing LWL $24,2($0)
sign bit (31) extend 2 3 g h
2 3 4 5 Then after LWR $24,5($0)
sign bit (31) extend 2 3 4 5
LWL Load Word Left
A-98 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
The figure above illustrates this operation for big-endian byte ordering for 32-bit and
64-bit registers. The four consecutive bytes in 2..5 form an unaligned word starting at
location 2. A part of W, two bytes, is in the aligned word containing the most-
significant byte at 2. First, LWL loads these two bytes into the left part of the
destination register word and leaves the right part of the destination word unchanged.
Next, the complementary LWR loads the remainder of the unaligned word.
The bytes loaded from memory to the destination register depend on both the offset of
the effective address within an aligned word, i.e. the low two bits of the address
(vAddr1..0), and the current byte ordering mode of the processor (big- or little-endian).
The table below shows the bytes loaded for every combination of offset and byte
ordering.
The unaligned loads, LWL and LWR, are exceptions to the load-delay scheduling
restriction in the MIPS I architecture. An unaligned load instruction to GPR rt that
immediately follows another load to GPR rt can “read” the loaded data. It will
correctly merge the 1 to 4 loaded bytes with the data loaded by the previous
instruction.
Table A-30 Bytes Loaded by LWL Instruction
Memory contents and byte offsets Initial contents of Dest Register
0 1 2 3 ← big-endian 64-bit register
I J K L offset (vAddr1..0) a b c d e f g h
3 2 1 0 ← little-endian most — significance — least
most least 32-bit register e f g h
— significance —
Destination 64-bit register contents after instruction (shaded is unchanged)
Big-endian byte ordering vAddr1..0 Little-endian byte ordering
sign bit (31) extended I J K L 0 sign bit (31) extended L f g h
sign bit (31) extended J K L h 1 sign bit (31) extended K L g h
sign bit (31) extended K L g h 2 sign bit (31) extended J K L h
sign bit (31) extended L f g h 3 sign bit (31) extended I J K L
The word sign (31) is always loaded and the value is copied into bits 63..32.
32-bit register Big-endian vAddr1..0 Little-endian
I J K L 0 L f g h
J K L h 1 K L g h
K L g h 2 J K L h
L f g h 3 I J K L
Load Word Left LWL
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-99
Restrictions:
MIPS I scheduling restriction: The loaded data is not available for use by the following
instruction. The instruction immediately following this one, unless it is an unaligned
load (LWL, LWR), may not use GPR rt as a source register. If this restriction is violated,
the result of the operation is undefined.
Operation: 32-bit processors
vAddr ← sign_extend(offset) + GPR[base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)
pAddr ← pAddr(PSIZE-1)..2 || (pAddr1..0 xor ReverseEndian2)
if BigEndianMem = 0 then
pAddr ← pAddr(PSIZE-1)..2 || 02
endif
byte ← vAddr1..0 xor BigEndianCPU2
memword ← LoadMemory (uncached, byte, pAddr, vAddr, DATA)
GPR[rt] ← memword7+8*byte..0 || GPR[rt]23–8*byte..0
Operation: 64-bit processors
vAddr ← sign_extend(offset) + GPR[base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)
pAddr ← pAddr(PSIZE-1)..3 || (pAddr2..0 xor ReverseEndian3)
if BigEndianMem = 0 then
pAddr ← pAddr(PSIZE-1)..3 || 03
endif
byte ← 0 || (vAddr1..0 xor BigEndianCPU2)
word ← vAddr2 xor BigEndianCPU
memdouble ← LoadMemory (uncached, byte, pAddr, vAddr, DATA)
temp ← memdouble31+32*word-8*byte..32*word || GPR[rt]23-8*byte..0
GPR[rt] ← (temp31)32 || temp
Exceptions:
TLB Refill, TLB Invalid
Bus Error
Address Error
Programming Notes:
The architecture provides no direct support for treating unaligned words as unsigned
values, i.e. zeroing bits 63..32 of the destination register when bit 31 is loaded. See SLL
or SLLV for a single-instruction method of propagating the word sign bit in a register
into the upper half of a 64-bit register.
LWR Load Word Right
A-100 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Format: LWR rt, offset(base) MIPS I
Purpose: To load the least-significant part of a word from an unaligned memory
address as a signed value.
Description: rt ← rt MERGE memory[base+offset]
The 16-bit signed offset is added to the contents of GPR base to form an effective address
(EffAddr). EffAddr is the address of the least-significant of four consecutive bytes
forming a word in memory (W) starting at an arbitrary byte boundary. A part of W, the
least-significant one to four bytes, is in the aligned word containing EffAddr. This part
of W is loaded into the least-significant (right) part of the word in GPR rt. The
remaining most-significant part of the word in GPR rt is unchanged.
If GPR rt is a 64-bit register, the destination word is the low-order word of the register.
The loaded value is treated as a signed value; if the word sign bit (bit 31) is loaded (i.e.
when all four bytes are loaded) then the new sign bit value is copied into bits 63..32. If
bit 31 is not loaded then the value of bits 63..32 is implementation dependent; the value
is either unchanged or a copy of the current value of bit 31. Executing both LWR and
LWL, in either order, delivers in a sign-extended word value in the destination register.
Figure A-5 Unaligned Word Load using LWR and LWL.
31 2526 2021 1516 0
LWR base rt offset
6 5 5 16
1 0 0 1 1 0
Word at byte 2 in memory, big-endian byte order, – each mem byte contains its address
most – significance – least
0 1 2 3 4 5 6 7 8 9 Memory initial contents
e f g h 32-bit GPR 24: Initial contents
a b c d e f g h 64-bit GPR 24
e f 4 5 After executing LWR $24,5($0)
no cng or sign ext e f 4 5
2 3 4 5 Then after LWL $24,2($0)
sign bit (31) extend 2 3 4 5
Load Word Right LWR
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-101
The figure above illustrates this operation for big-endian byte ordering for 32-bit and
64-bit registers. The four consecutive bytes in 2..5 form an unaligned word starting at
location 2. A part of W, two bytes, is in the aligned word containing the least-
significant byte at 5. First, LWR loads these two bytes into the right part of the
destination register. Next, the complementary LWL loads the remainder of the
unaligned word.
The bytes loaded from memory to the destination register depend on both the offset of
the effective address within an aligned word, i.e. the low two bits of the address
(vAddr1..0), and the current byte ordering mode of the processor (big- or little-endian).
The table below shows the bytes loaded for every combination of offset and byte
ordering.
The unaligned loads, LWL and LWR, are exceptions to the load-delay scheduling
restriction in the MIPS I architecture. An unaligned load to GPR rt that immediately
follows another load to GPR rt can “read” the loaded data. It will correctly merge the
1 to 4 loaded bytes with the data loaded by the previous instruction.
Table A-31 Bytes Loaded by LWR Instruction
Memory contents and byte offsets Initial contents of Dest Register
0 1 2 3 ← big-endian 64-bit register
I J K L offset (vAddr1..0) a b c d e f g h
3 2 1 0 ← little-endian most — significance — least
most least 32-bit register e f g h
— significance —
Destination 64-bit register contents after instruction (shaded is unchanged)
Big-endian byte ordering vAddr1..0 Little-endian byte ordering
No cng or sign-extend e f g I 0 sign bit (31) extended I J K L
No cng or sign-extend e f I J 1 No cng or sign-extend e I J K
No cng or sign-extend e I J K 2 No cng or sign-extend e f I J
sign bit (31) extended I J K L 3 No cng or sign-extend e f g I
When the word sign bit (31) is loaded, its value is copied into bits 63..32. When it
is not loaded, the behavior is implementation specific. Bits 63..32 are either
unchanged or a the value of the unloaded bit 31 is copied into them.
32-bit register big-endian vAddr1..0 little-endian
e f g I 0 I J K L
e f I J 1 e I J K
e I J K 2 e f I J
I J K L 3 e f g I
LWR Load Word Right
A-102 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Restrictions:
MIPS I scheduling restriction: The loaded data is not available for use by the following
instruction. The instruction immediately following this one, unless it is an unaligned
load (LWL, LWR), may not use GPR rt as a source register. If this restriction is violated,
the result of the operation is undefined.
Restrictions:
None
Operation: 32-bit processors
vAddr ← sign_extend(offset) + GPR[base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)
pAddr ← pAddr(PSIZE-1)..2 || (pAddr1..0 xor ReverseEndian2)
if BigEndianMem = 0 then
pAddr ← pAddr(PSIZE-1)..2 || 02
endif
byte ← vAddr1..0 xor BigEndianCPU2
memword ← LoadMemory (uncached, byte, pAddr, vAddr, DATA)
GPR[rt] ← memword31..32-8*byte || GPR[rt]31–8*byte..0
Operation: 64-bit processors
vAddr ← sign_extend(offset) + GPR[base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)
pAddr ← pAddr(PSIZE-1)..3 || (pAddr2..0 xor ReverseEndian3)
if BigEndianMem = 1 then
pAddr ← pAddr(PSIZE-1)..3 || 03
endif
byte ← vAddr1..0 xor BigEndianCPU2
word ← vAddr2 xor BigEndianCPU
memdouble ← LoadMemory (uncached, 0 || byte, pAddr, vAddr, DATA)
temp ← GPR[rt]31..32-8*byte || memdouble31+32*word..32*word+8*byte
if byte = 4 then
utemp ← (temp31)32 /* loaded bit 31, must sign extend */
else
one of the following two behaviors:
utemp ← GPR[rt]63..32 /* leave what was there alone */
utemp ← (GPR[rt]31)32 /* sign-extend bit 31 */
endif
GPR[rt] ← utemp || temp
Exceptions:
TLB Refill, TLB Invalid
Bus Error
Address Error
Load Word Right LWR
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-103
Programming Notes:
The architecture provides no direct support for treating unaligned words as unsigned
values, i.e. zeroing bits 63..32 of the destination register when bit 31 is loaded. See SLL
or SLLV for a single-instruction method of propagating the word sign bit in a register
into the upper half of a 64-bit register.
LWU Load Word Unsigned
A-104 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Format: LWU rt, offset(base) MIPS III
Purpose: To load a word from memory as an unsigned value.
Description: rt ← memory[base+offset]
The contents of the 32-bit word at the memory location specified by the aligned
effective address are fetched, zero-extended, and placed in GPR rt. The 16-bit signed
offset is added to the contents of GPR base to form the effective address.
Restrictions:
The effective address must be naturally aligned. If either of the two least-significant
bits of the address are non-zero, an Address Error exception occurs.
MIPS IV: The low-order 2 bits of the offset field must be zero. If they are not, the result
of the instruction is undefined.
Operation: 64-bit processors
vAddr ← sign_extend(offset) + GPR[base]
if (vAddr1..0) ≠ 02 then SignalException(AddressError) endif
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)
pAddr ← pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian || 02))
memdouble ← LoadMemory (uncached, WORD, pAddr, vAddr, DATA)
byte ← vAddr2..0 xor (BigEndianCPU || 02)
GPR[rt] ← 032 || memdouble31+8*byte..8*byte
Exceptions:
TLB Refill, TLB Invalid
Bus Error
Address Error
Reserved Instruction
31 2526 2021 1516 0
LWU base rt offset
6 5 5 16
1 0 0 1 1 1
Move From HI Register MFHI
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-105
Format: MFHI rd MIPS I
Purpose: To copy the special purpose HI register to a GPR.
Description: rd ← HI
The contents of special register HI are loaded into GPR rd.
Restrictions:
The two instructions that follow an MFHI instruction must not be instructions that
modify the HI register: DDIV, DDIVU, DIV, DIVU, DMULT, DMULTU, MTHI, MULT,
MULTU. If this restriction is violated, the result of the MFHI is undefined.
Operation:
GPR[rd] ← HI
Exceptions:
None
0
31 2526 1516 0
rd
6 10 5
6 5
6
SPECIAL MFHI0
5
11 10
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
MFLO Move From LO Register
A-106 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Format: MFLO rd MIPS I
Purpose: To copy the special purpose LO register to a GPR.
Description: rd ← LO
The contents of special register LO are loaded into GPR rd.
Restrictions:
The two instructions that follow an MFLO instruction must not be instructions that
modify the LO register: DDIV, DDIVU, DIV, DIVU, DMULT, DMULTU, MTLO, MULT,
MULTU. If this restriction is violated, the result of the MFLO is undefined.
Operation:
GPR[rd] ← LO
Exceptions:
None
0
31 2526 1516 0
rd
6 10 5
6 5
6
SPECIAL MFLO0
5
11 10
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0
Move Conditional on Not Zero MOVN
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-107
Format: MOVN rd, rs, rt MIPS IV
Purpose: To conditionally move a GPR after testing a GPR value.
Description: if (rt ≠ 0) then rd ← rs
If the value in GPR rt is not equal to zero, then the contents of GPR rs are placed into
GPR rd.
Restrictions:
None
Operation:
if GPR[rt] ≠ 0 then
GPR[rd] ← GPR[rs]
endif
Exceptions:
Reserved Instruction
Programming Notes:
The nonzero value tested here is the “condition true” result from the SLT, SLTI, SLTU,
and SLTIU comparison instructions.
31 2526 1516 0
6 5 5
6 5
6
SPECIAL
5
11 1021 20
5
0 MOVNrdrtrs
0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1
MOVZ Move Conditional on Zero
A-108 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Format: MOVZ rd, rs, rt MIPS IV
Purpose: To conditionally move a GPR after testing a GPR value.
Description: if (rt = 0) then rd ← rs
If the value in GPR rt is equal to zero, then the contents of GPR rs are placed into
GPR rd.
Restrictions:
None
Operation:
if GPR[rt] = 0 then
GPR[rd] ← GPR[rs]
endif
Exceptions:
Reserved Instruction
Programming Notes:
The zero value tested here is the “condition false” result from the SLT, SLTI, SLTU, and
SLTIU comparison instructions.
31 2526 1516 0
6 5 5
6 5
6
SPECIAL
5
11 1021 20
5
0 MOVZrdrtrs
0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0
Move To HI Register MTHI
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-109
Format: MTHI rs MIPS I
Purpose: To copy a GPR to the special purpose HI register.
Description: HI ← rs
The contents of GPR rs are loaded into special register HI.
Restrictions:
If either of the two preceding instructions is MFHI, the result of that MFHI is
undefined. Reads of the HI or LO special registers must be separated from subsequent
instructions that write to them by two or more other instructions.
A computed result written to the HI/LO pair by DDIV, DDIVU, DIV, DIVU, DMULT,
DMULTU, MULT, or MULTU must be read by MFHI or MFLO before another result is
written into either HI or LO. If an MTHI instruction is executed following one of these
arithmetic instructions, but before a MFLO or MFHI instruction, the contents of LO are
undefined. The following example shows this illegal situation:
MUL r2,r4 # start operation that will eventually write to HI,LO
… # code not containing mfhi or mflo
MTHI r6
… # code not containing mflo
MFLO r3 # this mflo would get an undefined value
Operation:
I – 2 :, I – 1 :HI ← undefined
I : HI ← GPR[rs]
Exceptions:
None
31 2526 2021 0
rs
6 5
6 5
15 6
SPECIAL 0 MTHI
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1
MTLO Move To LO Register
A-110 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Format: MTLO rs MIPS I
Purpose: To copy a GPR to the special purpose LO register.
Description: LO ← rs
The contents of GPR rs are loaded into special register LO.
Restrictions:
If either of the two preceding instructions is MFLO, the result of that MFLO is
undefined. Reads of the HI or LO special registers must be separated from subsequent
instructions that write to them by two or more other instructions.
A computed result written to the HI/LO pair by DDIV, DDIVU, DIV, DIVU, DMULT,
DMULTU, MULT, or MULTU must be read by MFHI or MFLO before another result is
written into either HI or LO. If an MTLO instruction is executed following one of these
arithmetic instructions, but before a MFLO or MFHI instruction, the contents of HI are
undefined. The following example shows this illegal situation:
MUL r2,r4 # start operation that will eventually write to HI,LO
… # code not containing mfhi or mflo
MTLO r6
… # code not containing mfhi
MFHI r3 # this mfhi would get an undefined value
Operation:
I – 2 :, I – 1 :LO ← undefined
I : LO ← GPR[rs]
Exceptions:
None
31 2526 2021 0
rs
6 5
6 5
15 6
SPECIAL 0 MTLO
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1
Multiply Word MULT
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-111
Format: MULT rs, rt MIPS I
Purpose: To multiply 32-bit signed integers.
Description: (LO, HI) ← rs × rt
The 32-bit word value in GPR rt is multiplied by the 32-bit value in GPR rs, treating
both operands as signed values, to produce a 64-bit result. The low-order 32-bit word
of the result is placed into special register LO, and the high-order 32-bit word is placed
into special register HI.
No arithmetic exception occurs under any circumstances.
Restrictions:
On 64-bit processors, if either GPR rt or GPR rs do not contain sign-extended 32-bit
values (bits 63..31 equal), then the result of the operation is undefined.
If either of the two preceding instructions is MFHI or MFLO, the result of the MFHI or
MFLO is undefined. Reads of the HI or LO special registers must be separated from
subsequent instructions that write to them by two or more other instructions.
Operation:
if (NotWordValue(GPR[rs]) or NotWordValue(GPR[rt])) then UndefinedResult() endif
I – 2 :, I – 1 : LO, HI ← undefined
I : prod ← GPR[rs]31..0 * GPR[rt]31..0
LO ← sign_extend(prod31..0)
H I ← sign_extend(prod63..32)
Exceptions:
None
Programming Notes:
In some processors the integer multiply operation may proceed asynchronously and
allow other CPU instructions to execute before it is complete. An attempt to read LO
or HI before the results are written will wait (interlock) until the results are ready.
Asynchronous execution does not affect the program result, but offers an opportunity
for performance improvement by scheduling the multiply so that other instructions
can execute in parallel.
Programs that require overflow detection must check for it explicitly.
31 2526 2021 1516 0
rs rt
6 5 5
6 5
10 6
SPECIAL 0 MULT
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0
MULTU Multiply Unsigned Word
A-112 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Format: MULTU rs, rt MIPS I
Purpose: To multiply 32-bit unsigned integers.
Description: (LO, HI) ← rs × rt
The 32-bit word value in GPR rt is multiplied by the 32-bit value in GPR rs, treating
both operands as unsigned values, to produce a 64-bit result. The low-order 32-bit
word of the result is placed into special register LO, and the high-order 32-bit word is
placed into special register HI.
No arithmetic exception occurs under any circumstances.
Restrictions:
On 64-bit processors, if either GPR rt or GPR rs do not contain sign-extended 32-bit
values (bits 63..31 equal), then the result of the operation is undefined.
If either of the two preceding instructions is MFHI or MFLO, the result of the MFHI or
MFLO is undefined. Reads of the HI or LO special registers must be separated from
subsequent instructions that write to them by two or more other instructions.
Operation:
if (NotWordValue(GPR[rs]) or NotWordValue(GPR[rt])) then UndefinedResult() endif
I – 2 :, I – 1 : LO, HI ← undefined
I : prod ← (0 || GPR[rs]31..0) * (0 || GPR[rt]31..0)
LO ← sign_extend(prod31..0)
H I ← sign_extend(prod63..32)
Exceptions:
None
Programming Notes:
In some processors the integer multiply operation may proceed asynchronously and
allow other CPU instructions to execute before it is complete. An attempt to read LO
or HI before the results are written will wait (interlock) until the results are ready.
Asynchronous execution does not affect the program result, but offers an opportunity
for performance improvement by scheduling the multiply so that other instructions
can execute in parallel.
Programs that require overflow detection must check for it explicitly.
31 2526 2021 1516 0
rs rt
6 5 5
6 5
10 6
SPECIAL 0 MULTU
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1
Not OrNOR
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-113
Format: NOR rd, rs, rt MIPS I
Purpose: To do a bitwise logical NOT OR.
Description: rd ← rs NOR rt
The contents of GPR rs are combined with the contents of GPR rt in a bitwise logical
NOR operation. The result is placed into GPR rd.
Restrictions:
None
Operation:
GPR[rd] ← GPR[rs] nor GPR[rt]
Exceptions:
None
31 2526 2021 1516
SPECIAL rs rt
6 5 5
rd 0 NOR
5 5 6
11 10 6 5 0
0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 1
OR Or
A-114 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Format: OR rd, rs, rt MIPS I
Purpose: To do a bitwise logical OR.
Description: rd ← rs OR rt
The contents of GPR rs are combined with the contents of GPR rt in a bitwise logical
OR operation. The result is placed into GPR rd.
Restrictions:
None
Operation:
GPR[rd] ← GPR[rs] or GPR[rt]
Exceptions:
None
31 2526 2021 1516
SPECIAL rs rt
6 5 5
rd 0 OR
5 5 6
11 10 6 5 0
0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1
Or Immediate ORI
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-115
Format: ORI rt, rs, immediate MIPS I
Purpose: To do a bitwise logical OR with a constant.
Description: rd ← rs OR immediate
The 16-bit immediate is zero-extended to the left and combined with the contents of
GPR rs in a bitwise logical OR operation. The result is placed into GPR rt.
Restrictions:
None
Operation:
GPR[rt] ← zero_extend(immediate) or GPR[rs]
Exceptions:
None
31 2526 2021 1516 0
ORI rs rt immediate
6 5 5 16
0 0 1 1 0 1
PREF Prefetch
A-116 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Format: PREF hint, offset(base) MIPS IV
Purpose: To prefetch data from memory.
Description: prefetch_memory(base+offset)
PREF adds the 16-bit signed offset to the contents of GPR base to form an effective byte
address. It advises that data at the effective address may be used in the near future.
The hint field supplies information about the way that the data is expected to be used.
PREF is an advisory instruction. It may change the performance of the program. For
all hint values and all effective addresses, it neither changes architecturally-visible state
nor alters the meaning of the program. An implementation may do nothing when
executing a PREF instruction.
If MIPS IV instructions are supported and enabled, PREF does not cause addressing-
related exceptions. If it raises an exception condition, the exception condition is
ignored. If an addressing-related exception condition is raised and ignored, no data
will be prefetched, Even if no data is prefetched in such a case, some action that is not
architecturally-visible, such as writeback of a dirty cache line, might take place.
PREF will never generate a memory operation for a location with an uncached memory
access type (see Memory Access Types on page A-12).
If PREF results in a memory operation, the memory access type used for the operation
is determined by the memory access type of the effective address, just as it would be if
the memory operation had been caused by a load or store to the effective address.
PREF enables the processor to take some action, typically prefetching the data into
cache, to improve program performance. The action taken for a specific PREF
instruction is both system and context dependent. Any action, including doing
nothing, is permitted that does not change architecturally-visible state or alter the
meaning of a program. It is expected that implementations will either do nothing or
take an action that will increase the performance of the program.
For a cached location, the expected, and useful, action is for the processor to prefetch a
block of data that includes the effective address. The size of the block, and the level of
the memory hierarchy it is fetched into are implementation specific.
31 2526 2021 1516
base hint
6 5 5
offset
16
0
1 1 0 0 1 1
PREF
Prefetch PREF
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-117
The hint field supplies information about the way the data is expected to be used. No
hint value causes an action that modifies architecturally-visible state. A processor may
use a hint value to improve the effectiveness of the prefetch action. The defined hint
values and the recommended prefetch action are shown in the table below. The hint
table may be extended in future implementations.
Restrictions:
None
Operation:
vAddr ← GPR[base] + sign_extend(offset)
(pAddr, uncached) ← AddressTranslation(vAddr, DATA, LOAD)
Prefetch(uncached, pAddr, vAddr, DATA, hint)
Exceptions:
Reserved Instruction
Table A-32 Values of Hint Field for Prefetch Instruction
Value Name Data use and desired prefetch action
0 load Data is expected to be loaded (not modified).
Fetch data as if for a load.
1 store Data is expected to be stored or modified.
Fetch data as if for a store.
2-3 Not yet defined.
4 load_streamed Data is expected to be loaded (not modified) but not
reused extensively; it will “stream” through cache.
Fetch data as if for a load and place it in the cache so
that it will not displace data prefetched as “retained”.
5 store_streamed Data is expected to be stored or modified but not
reused extensively; it will “stream” through cache.
Fetch data as if for a store and place it in the cache so
that it will not displace data prefetched as “retained”.
6 load_retained Data is expected to be loaded (not modified) and
reused extensively; it should be “retained” in the cache.
Fetch data as if for a load and place it in the cache so
that it will not be displaced by data prefetched as
“streamed”.
7 store_retained Data is expected to be stored or modified and reused
extensively; it should be “retained” in the cache.
Fetch data as if for a store and place it in the cache so
that will not be displaced by data prefetched as
“streamed”.
8-31 Not yet defined.
PREF Prefetch
A-118 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Programming Notes:
Prefetch can not prefetch data from a mapped location unless the translation for that
location is present in the TLB. Locations in memory pages that have not been accessed
recently may not have translations in the TLB, so prefetch may not be effective for such
locations.
Prefetch does not cause addressing exceptions. It will not cause an exception to
prefetch using an address pointer value before the validity of a pointer is determined.
Implementation Notes:
It is recommended that a reserved hint field value either cause a default prefetch action
that is expected to be useful for most cases of data use, such as the “load” hint, or cause
the instruction to be treated as a NOP.
Store Byte SB
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-119
Format: SB rt, offset(base) MIPS I
Purpose: To store a byte to memory.
Description: memory[base+offset] ← rt
The least-significant 8-bit byte of GPR rt is stored in memory at the location specified
by the effective address. The 16-bit signed offset is added to the contents of GPR base to
form the effective address.
Restrictions:
None
Operation: 32-bit processors
vAddr ← sign_extend(offset) + GPR[base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, STORE)
pAddr ← pAddrPSIZE-1..2 || (pAddr1..0 xor ReverseEndian2)
byte ← vAddr1..0 xor BigEndianCPU2
dataword ← GPR[rt]31–8*byte..0 || 08*byte
StoreMemory (uncached, BYTE, dataword, pAddr, vAddr, DATA)
Operation: 64-bit processors
vAddr ← sign_extend(offset) + GPR[base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, STORE)
pAddr ← pAddrPSIZE-1..3 || (pAddr2..0 xor ReverseEndian3)
byte ← vAddr2..0 xor BigEndianCPU3
datadouble ← GPR[rt]63–8*byte..0 || 08*byte
StoreMemory (uncached, BYTE, datadouble, pAddr, vAddr, DATA)
Exceptions:
TLB Refill, TLB Invalid
TLB Modified
Bus Error
Address Error
31 2526 2021 1516 0
SB base rt offset
6 5 5 16
1 0 1 0 0 0
SC Store Conditional Word
A-120 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Format: SC rt, offset(base) MIPS II
Purpose: To store a word to memory to complete an atomic read-modify-write.
Description: if (atomic_update) then memory[base+offset] ← rt, rt ← 1 else rt ← 0
The LL and SC instructions provide primitives to implement atomic Read-Modify-
Write (RMW) operations for cached memory locations.
The 16-bit signed offset is added to the contents of GPR base to form an effective
address.
The SC completes the RMW sequence begun by the preceding LL instruction executed
on the processor. If it would complete the RMW sequence atomically, then the least-
significant 32-bit word of GPR rt is stored into memory at the location specified by the
aligned effective address and a one, indicating success, is written into GPR rt.
Otherwise, memory is not modified and a zero, indicating failure, is written into
GPR rt.
If any of the following events occurs between the execution of LL and SC, the SC will
fail:
• A coherent store is completed by another processor or coherent I/O module
into the block of physical memory containing the word. The size and
alignment of the block is implementation dependent. It is at least one word
and is at most the minimum page size.
• An exception occurs on the processor executing the LL/SC.
An implementation may detect “an exception” in one of three ways:
1) Detect exceptions and fail when an exception occurs.
2) Fail after the return-from-interrupt instruction (RFE or ERET) is executed.
3) Do both 1 and 2.
If any of the following events occurs between the execution of LL and SC, the SC may
succeed or it may fail; the success or failure is unpredictable. Portable programs
should not cause one of these events.
• A load, store, or prefetch is executed on the processor executing the LL/SC.
• The instructions executed starting with the LL and ending with the SC do not
lie in a 2048-byte contiguous region of virtual memory. The region does not
have to be aligned, other than the alignment required for instruction words.
The following conditions must be true or the result of the SC will be undefined:
• Execution of SC must have been preceded by execution of an LL instruction.
31 2526 2021 1516 0
SC base rt offset
6 5 5 16
1 1 1 0 0 0
Store Conditional Word SC
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-121
• A RMW sequence executed without intervening exceptions must use the same
address in the LL and SC. The address is the same if the virtual address,
physical address, and cache-coherence algorithm are identical.
Atomic RMW is provided only for cached memory locations. The extent to which the
detection of atomicity operates correctly depends on the system implementation and
the memory access type used for the location. See Memory Access Types on page
A-12.
MP atomicity : To provide atomic RMW among multiple processors, all accesses to
the location must be made with a memory access type of cached coherent.
Uniprocessor atomicity : To provide atomic RMW on a single processor, all accesses
to the location must be made with memory access type of either cached noncoherent
or cached coherent. All accesses must be to one or the other access type, they may not
be mixed.
I/O System : To provide atomic RMW with a coherent I/O system, all accesses to the
location must be made with a memory access type of cached coherent. If the I/O
system does not use coherent memory operations, then atomic RMW cannot be
provided with respect to the I/O reads and writes.
The definition above applies to user-mode operation on all MIPS processors that
support the MIPS II architecture. There may be other implementation-specific events,
such as privileged CP0 instructions, that will cause an SC instruction to fail in some
cases. System programmers using LL/SC should consult implementation-specific
documentation.
Restrictions:
The addressed location must have a memory access type of cached noncoherent or
cached coherent; if it does not, the result is undefined (see Memory Access Types on
page A-12).
The effective address must be naturally aligned. If either of the two least-significant
bits of the address are non-zero, an Address Error exception occurs.
MIPS IV: The low-order 2 bits of the offset field must be zero. If they are not, the result
of the instruction is undefined.
Operation: 32-bit processors
vAddr ← sign_extend(offset) + GPR[base]
if (vAddr1..0) ≠ 02 then SignalException(AddressError) endif
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, STORE)
dataword ← GPR[rt]
if LLbit then
StoreMemory (uncached, WORD, dataword, pAddr, vAddr, DATA)
endif
GPR[rt] ← 031 || LLbit
SC Store Conditional Word
A-122 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Operation: 64-bit processors
vAddr ← sign_extend(offset) + GPR[base]
if (vAddr1..0) ≠ 02 then SignalException(AddressError) endif
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, STORE)
pAddr ← pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian || 02))
byte ← vAddr2..0 xor (BigEndianCPU || 02)
datadouble ← GPR[rt]63-8*byte..0 || 08*byte
if LLbit then
StoreMemory (uncached, WORD, datadouble, pAddr, vAddr, DATA)
endif
GPR[rt] ← 063 || LLbit
Exceptions:
TLB Refill, TLB Invalid
TLB Modified
Address Error
Reserved Instruction
Programming Notes:
LL and SC are used to atomically update memory locations as shown in the example
atomic increment operation below.
Exceptions between the LL and SC cause SC to fail, so persistent exceptions must be
avoided. Some examples of these are arithmetic operations that trap, system calls,
floating-point operations that trap or require software emulation assistance.
LL and SC function on a single processor for cached noncoherent memory so that
parallel programs can be run on uniprocessor systems that do not support cached
coherent memory access types.
Implementation Notes:
The block of memory that is “locked” for LL/SC is typically the largest cache line in
use.
L1:
LL T1, (T0) # load counter
ADDI T2, T1, 1 # increment
SC T2, (T0) # try to store, checking for atomicity
BEQ T2, 0, L1 # if not atomic (0), try again
NOP # branch-delay slot
Store Conditional Doubleword SCD
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-123
Format: SCD rt, offset(base) MIPS III
Purpose: To store a doubleword to memory to complete an atomic read-modify-
write.
Description: if (atomic_update) then memory[base+offset] ← rt, rt ← 1 else rt ← 0
The 16-bit signed offset is added to the contents of GPR base to form an effective
address.
The SCD completes the RMW sequence begun by the preceding LLD instruction
executed on the processor. If it would complete the RMW sequence atomically, then
the 64-bit doubleword of GPR rt is stored into memory at the location specified by the
aligned effective address and a one, indicating success, is written into GPR rt.
Otherwise, memory is not modified and a zero, indicating failure, is written into
GPR rt.
If any of the following events occurs between the execution of LLD and SCD, the SCD
will fail:
• A coherent store is completed by another processor or coherent I/O module
into the block of physical memory containing the word. The size and
alignment of the block is implementation dependent. It is at least one
doubleword and is at most the minimum page size.
• An exception occurs on the processor executing the LLD/SCD.
An implementation may detect “an exception” in one of three ways:
1) Detect exceptions and fail when an exception occurs.
2) Fail after the return-from-interrupt instruction (RFE or ERET) is executed.
3) Do both 1 and 2.
If any of the following events occurs between the execution of LLD and SCD, the SCD
may succeed or it may fail; the success or failure is unpredictable. Portable programs
should not cause one of these events.
• A memory access instruction (load, store, or prefetch) is executed on the
processor executing the LLD/SCD.
• The instructions executed starting with the LLD and ending with the SCD do
not lie in a 2048-byte contiguous region of virtual memory. The region does
not have to be aligned, other than the alignment required for instruction
words.
The following conditions must be true or the result of the SCD will be undefined:
• Execution of SCD must have been preceded by execution of an LLD
instruction.
31 2526 2021 1516 0
SCD base rt offset
6 5 5 16
1 1 1 1 0 0
SCD Store Conditional Doubleword
A-124 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
• A RMW sequence executed without intervening exceptions must use the same
address in the LLD and SCD. The address is the same if the virtual address,
physical address, and cache-coherence algorithm are identical.
Atomic RMW is provided only for memory locations with cached noncoherent or
cached coherent memory access types. The extent to which the detection of atomicity
operates correctly depends on the system implementation and the memory access type
used for the location. See Memory Access Types on page A-12.
MP atomicity : To provide atomic RMW among multiple processors, all accesses to
the location must be made with a memory access type of cached coherent.
Uniprocessor atomicity : To provide atomic RMW on a single processor, all accesses
to the location must be made with memory access type of either cached noncoherent
or cached coherent. All accesses must be to one or the other access type, they may not
be mixed.
I/O System : To provide atomic RMW with a coherent I/O system, all accesses to the
location must be made with a memory access type of cached coherent. If the I/O
system does not use coherent memory operations, then atomic RMW cannot be
provided with respect to the I/O reads and writes.
The defemination above applies to user-mode operation on all MIPS processors that
support the MIPS III architecture. There may be other implementation-specific events,
such as privileged CP0 instructions, that will cause an SCD instruction to fail in some
cases. System programmers using LLD/SCD should consult implementation-specific
documentation.
Restrictions:
The addressed location must have a memory access type of cached noncoherent or
cached coherent; if it does not, the result is undefined (see Memory Access Types on
page A-12The 64-bit doubleword of register rt is conditionally stored in memory at the
location specified by the aligned effective address. The 16-bit signed offset is added to
the contents of GPR base to form the effective address.
The effective address must be naturally aligned. If any of the three least-significant bits
of the address are non-zero, an Address Error exception occurs.
MIPS IV: The low-order 3 bits of the offset field must be zero. If they are not, the result
of the instruction is undefined.
Operation: 64-bit processors
vAddr ← sign_extend(offset) + GPR[base]
if (vAddr2..0) ≠ 03 then SignalException(AddressError) endif
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, STORE)
datadouble ← GPR[rt]
if LLbit then
StoreMemory (uncached, DOUBLEWORD, datadouble, pAddr, vAddr, DATA)
endif
GPR[rt] ← 063 || LLbit
Store Conditional Doubleword SCD
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-125
Exceptions:
TLB Refill, TLB Invalid
TLB Modified
Address Error
Reserved Instruction
Programming Notes:
LLD and SCD are used to atomically update memory locations as shown in the
example atomic increment operation below.
Exceptions between the LLD and SCD cause SCD to fail, so persistent exceptions must
be avoided. Some examples of these are arithmetic operations that trap, system calls,
floating-point operations that trap or require software emulation assistance.
LLD and SCD function on a single processor for cached noncoherent memory so that
parallel programs can be run on uniprocessor systems that do not support cached
coherent memory access types.
Implementation Notes:
The block of memory that is “locked” for LLD/SCD is typically the largest cache line
in use.
L1:
LLD T1, (T0) # load counter
ADDI T2, T1, 1 # increment
SCD T2, (T0) # try to store, checking for atomicity
BEQ T2, 0, L1 # if not atomic (0), try again
NOP # branch-delay slot
SD Store Doubleword
A-126 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Format: SD rt, offset(base) MIPS III
Purpose: To store a doubleword to memory.
Description: memory[base+offset] ← rt
The 64-bit doubleword in GPR rt is stored in memory at the location specified by the
aligned effective address. The 16-bit signed offset is added to the contents of GPR base
to form the effective address.
Restrictions:
The effective address must be naturally aligned. If any of the three least-significant bits
of the effective address are non-zero, an Address Error exception occurs.
MIPS IV: The low-order 3 bits of the offset field must be zero. If they are not, the result
of the instruction is undefined.
Operation: 64-bit processors
vAddr ← sign_extend(offset) + GPR[base]
if (vAddr2..0) ≠ 03 then SignalException(AddressError) endif
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, STORE)
datadouble ← GPR[rt]
StoreMemory (uncached, DOUBLEWORD, datadouble, pAddr, vAddr, DATA)
Exceptions:
TLB Refill, TLB Invalid
TLB Modified
Address Error
Reserved Instruction
31 2526 2021 1516 0
SD base rt offset
6 5 5 16
1 1 1 1 1 1
Store Doubleword From Coprocessor SDCz
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-127
Format: SDC1 rt, offset(base) MIPS II
SDC2 rt, offset(base)
Purpose: To store a doubleword from a coprocessor general register to memory.
Description: memory[base+offset] ← rt
Coprocessor unit zz supplies a 64-bit doubleword which is stored at the memory
location specified by the aligned effective address. The 16-bit signed offset is added to
the contents of GPR base to form the effective address.
The data supplied by each coprocessor is defined by the individual coprocessor
specifications. The usual operation would read the data from coprocessor general
register rt.
Each MIPS architecture level defines up to 4 coprocessor units, numbered 0 to 3 (see
Coprocessor Instructions on page A-11). The opcodes corresponding to coprocessors
that are not defined by an architecture level may be used for other instructions.
Restrictions:
Access to the coprocessors is controlled by system software. Each coprocessor has a
“coprocessor usable” bit in the System Control coprocessor. The usable bit must be set
for a user program to execute a coprocessor instruction. If the usable bit is not set, an
attempt to execute the instruction will result in a Coprocessor Unusable exception. An
unimplemented coprocessor must never be enabled. The result of executing this
instruction for an unimplemented coprocessor when the usable bit is set, is undefined.
This instruction is not defined for coprocessor 0, the System Control coprocessor, and
the opcode may be used for other instructions.
The effective address must be naturally aligned. If any of the three least-significant bits
of the effective address are non-zero, an Address Error exception occurs.
MIPS IV: The low-order 3 bits of the offset field must be zero. If they are not, the result
of the instruction is undefined.
Operation: 32-bit processors
vAddr ← sign_extend(offset) + GPR[base]
if (vAddr2..0) ≠ 03 then SignalException(AddressError) endif
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, STORE)
datadouble ← COP_SD(z, rt)
StoreMemory (uncached, DOUBLEWORD, datadouble, pAddr, vAddr, DATA)
31 2526 2021 1516 0
SDCz base rt offset
6 5 5 16
1 1 1 1 z z
SDCz Store Doubleword From Coprocessor
A-128 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Operation: 64-bit processors
vAddr ← sign_extend(offset) + GPR[base]
if (vAddr2..0) ≠ 03 then SignalException(AddressError) endif
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, STORE)
datadouble ← COP_SD(z, rt)
StoreMemory (uncached, DOUBLEWORD, datadouble, pAddr, vAddr, DATA)
Exceptions:
TLB Refill, TLB Invalid
TLB Modified
Address Error
Reserved Instruction
Coprocessor Unusable
Store Doubleword Left SDL
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-129
Format: SDL rt, offset(base) MIPS III
Purpose: To store the most-significant part of a doubleword to an unaligned
memory address.
Description: memory[base+offset] ← Some_Bytes_From rt
The 16-bit signed offset is added to the contents of GPR base to form an effective address
(EffAddr). EffAddr is the address of the most-significant of eight consecutive bytes
forming a doubleword in memory (DW) starting at an arbitrary byte boundary. A part
of DW, the most-significant one to eight bytes, is in the aligned doubleword containing
EffAddr. The same number of most-significant (left) bytes of GPR rt are stored into
these bytes of DW.
The figure below illustrates this operation for big-endian byte ordering. The eight
consecutive bytes in 2..9 form an unaligned doubleword starting at location 2. A part
of DW, six bytes, is contained in the aligned doubleword containing the most-
significant byte at 2. First, SDL stores the six most-significant bytes of the source
register into these bytes in memory. Next, the complementary SDR instruction stores
the remainder of DW.
Figure A-6 Unaligned Doubleword Store with SDL and SDR
31 2526 2021 1516 0
SDL base rt offset
6 5 5 16
1 0 1 1 0 0
Doubleword at byte 2 in memory (big-endian) – each memory byte contains its address
most — significance — least
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Memory
A B C D E F G H GPR 24
After executing
0 1 A B C D E F 8 9 10 … SDL $24,2($0)
Then after
0 1 A B C D E F G H 10 … SDR $24,9($0)
SDL Store Doubleword Left
A-130 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
The bytes stored from the source register to memory depend on both the offset of the
effective address within an aligned doubleword, i.e. the low three bits of the address
(vAddr2..0), and the current byte ordering mode of the processor (big- or little-endian).
The table below shows the bytes stored for every combination of offset and byte
ordering.
Restrictions:
None
Operation: 64-bit processors
vAddr ← sign_extend(offset) + GPR[base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, STORE)
pAddr ← pAddr(PSIZE-1)..3 || (pAddr2..0 xor ReverseEndian3)
If BigEndianMem = 0 then
pAddr ← pAddr(PSIZE-1)..3 || 03
endif
byte ← vAddr2..0 xor BigEndianCPU3
datadouble ← 056–8*byte || GPR[rt]63..56–8*byte
StoreMemory (uncached, byte, datadouble, pAddr, vAddr, DATA)
Exceptions:
TLB Refill, TLB Invalid
TLB Modified
Bus Error
Address Error
Reserved Instruction
Table A-33 Bytes Stored by SDL Instruction
Initial Memory contents and byte offsets Contents of
Source Registermost — significance — least
0 1 2 3 4 5 6 7 ← big- most — significance — least
i j k l m n o p A B C D E F G H
7 6 5 4 3 2 1 0 ← little-endian
Memory contents after instruction (shaded is unchanged)
Big-endian byte ordering vAddr2..0 Little-endian byte ordering
A B C D E F G H 0 i j k l m n o A
i A B C D E F G 1 i j k l m n A B
i j A B C D E F 2 i j k l m A B C
i j k A B C D E 3 i j k l A B C D
i j k l A B C D 4 i j k A B C D E
i j k l m A B C 5 i j A B C D E F
i j k l m n A B 6 i A B C D E F G
i j k l m n o A 7 A B C D E F G H
Store Doubleword Right SDR
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-131
Format: SDR rt, offset(base) MIPS III
Purpose: To store the least-significant part of a doubleword to an unaligned
memory address.
Description: memory[base+offset] ← Some_Bytes_From rt
The 16-bit signed offset is added to the contents of GPR base to form an effective address
(EffAddr). EffAddr is the address of the least-significant of eight consecutive bytes
forming a doubleword in memory (DW) starting at an arbitrary byte boundary. A part
of DW, the least-significant one to eight bytes, is in the aligned doubleword containing
EffAddr. The same number of least-significant (right) bytes of GPR rt are stored into
these bytes of DW.
The figure below illustrates this operation for big-endian byte ordering. The eight
consecutive bytes in 2..9 form an unaligned doubleword starting at location 2. A part
of DW, two bytes, is contained in the aligned doubleword containing the least-
significant byte at 9. First, SDR stores the two least-significant bytes of the source
register into these bytes in memory. Next, the complementary SDL stores the
remainder of DW.
Figure A-7 Unaligned Doubleword Store with SDR and SDL
31 2526 2021 1516 0
SDR base rt offset
6 5 5 16
1 0 1 1 0 1
Doubleword at byte 2 in memory, big-endian byte order, – each mem byte contains its address
most — significance — least
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Memory
A B C D E F G H GPR 24
After executing
0 1 2 3 4 5 6 7 G H 10 … SDR $24,9($0)
Then after
0 1 A B C D E F G H 10 … SDL $24,2($0)
SDR Store Doubleword Right
A-132 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
The bytes stored from the source register to memory depend on both the offset of the
effective address within an aligned doubleword, i.e. the low three bits of the address
(vAddr2..0), and the current byte ordering mode of the processor (big- or little-endian).
The table below shows the bytes stored for every combination of offset and byte
ordering.
Restrictions:
None
Operation: 64-bit processors
vAddr ← sign_extend(offset) + GPR[base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, STORE)
pAddr ← pAddr(PSIZE-1)..3 || (pAddr2..0 xor ReverseEndian3)
If BigEndianMem = 0 then
pAddr ← pAddr(PSIZE-1)..3 || 03
endif
byte ← vAddr1..0 xor BigEndianCPU3
datadouble ← GPR[rt]63–8*byte || 08*byte
StoreMemory (uncached, DOUBLEWORD-byte, datadouble, pAddr, vAddr, DATA)
Exceptions:
TLB Refill, TLB Invalid
TLB Modified
Bus Error
Address Error
Reserved Instruction
Table A-34 Bytes Stored by SDR Instruction
Initial Memory contents and byte offsets Contents of
Source Registermost — significance — least
0 1 2 3 4 5 6 7 ← big- most — significance — least
i j k l m n o p A B C D E F G H
7 6 5 4 3 2 1 0 ¨ little-endian
Memory contents after instruction (shaded is unchanged)
Big-endian byte ordering vAddr2..0 Little-endian byte ordering
H j k l m n o p 0 A B C D E F G H
G H k l m n o p 1 B C D E F G H p
F G H l m n o p 2 C D E F G H o p
E F G H m n o p 3 D E F G H n o p
D E F G H n o p 4 E F G H m n o p
C D E F G H o p 5 F G H l m n o p
B C D E F G H p 6 G H k l m n o p
A B C D E F G H 7 H j k l m n o p
Store Halfword SH
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-133
Format: SH rt, offset(base) MIPS I
Purpose: To store a halfword to memory.
Description: memory[base+offset] ← rt
The least-significant 16-bit halfword of register rt is stored in memory at the location
specified by the aligned effective address. The 16-bit signed offset is added to the
contents of GPR base to form the effective address.
Restrictions:
The effective address must be naturally aligned. If the least-significant bit of the
address is non-zero, an Address Error exception occurs.
MIPS IV: The low-order bit of the offset field must be zero. If it is not, the result of the
instruction is undefined.
Operation: 32-bit processors
vAddr ← sign_extend(offset) + GPR[base]
if (vAddr0) ≠ 0 then SignalException(AddressError) endif
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, STORE)
pAddr ← pAddrPSIZE-1..2 || (pAddr1..0 xor (ReverseEndian || 0))
byte ← vAddr1..0 xor (BigEndianCPU || 0)
dataword ← GPR[rt]31–8*byte..0 || 08*byte
StoreMemory (uncached, HALFWORD, dataword, pAddr, vAddr, DATA)
Operation: 64-bit processors
vAddr ← sign_extend(offset) + GPR[base]
if (vAddr0) ≠ 0 then SignalException(AddressError) endif
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, STORE)
pAddr ← pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian2 || 0))
byte ← vAddr2..0 xor (BigEndianCPU2 || 0)
datadouble ← GPR[rt]63–8*byte..0 || 08*byte
StoreMemory (uncached, HALFWORD, datadouble, pAddr, vAddr, DATA)
Exceptions:
TLB Refill, TLB Invalid
TLB Modified
Address Error
31 2526 2021 1516 0
SH base rt offset
6 5 5 16
1 0 1 0 0 1
SLL Shift Word Left Logical
A-134 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Format: SLL rd, rt, sa MIPS I
Purpose: To left shift a word by a fixed number of bits.
Description: rd ← rt << sa The contents of the low-order 32-bit word of GPR rt are shifted left, inserting zeroes into the emptied bits; the word result is placed in GPR rd. The bit shift count is specified by sa. If rd is a 64-bit register, the result word is sign-extended. Restrictions: None Operation: s ← sa temp ← GPR[rt](31-s)..0 || 0s GPR[rd]← sign_extend(temp) Exceptions: None Programming Notes: Unlike nearly all other word operations the input operand does not have to be a properly sign-extended word value to produce a valid sign-extended 32-bit result. The result word is always sign extended into a 64-bit destination register; this instruction with a zero shift amount truncates a 64-bit value to 32 bits and sign extends it. Some assemblers, particularly 32-bit assemblers, treat this instruction with a shift amount of zero as a NOP and either delete it or replace it with an actual NOP. 31 2526 2021 1516 SPECIAL 0 rt 6 5 5 rd sa SLL 5 5 6 11 10 6 5 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 Shift Word Left Logical Variable SLLV CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-135 Format: SLLV rd, rt, rs MIPS I Purpose: To left shift a word by a variable number of bits. Description: rd ← rt << rs The contents of the low-order 32-bit word of GPR rt are shifted left, inserting zeroes into the emptied bits; the result word is placed in GPR rd. The bit shift count is specified by the low-order five bits of GPR rs. If rd is a 64-bit register, the result word is sign-extended. Restrictions: None Operation: s ← GP[rs]4..0 temp ← GPR[rt](31-s)..0 || 0s GPR[rd]← sign_extend(temp) Exceptions: None Programming Notes: Unlike nearly all other word operations the input operand does not have to be a properly sign-extended word value to produce a valid sign-extended 32-bit result. The result word is always sign extended into a 64-bit destination register; this instruction with a zero shift amount truncates a 64-bit value to 32 bits and sign extends it. Some assemblers, particularly 32-bit assemblers, treat this instruction with a shift amount of zero as a NOP and either delete it or replace it with an actual NOP. 31 2526 2021 1516 SPECIAL rt 6 5 5 rd 0 SLLV 5 5 6 11 10 6 5 0 0 0 0 0 0 0 0 0 0 1 0 00 0 0 0 0 rs SLT Set On Less Than A-136 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Format: SLT rd, rs, rt MIPS I Purpose: To record the result of a less-than comparison. Description: rd ← (rs < rt) Compare the contents of GPR rs and GPR rt as signed integers and record the Boolean result of the comparison in GPR rd. If GPR rs is less than GPR rt the result is 1 (true), otherwise 0 (false). The arithmetic comparison does not cause an Integer Overflow exception. Restrictions: None Operation: if GPR[rs] < GPR[rt] then GPR[rd] ← 0GPRLEN-1 || 1 else GPR[rd] ← 0GPRLEN endif Exceptions: None 31 2526 2021 1516 SPECIAL rs rt 6 5 5 rd 0 SLT 5 5 6 11 10 6 5 0 0 0 0 0 0 0 1 0 1 0 1 00 0 0 0 0 Set on Less Than Immediate SLTI CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-137 Format: SLTI rt, rs, immediate MIPS I Purpose: To record the result of a less-than comparison with a constant. Description: rt ← (rs < immediate) Compare the contents of GPR rs and the 16-bit signed immediate as signed integers and record the Boolean result of the comparison in GPR rt. If GPR rs is less than immediate the result is 1 (true), otherwise 0 (false). The arithmetic comparison does not cause an Integer Overflow exception. Restrictions: None Operation: if GPR[rs] < sign_extend(immediate) then GPR[rd] ← 0GPRLEN-1|| 1 else GPR[rd] ← 0GPRLEN endif Exceptions: None 31 2526 2021 1516 0 SLTI rs rt immediate 6 5 5 16 0 0 1 0 1 0 SLTIU Set on Less Than Immediate Unsigned A-138 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Format: SLTIU rt, rs, immediate MIPS I Purpose: To record the result of an unsigned less-than comparison with a constant. Description: rt ← (rs < immediate) Compare the contents of GPR rs and the sign-extended 16-bit immediate as unsigned integers and record the Boolean result of the comparison in GPR rt. If GPR rs is less than immediate the result is 1 (true), otherwise 0 (false). Because the 16-bit immediate is sign-extended before comparison, the instruction is able to represent the smallest or largest unsigned numbers. The representable values are at the minimum [0, 32767] or maximum [max_unsigned-32767, max_unsigned] end of the unsigned range. The arithmetic comparison does not cause an Integer Overflow exception. Restrictions: None Operation: if (0 || GPR[rs]) < (0 || sign_extend(immediate)) then GPR[rd] ← 0GPRLEN-1 || 1 else GPR[rd] ← 0GPRLEN endif Exceptions: None 31 2526 2021 1516 0 SLTIU rs rt immediate 6 5 5 16 0 0 1 0 1 1 Set on Less Than Unsigned SLTU CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-139 Format: SLTU rd, rs, rt MIPS I Purpose: To record the result of an unsigned less-than comparison. Description: rd ← (rs < rt) Compare the contents of GPR rs and GPR rt as unsigned integers and record the Boolean result of the comparison in GPR rd. If GPR rs is less than GPR rt the result is 1 (true), otherwise 0 (false). The arithmetic comparison does not cause an Integer Overflow exception. Restrictions: None Operation: if (0 || GPR[rs]) < (0 || GPR[rt]) then GPR[rd] ← 0GPRLEN-1 || 1 else GPR[rd] ← 0GPRLEN endif Exceptions: None 31 2526 2021 1516 SPECIAL rs rt 6 5 5 rd 0 SLTU 5 5 6 11 10 6 5 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 1 SRA Shift Word Right Arithmetic A-140 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Format: SRA rd, rt, sa MIPS I Purpose: To arithmetic right shift a word by a fixed number of bits. Description: rd ← rt >> sa (arithmetic)
The contents of the low-order 32-bit word of GPR rt are shifted right, duplicating the
sign-bit (bit 31) in the emptied bits; the word result is placed in GPR rd. The bit shift
count is specified by sa. If rd is a 64-bit register, the result word is sign-extended.
Restrictions:
On 64-bit processors, if GPR rt does not contain a sign-extended 32-bit value
(bits 63..31 equal) then the result of the operation is undefined.
Operation:
if (NotWordValue(GPR[rt])) then UndefinedResult() endif
s ← sa
temp ← (GPR[rt]31)s || GPR[rt]31..s
GPR[rd]← sign_extend(temp)
Exceptions:
None
31 2526 2021 1516
SPECIAL 0 rt
6 5 5
rd sa SRA
5 5 6
11 10 6 5 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1
Shift Word Right Arithmetic Variable SRAV
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-141
Format: SRAV rd, rt, rs MIPS I
Purpose: To arithmetic right shift a word by a variable number of bits.
Description: rd ← rt >> rs (arithmetic)
The contents of the low-order 32-bit word of GPR rt are shifted right, duplicating the
sign-bit (bit 31) in the emptied bits; the word result is placed in GPR rd. The bit shift
count is specified by the low-order five bits of GPR rs. If rd is a 64-bit register, the result
word is sign-extended.
Restrictions:
On 64-bit processors, if GPR rt does not contain a sign-extended 32-bit value
(bits 63..31 equal) then the result of the operation is undefined.
Operation:
if (NotWordValue(GPR[rt])) then UndefinedResult() endif
s ← GPR[rs]4..0
temp ← (GPR[rt]31)s || GPR[rt]31..s
GPR[rd]← sign_extend(temp)
Exceptions:
None
31 2526 2021 1516
SPECIAL rs rt
6 5 5
rd 0 SRAV
5 5 6
11 10 6 5 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1
SRL Shift Word Right Logical
A-142 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Format: SRL rd, rt, sa MIPS I
Purpose: To logical right shift a word by a fixed number of bits.
Description: rd ← rt >> sa (logical)
The contents of the low-order 32-bit word of GPR rt are shifted right, inserting zeros
into the emptied bits; the word result is placed in GPR rd. The bit shift count is
specified by sa. If rd is a 64-bit register, the result word is sign-extended.
Restrictions:
On 64-bit processors, if GPR rt does not contain a sign-extended 32-bit value
(bits 63..31 equal) then the result of the operation is undefined.
Operation:
if (NotWordValue(GPR[rt])) then UndefinedResult() endif
s ← sa
temp ← 0s || GPR[rt]31..s
GPR[rd]← sign_extend(temp)
Exceptions:
None
31 2526 2021 1516
SPECIAL rt
6 5 5
rd sa SRL
5 5 6
11 10 6 5 0
0 0 0 0 0 0 0 0 0 0 1 0
0
0 0 0 0 0
Shift Word Right Logical Variable SRLV
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-143
Format: SRLV rd, rt, rs MIPS I
Purpose: To logical right shift a word by a variable number of bits.
Description: rd ← rt >> rs (logical)
The contents of the low-order 32-bit word of GPR rt are shifted right, inserting zeros
into the emptied bits; the word result is placed in GPR rd. The bit shift count is
specified by the low-order five bits of GPR rs. If rd is a 64-bit register, the result word
is sign-extended.
Restrictions:
On 64-bit processors, if GPR rt does not contain a sign-extended 32-bit value
(bits 63..31 equal) then the result of the operation is undefined.
Operation:
if (NotWordValue(GPR[rt])) then UndefinedResult() endif
s ← GPR[rs]4..0
temp ← 0s || GPR[rt]31..s
GPR[rd]← sign_extend(temp)
Exceptions:
None
31 2526 2021 1516
SPECIAL rs rt
6 5 5
rd 0 SRLV
5 5 6
11 10 6 5 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0
SUB Subtract Word
A-144 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Format: SUB rd, rs, rt MIPS I
Purpose: To subtract 32-bit integers. If overflow occurs, then trap.
Description: rd ← rs – rt
The 32-bit word value in GPR rt is subtracted from the 32-bit value in GPR rs to
produce a 32-bit result. If the subtraction results in 32-bit 2’s complement arithmetic
overflow then the destination register is not modified and an Integer Overflow
exception occurs. If it does not overflow, the 32-bit result is placed into GPR rd.
Restrictions:
On 64-bit processors, if either GPR rt or GPR rs do not contain sign-extended 32-bit
values (bits 63..31 equal), then the result of the operation is undefined.
Operation:
if (NotWordValue(GPR[rs]) or NotWordValue(GPR[rt])) then UndefinedResult() endif
temp ← GPR[rs] – GPR[rt]
if (32_bit_arithmetic_overflow) then
SignalException(IntegerOverflow)
else
GPR[rd] ←temp
endif
Exceptions:
Integer Overflow
Programming Notes:
SUBU performs the same arithmetic operation but, does not trap on overflow.
31 2526 2021 1516
SPECIAL rs rt
6 5 5
rd 0 SUB
5 5 6
11 10 6 5 0
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0
Subtract Unsigned Word SUBU
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-145
Format: SUBU rd, rs, rt MIPS I
Purpose: To subtract 32-bit integers.
Description: rd ← rs – rt
The 32-bit word value in GPR rt is subtracted from the 32-bit value in GPR rs and the
32-bit arithmetic result is placed into GPR rd.
No integer overflow exception occurs under any circumstances.
Restrictions:
On 64-bit processors, if either GPR rt or GPR rs do not contain sign-extended 32-bit
values (bits 63..31 equal), then the result of the operation is undefined.
Operation:
if (NotWordValue(GPR[rs]) or NotWordValue(GPR[rt])) then UndefinedResult() endif
temp ←GPR[rs] – GPR[rt]
GPR[rd] ←temp
Exceptions:
None
Programming Notes:
The term “unsigned” in the instruction name is a misnomer; this operation is 32-bit
modulo arithmetic that does not trap on overflow. It is appropriate for arithmetic
which is not signed, such as address arithmetic, or integer arithmetic environments
that ignore overflow, such as “C” language arithmetic.
31 2526 2021 1516
SPECIAL rs rt
6 5 5
rd 0 SUBU
5 5 6
11 10 6 5 0
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1
SW Store Word
A-146 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Format: SW rt, offset(base) MIPS I
Purpose: To store a word to memory.
Description: memory[base+offset] ← rt
The least-significant 32-bit word of register rt is stored in memory at the location
specified by the aligned effective address. The 16-bit signed offset is added to the
contents of GPR base to form the effective address.
Restrictions:
The effective address must be naturally aligned. If either of the two least-significant
bits of the address are non-zero, an Address Error exception occurs.
MIPS IV: The low-order 2 bits of the offset field must be zero. If they are not, the result
of the instruction is undefined.
Operation: 32-bit Processors
vAddr ← sign_extend(offset) + GPR[base]
if (vAddr1..0) ≠ 02 then SignalException(AddressError) endif
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, STORE)
dataword ← GPR[rt]
StoreMemory (uncached, WORD, dataword, pAddr, vAddr, DATA)
Operation: 64-bit Processors
vAddr ← sign_extend(offset) + GPR[base]
if (vAddr1..0) ≠ 02 then SignalException(AddressError) endif
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, STORE)
pAddr ← pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian || 02)
byte ← vAddr2..0 xor (BigEndianCPU || 02)
datadouble ← GPR[rt]63-8*byte || 08*byte
StoreMemory (uncached, WORD, datadouble, pAddr, vAddr, DATA)
Exceptions:
TLB Refill, TLB Invalid
TLB Modified
Address Error
31 2526 2021 1516 0
SW base rt offset
6 5 5 16
1 0 1 0 1 1
Store Word From Coprocessor SWCz
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-147
Format: SWC1 rt, offset(base) MIPS I
SWC2 rt, offset(base)
SWC3 rt, offset(base)
Purpose: To store a word from a coprocessor general register to memory.
Description: memory[base+offset] ← rt
Coprocessor unit zz supplies a 32-bit word which is stored at the memory location
specified by the aligned effective address. The 16-bit signed offset is added to the
contents of GPR base to form the effective address.
The data supplied by each coprocessor is defined by the individual coprocessor
specifications. The usual operation would read the data from coprocessor general
register rt.
Each MIPS architecture level defines up to 4 coprocessor units, numbered 0 to 3 (see
Coprocessor Instructions on page A-11). The opcodes corresponding to coprocessors
that are not defined by an architecture level may be used for other instructions.
Restrictions:
Access to the coprocessors is controlled by system software. Each coprocessor has a
“coprocessor usable” bit in the System Control coprocessor. The usable bit must be set
for a user program to execute a coprocessor instruction. If the usable bit is not set, an
attempt to execute the instruction will result in a Coprocessor Unusable exception. An
unimplemented coprocessor must never be enabled. The result of executing this
instruction for an unimplemented coprocessor when the usable bit is set, is undefined.
This instruction is not available for coprocessor 0, the System Control coprocessor, and
the opcode may be used for other instructions.
The effective address must be naturally aligned. If either of the two least-significant
bits of the address are non-zero, an Address Error exception occurs.
MIPS IV: The low-order 2 bits of the offset field must be zero. If they are not, the result
of the instruction is undefined.
Operation: 32-bit processors
vAddr ← sign_extend(offset) + GPR[base]
if (vAddr1..0) ≠ 02 then SignalException(AddressError) endif
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, STORE)
dataword ← COP_SW (z, rt)
StoreMemory (uncached, WORD, dataword, pAddr, vAddr, DATA)
31 2526 2021 1516 0
SWCz base rt offset
6 5 5 16
1 1 1 0 z z
SWCz Store Word From Coprocessor
A-148 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Operation: 64-bit processors
vAddr ← sign_extend(offset) + GPR[base]
if (vAddr1..0) ≠ 02 then SignalException(AddressError) endif
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, STORE)
pAddr ← pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian || 02)
byte ← vAddr2..0 xor (BigEndianCPU || 02)
dataword← COP_SW (z, rt)
datadouble ← 032-8*byte || dataword || 08*byte
StoreMemory (uncached, WORD, datadouble, pAddr, vAddr DATA)
Exceptions:
TLB Refill, TLB Invalid
TLB Modified
Address Error
Reserved Instruction
Coprocessor Unusable
Store Word Left SWL
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-149
Format: SWL rt, offset(base) MIPS I
Purpose: To store the most-significant part of a word to an unaligned memory
address.
Description: memory[base+offset] ← rt
The 16-bit signed offset is added to the contents of GPR base to form an effective address
(EffAddr). EffAddr is the address of the most-significant of four consecutive bytes
forming a word in memory (W) starting at an arbitrary byte boundary. A part of W, the
most-significant one to four bytes, is in the aligned word containing EffAddr. The same
number of the most-significant (left) bytes from the word in GPR rt are stored into
these bytes of W.
If GPR rt is a 64-bit register, the source word is the low word of the register.
Figures A-4 illustrates this operation for big-endian byte ordering for 32-bit and 64-bit
registers. The four consecutive bytes in 2..5 form an unaligned word starting at
location 2. A part of W, two bytes, is contained in the aligned word containing the
most-significant byte at 2. First, SWL stores the most-significant two bytes of the low-
word from the source register into these two bytes in memory. Next, the
complementary SWR stores the remainder of the unaligned word.
Figure A-8 Unaligned Word Store using SWL and SWR.
31 2526 2021 1516 0
SWL base rt offset
6 5 5 16
1 0 1 0 1 0
Word at byte 2 in memory, big-endian byte order, – each mem byte contains its address
most — significance — least
0 1 2 3 4 5 6 7 8 … Memory : Initial contents
64-bit GPR 24 A B C D E F G H
32-bit GPR 24 E F G H
0 1 E F 4 5 6 … After executing SWL $24,2($0)
0 1 E F G H 6 … Then after SWR $24,5($0)
SWL Store Word Left
A-150 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
The bytes stored from the source register to memory depend on both the offset of the
effective address within an aligned word, i.e. the low two bits of the address
(vAddr1..0), and the current byte ordering mode of the processor (big- or little-endian).
The table below shows the bytes stored for every combination of offset and byte
ordering.
Operation: 32-bit Processors
vAddr ← sign_extend(offset) + GPR[base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, STORE)
pAddr ← pAddr(PSIZE-1)..2 || (pAddr1..0 xor ReverseEndian2)
If BigEndianMem = 0 then
pAddr ← pAddr(PSIZE-1)..2 || 02
endif
byte ← vAddr1..0 xor BigEndianCPU2
dataword ← 024–8*byte || GPR[rt]31..24–8*byte
StoreMemory (uncached, byte, dataword, pAddr, vAddr, DATA)
Table A-35 Bytes Stored by SWL Instruction
Memory contents and byte offsets Initial contents of Dest Register
0 1 2 3 ← big-endian 64-bit register
i j k l offset (vAddr1..0) A B C D E F G H
3 2 1 0 ← little-endian most — significance — least
most least 32-bit register E F G H
— significance —
Memory contents after instruction (shaded is unchanged)
Big-endian
byte ordering
vAddr1..0
Little-endian
byte ordering
E F G H 0 i j k E
i E F G 1 i j E F
i j E F 2 i E F G
i j k E 3 E F G H
Store Word Left SWL
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-151
Operation: 64-bit Processors
vAddr ← sign_extend(offset) + GPR[base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, STORE)
pAddr ← pAddr(PSIZE-1)..3 || (pAddr2..0 xor ReverseEndian3)
If BigEndianMem = 0 then
pAddr ← pAddr(PSIZE-1)..2 || 02
endif
byte ← vAddr1..0 xor BigEndianCPU2
if (vAddr2 xor BigEndianCPU) = 0 then
datadouble ← 032 || 024-8*byte || GPR[rt]31..24-8*byte
else
datadouble ← 024-8*byte || GPR[rt]31..24-8*byte || 032
endif
StoreMemory(uncached, byte, datadouble, pAddr, vAddr, DATA)
Exceptions:
TLB Refill, TLB Invalid
TLB Modified
Bus Error
Address Error
SWR Store Word Right
A-152 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Format: SWR rt, offset(base) MIPS I
Purpose: To store the least-significant part of a word to an unaligned memory
address.
Description: memory[base+offset] ← rt
The 16-bit signed offset is added to the contents of GPR base to form an effective address
(EffAddr). EffAddr is the address of the least-significant of four consecutive bytes
forming a word in memory (W) starting at an arbitrary byte boundary. A part of W, the
least-significant one to four bytes, is in the aligned word containing EffAddr. The same
number of the least-significant (right) bytes from the word in GPR rt are stored into
these bytes of W.
If GPR rt is a 64-bit register, the source word is the low word of the register.
Figures A-4 illustrates this operation for big-endian byte ordering for 32-bit and 64-bit
registers. The four consecutive bytes in 2..5 form an unaligned word starting at
location 2. A part of W, two bytes, is contained in the aligned word containing the least-
significant byte at 5. First, SWR stores the least-significant two bytes of the low-word
from the source register into these two bytes in memory. Next, the complementary
SWL stores the remainder of the unaligned word.
Figure A-9 Unaligned Word Store using SWR and SWL.
31 2526 2021 1516 0
SWR base rt offset
6 5 5 16
1 0 1 1 1 0
Word at byte 2 in memory, big-endian byte order, – each mem byte contains its address
most — significance — least
0 1 2 3 4 5 6 7 8 … Memory : Initial contents
64-bit GPR 24 A B C D E F G H
32-bit GPR 24 E F G H
0 1 2 3 G H 6 … After executing SWR $24,5($0)
0 1 E F G H 6 … Then after SWL $24,2($0)
Store Word Right SWR
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-153
The bytes stored from the source register to memory depend on both the offset of the
effective address within an aligned word, i.e. the low two bits of the address
(vAddr1..0), and the current byte ordering mode of the processor (big- or little-endian).
The tabel below shows the bytes stored for every combination of offset and byte
ordering.
Restrictions:
None
Operation: 32-bit Processors
vAddr ← sign_extend(offset) + GPR[base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, STORE)
pAddr ← pAddr(PSIZE-1)..2 || (pAddr1..0 xor ReverseEndian2)
BigEndianMem = 0 then
pAddr ← pAddr(PSIZE-1)..2 || 02
endif
byte ← vAddr1..0 xor BigEndianCPU2
dataword ← GPR[rt]31–8*byte || 08*byte
StoreMemory (uncached, WORD-byte, dataword, pAddr, vAddr, DATA)
Table A-36 Bytes Stored by SWR Instruction
Memory contents and byte offsets Initial contents of Dest Register
0 1 2 3 ← big-endian 64-bit register
i j k l offset (vAddr1..0) A B C D E F G H
3 2 1 0 ← little-endian most — significance — least
most least 32-bit register E F G H
— significance —
Memory contents after instruction (shaded is unchanged)
Big-endian
byte ordering
vAddr1..0
Little-endian
byte ordering
H j k l 0 E F G H
G H k l 1 F G H l
F G H l 2 G H k l
E F G H 3 H j k l
SWR Store Word Right
A-154 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Operation: 64-bit Processors
vAddr ← sign_extend(offset) + GPR[base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA, STORE)
pAddr ← pAddr(PSIZE-1)..3 || (pAddr2..0 xor ReverseEndian3)
If BigEndianMem = 0 then
pAddr ← pAddr(PSIZE-1)..2 || 02
endif
byte ← vAddr1..0 xor BigEndianCPU2
if (vAddr2 xor BigEndianCPU) = 0 then
datadouble ← 032 || GPR[rt]31-8*byte..0 || 08*byte
else
datadouble ← GPR[rt]31-8*byte..0 || 08*byte || 032
endif
StoreMemory(uncached, WORD-byte, datadouble, pAddr, vAddr, DATA)
Exceptions:
TLB Refill, TLB Invalid
TLB Modified
Bus Error
Address Error
Synchronize Shared Memory SYNC
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-155
Format: SYNC (stype = 0 implied) MIPS II
Purpose: To order loads and stores to shared memory in a multiprocessor system.
Description:
To serve a broad audience, two descriptions are given. A simple description of SYNC
that appeals to intuition is followed by a precise and detailed description.
A Simple Description:
SYNC affects only uncached and cached coherent loads and stores. The loads and
stores that occur prior to the SYNC must be completed before the loads and stores after
the SYNC are allowed to start.
Loads are completed when the destination register is written. Stores are completed
when the stored value is visible to every other processor in the system.
A Precise Description:
If the stype field has a value of zero, every synchronizable load and store that occurs in
the instruction stream prior to the SYNC instruction must be globally performed before
any synchronizable load or store that occurs after the SYNC may be performed with
respect to any other processor or coherent I/O module.
Sync does not guarantee the order in which instruction fetches are performed.
The stype values 1-31 are reserved; they produce the same result as the value zero.
Synchronizable: A load or store instruction is synchronizable if the load or store occurs
to a physical location in shared memory using a virtual location with a memory access
type of either uncached or cached coherent. Shared memory is memory that can be
accessed by more than one processor or by a coherent I/O system module.
Memory Access Types on page A-12 contains information on memory access types.
Performed load: A load instruction is performed when the value returned by the load
has been determined. The result of a load on processor A has been determined with
respect to processor or coherent I/O module B when a subsequent store to the location
by B cannot affect the value returned by the load. The store by B must use the same
memory access type as the load.
Performed store: A store instruction is performed when the store is observable. A store
on processor A is observable with respect to processor or coherent I/O module B when
a subsequent load of the location by B returns the value written by the store. The load
by B must use the same memory access type as the store.
31 2526
SPECIAL
6 15
0 SYNC
6
6 5 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1
stype
5
1011
SYNC Synchronize Shared Memory
A-156 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Globally performed load: A load instruction is globally performed when it is performed
with respect to all processors and coherent I/O modules capable of storing to the
location.
Globally performed store: A store instruction is globally performed when it is globally
observable. It is globally observable when it observable by all processors and I/O
modules capable of loading from the location.
Coherent I/O module: A coherent I/O module is an Input/Output system component
that performs coherent Direct Memory Access (DMA). It reads and writes memory
independently as though it were a processor doing loads and stores to locations with
a memory access type of cached coherent.
Restrictions:
The effect of SYNC on the global order of the effects of loads and stores for memory
access types other than uncached and cached coherent is not defined.
Operation:
SyncOperation(stype)
Exceptions:
Reserved Instruction
Programming Notes:
A processor executing load and store instructions observes the effects of the loads and
stores that use the same memory access type in the order that they occur in the
instruction stream; this is known as program order. A parallel program has multiple
instruction streams that can execute at the same time on different processors. In
multiprocessor (MP) systems, the order in which the effects of loads and stores are
observed by other processors, the global order of the loads and stores, determines the
actions necessary to reliably share data in parallel programs.
When all processors observe the effects of loads and stores in program order, the
system is strongly ordered. On such systems, parallel programs can reliably share data
without explicit actions in the programs. For such a system, SYNC has the same effect
as a NOP. Executing SYNC on such a system is not necessary, but is also not an error.
If a multiprocessor system is not strongly ordered, the effects of load and store
instructions executed by one processor may be observed out of program order by other
processors. On such systems, parallel programs must take explicit actions in order to
reliably share data. At critical points in the program, the effects of loads and stores
from an instruction stream must occur in the same order for all processors. SYNC
separates the loads and stores executed on the processor into two groups and the
effects of these groups are seen in program order by all processors. The effect of all
loads and stores in one group is seen by all processors before the effect of any load or
store in the other group. In effect, SYNC causes the system to be strongly ordered for
the executing processor at the instant that the SYNC is executed.
Synchronize Shared Memory SYNC
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-157
Many MIPS-based multiprocessor systems are strongly ordered or have a mode in
which they operate as strongly ordered for at least one memory access type. The MIPS
architecture also permits MP systems that are not strongly ordered. SYNC enables the
reliable use of shared memory on such systems. A parallel program that does not use
SYNC will generally not operate on a system that is not strongly ordered, however a
program that does use SYNC will work on both types of systems. System-specific
documentation will describe the actions necessary to reliably share data in parallel
programs for that system.
The behavior of a load or store using one memory access type is undefined if a load or
store was previously made to the same physical location using a different memory
access type. The presence of a SYNC between the references does not alter this
behavior. See page A-13 for a more complete discussion.
SYNC affects the order in which the effects of load and store instructions appears to all
processors; it not generally affect the physical memory-system ordering or
synchronization issues that arise in system programming. The effect of SYNC on
implementation specific aspects of the cached memory system, such as writeback
buffers, is not defined. The effect of SYNC on reads or writes to memory caused by
privileged implementation-specific instructions, such as CACHE, is not defined.
Prefetch operations have no effects detectable by user-mode programs so ordering the
effects of prefetch operations is not meaningful.
SYNC Synchronize Shared Memory
A-158 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
EXAMPLE : These code fragments show how SYNC can be used to coordinate the
use of shared data between separate writer and reader instruction streams in a
multiprocessor environment. The FLAG location is used by the instruction streams to
determine whether the shared data item DATA is valid. The SYNC executed by
processor A forces the store of DATA to be performed globally before the store to FLAG
is performed. The SYNC executed by processor B ensures that DATA is not read until
after the FLAG value indicates that the shared data is valid.
Implementation Notes:
There may be side effects of uncached loads and stores that affect cached coherent load
and store operations. To permit the reliable use of such side effects, buffered uncached
stores that occur before the SYNC must be written to memory before cached coherent
loads and stores after the SYNC may be performed.
Processor A (writer)
# Conditions at entry:
# The value 0 has been stored in FLAG and that value is observable by B.
SW R1, DATA # change shared DATA value
LI R2, 1
SYNC # perform DATA store before performing FLAG store
SW R2, FLAG # say that the shared DATA value is valid
Processor B (reader)
LI R2, 1
1: LW R1, FLAG # get FLAG
BNE R2, R1, 1B # if it says that DATA is not valid, poll again
NOP
SYNC # FLAG value checked before doing DATA reads
LW R1, DATA # read (valid) shared DATA values
System Call SYSCALL
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-159
Format: SYSCALL MIPS I
Purpose: To cause a System Call exception.
Description:
A system call exception occurs, immediately and unconditionally transferring control
to the exception handler.
The code field is available for use as software parameters, but is retrieved by the
exception handler only by loading the contents of the memory word containing the
instruction.
Restrictions:
None
Operation:
SignalException(SystemCall)
Exceptions:
System Call
31 2526
SPECIAL
6 20
Code SYSCALL
6
6 5 0
0 0 0 0 0 0 0 0 1 1 00
TEQ Trap if Equal
A-160 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Format: TEQ rs, rt MIPS II
Purpose: To compare GPRs and do a conditional Trap.
Description: if (rs = rt) then Trap
Compare the contents of GPR rs and GPR rt as signed integers; if GPR rs is equal to
GPR rt then take a Trap exception.
The contents of the code field are ignored by hardware and may be used to encode
information for system software. To retrieve the information, system software must
load the instruction word from memory.
Restrictions:
None
Operation:
if GPR[rs] = GPR[rt] then
SignalException(Trap)
endif
Exceptions:
Reserved Instruction
Trap
31 2526 2021 1516
SPECIAL rs rt
6 5 5
code TEQ
10 6
6 5 0
0 0 0 0 0 0 1 1 0 1 0 0
Trap if Equal Immediate TEQI
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-161
Format: TEQI rs, immediate MIPS II
Purpose: To compare a GPR to a constant and do a conditional Trap.
Description: if (rs = immediate) then Trap
Compare the contents of GPR rs and the 16-bit signed immediate as signed integers; if
GPR rs is equal to immediate then take a Trap exception.
Restrictions:
None
Operation:
if GPR[rs] = sign_extend(immediate) then
SignalException(Trap)
endif
Exceptions:
Reserved Instruction
Trap
31 2526 2021 1516
REGIMM rs
6 5 5
immediateTEQI
16
0
0 0 0 0 0 1 0 1 1 0 0
TGE Trap if Greater or Equal
A-162 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Format: TGE rs, rt MIPS II
Purpose: To compare GPRs and do a conditional Trap.
Description: if (rs ≥ rt) then Trap
Compare the contents of GPR rs and GPR rt as signed integers; if GPR rs is greater than
or equal to GPR rt then take a Trap exception.
The contents of the code field are ignored by hardware and may be used to encode
information for system software. To retrieve the information, system software must
load the instruction word from memory.
Restrictions:
None
Operation:
if GPR[rs] ≥ GPR[rt] then
SignalException(Trap)
endif
Exceptions:
Reserved Instruction
Trap
31 2526 2021 1516
SPECIAL rs rt
6 5 5
code TGE
10 6
6 5 0
0 0 0 0 0 0 1 1 0 0 0 0
Trap if Greater or Equal Immediate TGEI
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-163
Format: TGEI rs, immediate MIPS II
Purpose: To compare a GPR to a constant and do a conditional Trap.
Description: if (rs ≥ immediate) then Trap
Compare the contents of GPR rs and the 16-bit signed immediate as signed integers; if
GPR rs is greater than or equal to immediate then take a Trap exception.
Restrictions:
None
Operation:
if GPR[rs] ≥ sign_extend(immediate) then
SignalException(Trap)
endif
Exceptions:
Reserved Instruction
Trap
31 2526 2021 1516
REGIMM rs
6 5 5
immediateTGEI
16
0
0 0 0 0 0 1 0 1 0 0 0
TGEIU Trap If Greater Or Equal Immediate Unsigned
A-164 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Format: TGEIU rs, immediate MIPS II
Purpose: To compare a GPR to a constant and do a conditional Trap.
Description: if (rs ≥ immediate) then Trap
Compare the contents of GPR rs and the 16-bit sign-extended immediate as unsigned
integers; if GPR rs is greater than or equal to immediate then take a Trap exception.
Because the 16-bit immediate is sign-extended before comparison, the instruction is able
to represent the smallest or largest unsigned numbers. The representable values are at
the minimum [0, 32767] or maximum [max_unsigned-32767, max_unsigned] end of
the unsigned range.
Restrictions:
None
Operation:
if (0 || GPR[rs]) ≥ (0 || sign_extend(immediate)) then
SignalException(Trap)
endif
Exceptions:
Reserved Instruction
Trap
31 2526 2021 1516
REGIMM rs
6 5 5
immediateTGEIU
16
0
0 0 0 0 0 1 0 1 0 0 1
Trap If Greater or Equal Unsigned TGEU
CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-165
Format: TGEU rs, rt MIPS II
Purpose: To compare GPRs and do a conditional Trap.
Description: if (rs ≥ rt) then Trap
Compare the contents of GPR rs and GPR rt as unsigned integers; if GPR rs is greater
than or equal to GPR rt then take a Trap exception.
The contents of the code field are ignored by hardware and may be used to encode
information for system software. To retrieve the information, system software must
load the instruction word from memory.
Restrictions:
None
Operation:
if (0 || GPR[rs]) ≥ (0 || GPR[rt]) then
SignalException(Trap)
endif
Exceptions:
Reserved Instruction
Trap
31 2526 2021 1516
SPECIAL rs rt
6 5 5
code TGEU
10 6
6 5 0
0 0 0 0 0 0 1 1 0 0 0 1
TLT Trap if Less Than
A-166 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set
Format: TLT rs, rt MIPS II
Purpose: To compare GPRs and do a conditional Trap.
Description: if (rs < rt) then Trap Compare the contents of GPR rs and GPR rt as signed integers; if GPR rs is less than GPR rt then take a Trap exception. The contents of the code field are ignored by hardware and may be used to encode information for system software. To retrieve the information, system software must load the instruction word from memory. Restrictions: None Operation: if GPR[rs] < GPR[rt] then SignalException(Trap) endif Exceptions: Reserved Instruction Trap 31 2526 2021 1516 SPECIAL rs rt 6 5 5 code TLT 10 6 6 5 0 0 0 0 0 0 0 1 1 0 0 1 0 Trap if Less Than Immediate TLTI CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-167 Format: TLTI rs, immediate MIPS II Purpose: To compare a GPR to a constant and do a conditional Trap. Description: if (rs < immediate) then Trap Compare the contents of GPR rs and the 16-bit signed immediate as signed integers; if GPR rs is less than immediate then take a Trap exception. Restrictions: None Operation: if GPR[rs] < sign_extend(immediate) then SignalException(Trap) endif Exceptions: Reserved Instruction Trap 31 2526 2021 1516 REGIMM rs 6 5 5 immediateTLTI 16 0 0 0 0 0 0 1 0 1 0 1 0 TLTIU Trap if Less Than Immediate Unsigned A-168 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Format: TLTIU rs, immediate MIPS II Purpose: To compare a GPR to a constant and do a conditional Trap. Description: if (rs < immediate) then Trap Compare the contents of GPR rs and the 16-bit sign-extended immediate as unsigned integers; if GPR rs is less than immediate then take a Trap exception. Because the 16-bit immediate is sign-extended before comparison, the instruction is able to represent the smallest or largest unsigned numbers. The representable values are at the minimum [0, 32767] or maximum [max_unsigned-32767, max_unsigned] end of the unsigned range. Restrictions: None Operation: if (0 || GPR[rs]) < (0 || sign_extend(immediate)) then SignalException(Trap) endif Exceptions: Reserved Instruction Trap 31 2526 2021 1516 REGIMM rs 6 5 5 immediateTLTIU 16 0 0 0 0 0 0 1 0 1 0 1 1 Trap if Less Than Unsigned TLTU CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-169 Format: TLTU rs, rt MIPS II Purpose: To compare GPRs and do a conditional Trap. Description: if (rs < rt) then Trap Compare the contents of GPR rs and GPR rt as unsigned integers; if GPR rs is less than GPR rt then take a Trap exception. The contents of the code field are ignored by hardware and may be used to encode information for system software. To retrieve the information, system software must load the instruction word from memory. Restrictions: None Operation: if (0 || GPR[rs]) < (0 || GPR[rt]) then SignalException(Trap) endif Exceptions: Reserved Instruction Trap 31 2526 2021 1516 SPECIAL rs rt 6 5 5 code TLTU 10 6 6 5 0 0 0 0 0 0 0 1 1 0 0 1 1 TNE Trap if Not Equal A-170 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Format: TNE rs, rt MIPS II Purpose: To compare GPRs and do a conditional Trap. Description: if (rs ≠ rt) then Trap Compare the contents of GPR rs and GPR rt as signed integers; if GPR rs is not equal to GPR rt then take a Trap exception. The contents of the code field are ignored by hardware and may be used to encode information for system software. To retrieve the information, system software must load the instruction word from memory. Restrictions: None Operation: if GPR[rs] ≠ GPR[rt] then SignalException(Trap) endif Exceptions: Reserved Instruction Trap 31 2526 2021 1516 SPECIAL rs rt 6 5 5 code TNE 10 6 6 5 0 0 0 0 0 0 0 1 1 0 1 1 0 Trap if Not Equal Immediate TNEI CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-171 Format: TNEI rs, immediate MIPS II Purpose: To compare a GPR to a constant and do a conditional Trap. Description: if (rs ≠ immediate) then Trap Compare the contents of GPR rs and the 16-bit signed immediate as signed integers; if GPR rs is not equal to immediate then take a Trap exception. Restrictions: None Operation: if GPR[rs] ≠ sign_extend(immediate) then SignalException(Trap) endif Exceptions: Reserved Instruction Trap 31 2526 2021 1516 REGIMM rs 6 5 5 immediateTNEI 16 0 0 0 0 0 0 1 0 1 1 1 0 XOR Exclusive OR A-172 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Format: XOR rd, rs, rt MIPS I Purpose: To do a bitwise logical EXCLUSIVE OR. Description: rd ← rs XOR rt Combine the contents of GPR rs and GPR rt in a bitwise logical exclusive OR operation and place the result into GPR rd. Restrictions: None Operation: GPR[rd] ← GPR[rs] xor GPR[rt] Exceptions: None 31 2526 2021 1516 SPECIAL rs rt 6 5 5 rd 0 XOR 5 5 6 11 10 6 5 0 0 0 0 0 0 0 1 0 0 1 1 00 0 0 0 0 Exclusive OR Immediate XORI CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-173 Format: XORI rt, rs, immediate MIPS I Purpose: To do a bitwise logical EXCLUSIVE OR with a constant. Description: rt ← rs XOR immediate Combine the contents of GPR rs and the 16-bit zero-extended immediate in a bitwise logical exclusive OR operation and place the result into GPR rt. Restrictions: None Operation: GPR[rt] ← GPR[rs] xor zero_extend(immediate) Exceptions: None 31 2526 2021 1516 0 XORI rs rt immediate 6 5 5 16 0 0 1 1 1 0 A-174 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set A A 7 CPU Instruction Formats A CPU instruction is a single 32-bit aligned word. The major instruction formats are shown in Figure A-10. Figure A-10 CPU Instruction Formats opcode 6-bit primary operation code rd 5-bit destination register specifier rs 5-bit source register specifier rt 5-bit target (source/destination) register specifier or used to specify functions within the primary opcode value REGIMM immediate 16-bit signed immediate used for: logical operands, arithmetic signed operands, load/store address byte offsets, PC-relative branch signed instruction displacement instr_index 26-bit index shifted left two bits to supply the low-order 28 bits of the jump target address. sa 5-bit shift amount function 6-bit function field used to specify functions within the primary operation code value SPECIAL. I-Type (Immediate). 31 2526 2021 1516 0 opcode rs rt offset 6 5 5 16 J-Type (Jump). 31 2526 opcode 6 0 instr_index 26 R-Type (Register). 31 2526 2021 1516 opcode rs rt 6 5 5 rd sa function 5 5 6 11 10 6 5 0 CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-175 A 8 CPU Instruction Encoding This section describes the encoding of user-level, i.e. non-privileged, CPU instructions for the four levels of the MIPS architecture, MIPS I through MIPS IV. Each architecture level includes the instructions in the previous level;† MIPS IV includes all instructions in MIPS I, MIPS II, and MIPS III. This section presents eight different views of the instruction encoding. • Separate encoding tables for each architecture level. • A MIPS IV encoding table showing the architecture level at which each opcode was originally defined and subsequently modified (if modified). • Separate encoding tables for each architecture revision showing the changes made during that revision. A 8.1 Instruction Decode Instruction field names are printed in bold in this section. The primary opcode field is decoded first. Most opcode values completely specify an instruction that has an immediate value or offset. Opcode values that do not specify an instruction specify an instruction class. Instructions within a class are further specified by values in other fields. The opcode values SPECIAL and REGIMM specify instruction classes. The COP0, COP1, COP2, COP3, and COP1X instruction classes are not CPU instructions; they are discussed in section A 8.3. A 8.1.1 SPECIAL Instruction Class The opcode =SPECIAL instruction class encodes 3-register computational instructions, jump register, and some special purpose instructions. The class is further decoded by examining the format field. The format values fully specify the CPU instructions; the MOVCI instruction class is not a CPU instruction class. A 8.1.2 REGIMM Instruction Class The opcode =REGIMM instruction class encodes conditional branch and trap immediate instructions. The class is further decode, and the instructions fully specified, by examining the rt field. A 8.2 Instruction Subsets of MIPS III and MIPS IV Processors. MIPS III processors, such as the R4000, R4200, R4300, R4400, and R4600, have a processor mode in which only the MIPS II instructions are valid. The MIPS II encoding table describes the MIPS II-only mode except that the Coprocessor 3 instructions (COP3, LWC3, SWC3, LDC3, SDC3) are not available and cause a Reserved Instruction exception. † An exception to this rule is that the reserved, but never implemented, Coprocessor 3 instructions were removed or changed to another use starting in MIPS III. A-176 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set MIPS IV processors, such as the R8000 and R10000, have processor modes in which only the MIPS II or MIPS III instructions are valid. The MIPS II encoding table describes the MIPS II-only mode except that the Coprocessor 3 instructions (COP3, LWC3, SWC3, LDC3, SDC3) are not available and cause a Reserved Instruction exception. The MIPS III encoding table describes the MIPS III-only mode. A 8.3 Non-CPU Instructions in the Tables The encoding tables show all values for the field they describe and by doing this they include some entries that are not user-level CPU instructions. The primary opcode table includes coprocessor instruction classes (COP0, COP1, COP2, COP3/COP1X) and coprocessor load/store instructions (LWCx, SWCx, LDCx, SDCx for x=1, 2, or 3). The opcode =SPECIAL + function =MOVCI instruction class is an FPU instruction. A 8.3.1 Coprocessor 0 - COP0 COP0 encodes privileged instructions for Coprocessor 0, the System Control Coprocessor. The definition of the System Control Coprocessor is processor-specific and further information on these instructions are not included in this document. A 8.3.2 Coprocessor 1 - COP1, COP1X, MOVCI, and CP1 load/store. Coprocessor 1 is the floating-point unit in the MIPS architecture. COP1, COP1X, and the (opcode =SPECIAL + function =MOVCI) instruction classes encode floating-point instructions. LWC1, SWC1, LDC1, and SDC1 are floating-point loads and stores. The FPU instruction encoding is documented in section B.12. A 8.3.3 Coprocessor 2 - COP2 and CP2 load/store. Coprocessor 2 is optional and implementation-specific. No standard processor from MIPS has implemented coprocessor 2, but MIPS’ semiconductor licensees may have implemented it in a product based on one of the standard MIPS processors. At this time the standard processors are: R2000, R3000, R4000, R4200, R4300, R4400, R4600, R6000, R8000, and R10000. A 8.3.4 Coprocessor 3 - COP3 and CP3 load/store. Coprocessor 3 is optional and implementation-specific in the MIPS I and MIPS II architecture levels. It was removed from MIPS III and later architecture levels. Note that in MIPS IV the COP3 primary opcode was reused for the COP1X instruction class. No standard processor from MIPS has implemented coprocessor 2, but MIPS’ semiconductor licensees may have implemented it in a product based on one of the standard MIPS processors. At this time the standard processors are: R2000, R3000, R4000, R4200, R4300, R4400, R4600, R6000, R8000, and R10000. CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-177 Table A-37 CPU Instruction Encoding - MIPS I Architecture opcod e bits 28..26 Instructions encoded by opcode field. bits 31..29 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 SPECIAL δ REGIMM δ J JAL BEQ BNE BLEZ BGTZ 1 001 ADDI ADDIU SLTI SLTIU ANDI ORI XORI LUI 2 010 COP0 δ,π COP1 δ,π COP2 δ,π COP3 δ,π,κ * * * * 3 011 * * ∗ * * * * * 4 100 LB LH LWL LW LBU LHU LWR * 5 101 SB SH SWL SW * * SWR * 6 110 * LWC1 π LWC2 π LWC3 π,κ * * * * 7 111 * SWC1 π SWC2 π SWC3 π,κ * * * * functi on bits 2..0 Instructions encoded by function field when opcode field = SPECIAL. bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 SLL * SRL SRA SLLV * SRLV SRAV 1 001 JR JALR * * SYSCALL BREAK * * 2 010 MFHI MTHI MFLO MTLO * * * * 3 011 MULT MULTU DIV DIVU * * * * 4 100 ADD ADDU SUB SUBU AND OR XOR NOR 5 101 * * SLT SLTU * * * * 6 110 * * * * * * * * 7 111 * * * * * * * * rt bits 18..16 Instructions encoded by the rt field when opcode field = REGIMM. bits 20..19 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 00 BLTZ BGEZ = = = = = = 1 01 = = = = = = = = 2 10 BLTZAL BGEZAL = = = = = = 3 11 = = = = = = = = 31 26 opcode 0 31 26 opcode function 5 0 = SPECIAL 31 26 20 16 0 opcode rt= REGIMM A-178 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Table A-38 CPU Instruction Encoding - MIPS II Architecture opcod e bits 28..26 Instructions encoded by opcode field. bits 31..29 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 SPECIAL δ REGIMM δ J JAL BEQ BNE BLEZ BGTZ 1 001 ADDI ADDIU SLTI SLTIU ANDI ORI XORI LUI 2 010 COP0 δ,π COP1 δ,π COP2 δ,π COP3 δ,π,κ BEQL BNEL BLEZL BGTZL 3 011 * * ∗ * * * * * 4 100 LB LH LWL LW LBU LHU LWR * 5 101 SB SH SWL SW * * SWR ρ 6 110 LL LWC1 π LWC2 π LWC3 π,κ * LDC1 π LDC2 π LDC3 π,κ 7 111 SC SWC1 π SWC2 π SWC3 π,κ * SDC1 π SDC2 π SDC3 π,κ functi on bits 2..0 Instructions encoded by function field when opcode field = SPECIAL. bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 SLL * SRL SRA SLLV * SRLV SRAV 1 001 JR JALR * * SYSCALL BREAK * SYNC 2 010 MFHI MTHI MFLO MTLO * * * * 3 011 MULT MULTU DIV DIVU * * * * 4 100 ADD ADDU SUB SUBU AND OR XOR NOR 5 101 * * SLT SLTU * * * * 6 110 TGE TGEU TLT TLTU TEQ * TNE * 7 111 * * * * * * * * rt bits 18..16 Instructions encoded by the rt field when opcode field = REGIMM. bits 20..19 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 00 BLTZ BGEZ BLTZL BGEZL * * * * 1 01 TGEI TGEIU TLTI TLTIU TEQI * TNEI * 2 10 BLTZAL BGEZAL BLTZALL BGEZALL * * * * 3 11 * * * * * * * * 31 26 opcode 0 31 26 opcode function 5 0 = SPECIAL 31 26 20 16 0 opcode rt= REGIMM CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-179 Table A-39 CPU Instruction Encoding - MIPS III Architecture opcod e bits 28..26 Instructions encoded by opcode field. bits 31..29 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 SPECIAL δ REGIMM δ J JAL BEQ BNE BLEZ BGTZ 1 001 ADDI ADDIU SLTI SLTIU ANDI ORI XORI LUI 2 010 COP0 δ,π COP1 δ,π COP2 δ,π ∗ BEQL BNEL BLEZL BGTZL 3 011 DADDI DADDIU LDL LDR * * * * 4 100 LB LH LWL LW LBU LHU LWR LWU 5 101 SB SH SWL SW SDL SDR SWR ρ 6 110 LL LWC1 π LWC2 π ∗ LLD LDC1 π LDC2 π LD 7 111 SC SWC1 π SWC2 π ∗ SCD SDC1 π SDC2 π SD functi on bits 2..0 Instructions encoded by function field when opcode field = SPECIAL. bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 SLL * SRL SRA SLLV * SRLV SRAV 1 001 JR JALR * * SYSCALL BREAK * SYNC 2 010 MFHI MTHI MFLO MTLO DSLLV * DSRLV DSRAV 3 011 MULT MULTU DIV DIVU DMULT DMULTU DDIV DDIVU 4 100 ADD ADDU SUB SUBU AND OR XOR NOR 5 101 * * SLT SLTU DADD DADDU DSUB DSUBU 6 110 TGE TGEU TLT TLTU TEQ * TNE * 7 111 DSLL * DSRL DSRA DSLL32 * DSRL32 DSRA32 rt bits 18..16 Instructions encoded by the rt field when opcode field = REGIMM. bits 20..19 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 00 BLTZ BGEZ BLTZL BGEZL * * * * 1 01 TGEI TGEIU TLTI TLTIU TEQI * TNEI * 2 10 BLTZAL BGEZAL BLTZALL BGEZALL * * * * 3 11 * * * * * * * * 31 26 opcode 0 31 26 opcode function 5 0 = SPECIAL 31 26 20 16 0 opcode rt= REGIMM A-180 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Table A-40 CPU Instruction Encoding - MIPS IV Architecture opcod e bits 28..26 Instructions encoded by opcode field. bits 31..29 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 SPECIAL δ REGIMM δ J JAL BEQ BNE BLEZ BGTZ 1 001 ADDI ADDIU SLTI SLTIU ANDI ORI XORI LUI 2 010 COP0 δ,π COP1 δ,π COP2 δ,π COP1X δ,π BEQL BNEL BLEZL BGTZL 3 011 DADDI DADDIU LDL LDR * * * 4 100 LB LH LWL LW LBU LHU LWR LWU 5 101 SB SH SWL SW SDL SDR SWR ρ 6 110 LL LWC1 π LWC2 π PREF LLD LDC1 π LDC2 π LD 7 111 SC SWC1 π SWC2 π ∗ SCD SDC1 π SDC2 π SD functi on bits 2..0 Instructions encoded by function field when opcode field = SPECIAL. bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 SLL MOVCI δ,µ SRL SRA SLLV * SRLV SRAV 1 001 JR JALR MOVZ MOVN SYSCALL BREAK * SYNC 2 010 MFHI MTHI MFLO MTLO DSLLV * DSRLV DSRAV 3 011 MULT MULTU DIV DIVU DMULT DMULTU DDIV DDIVU 4 100 ADD ADDU SUB SUBU AND OR XOR NOR 5 101 * * SLT SLTU DADD DADDU DSUB DSUBU 6 110 TGE TGEU TLT TLTU TEQ * TNE * 7 111 DSLL * DSRL DSRA DSLL32 * DSRL32 DSRA32 rt bits 18..16 Instructions encoded by the rt field when opcode field = REGIMM. bits 20..19 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 00 BLTZ BGEZ BLTZL BGEZL * * * * 1 01 TGEI TGEIU TLTI TLTIU TEQI * TNEI * 2 10 BLTZAL BGEZAL BLTZALL BGEZALL * * * * 3 11 * * * * * * * * 31 26 opcode 0 31 26 opcode function 5 0 = SPECIAL 31 26 20 16 0 opcode rt= REGIMM CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-181 Table A-41 Architecture Level in Which CPU Instructions are Defined or Extended. A-182 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set The architecture level in which each MIPS IVencoding was defined is indicated by a subscript 1, 2, 3, or 4 (for architecture level I, II, III, or IV). If an instruction or instruction class was later extended, the extending level is indicated after the defining level. opcod e bits 28..26 Instructions encoded by opcode field. bits 31..29 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 SPECIAL 1-4 REGIMM 1,2 J 1 JAL 1 BEQ 1 BNE 1 BLEZ 1 BGTZ 1 1 001 ADDI 1 ADDIU 1 SLTI 1 SLTIU 1 ANDI 1 ORI 1 XORI 1 LUI 1 2 010 COP0 1 COP1 1,2,3,4 COP2 1 COP1X 4 BEQL 2 BNEL 2 BLEZL 2 BGTZL 2 3 011 DADDI 3 DADDIU 3 LDL 3 LDR 3 * 1 * 1 * 1 * 1 4 100 LB 1 LH 1 LWL 1 LW 1 LBU 1 LHU 1 LWR 1 LWU 3 5 101 SB 1 SH 1 SWL 1 SW 1 SDL 3 SDR 3 SWR 1 ρ 2 6 110 LL 2 LWC1 1 LWC2 1 PREF 4 LLD 3 LDC1 2 LDC2 2 LD 3 7 111 SC 2 SWC1 1 SWC2 1 ∗ 3 SCD 3 SDC1 2 SDC2 2 SD 3 functi on bits 2..0 Instructions encoded by function field when opcode field = SPECIAL. bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 SLL 1 MOVCI 4 SRL 1 SRA 1 SLLV 1 * 1 SRLV 1 SRAV 1 1 001 JR 1 JALR 1 MOVZ 4 MOVN 4 SYSCALL 1 BREAK 1 * 1 SYNC 2 2 010 MFHI 1 MTHI 1 MFLO 1 MTLO 1 DSLLV 3 * 1 DSRLV 3 DSRAV 3 3 011 MULT 1 MULTU 1 DIV 1 DIVU 1 DMULT 3 DMULTU 3 DDIV 3 DDIVU 3 4 100 ADD 1 ADDU 1 SUB 1 SUBU 1 AND 1 OR 1 XOR 1 NOR 1 5 101 * 1 * 1 SLT 1 SLTU 1 DADD 3 DADDU 3 DSUB 3 DSUBU 3 6 110 TGE 2 TGEU 2 TLT 2 TLTU 2 TEQ 2 * 1 TNE 2 * 1 7 111 DSLL 3 * 1 DSRL 3 DSRA 3 DSLL32 3 * 1 DSRL32 3 DSRA32 3 rt bits 18..16 Instructions encoded by the rt field when opcode field = REGIMM. bits 20..19 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 00 BLTZ 1 BGEZ 1 BLTZL 2 BGEZL 2 * 1 * 1 * 1 * 1 1 01 TGEI 2 TGEIU 2 TLTI 2 TLTIU 2 TEQI 2 * 1 TNEI 2 * 1 2 10 BLTZAL 1 BGEZAL 1 BLTZALL 2 BGEZALL 2 * 1 * 1 * 1 * 1 3 11 * 1 * 1 * 1 * 1 * 1 * 1 * 1 * 1 31 26 opcode 0 31 26 opcode function 5 0 = SPECIAL 31 26 20 16 0 opcode rt= REGIMM CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-183 A-184 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Table A-42 CPU Instruction Encoding Changes - MIPS II Revision. An instruction encoding is shown if the instruction is added in this revision. opcod e bits 28..26 Instructions encoded by opcode field. bits 31..29 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 1 001 2 010 BEQL BNEL BLEZL BGTZL 3 011 4 100 5 101 ρ 6 110 LL LDC1 π LDC2 π LDC3 π 7 111 SC SDC1 π SDC2 π SDC3 π functi on bits 2..0 Instructions encoded by function field when opcode field = SPECIAL. bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 1 001 SYNC 2 010 3 011 4 100 5 101 6 110 TGE TGEU TLT TLTU TEQ TNE 7 111 rt bits 18..16 Instructions encoded by the rt field when opcode field = REGIMM. bits 20..19 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 00 BLTZL BGEZL 1 01 TGEI TGEIU TLTI TLTIU TEQI TNEI 2 10 BLTZALL BGEZALL 3 11 31 26 opcode 0 31 26 opcode function 5 0 = SPECIAL 31 26 20 16 0 opcode rt= REGIMM CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-185 A-186 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Table A-43 CPU Instruction Encoding Changes - MIPS III Revision. CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-187 An instruction encoding is shown if the instruction is added or modified in this revision. opcod e bits 28..26 Instructions encoded by opcode field. bits 31..29 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 1 001 2 010 * (was COP3) 3 011 DADDI DADDIU LDL LDR 4 100 LWU 5 101 SDL SDR 6 110 * (was LWC3 ) LLD LD (was LDC3) 7 111 * (was SWC3) SCD SD (was SDC3) functi on bits 2..0 Instructions encoded by function field when opcode field = SPECIAL. bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 1 001 2 010 DSLLV DSRLV DSRAV 3 011 DMULT DMULTU DDIV DDIVU 4 100 5 101 DADD DADDU DSUB DSUBU 6 110 7 111 DSLL DSRL DSRA DSLL32 DSRL32 DSRA32 31 26 opcode 0 31 26 opcode function 5 0 = SPECIAL A-188 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set rt bits 18..16 Instructions encoded by the rt field when opcode field = REGIMM. bits 20..19 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 00 1 01 2 10 3 11 31 26 20 16 0 opcode rt= REGIMM CPU Instruction Set MIPS IV Instruction Set. Rev 3.2 A-189 Table A-44 CPU Instruction Encoding Changes - MIPS IV Revision. An instruction encoding is shown if the instruction is added or modified in this revision. opcod e bits 28..26 Instructions encoded by opcode field. bits 31..29 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 1 001 2 010 COP1X δ,π 3 011 4 100 5 101 6 110 PREF 7 111 functi on bits 2..0 Instructions encoded by function field when opcode field = SPECIAL. bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 MOVCI δ,µ 1 001 MOVZ MOVN 2 010 3 011 4 100 5 101 6 110 7 111 rt bits 18..16 Instructions encoded by the rt field when opcode field = REGIMM. bits 20..19 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 00 1 01 2 10 3 11 31 26 opcode 0 31 26 opcode function 5 0 = SPECIAL 31 26 20 16 0 opcode rt= REGIMM A-190 MIPS IV Instruction Set. Rev 3.2 CPU Instruction Set Key to notes in CPU instruction encoding tables: * This opcode is reserved for future use. An attempt to execute it causes a Reserved Instruction exception. = This opcode is reserved for future use. An attempt to execute it produces an undefined result. The result may be a Reserved Instruction exception but this is not guaranteed. δ (also italic opcode name) This opcode indicates an instruction class. The instruction word must be further decoded by examing additional tables that show values for another instruction field. π This opcode is a coprocessor operation, not a CPU operation. If the processor state does not allow access to the specified coprocessor, the instruction causes a Coprocessor Unusable exception. It is included in the table because it uses a primary opcode in the instruction encoding map. κ This opcode is removed in a later revision of the architecture. If a MIPS III or MIPS IV processor is operated in MIPS II-only mode this opcode will cause a Reserved Instruction exception. µ This opcode indicates a class of coprocessor 1 instructions. If the processor state does not allow access to coprocessor 1, the opcode causes a Coprocessor Unusable exception. It is included in the table because the encoding uses a location in what is otherwise a CPU instruction encoding map. Further encoding information for this instruction class is in the FPU Instruction Encoding tables. ρ This opcode is reserved for Coprocessor 0 (System Control Coprocessor) instructions that require base+offset addressing. If the instruction is used for COP0 in an implementation, an attempt to execute it without Coprocessor 0 access privilege will cause a Coprocessor Unusable exception. If the instruction is not used in an implementation, it will cause a Reserved Instruction exception. FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-1 FPU Instruction Set B B 1 Introduction This appendix describes the instruction set architecture (ISA) for the floating-point unit (FPU) in the MIPS IV architecture. In the MIPS architecture, the FPU is coprocessor 1, an optional processor implementing IEEE Standard 754† floating- point operations. The FPU also provides a few additional operations not defined by the IEEE standard. † IEEE Standard 754-1985, “IEEE Standard for Binary Floating-Point Arithmetic” MIPS I MIPS II MIPS III MIPS IV The original MIPS I FPU ISA has been extended in a backward-compatible fashion three times. The ISA extensions are inclusive as the diagram illustrates; each new architecture level (or version) includes the former levels. The description of an architectural feature includes the architecture level in which the feature is (first) defined or extended. The feature is also available in all later (higher) levels of the architecture. MIPS Architecture Extensions B-2 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set In addition to an ISA, the architecture definition includes processing resources, such as the coprocessor general register set. The 32-bit registers in MIPS I were changed to 64-bit registers in MIPS III in a way that is not backwards compatible. For changes such as this, processors implementing higher levels of the architecture have a way to provide the processing resource model for earlier levels. For the FPU there is a mode to select the 32-bit or 64-bit register model. The practical result is that a processor implementing MIPS IV is also able to run MIPS I, MIPS II, or MIPS III binary programs without change. If coprocessor 1 is not enabled, an attempt to execute a floating-point instruction will cause a Coprocessor Unusable exception. Enabling coprocessor 1 is a privileged operation provided by the System Control Coprocessor. Every system environment will either enable the FPU automatically or provide a means for an application to request that it be enabled. Before the instruction set is described, there is an overview of the FPU data types, registers, and computational model. The FPU instruction set is summarized by functional group then each operation is described separately in alphabetical order. The description concludes with the FPU instruction formats and opcode encoding tables. See the CPU instruction set section titled “Description of an Instruction” for a description of the organization of the individual instruction descriptions and the notation used in them. The architecture of the floating-point coprocessor consists of: • Data types • Operations • A computational model • Processing resources (registers) • An instruction set The IEEE standard defines the floating-point number data types, the basic arithmetic, comparison, and conversion operations, and a computational model. The IEEE standard defines neither specific processing resources nor an instruction set. The MIPS architecture defines fixed-point (integer) data types, FPU register sets, control and exception mechanisms, and an instruction set. The architecture include non-IEEE FPU control operations, and arithmetic operations (multiply- add, reciprocal, and reciprocal square root) that may not supply results that match the IEEE precision rules. B 2 FPU Data Types The FPU provides both floating-point and fixed-point data types. The single and double precision floating-point data types are those specified by the IEEE standard. The fixed-point types are the signed integers provided by the CPU architecture FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-3 B 2.1 Floating-point formats There are two floating-point data types provided by the FPU. • 32-bit Single precision floating-point (type S) • 64-bit Double precision floating-point (type D) The floating-point formats represents numeric values as well as other special entities: 1. Numbers of the form: (-1)s 2E b0 . b1 b2 ...bp-1 where (see Table B-1): s = 0 or 1 E = any integer between E_min and E_max, inclusive bi = 0 or 1 (the high bit, b0, is to the left of the binary point) p is the precision 2. Two infinities, +∞ and -∞ 3. Signaling non-numbers (SNaNs) 4. Quiet non-numbers (QNaNs) Table B-1 Parameters of Floating-Point Formats The single and double floating-point formats are composed of three fields whose size is listed in Table B-1. The layouts are pictured in the figures below. • A 1-bit sign, s. • A biased exponent, e = E + bias • A binary fraction, f = .b1 b2 ...bp-1 (the b0 bit is not recorded) Figure B-1 Single-Precision Floating-Point Format (S) parameter Single Double bits of mantissa precision, p 24 53 maximum exponent, E_max +127 +1023 minimum exponent, E_min -126 -1022 exponent bias +127 +1023 bits in exponent field, e 8 11 representation of b0 integer bit hidden hidden bits in fraction field, f 23 52 total format width in bits 32 64 31 30 23 22 0 sign exponent fraction 1 8 23 B-4 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Figure B-2 Double-Precision Floating-Point Format (D) Values are encoded in the formats using the unbiased exponent, fraction, and sign values shown in Table B-2. The high-order bit of the fraction field, identified as b1, is also important for NaNs. Table B-2 Value of Single or Double Floating-Point Format Encoding B 2.1.1 Normalized and Denormalized Numbers For single and double formats, each representable nonzero numerical value has just one encoding; numbers are kept in normalized form. The high-order bit of the p-bit mantissa, which lies to the left of the binary point, is “hidden”, and not recorded in the fraction field. The encoding rules permit the value of this bit to be determined by looking at the value of the exponent. When the unbiased exponent is in the range E_min to E_max, inclusive, the number is normalized and the hidden bit must be 1. If the numeric value cannot be normalized because the exponent would be less than E_min, then the representation is denormalized and the encoded number has an exponent of E_min-1 and the hidden bit has the value 0. Plus and minus zero are special cases that are not regarded as denormalized values. B 2.1.2 Reserved Operand Values — Infinity and NaN A floating-point operation can signal IEEE exception conditions, such as those caused by uninitialized variables, violations of mathematical rules, or results that cannot be represented. If a program does not choose to trap IEEE exception 63 62 52 51 0 sign exponent fraction 1 11 52 unbiased E f s b 1 value v type of value E_max + 1 ≠ 0 1 SNaN Signaling NaN 0 QNaN Quiet NaN E_max +1 0 1 - ∞ minus infinity 0 + ∞ plus infinity E_max to E_min 1 - (2E)(1.f) negative normalized number 0 + (2E)(1.f) positive normalized number E_min -1 ≠ 0 1 - (2E_min)(0.f) negative denormalized number 0 + (2E_min)(0.f) positive denormalized number E_min -1 0 1 - 0 negative zero 0 + 0 positive zero FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-5 conditions, a computation that encounters these conditions proceeds without trapping but generates a result indicating that an exceptional condition arose during the computation. To permit this, each floating-point format defines representations, shown in Table B-2, for +infinity (+∞), -infinity (-∞), quiet NaN (QNan), and signaling NaN (SNaN). Infinity represents a number with magnitude too large to be represented in the format; in essence it exists to represent a magnitude overflow during a computation. A correctly signed ∞ is generated as the default result in division by zero and some cases of overflow; details are in the IEEE exception condition descriptions and Table B-4 "Default Result for IEEE Exceptions Not Trapped Precisely". Once created as a default result, ∞ can become an operand in a subsequent operation. The infinities are interpreted such that -∞ < (every finite number) < +∞. Arithmetic with ∞ is the limiting case of real arithmetic with operands of arbitrarily large magnitude, when such limits exist. In these cases, arithmetic on ∞ is regarded as exact and exception conditions do not arise. The out-of-range indication represented by the ∞ is propagated through subsequent computations. For some cases there is no meaningful limiting case in real arithmetic for operands of ∞ and these cases raise the Invalid Operation exception condition. See the description of the Invalid Operation exception for a list of these cases. SNaN operands cause the Invalid Operation exception for arithmetic operations. SNaNs are useful values to put uninitialized variables. SNaN is never produced as a result value. NOTE: The IEEE 754 Standard states that “Whether copying a signaling NaN without a change of format signals the invalid operation exception is the implementor’s option”. The MIPS architecture has chosen to make the formatted operand move instructions (MOV.fmt MOVT.fmt MOVF.fmt MOVN.fmt MOVZ.fmt) non-arithmetic and they do not signal IEEE exceptions. QNaNs are intended to afford retrospective diagnostic information inherited from invalid or unavailable data and results. Propagation of the diagnostic information requires that information contained in the QNaNs be preserved through arithmetic operations and floating-point format conversions. QNaN operands do not cause arithmetic operations to signal an exception. When a floating-point result is to be delivered, a QNaN operand causes an arithmetic operation to supply a QNaN result. The result QNaN is one of the operand QNaN values when possible. QNaNs do have effects similar to SNaNs on operations that do not deliver a floating-point result, specifically comparisons. See the detailed description of the floating-point compare instruction (C.cond.fmt) for information. When certain invalid operations not involving QNaN operands are performed but do not cause a trap (because the trap is not enabled), a new QNaN value is created. Table B-3 shows the QNaN value generated when no input operand QNaN value can be copied. The values listed for the fixed-point formats are the values supplied to satisfy the IEEE standard when a QNaN or infinite floating-point value is converted to fixed point. There is no other feature of the architecture that detects or makes use of these “integer QNaN” values. B-6 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Table B-3 Value Supplied when a new Quiet NaN is Created B 2.2 Fixed-point formats There are two floating-point data types provided by the FPU. • 32-bit Word fixed-point (type W) • 64-bit Longword fixed-point (type L) (defined in MIPS III) The fixed-point values are held in the two’s complement format used for signed integers in the CPU. Unsigned fixed-point data types are not provided in the architecture; application software may synthesize computations for unsigned integers from the existing instructions and data types. Figure B-3 Word Fixed-Point Format (W) Figure B-4 Longword Fixed-Point Format (L) B 3 Floating-Point Registers This section describes the organization and use of the two separate coprocessor 1 (CP1) register sets. The coprocessor general registers, also called Floating General Registers (FGRs) are used to transfer binary data between the FPU and the rest of the system. The general register set is also used to hold formatted FPU operand values. There are only two control registers and they are used to identify and control the FPU. Format New QNaN value Single floating point 7fbf ffff Double floating point 7ff7 ffff ffff ffff Word fixed point 7fff ffff Longword fixed point 7fff ffff ffff ffff 31 30 0 sign int 1 31 63 62 0 sign int 1 63 FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-7 There are separate 32-bit and 64-bit wide register models. MIPS I defines the 32-bit wide register model. MIPS III defines the 64-bit model. To support programs for earlier architecture definitions, processors providing the 64-bit MIPS III register model also provide the 32-bit wide register model as a mode selection. Selecting 32 or 64-bit register model is an implementation-specific privileged operation. B 3.1 Organization The CP1 register organization for 32-bit and 64-bit register models is shown in Figure B-5. The coprocessor general registers are the same width as the CPU registers. The two defined control registers are 32-bits wide. Figure B-5 Coprocessor 1 General Registers (FGRs) B 3.2 Binary Data Transfers The data transfer instructions move words and doublewords between the CP1 general registers and the remainder of the system. The operation of the load and move-to instructions is shown in Figure B-6 and Figure B-7. The store and move- from instructions operate in reverse, reading data from the location that the corresponding load or move-to instruction wrote it. MIPS I 32-bit reg model MIPS III 64-bit register model 31 0 63 0 reg # 0 0 1 1 2 2 3 3 ... ... ... ... 30 30 31 31 FPU - Control Registers (FCRs) 31 0 31 0 # 0 Implementation and Revision # 0 31 FP Control and Status 31 B-8 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Figure B-6 Effect of FPU Word Load or Move-to Operations Doubleword transfers to/from 32-bit registers access an aligned pair of CP1 general registers with the least-significant word of the doubleword in the lowest- numbered register. Figure B-7 Effect of FPU Doubleword Load or Move-to Operations MIPS I 32-bit reg model operation MIPS III 64-bit register model 31 0 63 0 #0 empty #0 empty 1 empty 1 empty ⇓ LWC1 f0,0(r0) / MTC1 f0,r0 ⇓ 0 data word 0 0 undefined/unused data word 0 1 empty 1 empty ⇓ LWC1 f1,4(r0) / MTC1 f1,r4 ⇓ 0 data word 0 0 undefined/unused data word 0 1 data word 4 1 undefined/unused data word 4 MIPS II 32-bit reg model Loads/Stores (see note below) operation MIPS III 64-bit register model 31 0 63 0 #0 empty #0 empty 1 empty 1 empty ⇓ LDC1 f0,0(r0) / DMTC1 f0,r0 ⇓ 0 lower word (0) 0 data doubleword 0 1 upper word (4) 1 empty ⇓ LDC1 f1,8(r0) / DMTC1 f1,r8 ⇓ invalid to load double to odd register 0 data doubleword 0 1 data doubleword 8 NOTE: No 64-bit transfers are defined for the MIPS I 32-bit register model. MIPS II defines the 64-bit loads/stores but not 64-bit moves. FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-9 B 3.3 Formatted Operand Layout FPU instructions that operate on formatted operand values specify the floating- point register (FPR) that holds a value. An FPR is not necessarily the same as a CP1 general register because an FPR is 64 bits wide; if this is wider than the CP1 general registers, an aligned set of adjacent CP1 general registers is used as the FPR. The 32-bit register model provides 16 FPRs specified by the even CP1 general register numbers. The 64-bit register model provides 32 FPRs, one per CP1 general register. Operands that are only 32 bits wide (W and S formats), use only half the space in an FPR. The FPR organization and the way that operand data is stored in them is shown in the following figures. A summary of the data transfer instructions can be found in section B 6.1 on page B-19. Figure B-8 Floating-point Operand Register (FPR) Organization Figure B-9 Single Floating Point (S) or Word Fixed (W) Operand in an FPR MIPS I 32-bit reg model MIPS III 64-bit register model #0 #0 1 2 2 3 ... ... ... ... 30 30 31 16 x 64-bit operand registers (FPRs) 32 x 64-bit operand registers (FPRs) MIPS I 32-bit reg model MIPS III 64-bit register model 31 0 63 0 #0 data word #0 undefined/unused data word undefined/unused 1 empty — available to hold an operand B-10 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Figure B-10 Double Floating Point (D) or Long Fixed (L) Operand In an FPR B 3.4 Implementation and Revision Register Coprocessor control register 0 contains values that identify the implementation and revision of the FPU. Only the low-order two bytes of this register are defined as shown in Figure B-11. Figure B-11 FPU Implementation and Revision Register The implementation field identifies a particular FPU part, but the revision number may not be relied on to reliably characterize the FPU functional version. B 3.5 FPU Control and Status Register — FCSR Coprocessor control register 31 Is the FPU Control and Status Register (FCSR). Access to the register is not privileged; it can be read or written by any program that can execute floating-point instructions. It controls some operations of the coprocessor and shows status information: • Selects the default rounding mode for FPU arithmetic operations. • Selectively enables traps of FPU exception conditions. • Controls some denormalized number handling options. • Reports IEEE exceptions that arose in the most recently executed instruction. • Reports IEEE exceptions that arose, cumulatively, in completed instructions. • Indicates the condition code result of FP compare instructions. MIPS I 32-bit reg model (see note below) MIPS III 64-bit register model 31 0 63 0 #0 lower word #0 data doubleword upper word 1 empty — available to hold an operand NOTE: MIPS I supports the Double floating-point (D) type; the fixed-point longword (L) operand is available starting in MIPS III 32 16 15 8 7 0 0 Implementation Revision 16 8 8 FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-11 The contents of this register are unpredictable and undefined after a processor reset or a power-up event. Software should initialize this register. Figure B-12 MIPS I - FPU Control and Status Register (FCSR) Figure B-13 MIPS III - FPU Control and Status Register (FCSR) Figure B-14 MIPS IV - FPU Control and Status Register (FCSR) 31 24 23 22 18 17 12 11 7 6 2 1 0 0 c 0 cause enables flags RM 8 1 5 6 5 5 2 E V Z O U I V Z O U I V Z O U I 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 31 25 24 23 22 18 17 12 11 7 6 2 1 0 0 FS c 0 cause enables flags RM 7 1 1 5 6 5 5 2 E V Z O U I V Z O U I V Z O U I 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 31 25 24 23 22 18 17 12 11 7 6 2 1 0 FCC FS FCC 0 cause enables flags RM 7 1 1 5 6 5 5 2 7 6 5 4 3 2 1 0 E V Z O U I V Z O U I V Z O U I 31 30 29 28 27 26 25 23 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 All fields in the FCSR are readable and writable. FCC Floating-Point Condition Codes. These bits record the result of FP compares and are tested for FP conditional branches; the FCC bit to use is specified in the compare or branch instruction. The 0th FCC bit is the same as the c bit in MIPS I. FS Flush to Zero. When FS is set, denormalized results are flushed to zero instead of causing an unimplemented operation exception. When a denormalized operand value is encountered, zero may be used instead of the denorm; this is implementation specific. c Condition Bit. This bit records the result of FP compares and is tested by FP conditional branches. In MIPS IV this becomes the 0th FCC bit. B-12 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set cause Cause bits. These bits indicate the exception conditions that arise during the execution of an FPU arithmetic instruction in precise exception mode. A bit is set to 1 if the corresponding exception condition arises during the execution of an instruction and 0 otherwise. By reading the registers, the exception conditions caused by the preceding FPU arithmetic instruction can be determined. The meaning of the individual bits is: E Unimplemented Operation V Invalid Operation Z Divide by Zero O Overflow U Underflow I Inexact Result enables Enable bits (see cause field for bit names). These bits control, for each of the five conditions individually, whether a trap is taken when the IEEE exception condition occurs. The trap occurs when both an enable bit and the corresponding cause bit are set during an FPU arithmetic operation or by moving a value to the FCSR. The meaning of the individual bits is the same as the cause bits. Note that the “E” cause bit has no corresponding enable bit; the non-IEEE Unimplemented Operation exception defined by MIPS is always enabled. flags Flag bits. (see cause field for bit names) This field shows the exception conditions that have occurred for completed instructions since it was last reset. For a completed FPU arithmetic operation that raises an exception condition the corresponding bits in the flag field are set and the others are unchanged. This field is never reset by hardware and must be explicitly reset by user software. RM Rounding Mode. The rounding mode used for most floating-point operations (some FP instructions use a specific rounding mode). The rounding modes are: 0 RN -- Round to Nearest Round result to the nearest representable value. When two representable values are equally near, round to the value that has a least significant bit of zero (i.e. is even). 1 RZ -- Round toward Zero Round result to the value closest to and not greater in magnitude then the result. 2 RP -- Round toward Plus infinity Round result to the value closest to and not less than the result. 3 RM -- Round toward Minus infinity Round result to the value closest to and not greater than the result. FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-13 B 4 Values in FP Registers Unlike the CPU, the FPU does not interpret the binary encoding of source operands or produce a binary encoding of results for every operation. The value held in a floating-point operand register (FPR) has a format, or type and it may only be used by instructions that operate on that format. The format of a value is either uninterpreted, unknown, or one of the valid numeric formats: single and double floating-point and word and long fixed-point. The way that the formatted value in an FPR is set and changed is summarized in the state diagram in Figure B-15 and is discussed below. The value in an FPR is always set when a value is written to the register. When a data transfer instruction writes binary data into an FPR (a load), the FPR gets a binary value that is uninterpreted. A computational or FP register move instruction that produces a result of type fmt puts a value of type fmt into the result register. When an FPR with an uninterpreted value is used as a source operand by an instruction that requires a value of format fmt, the binary contents are interpreted as an encoded value in format fmt and the value in the FPR changes to a value of format fmt. The binary contents cannot be reinterpreted in a different format. If an FPR contains a value of format fmt, a computational instruction must not use the FPR as a source operand of a different format. If this occurs, the value in the register becomes unknown and the result of the instruction is also a value that is unknown. Using an FPR containing an unknown value as a source operand produces a result that has an unknown value. The format of the value in the FPR is unchanged when it is read by a data transfer instruction (a store). A data transfer instruction produces a binary encoding of the value contained in the FPR. If the value in the FPR is unknown, the encoded binary value produced by the operation is not defined. B-14 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set B 5 FPU Exceptions The IEEE 754 standard specifies that: Value in format A Load Load Load Src A Src B Store Src A Rslt A Store Store Rslt B Src BSrc A Rslt A B Rslt A Src B Src A Rslt BRslt A Src B Rslt B Store A, B: Example formats Load: Destination of LWC1, LDC1, MTC1, or DMTC1 instructions. Store: Source operand of SWC1, SDC1, MFC1, or DMFC1 instructions. Src fmt: Source operand of computational instruction expecting format “fmt”. Rslt fmt: Result of computational instruction producing value of format “fmt”. Figure B-15 The Effect of FPU Operations on the Format of Values Held in FPRs. (interpret) (interpret) Rslt unknown Rslt unknown Rslt unknown Value unknown Value in format B Value uninterpreted (binary encoding) FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-15 There are five types of exceptions that shall be signaled when detected. The signal entails setting a status flag, taking a trap, or possibly doing both. With each exception should be associated a trap under user control, ... This function is implemented in the MIPS FPU architecture with the cause, enable, and flag fields of the control and status register. The flag bits implement IEEE exception status flags, and the cause and enable bits control exception trapping. Each field has a bit for each of the five IEEE exception conditions and the cause field has an additional exception bit, Unimplemented Operation, used to trap for software emulation assistance. There may be two exception modes for the FPU, precise and imprecise, and the operation of the FPU when exception conditions arise depends on the exception mode that is currently selected. Every processor is able to operate the FPU in the precise exception mode. Some processors also have an imprecise exception mode in which floating-point performance is greater. Selecting the exception mode, when there is a choice, is a privileged implementation-specific operation. B 5.1 Precise Exception Mode In precise exception mode, an exception (trap) caused by a floating-point operation is precise. A precise trap occurs before the instruction that causes the trap, or any following instruction, completes and writes results. If desired, the software trap handler can resume execution of the interrupted instruction stream after handling the exception. The cause bit field reports per-instruction exception conditions. The cause bits are written during each floating-point arithmetic operation to show the exception conditions that arose during the operation. The bits are set to 1 if the corresponding exception condition arises and 0 otherwise. A floating-point trap is generated any time both a cause bit and the corresponding enable bit are set. This occurs either during the execution of a floating-point operation or by moving a value into the FCSR. There is no enable for Unimplemented Operation; this exception condition always generates a trap. In a trap handler, the exception conditions that arose during the floating-point operation that trapped are reported in the cause field. Before returning from a floating-point interrupt or exception, or setting cause bits with a move to the FCSR, software must first clear the enabled cause bits by a move to the FCSR to prevent the trap from being retaken. User-mode programs can never observe enabled cause bits set. If this information is required in a user-mode handler, then it must be passed somewhere other than the status register. For a floating-point operation that sets only non-enabled cause bits, no trap occurs and the default result defined by the IEEE standard is stored (see Table B-4). When a floating-point operation does not trap, the program can see the exception conditions that arose during the operation by reading the cause field. B-16 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set The flag bit field is a cumulative report of IEEE exception conditions that arise during instructions that complete; instructions that trap do not update the flag bits. The flag bits are set to 1 if the corresponding IEEE exception is raised and unchanged otherwise. There is no flag bit for the MIPS Uniplemented Operation exception condition. The flag bits are never cleared as a side effect of floating-point operations, but may be set or cleared by moving a new value into the FCSR. B 5.2 Imprecise Exception Mode In imprecise exception mode, an exception (trap) caused by an IEEE floating-point operation is imprecise (Unimplemented Operation exceptions must still be signaled precisely). An imprecise trap occurs at some point after the exception condition arises. In particular, it does not necessarily occur before the instruction that causes the exception, or following instructions, have completed and written results. The software trap handler can generally neither determine which instruction caused the trap nor continue execution of the interrupted instruction stream; it can record the trap that occurred and abort the program. The meaning of the cause bit field when reading the FCSR is not defined. When a cause bit is written in the FCSR by moving data to it, the corresponding flag bit is also set. All floating-point operations, whether they cause a trap or not, complete in the sense that they write a result and record exception condition bits in the flag field. When an IEEE exception condition arises during an operation, the default result defined by the IEEE standard is stored (see Table B-4). A floating-point trap is generated when an exception condition arises during a floating-point operation and the corresponding enable bit is set. A trap will also be generated when a value with corresponding cause and enable bits set is moved into the FCSR. There is no enable for Unimplemented Operation; this exception condition always generates a trap. The flag bit field is a cumulative report of IEEE exception conditions that arise during instructions that complete. Because all instructions complete in this mode, unlike precise exception mode, the flag bits include exception conditions that cause traps. The flag bits are set to 1 if the corresponding IEEE exception is raised and unchanged otherwise. There is no flag bit for the MIPS Uniplemented Operation exception condition. The flag bits are never cleared as a side effect of floating-point operations, but may be set or cleared by moving a new value into the FCSR. B 5.3 Exception Condition Definitions The five exception conditions defined by the IEEE standard are described in this section. It also describes the MIPS-defined exception condition, Unimplemented Operation, that is used to signal a need for software emulation assistance for an instruction. FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-17 Normally an IEEE arithmetic operation can cause only one exception condition; the only case in which two exceptions can occur at the same time are inexact with overflow and inexact with underflow. At the program’s direction, an IEEE exception condition can either cause a trap or not. The IEEE standard specifies the result to be delivered in case the exception is not enabled and no trap is taken. The MIPS architecture supplies these results whenever the exception condition does not result in a precise trap (i.e. no trap or an imprecise trap). The default action taken depends on the type of exception condition, and in the case of the Overflow, the current rounding mode. The default result is mentioned in each description and summarized inTable B-4. Table B-4 Default Result for IEEE Exceptions Not Trapped Precisely B 5.3.1 Invalid Operation exception The invalid operation exception is signaled if one or both of the operands are invalid for the operation to be performed. The result, when the exception condition occurs without a precise trap, is a quiet NaN. The invalid operations are: • One or both operands is a signaling NaN (except for the non-arithmetic MOV.fmt MOVT.fmt MOVF.fmt MOVN.fmt and MOVZ.fmt instructions) • Addition or subtraction: magnitude subtraction of infinities, such as: (+∞) + (-∞) or (-∞) - (-∞) • Multiplication: 0 × ∞, with any signs • Division: 0 / 0 or ∞ / ∞, with any signs • Square root: An operand less than 0 (-0 is a valid operand value). • Conversion of a floating-point number to a fixed-point format when an overflow, or operand value of infinity or NaN, precludes a faithful representation in that format. Bit Description Default Action V Invalid Operation Supply a quiet NaN. Z Divide by zero Supply a properly signed infinity. U Underflow Supply a rounded result. I Inexact Supply a rounded result. If caused by an overflow without the overflow trap enabled, supply the overflowed result. O Overflow Depends on the rounding mode as shown below 0 (RN) Supply an infinity with the sign of the intermediate result. 1 (RZ) Supply the format’s largest finite number with the sign of the intermediate result. 2 (RP) For positive overflow values, supply positive infinity. For negative overflow values, supply the format’s most negative finite number. 3 (RM) for positive overflow values supply the format’s largest finite number. For negative overflow values, supply minus infinity. B-18 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set • Some comparison operations in which one or both of the operands is a QNaN value. The definition of the compare operation (C.cond.fmt) has tables showing the comparisons that do and do not signal the exception. B 5.3.2 Division By Zero exception The division by zero exception is signaled on an implemented divide operation if the divisor is zero and the dividend is a finite nonzero number. The result, when no precise trap occurs, is a correctly signed infinity. The divisions (0/0) and (∞/0) do not cause the division by zero exception. The result of (0/0) is an Invalid Operation exception condition. The result of (∞/0) is a correctly signed infinity. B 5.3.3 Overflow exception The overflow exception is signaled when what would have been the magnitude of the rounded floating-point result, were the exponent range unbounded, is larger than the destination format’s largest finite number. The result, when no precise trap occurs, is determined by the rounding mode and the sign of the intermediate result as shown in Table B-4. B 5.3.4 Underflow exception Two related events contribute to underflow. One is the creation of a tiny non-zero result between ±2E_min which, because it is tiny, may cause some other exception later such as overflow on division. The other is extraordinary loss of accuracy during the approximation of such tiny numbers by denormalized numbers. The IEEE standard permits a choice in how these events are detected, but requires that they must be detected the same way for all operations. The IEEE standard specifies that “tininess” may be detected either: “after rounding” (when a nonzero result computed as though the exponent range were unbounded would lie strictly between ±2E_min), or “before rounding” (when a nonzero result computed as though both the exponent range and the precision were unbounded would lie strictly between ±2E_min). The MIPS architecture specifies that tininess is detected after rounding. The IEEE standard specifies that loss of accuracy may be detected as either “denormalization loss” (when the delivered result differs from what would have been computed if the exponent range were unbounded), or “inexact result” (when the delivered result differs from what would have been computed if both the exponent range and precision were unbounded). The MIPS architecture specifies that loss of accuracy is detected as inexact result. When an underflow trap is not enabled, underflow is signaled only when both tininess and loss of accuracy have been detected. The delivered result might be zero, denormalized, or ±2E_min. When an underflow trap is enabled (via the FCSR enable field bit), underflow is signaled when tininess is detected regardless of loss of accuracy. FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-19 B 5.3.5 Inexact exception If the rounded result of an operation is not exact or if it overflows without an overflow trap, then the inexact exception is signaled. B 5.3.6 Unimplemented Operation exception This MIPS defined (non-IEEE) exception is to provide software emulation support. The architecture is designed to permit a combination of hardware and software to fully implement the architecture. Operations that are not fully supported in hardware cause an Unimplemented Operation exception so that software may perform the operation. There is no enable bit for this condition; it always causes a trap. After the appropriate emulation or other operation is done in a software exception handler, the original instruction stream can be continued. B 6 Functional Instruction Groups The FPU instructions are divided into the following functional groups: • Data Transfer • Arithmetic • Conversion • Formatted Operand Value Move • Conditional Branch • Miscellaneous B 6.1 Data Transfer Instructions The FPU has two separate register sets: coprocessor general registers and coprocessor control registers. The FPU has a load/store architecture; all computations are done on data held in coprocessor general registers. The control registers are used to control FPU operation. Data is transferred between registers and the rest of the system with dedicated load, store, and move instructions. The transferred data is treated as unformatted binary data; no format conversions are performed and, therefore, no IEEE floating-point exceptions can occur. The supported transfer operations are: All coprocessor loads and stores operate on naturally-aligned data items. An attempt to load or store to an address that is not naturally aligned for the data item will cause an Address Error exception. Regardless of byte-numbering order • FPU general reg ↔ memory (word/doubleword load/store) • FPU general reg ↔ CPU general reg (word/doubleword move) • FPU control reg ↔ CPU general reg (word move) B-20 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set (endianness), the address of a word or doubleword is the smallest byte address among the bytes in the object. For a big-endian machine this is the most-significant byte; for a little-endian machine this is the least-significant byte. The FPU has loads and stores using the usual register+offset addressing. For the FPU only, there are load and store instructions using register+register addressing. MIPS I specifies that loads are delayed by one instruction and that proper execution must be insured by observing an instruction scheduling restriction. The instruction immediately following a load into an FPU register Fn must not use Fn as a source register. The time between the load instruction and the time the data is available is the “load delay slot”. If no useful instruction can be put into the load delay slot, then a null operation (NOP) must be inserted. In MIPS II, this instruction scheduling restriction is removed. Programs will execute correctly when the loaded data is used by the instruction following the load, but this may require extra real cycles. Most processors cannot actually load data quickly enough for immediate use and the processor will be forced to wait until the data is available. Scheduling load delay slots is desirable for performance reasons even when it is not necessary for correctness. Table B-5 FPU Loads and Stores Using Register + Offset Address Mode Table B-6 FPU Loads and Using Register + Register Address Mode Table B-7 FPU Move To/From Instructions Mnemonic Description Defined in LWC1 Load Word to Floating-Point MIPS I SWC1 Store Word to Floating-Point I LDC1 Load Doubleword to Floating-Point III SDC1 Store Doubleword to Floating-Point III Mnemonic Description Defined in LWXC1 Load Word Indexed to Floating-Point MIPS IV SWXC1 Store Word Indexed to Floating-Point IV LDXC1 Load Doubleword Indexed to Floating-Point IV SDXC1 Store Doubleword Indexed to Floating-Point IV Mnemonic Description Defined in MTC1 Move Word To Floating-Point MIPS I MFC1 Move Word From Floating-Point I DMTC1 Doubleword Move To Floating-Point III DMFC1 Doubleword Move From Floating-Point III CTC1 Move Control Word To Floating-Point I CFC1 Move Control Word From Floating-Point I FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-21 B 6.2 Arithmetic Instructions The arithmetic instructions operate on formatted data values. The result of most floating-point arithmetic operations meets the IEEE standard specification for accuracy; a result which is identical to an infinite-precision result rounded to the specified format, using the current rounding mode. The rounded result differs from the exact result by less than one unit in the least-significant place (ulp). Table B-8 FPU IEEE Arithmetic Operations Two operations, Reciprocal Approximation (RECIP) and Reciprocal Square Root Approximation (RSQRT), may be less accurate than the IEEE specification. The result of RECIP differs from the exact reciprocal by no more than one ulp. The result of RSQRT differs by no more than two ulp. Within these error limits, the result of these instructions is implementation specific. Table B-9 FPU Approximate Arithmetic Operations There are four compound-operation instructions that perform variations of multiply-accumulate: multiply two operands and accumulate to a third operand to produce a result. The accuracy of the result depends which of two alternative arithmetic models is used for the computation. The unrounded model is more accurate than a pair of IEEE operations and the rounded model meets the IEEE specification. Table B-10 FPU Multiply-Accumulate Arithmetic Operations Mnemonic Description Defined in ADD.fmt Floating-Point Add MIPS I SUB.fmt Floating-Point Subtract I MUL.fmt Floating-Point Multiply I DIV.fmt Floating-Point Divide I ABS.fmt Floating-Point Absolute Value I NEG.fmt Floating-Point Negate I SQRT.fmt Floating-Point Square Root II C.cond.fmt Floating-Point Compare I Mnemonic Description Defined in RECIP.fmt Floating-Point Reciprocal Approximation MIPS IV RSQRT.fmt Floating-Point Reciprocal Square Root Approximation IV Mnemonic Description Defined in MADD.fmt Floating-Point Multiply Add MIPS IV MSUB.fmt Floating-Point Multiply Subtract IV NMADD.fmt Floating-Point Negative Multiply Add IV NMSUB.fmt Floating-Point Negative Multiply Subtract IV B-22 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set The initial implementation of the MIPS IV architecture, the R8000 (and future revisions of it), uses the unrounded arithmetic model which does not match the IEEE accuracy specification. All other implementations will use the rounded model which does meet the specification. • Rounded or non-fused: The product is rounded according to the current rounding mode prior to the accumulation. This model meets the IEEE accuracy specification; the result is numerically identical to the equivalent computation using multiply, add, subtract, and negate instructions. • Unrounded or fused (R8000 implementation): The product is not rounded and all bits take part in the accumulation. This model does not match the IEEE accuracy requirements; the result is more accurate than the equivalent computation using IEEE multiply, add, subtract, and negate instructions. B 6.3 Conversion Instructions There are instructions to perform conversions among the floating-point and fixed- point data types. Each instruction converts values from a number of operand formats to a particular result format. Some convert instructions use the rounding mode specified in the Floating Control and Status Register (FCSR), others specify the rounding mode directly. Table B-11 FPU Conversion Operations Using the FCSR Rounding Mode Table B-12 FPU Conversion Operations Using a Directed Rounding Mode Mnemonic Description Defined in CVT.S.fmt Floating-Point Convert to Single Floating-Point MIPS I CVT.D.fmt Floating-Point Convert to Double Floating-Point I CVT.W.fmt Floating-Point Convert to Word Fixed-Point I CVT.L.fmt Floating-Point Convert to Long Fixed-Point I Mnemonic Description Defined in ROUND.W.fmt Floating-Point Round to Word Fixed-Point II ROUND.L.fmt Floating-Point Round to Long Fixed-Point III TRUNC.W.fmt Floating-Point Truncate to Word Fixed-Point II TRUNC.L.fmt Floating-Point Truncate to Long Fixed-Point III CEIL.W.fmt Floating-Point Ceiling to Word Fixed-Point II CEIL.L.fmt Floating-Point Ceiling to Long Fixed-Point III FLOOR.W.fmt Floating-Point Floor to Word Fixed-Point II FLOOR.L.fmt Floating-Point Floor to Long Fixed-Point III FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-23 B 6.4 Formatted Operand Value Move Instructions These instructions all move formatted operand values among FPU general registers. A particular operand type must be moved by the instruction that handles that type. There are three kinds of move instructions: • Unconditional move • Conditional move that tests an FPU condition code • Conditional move that tests a CPU general register value against zero The conditional move instructions operate in a way that may be unexpected. They always force the value in the destination register to become a value of the format specified in the instruction. If the destination register does not contain an operand of the specified format before the conditional move is executed, the contents become undefined. There is more information in Values in FP Registers on page B-13 and in the individual descriptions of the conditional move instructions themselves. Table B-13 FPU Formatted Operand Move Instructions Table B-14 FPU Conditional Move on True/False Instructions Table B-15 FPU Conditional Move on Zero/Nonzero Instructions B 6.5 Conditional Branch Instructions The FPU has PC-relative conditional branch instructions that test condition codes set by FPU compare instructions (C.cond.fmt). All branches have an architectural delay of one instruction. When a branch is taken, the instruction immediately following the branch instruction, in the branch delay slot, is executed before the branch to the target instruction takes place. Conditional branches come in two versions that treat the instruction in the delay slot differently when the branch is not taken and execution falls through. The “branch” instructions execute the instruction in the delay slot, but the “branch likely” instructions do not (they are said to nullify it). Mnemonic Description Defined in MOV.fmt Floating-Point Move MIPS I Mnemonic Description Defined in MOVT.fmt Floating-Point Move Conditional on FP True MIPS IV MOVF.fmt Floating-Point Move Conditional on FP False IV Mnemonic Description Defined in MOVZ.fmt Floating-Point Move Conditional on Zero MIPS IV MOVN.fmt Floating-Point Move Conditional on Nonzero IV B-24 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set MIPS I defines a single condition code which is implicit in the compare and branch instructions. MIPS IV defines seven additional condition codes and includes the condition code number in the compare and branch instructions. The MIPS IV extension keeps the original condition bit as condition code zero and the extended encoding is compatible with the MIPS I encoding. Table B-16 FPU Conditional Branch Instructions B 6.6 Miscellaneous Instructions B 6.6.1 CPU Conditional Move There are instructions to move conditionally move one CPU general register to another based on an FPU condition code. Table B-17 CPU Conditional Move on FPU True/False Instructions B 7 Valid Operands for FP Instructions The floating-point unit arithmetic, conversion, and operand move instructions operate on formatted values with different precision and range limits and produce formatted values for results. Each representable value in each format has a binary encoding that is read from or stored to memory. The fmt or fmt3 field of the instruction encodes the operand format required for the instruction. A conversion instruction specifies the result type in the function field; the result of other operations is the same format as the operands. The encoding of the fmt and fmt3 fields is shown in Table B-18. Mnemonic Description Defined in BC1T Branch on FP True MIPS I BC1F Branch on FP False I BC1TL Branch on FP True Likely II BC1FL Branch on FP False Likely II Mnemonic Description Defined in MOVZ Move Conditional on FP True MIPS IV MOVN Move Conditional on FP False IV FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-25 Table B-18 FPU Operand Format Field (fmt, fmt3) Decoding Each type of arithmetic or conversion instruction is valid for operands of selected formats. A summary of the computational and operand move instructions and the formats valid for each of them is listed in Table B-19. Implementations must support combinations that are valid either directly in hardware or through emulation in an exception handler. The result of an instruction using operand formats marked “U ” is not currently specified by this architecture and will cause an exception. They are available for future extensions to the architecture. The exact exception mechanism used is processor specific. Most implementations report this as an Unimplemented Operation for a Floating Point exception. Other implementations report these combinations as Reserved Instruction exceptions. The result of an instruction using operand formats marked “i” are invalid and an attempt to execute such an instruction has an undefined result. Table B-19 Valid Formats for FPU Operations fmt fmt3 Instruction Mnemonic Size data type name bits 0-15 - Reserved 16 0 S single 32 floating-point 17 1 D double 64 floating-point 18-19 2-3 Reserved 20 4 W word 32 fixed-point 21 5 L long 64 fixed-point 22–31 6-7 Reserved Mnemonic Operation operand fmt COP1 func- tion value COP1 X op4 value float fixed S D W L ABS Absolute value • • U U 5 ADD Add • • U U 0 C.cond Floating-point compare • • U U 48–63 CEIL.L, (CEIL.W) Convert to longword fixed-point, round toward +∞ • • i i 10 (14) CVT.D Convert to double floating-point • i • • 33 CVT.L Convert to longword fixed-point • • i i 37 CVT.S Convert to single floating-point i • • • 32 CVT.W Convert to 32-bit fixed-point • • i i 36 DIV Divide • • U U 3 FLOOR.L, (FLOOR.W) Convert to longword fixed-point, round toward -∞ • • i i 11 (15) MADD Multiply-Add • • U U 4 MOV Move Register • • i i 6 B-26 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set B 8 Description of an Instruction For the FPU instruction detail documentation, all variable subfields in an instruction format (such as fs, ft, immediate, and so on) are shown in lower-case. The instruction name (such as ADD, SUB, and so on) is shown in upper-case. For the sake of clarity, we sometimes use an alias for a variable subfield in the formats of specific instructions. For example, we use rs = base in the format for load and store instructions. Such an alias is always lower case, since it refers to a variable subfield. In some instructions, the instruction subfields op and function can have constant 6-bit values. When reference is made to these instructions, upper-case mnemonics are used. For instance, in the floating-point ADD instruction we use op = COP1 and function = ADD. In other cases, a single field has both fixed and variable subfields, so the name contains both upper and lower case characters. Bit encodings for mnemonics are shown at the end of this section, and are also included with each individual instruction. B 9 Operation Notation Conventions and Functions The instruction description includes an Operation section that describes the operation of the instruction in a pseudocode. The pseudocode and terms used in the description are described in Operation Section Notation and Functions on page A-18. MOVC FP Move Conditional on condition • • i i 17 MOVN FP Move Conditional on GPR ≠ zero • • i i 19 MOVZ FP Move Conditional on GPR = zero • • i i 18 MSUB Multiply-Subtract • • U U 5 MUL Multiply • • U U 2 NEG Negate • • U U 7 NMADD Negative multiply-Add • • U U 6 NMSUB Negative multiply-Subtract • • U U 7 RECIP Reciprocal approximation • • U U 21 ROUND.L, (ROUND.W) Convert to longword fixed-point, round to nearest/even • • i i 8 (12) RSQRT Reciprocal square root approximation • • U U 22 SQRT Square root • • U U 4 SUB Subtract • • U U 1 TRUNC.L (TRUNC.W) Convert to longword fixed-point, round toward zero • • i i 9 (13) Key: • − Valid. U − Unimplemented or Reserved. i − Invalid. Mnemonic Operation operand fmt COP1 func- tion value COP1 X op4 value float fixed S D W L FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-27 B 10 Individual FPU Instruction Descriptions The FP instructions are described in alphabetic order. See Description of an Instruction on page A-15 for a description of the information in each instruction description. ABS.fmt Floating-Point Absolute Value B-28 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Format: ABS.S fd, fs MIPS I ABS.D fd, fs Purpose: To compute the absolute value of an FP value. Description: fd ← absolute(fs) The absolute value of the value in FPR fs is placed in FPR fd. The operand and result are values in format fmt. This operation is arithmetic; a NaN operand signals invalid operation. Restrictions: The fields fs and fd must specify FPRs valid for operands of type fmt; see Floating- Point Registers on page B-6. If they are not valid, the result is undefined. The operand must be a value in format fmt; see section B 7 on page B-24. If it is not, the result is undefined and the value of the operand FPR becomes undefined. Operation: StoreFPR(fd, fmt, AbsoluteValue(ValueFPR(fs, fmt))) Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Unimplemented Operation Invalid Operation 31 0 6 5 5 5 5 6 COP1 fmt 0 fs fd ABS 11 1021 20 16 1526 25 6 5 0 1 0 0 0 1 0 0 0 0 0 0 0 0 1 0 1 Floating-Point Add ADD.fmt FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-29 Format: ADD.S fd, fs, ft MIPS I ADD.D fd, fs, ft Purpose: To add FP values. Description: fd ← fs + ft The value in FPR ft is added to the value in FPR fs. The result is calculated to infinite precision, rounded according to the current rounding mode in FCSR, and placed into FPR fd. The operands and result are values in format fmt. Restrictions: The fields fs, ft, and fd must specify FPRs valid for operands of type fmt; see Floating- Point Registers on page B-6. If they are not valid, the result is undefined. The operands must be values in format fmt; see section B 7 on page B-24. If they are not, the result is undefined and the value of the operand FPRs becomes undefined. Operation: StoreFPR (fd, fmt, ValueFPR(fs, fmt) + ValueFPR(ft, fmt)) Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Unimplemented Operation Invalid Operation Inexact Overflow Underflow 31 0 6 5 5 5 5 6 COP1 fmt ft fs fd ADD 11 1021 20 16 1526 25 6 5 0 1 0 0 0 1 0 0 0 0 0 0 BC1F Branch on FP False B-30 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Format: BC1F offset (cc = 0 implied) MIPS I BC1F cc, offset MIPS IV Purpose: To test an FP condition code and do a PC-relative conditional branch. Description: if (cc = 0) then branch An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address. If the FP condition code bit cc is false (0), branch to the effective target address after the instruction in the delay slot is executed An FP condition code is set by the FP compare instruction, C.cond.fmt. The MIPS I architecture defines a single floating-point condition code, implemented as the coprocessor 1 condition signal (Cp1Cond) and the C bit in the FP Control and Status register. MIPS I, II, and III architectures must have the cc field set to 0, which is implied by the first format in the Format section. The MIPS IV architecture adds seven more condition code bits to the original condition code 0. FP compare and conditional branch instructions specify the condition code bit to set or test. Both assembler formats are valid for MIPS IV. Restrictions: MIPS I, II, III: There must be at least one instruction between the compare instruction that sets a condition code and the branch instruction that tests it. Hardware does not detect a violation of this restriction. MIPS IV: None. 3 15 BC 31 2526 COP1 6 0 16 offsetnd 21 20 5 0 1 0 0 0 1 0 1 0 0 0 0 cc 1 1 0 tf 18 1716 Branch on FP False BC1F FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-31 Operation: MIPS I, II, and III define a single condition code; MIPS IV adds 7 more condition codes.This operation specification is for the general “Branch On Condition” operation with the tf (true/false) and nd (nullify delay slot) fields as variables. The individual instructions BC1F, BC1FL, BC1T, and BC1TL have specific values for tf and nd. MIPS I I - 1: condition ← COC[1] = tf I : target_offset← (offset15)GPRLEN-(16+2) || offset || 02 I + 1 :if condition then PC ← PC + target endif MIPS II and MIPS III: I - 1: condition ← COC[1] = tf I : target_offset← (offset15)GPRLEN-(16+2) || offset || 02 I + 1 :if condition then PC ← PC + target else if nd then NullifyCurrentInstruction() endif MIPS IV: I : condition ← FCC[cc] = tf target_offset← (offset15)GPRLEN-(16+2) || offset || 02 I + 1 :if condition then PC ← PC + target else if nd then NullifyCurrentInstruction() endif Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Unimplemented Operation Programming Notes: With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register (JR) instructions to branch to more distant addresses. BC1FL Branch on FP False Likely B-32 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Format: BC1FL offset (cc = 0 implied) MIPS II BC1FL cc, offset MIPS IV Purpose: To test an FP condition code and do a PC-relative conditional branch; execute the delay slot only if the branch is taken. Description: if (cc = 0) then branch_likely An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address. If the FP condition code bit cc is false (0), branch to the effective target address after the instruction in the delay slot is executed. If the branch is not taken, the instruction in the delay slot is not executed. An FP condition code is set by the FP compare instruction, C.cond.fmt. The MIPS I architecture defines a single floating-point condition code, implemented as the coprocessor 1 condition signal (Cp1Cond) and the C bit in the FP Control and Status register. MIPS I, II, and III architectures must have the cc field set to 0, which is implied by the first format in the Format section. The MIPS IV architecture adds seven more condition code bits to the original condition code 0. FP compare and conditional branch instructions specify the condition code bit to set or test. Both assembler formats are valid for MIPS IV. Restrictions: MIPS II, III: There must be at least one instruction between the compare instruction that sets a condition code and the branch instruction that tests it. Hardware does not detect a violation of this restriction. MIPS IV: None. 3 15 BC 31 2526 COP1 6 0 16 offsetnd 21 20 5 0 1 0 0 0 1 0 1 0 0 0 1 cc 1 1 0 tf 18 1716 Branch on FP False Likely BC1FL FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-33 Operation: MIPS II, and III define a single condition code; MIPS IV adds 7 more condition codes.This operation specification is for the general “Branch On Condition” operation with the tf (true/false) and nd (nullify delay slot) fields as variables. The individual instructions BC1F, BC1FL, BC1T, and BC1TL have specific values for tf and nd. MIPS II and MIPS III: I - 1: condition ← COC[1] = tf I : target_offset← (offset15)GPRLEN-(16+2) || offset || 02 I + 1 :if condition then PC ← PC + target else if nd then NullifyCurrentInstruction() endif MIPS IV: I : condition ← FCC[cc] = tf target_offset← (offset15)GPRLEN-(16+2) || offset || 02 I + 1 :if condition then PC ← PC + target else if nd then NullifyCurrentInstruction() endif Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Unimplemented Operation Programming Notes: With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register (JR) instructions to branch to more distant addresses. BC1T Branch on FP True B-34 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Format: BC1T offset (cc = 0 implied) MIPS I BC1T cc, offset MIPS IV Purpose: To test an FP condition code and do a PC-relative conditional branch. Description: if (cc = 1) then branch An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address. If the FP condition code bit cc is true (1), branch to the effective target address after the instruction in the delay slot is executed An FP condition code is set by the FP compare instruction, C.cond.fmt. The MIPS I architecture defines a single floating-point condition code, implemented as the coprocessor 1 condition signal (Cp1Cond) and the C bit in the FP Control and Status register. MIPS I, II, and III architectures must have the cc field set to 0, which is implied by the first format in the Format section. The MIPS IV architecture adds seven more condition code bits to the original condition code 0. FP compare and conditional branch instructions specify the condition code bit to set or test. Both assembler formats are valid for MIPS IV. Restrictions: MIPS I, II, III: There must be at least one instruction between the compare instruction that sets a condition code and the branch instruction that tests it. Hardware does not detect a violation of this restriction. MIPS IV: None 3 15 BC 31 2526 COP1 6 0 16 offsetnd 21 20 5 0 1 0 0 0 1 0 1 0 0 0 0 cc 1 1 1 tf 18 1716 Branch on FP True BC1T FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-35 Operation: MIPS I, II, and III define a single condition code; MIPS IV adds 7 more condition codes.This operation specification is for the general “Branch On Condition” operation with the tf (true/false) and nd (nullify delay slot) fields as variables. The individual instructions BC1F, BC1FL, BC1T, and BC1TL have specific values for tf and nd. MIPS I I - 1: condition ← COC[1] = tf I : target ← (offset15)GPRLEN-(16+2) || offset || 02 I + 1 :if condition then PC ← PC + target endif MIPS II and MIPS III: I - 1: condition ← COC[1] = tf I : target ← (offset15)GPRLEN-(16+2) || offset || 02 I + 1 :if condition then PC ← PC + target else if nd then NullifyCurrentInstruction() endif MIPS IV: I : condition ← FCC[cc] = tf target ← (offset15)GPRLEN-(16+2) || offset || 02 I + 1 :if condition then PC ← PC + target else if nd then NullifyCurrentInstruction() endif Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Unimplemented Operation Programming Notes: With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register (JR) instructions to branch to more distant addresses. BC1TL Branch on FP True Likely B-36 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Format: BC1TL offset (cc = 0 implied) MIPS II BC1TL cc, offset MIPS IV Purpose: To test an FP condition code and do a PC-relative conditional branch; execute the delay slot only if the branch is taken. Description: if (cc = 1) then branch_likely An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address. If the FP condition code bit cc is true (1), branch to the effective target address after the instruction in the delay slot is executed. If the branch is not taken, the instruction in the delay slot is not executed. An FP condition code is set by the FP compare instruction, C.cond.fmt. The MIPS I architecture defines a single floating-point condition code, implemented as the coprocessor 1 condition signal (Cp1Cond) and the C bit in the FP Control and Status register. MIPS I, II, and III architectures must have the cc field set to 0, which is implied by the first format in the Format section. The MIPS IV architecture adds seven more condition code bits to the original condition code 0. FP compare and conditional branch instructions specify the condition code bit to set or test. Both assembler formats are valid for MIPS IV. Restrictions: MIPS II, III: There must be at least one instruction between the compare instruction that sets a condition code and the branch instruction that tests it. Hardware does not detect a violation of this restriction. MIPS IV: None. 3 15 BC 31 2526 COP1 6 0 16 offsetnd 21 20 5 0 1 0 0 0 1 0 1 0 0 0 1 cc 1 1 1 tf 18 1716 Branch on FP True Likely BC1TL FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-37 Operation: MIPS II, and III define a single condition code; MIPS IV adds 7 more condition codes.This operation specification is for the general “Branch On Condition” operation with the tf (true/false) and nd (nullify delay slot) fields as variables. The individual instructions BC1F, BC1FL, BC1T, and BC1TL have specific values for tf and nd. MIPS II and MIPS III: I - 1: condition ← COC[1] = tf I : target ← (offset15)GPRLEN-(16+2) || offset || 02 I + 1 :if condition then PC ← PC + target else if nd then NullifyCurrentInstruction() endif MIPS IV: I : condition ← FCC[cc] = tf target ← (offset15)GPRLEN-(16+2) || offset || 02 I + 1 :if condition then PC ← PC + target else if nd then NullifyCurrentInstruction() endif Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Unimplemented Operation Programming Notes: With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register (JR) instructions to branch to more distant addresses. C.cond.fmt Floating-Point Compare B-38 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Format: C.cond.S fs, ft (cc = 0 implied) MIPS I C.cond.D fs, ft (cc = 0 implied) C.cond.S cc, fs, ft MIPS IV C.cond.D cc, fs, ft Purpose: To compare FP values and record the Boolean result in a condition code. Description: cc ← fs compare_cond ft The value in FPR fs is compared to the value in FPR ft; the values are in format fmt. The comparison is exact and neither overflows nor underflows. If the comparison specified by cond2..1 is true for the operand values, then the result is true, otherwise it is false. If no exception is taken, the result is written into condition code cc; true is 1 and false is 0. If cond3 is set and at least one of the values is a NaN, an Invalid Operation condition is raised; the result depends on the FP exception model currently active. • Precise exception model: The Invalid Operation flag is set in the FCSR. If the Invalid Operation enable bit is set in the FCSR, no result is written and an Invalid Operation exception is taken immediately. Otherwise, the Boolean result is written into condition code cc. • Imprecise exception model (R8000 normal mode): The Boolean result is written into condition code cc. No FCSR flag is set. If the Invalid Operation enable bit is set in the FCSR, an Invalid Operation exception is taken, imprecisely, at some future time. There are four mutually exclusive ordering relations for comparing floating-point values; one relation is always true and the others are false. The familiar relations are greater than, less than, and equal. In addition, the IEEE floating-point standard defines the relation unordered which is true when at least one operand value is NaN; NaN compares unordered with everything, including itself. Comparisons ignore the sign of zero, so +0 equals -0. The comparison condition is a logical predicate, or equation, of the ordering relations such as “less than or equal”, “equal”, “not less than”, or “unordered or equal”. Compare distinguishes sixteen comparison predicates. The Boolean result of the instruction is obtained by substituting the Boolean value of each ordering relation for the two FP values into equation. If the equal relation is true, for example, then all four example predicates above would yield a true result. If the unordered relation is true then only the final predicate, “unordered or equal” would yield a true result. 31 2526 2021 1516 0 COP1 6 5 5 346 58 71011 42235 ft fs cc 0 FC condfmt 0 1 0 0 0 1 1 10 0 Floating-Point Compare C.cond.fmt FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-39 Logical negation of a compare result allows eight distinct comparisons to test for sixteen predicates as shown in Table B-20. Each mnemonic tests for both a predicate and its logical negation. For each mnemonic, compare tests the truth of the first predicate. When the first predicate is true, the result is true as shown in the “if predicate is true” column (note that the False predicate is never true and False/True do not follow the normal pattern). When the first predicate is true, the second predicate must be false, and vice versa. The truth of the second predicate is the logical negation of the instruction result. After a compare instruction, test for the truth of the first predicate with the Branch on FP True (BC1T) instruction and the truth of the second with Branch on FP False (BC1F). Table B-20 FPU Comparisons Without Special Operand Exceptions Instr Comparison Predicate Comparison CC Result Instr cond Mne- monic name of predicate and logically negated predicate (abbreviation) relation values If pred- icate is true Inv Op excp if Q NaN cond field > < = ? 3 2..0 F False [this predicate is always False, F F F F F No 0 0 True (T) it never has a True result] T T T T UN Unordered F F F T T 1 Ordered (OR) T T T F F EQ Equal F F T F T 2 Not Equal (NEQ) T T F T F UEQ Unordered or Equal F F T T T 3 Ordered or Greater than or Less than (OGL) T T F F F OLT Ordered or Less Than F T F F T 4 Unordered or Greater than or Equal (UGE) T F T T F ULT Unordered or Less Than F T F T T 5 Ordered or Greater than or Equal (OGE) T F T F F OLE Ordered or Less than or Equal F T T F T 6 Unordered or Greater Than (UGT) T F F T F ULE Unordered or Less than or Equal F T T T T 7 Ordered or Greater Than (OGT) T F F F F key: “?” = unordered, “>” = greater than, “<“ = less than, “=” is equal,“T” = True, “F” = False C.cond.fmt Floating-Point Compare B-40 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set There is another set of eight compare operations, distinguished by a cond3 value of 1, testing the same sixteen conditions. For these additional comparisons, if at least one of the operands is a NaN, including Quiet NaN, then an Invalid Operation condition is raised. If the Invalid Operation condition is enabled in the FCSR, then an Invalid Operation exception occurs. Table B-21 FPU Comparisons With Special Operand Exceptions for QNaNs The instruction encoding is an extension made in the MIPS IV architecture. In previous architecture levels the cc field for this instruction must be 0. The MIPS I architecture defines a single floating-point condition code, implemented as the coprocessor 1 condition signal (Cp1Cond) and the C bit in the FP Control and Status register. MIPS I, II, and III architectures must have the cc field set to 0, which is implied by the first format in the Format section. Instr Comparison Predicate Comparison CC Result Instr cond Mne- monic name of predicate and logically negated predicate (abbreviation) relation values If pred- icate is true Inv Op excp if Q NaN cond field > < = ? 3 2..0 SF Signaling False [this predicate always False] F F F F F Yes 1 0 Signaling True (ST) T T T T NGLE Not Greater than or Less than or Equal F F F T T 1 Greater than or Less than or Equal (GLE) T T T F F SEQ Signaling Equal F F T F T 2 Signaling Not Equal (SNE) T T F T F NGL Not Greater than or Less than F F T T T 3 Greater than or Less than (GL) T T F F F LT Less than F T F F T 4 Not Less Than (NLT) T F T T F NGE Not Greater than or Equal F T F T T 5 Greater than or Equal (GE) T F T F F LE Less than or Equal F T T F T 6 Not Less than or Equal (NLE) T F F T F NGT Not Greater than F T T T T 7 Greater than (GT) T F F F F key: “?” = unordered, “>” = greater than, “<“ = less than, “=” is equal,“T” = True, “F” = False Floating-Point Compare C.cond.fmt FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-41 The MIPS IV architecture adds seven more condition code bits to the original condition code 0. FP compare and conditional branch instructions specify the condition code bit to set or test. Both assembler formats are valid for MIPS IV. Restrictions: The fields fs and ft must specify FPRs valid for operands of type fmt; see Floating- Point Registers on page B-6. If they are not valid, the result is undefined. The operands must be values in format fmt; see section B 7 on page B-24. If they are not, the result is undefined and the value of the operand FPRs becomes undefined. MIPS I, II, III: There must be at least one instruction between the compare instruction that sets a condition code and the branch instruction that tests it. Hardware does not detect a violation of this restriction. Operation: if NaN(Value FPR(fs, fmt)) or NaN(ValueFPR(ft, fmt)) then less ← false equal ← false unordered ← true if t then SignalException(InvalidOperation) endif else less ← ValueFPR(fs, fmt) < ValueFPR(ft, fmt) equal ← ValueFPR(fs, fmt) = ValueFPR(ft, fmt) unordered ← false endif condition ← (cond2 and less) or (cond1 and equal) or (cond0 and unordered) FCC[cc] ← condition if cc = 0 then COC[1] ← condition endif Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Unimplemented Operation Invalid Operation C.cond.fmt Floating-Point Compare B-42 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Programming Notes: FP computational instructions, including compare, that receive an operand value of Signaling NaN, will raise the Invalid Operation condition. The comparisons that raise the Invalid Operation condition for Quiet NaNs in addition to SNaNs, permit a simpler programming model if NaNs are errors. Using these compares, programs do not need explicit code to check for QNaNs causing the unordered relation. Instead, they take an exception and allow the exception handling system to deal with the error when it occurs. For example, consider a comparison in which we want to know if two numbers are equal, but for which unordered would be an error. # comparisons using explicit tests for QNaN c.eq.d $f2,$f4 # check for equal nop bc1t L2 # it is equal c.un.d $f2,$f4 # it is not equal, but might be unordered bc1t ERROR# unordered goes off to an error handler # not-equal-case code here ... # equal-case code here L2: # -------------------------------------------------------------- # comparison using comparisons that signal QNaN c.seq.d $f2,$f4 # check for equal nop bc1t L2 # it is equal nop # it is not unordered here... # not-equal-case code here ... #equal-case code here L2: Floating-Point Ceiling Convert to Long Fixed-Point CEIL.L.fmt FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-43 Format: CEIL.L.S fd, fs MIPS III CEIL.L.D fd, fs Purpose: To convert an FP value to 64-bit fixed-point, rounding up. Description: fd ← convert_and_round(fs) The value in FPR fs in format fmt, is converted to a value in 64-bit long fixed-point format rounding toward +∞ (rounding mode 2). The result is placed in FPR fd. When the source value is Infinity, NaN, or rounds to an integer outside the range -263 to 263-1, the result cannot be represented correctly and an IEEE Invalid Operation condition exists. The result depends on the FP exception model currently active. • Precise exception model: The Invalid Operation flag is set in the FCSR. If the Invalid Operation enable bit is set in the FCSR, no result is written to fd and an Invalid Operation exception is taken immediately. Otherwise, the default result, 263–1, is written to fd. • Imprecise exception model (R8000 normal mode): The default result, 263–1, is written to fd. No FCSR flag is set. If the Invalid Operation enable bit is set in the FCSR, an Invalid Operation exception is taken, imprecisely, at some future time. Restrictions: The fields fs and fd must specify valid FPRs; fs for type fmt and fd for long fixed-point; see Floating-Point Registers on page B-6. If they are not valid, the result is undefined. The operand must be a value in format fmt; see section B 7 on page B-24. If it is not, the result is undefined and the value of the operand FPR becomes undefined. Operation: StoreFPR(fd, L, ConvertFmt(ValueFPR(fs, fmt), fmt, L)) Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Invalid Operation Unimplemented Operation Inexact Overflow 31 0 6 5 5 5 5 6 COP1 fmt 0 fs fd CEIL.L 11 1021 20 16 1526 25 6 5 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 1 0 CEIL.W.fmt Floating-Point Ceiling Convert to Word Fixed-Point B-44 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Format: CEIL.W.S fd, fs MIPS II CEIL.W.D fd, fs Purpose: To convert an FP value to 32-bit fixed-point, rounding up. Description: fd ← convert_and_round(fs) The value in FPR fs in format fmt, is converted to a value in 32-bit word fixed-point format rounding toward +∞ (rounding mode 2). The result is placed in FPR fd. When the source value is Infinity, NaN, or rounds to an integer outside the range -231 to 231-1, the result cannot be represented correctly and an IEEE Invalid Operation condition exists. The result depends on the FP exception model currently active. • Precise exception model: The Invalid Operation flag is set in the FCSR. If the Invalid Operation enable bit is set in the FCSR, no result is written to fd and an Invalid Operation exception is taken immediately. Otherwise, the default result, 231–1, is written to fd. • Imprecise exception model (R8000 normal mode): The default result, 231–1, is written to fd. No FCSR flag is set. If the Invalid Operation enable bit is set in the FCSR, an Invalid Operation exception is taken, imprecisely, at some future time. Restrictions: The fields fs and fd must specify valid FPRs; fs for type fmt and fd for word fixed-point; see Floating-Point Registers on page B-6. If they are not valid, the result is undefined. The operand must be a value in format fmt; see section B 7 on page B-24. If it is not, the result is undefined and the value of the operand FPR becomes undefined. Operation: StoreFPR(fd, W, ConvertFmt(ValueFPR(fs, fmt), fmt, W)) Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Invalid Operation Unimplemented Operation Inexact Overflow 31 0 6 5 5 5 5 6 COP1 fmt 0 fs fd CEIL.W 11 1021 20 16 1526 25 6 5 0 1 0 0 0 1 0 0 0 0 0 0 0 1 1 1 0 Move Control Word from Floating-Point CFC1 FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-45 Format: CFC1 rt, fs MIPS I Purpose: To copy a word from an FPU control register to a GPR. Description: rt ← FP_Control[fs] Copy the 32-bit word from FP (coprocessor 1) control register fs into GPR rt, sign- extending it if the GPR is 64 bits. Restrictions: There are only a couple control registers defined for the floating-point unit. The result is not defined if fs specifies a register that does not exist. For MIPS I, MIPS II, and MIPS III, the contents of GPR rt are undefined for the instruction immediately following CFC1. Operation: MIPS I - III I : temp ← FCR[fs] I + 1 :GPR[rt] ← sign_extend(temp) Operation: MIPS IV temp ← FCR[fs] GPR[rt]← sign_extend(temp) Exceptions: Coprocessor Unusable 11 31 2526 2021 1516 COP1 CF rt 6 5 5 fs 0 5 11 10 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 CTC1 Move Control Word to Floating-Point B-46 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Format: CTC1 rt, fs MIPS I Purpose: To copy a word from a GPR to an FPU control register. Description: FP_Control[fs] ← rt Copy the low word from GPR rt into FP (coprocessor 1) control register fs. Writing to control register 31, the Floating-Point Control and Status Register or FCSR, causes the appropriate exception if any cause bit and its corresponding enable bit are both set. The register will be written before the exception occurs. Restrictions: There are only a couple control registers defined for the floating-point unit. The result is not defined if fs specifies a register that does not exist. For MIPS I, MIPS II, and MIPS III, the contents of floating-point control register fs are undefined for the instruction immediately following CTC1. Operation: MIPS I - III I : temp ← GPR[rt]31..0 I + 1 :FCR[fs]← temp COC[1] ← FCR[31]23 Operation: MIPS IV temp ← GPR[rt]31..0 FCR[fs] ← temp COC[1] ← FCR[31]23 Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Unimplemented Operation Invalid Operation Division-by-zero Inexact Overflow Underflow 11 31 2526 2021 1516 COP1 CT rt 6 5 5 fs 0 5 11 10 0 0 1 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 Floating-Point Convert to Double Floating-Point CVT.D.fmt FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-47 Format: CVT.D.S fd, fs MIPS I CVT.D.W fd, fs CVT.D.L fd, fs MIPS III Purpose: To convert an FP or fixed-point value to double FP. Description: fd ← convert_and_round(fs) The value in FPR fs in format fmt is converted to a value in double floating-point format rounded according to the current rounding mode in FCSR. The result is placed in FPR fd. If fmt is S or W, then the operation is always exact. Restrictions: The fields fs and fd must specify valid FPRs; fs for type fmt and fd for double floating- point; see Floating-Point Registers on page B-6. If they are not valid, the result is undefined. The operand must be a value in format fmt; see section B 7 on page B-24. If it is not, the result is undefined and the value of the operand FPR becomes undefined. Operation: StoreFPR (fd, D, ConvertFmt(ValueFPR(fs, fmt), fmt, D)) Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Invalid Operation Unimplemented Operation Inexact Overflow Underflow 31 0 6 5 5 5 5 6 COP1 fmt 0 fs fd CVT.D 11 1021 20 16 1526 25 6 5 0 1 0 0 0 1 1 0 0 0 0 10 0 0 0 0 CVT.L.fmt Floating-Point Convert to Long Fixed-Point B-48 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Format: CVT.L.S fd, fs MIPS III CVT.L.D fd, fs Purpose: To convert an FP value to a 64-bit fixed-point. Description: fd ← convert_and_round(fs) Convert the value in format fmt in FPR fs to long fixed-point format, round according to the current rounding mode in FCSR, and place the result in FPR fd. When the source value is Infinity, NaN, or rounds to an integer outside the range -263 to 263-1, the result cannot be represented correctly and an IEEE Invalid Operation condition exists. The result depends on the FP exception model currently active: • Precise exception model: The Invalid Operation flag is set in the FCSR. If the Invalid Operation enable bit is set in the FCSR, no result is written to fd and an Invalid Operation exception is taken immediately. Otherwise, the default result, 263–1, is written to fd. • Imprecise exception model (R8000 normal mode): The default result, 263–1, is written to fd. No FCSR flag is set. If the Invalid Operation enable bit is set in the FCSR, an Invalid Operation exception is taken, imprecisely, at some future time. Restrictions: The fields fs and fd must specify valid FPRs; fs for type fmt and fd for long fixed-point; see Floating-Point Registers on page B-6. If they are not valid, the result is undefined. The operand must be a value in format fmt; see section B 7 on page B-24. If it is not, the result is undefined and the value of the operand FPR becomes undefined. Operation: StoreFPR (fd, L, ConvertFmt(ValueFPR(fs, fmt), fmt, L)) Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Invalid Operation Unimplemented Operation Inexact Overflow 31 0 6 5 5 5 5 6 COP1 fmt 0 fs fd CVT.L 11 1021 20 16 1526 25 6 5 0 1 0 0 0 1 1 0 0 1 0 10 0 0 0 0 Floating-Point Convert to Single Floating-Point CVT.S.fmt FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-49 Format: CVT.S.D fd, fs MIPS I CVT.S.W fd, fs CVT.S.L fd, fs MIPS III Purpose: To convert an FP or fixed-point value to single FP. Description: fd ← convert_and_round(fs) The value in FPR fs in format fmt is converted to a value in single floating-point format rounded according to the current rounding mode in FCSR. The result is placed in FPR fd. Restrictions: The fields fs and fd must specify valid FPRs; fs for type fmt and fd for single floating- point; see Floating-Point Registers on page B-6. If they are not valid, the result is undefined. The operand must be a value in format fmt; see section B 7 on page B-24. If it is not, the result is undefined and the value of the operand FPR becomes undefined. Operation: StoreFPR(fd, S, ConvertFmt(ValueFPR(fs, fmt), fmt, S)) Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Invalid Operation Unimplemented Operation Inexact Overflow Underflow 31 0 6 5 5 5 5 6 COP1 fmt 0 fs fd CVT.S 11 1021 20 16 1526 25 6 5 0 1 0 0 0 1 1 0 0 0 0 00 0 0 0 0 CVT.W.fmt Floating-Point Convert to Word Fixed-Point B-50 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Format: CVT.W.S fd, fs MIPS I CVT.W.D fd, fs Purpose: To convert an FP value to 32-bit fixed-point. Description: fd ← convert_and_round(fs) The value in FPR fs in format fmt is converted to a value in 32-bit word fixed-point format rounded according to the current rounding mode in FCSR. The result is placed in FPR fd. When the source value is Infinity, NaN, or rounds to an integer outside the range -231 to 231-1, the result cannot be represented correctly and an IEEE Invalid Operation condition exists. The result depends on the FP exception model currently active. • Precise exception model: The Invalid Operation flag is set in the FCSR. If the Invalid Operation enable bit is set in the FCSR, no result is written to fd and an Invalid Operation exception is taken immediately. Otherwise, the default result, 231–1, is written to fd. • Imprecise exception model (R8000 normal mode): The default result, 231–1, is written to fd. No FCSR flag is set. If the Invalid Operation enable bit is set in the FCSR, an Invalid Operation exception is taken, imprecisely, at some future time. Restrictions: The fields fs and fd must specify valid FPRs; fs for type fmt and fd for word fixed-point; see Floating-Point Registers on page B-6. If they are not valid, the result is undefined. The operand must be a value in format fmt; see section B 7 on page B-24. If it is not, the result is undefined and the value of the operand FPR becomes undefined. Operation: StoreFPR(fd, W, ConvertFmt(ValueFPR(fs, fmt), fmt, W)) Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Invalid Operation Unimplemented Operation Inexact Overflow 31 0 6 5 5 5 5 6 COP1 fmt 0 fs fd CVT.W 11 1021 20 16 1526 25 6 5 0 1 0 0 0 1 1 0 0 1 0 00 0 0 0 0 Floating-Point Divide DIV.fmt FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-51 Format: DIV.S fd, fs, ft MIPS I DIV.D fd, fs, ft Purpose: To divide FP values. Description: fd ← fs / ft The value in FPR fs is divided by the value in FPR ft. The result is calculated to infinite precision, rounded according to the current rounding mode in FCSR, and placed into FPR fd. The operands and result are values in format fmt. Restrictions: The fields fs, ft, and fd must specify FPRs valid for operands of type fmt; see Floating- Point Registers on page B-6. If they are not valid, the result is undefined. The operands must be values in format fmt; see section B 7 on page B-24. If they are not, the result is undefined and the value of the operand FPRs becomes undefined. Operation: StoreFPR (fd, fmt, ValueFPR(fs, fmt) / ValueFPR(ft, fmt)) Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Inexact Unimplemented Operation Division-by-zero Invalid Operation Overflow Underflow 31 0 6 5 5 5 5 6 COP1 fmt ft fs fd DIV 11 1021 20 16 1526 25 6 5 0 1 0 0 0 1 0 0 0 0 1 1 DMFC1 Doubleword Move From Floating-Point B-52 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Format: DMFC1 rt, fs MIPS III Purpose: To copy a doubleword from an FPR to a GPR. Description: rt ← fs The doubleword contents of FPR fs are placed into GPR rt. If the coprocessor 1 general registers are 32-bits wide (a native 32-bit processor or 32- bit register emulation mode in a 64-bit processor), FPR fs is held in an even/odd register pair. The low word is taken from the even register fs and the high word is from fs+1. Restrictions: If fs does not specify an FPR that can contain a doubleword, the result is undefined; see Floating-Point Registers on page B-6. For MIPS III, the contents of GPR rt are undefined for the instruction immediately following DMFC1. Operation: MIPS I - III I : if SizeFGR() = 64 then /* 64-bit wide FGRs */ data ← FGR[fs] elseif fs0 = 0 then /* valid specifier, 32-bit wide FGRs */ data ← FGR[fs+1] || FGR[fs] else /* undefined for odd 32-bit FGRs */ UndefinedResult() endif I + 1 : GPR[rt] ← data Operation: MIPS IV if SizeFGR() = 64 then /* 64-bit wide FGRs */ data ← FGR[fs] elseif fs0 = 0 then /* valid specifier, 32-bit wide FGRs */ data ← FGR[fs+1] || FGR[fs] else /* undefined for odd 32-bit FGRs */ UndefinedResult() endif GPR[rt] ← data Exceptions: Reserved Instruction Coprocessor Unusable fs 11 10 5 31 2526 2021 1516 0 COP1 DMF rt 0 6 5 5 11 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 00 Doubleword Move To Floating-Point DMTC1 FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-53 Format: DMTC1 rt, fs MIPS III Purpose: To copy a doubleword from a GPR to an FPR. Description: fs ← rt The doubleword contents of GPR rt are placed into FPR fs. If coprocessor 1 general registers are 32-bits wide (a native 32-bit processor or 32-bit register emulation mode in a 64-bit processor), FPR fs is held in an even/odd register pair. The low word is placed in the even register fs and the high word is placed in fs+1. Restrictions: If fs does not specify an FPR that can contain a doubleword, the result is undefined; see Floating-Point Registers on page B-6. For MIPS III, the contents of FPR fs are undefined for the instruction immediately following DMTC1. Operation: MIPS I - III I : data ← GPR[rt] I + 1 :if SizeFGR() = 64 then /* 64-bit wide FGRs */ FGR[fs] ← data elseif fs0 = 0 then /* valid specifier, 32-bit wide FGRs */ FGR[fs+1] ← data63..32 FGR[fs] ← data31..0 else /* undefined result for odd 32-bit FGRs */ UndefinedResult() endif Operation: MIPS IV data ← GPR[rt] if SizeFGR() = 64 then /* 64-bit wide FGRs */ FGR[fs] ← data elseif fs0 = 0 then /* valid specifier, 32-bit wide FGRs */ FGR[fs+1] ← data63..32 FGR[fs] ← data31..0 else /* undefined result for odd 32-bit FGRs */ UndefinedResult() endif fs 11 10 5 31 2526 2021 1516 0 COP1 DMT rt 0 6 5 5 11 0 1 0 0 0 1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 00 DMTC1 Doubleword Move To Floating-Point B-54 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Exceptions: Reserved Instruction Coprocessor Unusable Floating-Point Floor Convert to Long Fixed-Point FLOOR.L.fmt FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-55 Format: FLOOR.L.S fd, fs MIPS III FLOOR.L.D fd, fs Purpose: To convert an FP value to 64-bit fixed-point, rounding down. Description: fd ← convert_and_round(fs) The value in FPR fs in format fmt, is converted to a value in 64-bit long fixed-point format rounding toward -∞ (rounding mode 3). The result is placed in FPR fd. When the source value is Infinity, NaN, or rounds to an integer outside the range -263 to 263-1, the result cannot be represented correctly and an IEEE Invalid Operation condition exists. The result depends on the FP exception model currently active. • Precise exception model: The Invalid Operation flag is set in the FCSR. If the Invalid Operation enable bit is set in the FCSR, no result is written to fd and an Invalid Operation exception is taken immediately. Otherwise, the default result, 263–1, is written to fd. • Imprecise exception model (R8000 normal mode): The default result, 263–1, is written to fd. No FCSR flag is set. If the Invalid Operation enable bit is set in the FCSR, an Invalid Operation exception is taken, imprecisely, at some future time. Restrictions: The fields fs and fd must specify valid FPRs; fs for type fmt and fd for long fixed-point; see Floating-Point Registers on page B-6. If they are not valid, the result is undefined. The operand must be a value in format fmt; see section B 7 on page B-24. If it is not, the result is undefined and the value of the operand FPR becomes undefined. Operation: StoreFPR(fd, L, ConvertFmt(ValueFPR(fs, fmt), fmt, L)) Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Invalid Operation Unimplemented Operation Inexact Overflow 31 0 6 5 5 5 5 6 COP1 fmt 0 fs fd FLOOR.L 11 1021 20 16 1526 25 6 5 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 1 1 FLOOR.W.fmt Floating-Point Floor Convert to Word Fixed-Point B-56 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Format: FLOOR.W.S fd, fs MIPS II FLOOR.W.D fd, fs Purpose: To convert an FP value to 32-bit fixed-point, rounding down. Description: fd ← convert_and_round(fs) The value in FPR fs in format fmt, is converted to a value in 32-bit word fixed-point format rounding toward –∞ (rounding mode 3). The result is placed in FPR fd. When the source value is Infinity, NaN, or rounds to an integer outside the range -231 to 231-1, the result cannot be represented correctly and an IEEE Invalid Operation condition exists. The result depends on the FP exception model currently active. • Precise exception model: The Invalid Operation flag is set in the FCSR. If the Invalid Operation enable bit is set in the FCSR, no result is written to fd and an Invalid Operation exception is taken immediately. Otherwise, the default result, 231–1, is written to fd. • Imprecise exception model (R8000 normal mode): The default result, 231–1, is written to fd. No FCSR flag is set. If the Invalid Operation enable bit is set in the FCSR, an Invalid Operation exception is taken, imprecisely, at some future time. Restrictions: The fields fs and fd must specify valid FPRs; fs for type fmt and fd for word fixed-point; see Floating-Point Registers on page B-6. If they are not valid, the result is undefined. The operand must be a value in format fmt; see section B 7 on page B-24. If it is not, the result is undefined and the value of the operand FPR becomes undefined. Operation: StoreFPR(fd, W, ConvertFmt(ValueFPR(fs, fmt), fmt, W)) Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Invalid Operation Unimplemented Operation Inexact Overflow 31 0 6 5 5 5 5 6 COP1 fmt 0 fs fd FLOOR.W 11 1021 20 16 1526 25 6 5 0 1 0 0 0 1 0 0 1 1 1 10 0 0 0 0 Load Doubleword to Floating-Point LDC1 FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-57 Format: LDC1 ft, offset(base) MIPS II Purpose: To load a doubleword from memory to an FPR. Description: ft ← memory[base+offset] The contents of the 64-bit doubleword at the memory location specified by the aligned effective address are fetched and placed in FPR ft. The 16-bit signed offset is added to the contents of GPR base to form the effective address. If coprocessor 1 general registers are 32-bits wide (a native 32-bit processor or 32-bit register emulation mode in a 64-bit processor), FPR ft is held in an even/odd register pair. The low word is placed in the even register ft and the high word is placed in ft+1. Restrictions: If ft does not specify an FPR that can contain a doubleword, the result is undefined; see Floating-Point Registers on page B-6. An Address Error exception occurs if EffectiveAddress2..0 ≠ 0 (not doubleword- aligned). MIPS IV: The low-order 3 bits of the offset field must be zero. If they are not, the result of the instruction is undefined. Operation: vAddr ← sign_extend(offset) + GPR[base] if vAddr2..0 ≠ 03 then SignalException(AddressError) endif (pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD) data ← LoadMemory(uncached, DOUBLEWORD, pAddr, vAddr, DATA) if SizeFGR() = 64 then /* 64-bit wide FGRs */ FGR[ft] ← data elseif ft0 = 0 then /* valid specifier, 32-bit wide FGRs */ FGR[ft+1] ← data63..32 FGR[ft] ← data31..0 else /* undefined result for odd 32-bit FGRs */ UndefinedResult() endif Exceptions: Coprocessor unusable Reserved Instruction TLB Refill, TLB Invalid Address Error 31 2526 2021 1516 0 LDC1 base ft offset 6 5 5 16 1 1 0 1 0 1 LDXC1 Load Doubleword Indexed to Floating-Point B-58 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Format: LDXC1 fd, index(base) MIPS IV Purpose: To load a doubleword from memory to an FPR (GPR+GPR addressing). Description: fd ← memory[base+index] The contents of the 64-bit doubleword at the memory location specified by the aligned effective address are fetched and placed in FPR fd. The contents of GPR index and GPR base are added to form the effective address. If coprocessor 1 general registers are 32-bits wide (a native 32-bit processor or 32-bit register emulation mode in a 64-bit processor), FPR fd is held in an even/odd register pair. The low word is placed in the even register fd and the high word is placed in fd+1. Restrictions: If fd does not specify an FPR that can contain a doubleword, the result is undefined; see Floating-Point Registers on page B-6. The Region bits of the effective address must be supplied by the contents of base. If EffectiveAddress63..62 ≠ base63..62, the result is undefined. An Address Error exception occurs if EffectiveAddress2..0 ≠ 0 (not doubleword- aligned). MIPS IV: The low-order 3 bits of the offset field must be zero. If they are not, the result of the instruction is undefined. Operation: vAddr ← GPR[base] + GPR[index] if vAddr2..0 ≠ 03 then SignalException(AddressError) endif (pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD) mem ← LoadMemory(unchched, DOUBLEWORD, pAddr, vAddr, DATA) if SizeFGR() = 64 then /* 64-bit wide FGRs */ FGR[fd] ← data elseif fd0 = 0 then /* valid specifier, 32-bit wide FGRs */ FGR[fd+1] ← data63..32 FGR[fd] ← data31..0 else /* undefined result for odd 32-bit FGRs */ UndefinedResult() endif 31 2526 2021 1516 COP1X base index 6 5 5 0 fd LDXC1 5 5 6 11 10 6 5 0 0 0 0 0 0 10 1 0 0 1 1 Load Doubleword Indexed to Floating-Point LDXC1 FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-59 Exceptions: TLB Refill, TLB Invalid Address Error Reserved Instruction Coprocessor Unusable LWC1 Load Word to Floating-Point B-60 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Format: LWC1 ft, offset(base) MIPS I Purpose: To load a word from memory to an FPR. Description: ft ← memory[base+offset] The contents of the 32-bit word at the memory location specified by the aligned effective address are fetched and placed into the low word of coprocessor 1 general register ft. The 16-bit signed offset is added to the contents of GPR base to form the effective address. If coprocessor 1 general registers are 64-bits wide, bits 63..32 of register ft become undefined. See Floating-Point Registers on page B-6. Restrictions: An Address Error exception occurs if EffectiveAddress1..0 ≠ 0 (not word-aligned). MIPS IV: The low-order 2 bits of the offset field must be zero. If they are not, the result of the instruction is undefined. Operation: 32-bit Processors I : /* “mem” is aligned 64-bits from memory. Pick out correct bytes. */ vAddr ← sign_extend(offset) + GPR[base] if vAddr1..0 ≠ 02 then SignalException(AddressError) endif (pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD) mem ← LoadMemory(uncached, WORD, pAddr, vAddr, DATA) I + 1 :FGR[ft] ← mem Operation: 64-bit Processors /* “mem” is aligned 64-bits from memory. Pick out correct bytes. */ vAddr ← sign_extend(offset) + GPR[base] if vAddr1..0 ≠ 02 then SignalException(AddressError) endif (pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD) pAddr ← pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian || 02)) mem ← LoadMemory(uncached, WORD, pAddr, vAddr, DATA) bytesel ← vAddr2..0 xor (BigEndianCPU || 02) if SizeFGR() = 64 then /* 64-bit wide FGRs */ FGR[ft] ← undefined32 || mem31+8*bytesel..8*bytesel else /* 32-bit wide FGRs */ FGR[ft] ← mem31+8*bytesel..8*bytesel endif LWC1 6 1 1 0 0 0 1 31 2526 2021 1516 0 base ft offset 5 5 16 Load Word to Floating-Point LWC1 FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-61 Exceptions: Coprocessor unusable Reserved Instruction TLB Refill, TLB Invalid Address Error LWXC1 Load Word Indexed to Floating-Point B-62 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Format: LWXC1 fd, index(base) MIPS IV Purpose: To load a word from memory to an FPR (GPR+GPR addressing). Description: fd ← memory[base+index] The contents of the 32-bit word at the memory location specified by the aligned effective address are fetched and placed into the low word of coprocessor 1 general register fd. The contents of GPR index and GPR base are added to form the effective address. If coprocessor 1 general registers are 64-bits wide, bits 63..32 of register fd become undefined. See Floating-Point Registers on page B-6. Restrictions: The Region bits of the effective address must be supplied by the contents of base. If EffectiveAddress63..62 ≠ base63..62, the result is undefined. An Address Error exception occurs if EffectiveAddress1..0 ≠ 0 (not word-aligned). MIPS IV: The low-order 2 bits of the offset field must be zero. If they are not, the result of the instruction is undefined. Operation: vAddr ← GPR[base] + GPR[index] if vAddr1..0 ≠ 02 then SignalException(AddressError) endif (pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD) pAddr ← pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian || 02)) /* “mem” is aligned 64-bits from memory. Pick out correct bytes. */ mem ← LoadMemory(uncached, WORD, pAddr, vAddr, DATA) bytesel ← vAddr2..0 xor (BigEndianCPU || 02) if SizeFGR() = 64 then /* 64-bit wide FGRs */ FGR[fd] ← undefined32 || mem31+8*bytesel..8*bytesel else /* 32-bit wide FGRs */ FGR[fd] ← mem31+8*bytesel..8*bytesel endif Exceptions: TLB Refill, TLB Invalid Address Error Reserved Instruction Coprocessor Unusable 31 2526 2021 1516 COP1X base index 6 5 5 0 fd LWXC1 5 5 6 11 10 6 5 0 0 0 0 0 0 00 1 0 0 1 1 Floating-Point Multiply Add MADD.fmt FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-63 Format: MADD.S fd, fr, fs, ft MIPS IV MADD.D fd, fr, fs, ft Purpose: To perform a combined multiply-then-add of FP values. Description: fd ← (fs × ft) + fr The value in FPR fs is multiplied by the value in FPR ft to produce a product. The value in FPR fr is added to the product. The result sum is calculated to infinite precision, rounded according to the current rounding mode in FCSR, and placed into FPR fd. The operands and result are values in format fmt. The accuracy of the result depends which of two alternative arithmetic models is used by the implementation for the computation. The numeric models are explained in Arithmetic Instructions on page B-21. Restrictions: The fields fr, fs, ft, and fd must specify FPRs valid for operands of type fmt; see Floating-Point Registers on page B-6. If they are not valid, the result is undefined. The operands must be values in format fmt; see section B 7 on page B-24. If they are not, the result is undefined and the value of the operand FPRs becomes undefined. Operation: vfr ← ValueFPR(fr, fmt) vfs ← ValueFPR(fs, fmt) vft ← ValueFPR(ft, fmt) StoreFPR(fd, fmt, vfr + vfs * vft) Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Inexact Unimplemented Operation Invalid Operation Overflow Underflow 31 2526 2021 1516 COP1X fr ft 6 5 5 fs fd 5 5 3 11 10 6 5 0 3 fmt 0 1 0 0 1 1 MADD 3 2 1 0 0 MFC1 Move Word From Floating-Point B-64 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Format: MFC1 rt, fs MIPS I Purpose: To copy a word from an FPU (CP1) general register to a GPR. Description: rt ← fs The low word from FPR fs is placed into the low word of GPR rt. If GPR rt is 64 bits wide, then the value is sign extended. See Floating-Point Registers on page B-6. Restrictions: For MIPS I, MIPS II, and MIPS III the contents of GPR rt are undefined for the instruction immediately following MFC1. Operation: MIPS I - III I : word ← FGR[fs]31..0 I + 1 :GPR[rt] ← sign_extend(word) Operation: MIPS IV word ← FGR[fs]31..0 GPR[rt]← sign_extend(word) Exceptions: Coprocessor Unusable 11 31 2526 2021 1516 COP1 MF rt 6 5 5 fs 0 5 11 10 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Floating-Point Move MOV.fmt FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-65 Format: MOV.S fd, fs MIPS I MOV.D fd, fs Purpose: To move an FP value between FPRs. Description: fd ← fs The value in FPR fs is placed into FPR fd. The source and destination are values in format fmt. The move is non-arithmetic; it causes no IEEE 754 exceptions. Restrictions: The fields fs and fd must specify FPRs valid for operands of type fmt; see Floating- Point Registers on page B-6. If they are not valid, the result is undefined. The operand must be a value in format fmt; see section B 7 on page B-24. If it is not, the result is undefined and the value of the operand FPR becomes undefined. Operation: StoreFPR(fd, fmt, ValueFPR(fs, fmt)) Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Unimplemented Operation 31 0 6 5 5 5 5 6 COP1 fmt 0 fs fd MOV 11 1021 20 16 1526 25 6 5 0 1 0 0 0 1 0 0 0 1 1 00 0 0 0 0 MOVF Move Conditional on FP False B-66 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Format: MOVF rd, rs, cc MIPS IV Purpose: To test an FP condition code then conditionally move a GPR. Description: if (cc = 0) then rd ← rs If the floating-point condition code specified by cc is zero, then the contents of GPR rs are placed into GPR rd. Restrictions: None Operation: active ← FCC[cc] = tf if active then GPR[rd] ← GPR[rs] endif Exceptions: Reserved Instruction Coprocessor Unusable 31 2526 1516 0 6 35 6 5 6 SPECIAL 5 11 1021 20 18 17 5 1 1 0 MOVCItf0ccrs 00 0 0 0 0 0 0 0 0 0 0 10 0 0 0 00 rd 5 Floating-Point Move Conditional on FP False MOVF.fmt FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-67 Format: MOVF.S fd, fs, cc MIPS IV MOVF.D fd, fs, cc Purpose: To test an FP condition code then conditionally move an FP value. Description: if (cc = 0) then fd ← fs If the floating-point condition code specified by cc is zero, then the value in FPR fs is placed into FPR fd. The source and destination are values in format fmt. If the condition code is not zero, then FPR fs is not copied and FPR fd contains its previous value in format fmt. If fd did not contain a value either in format fmt or previously unused data from a load or move-to operation that could be interpreted in format fmt, then the value of fd becomes undefined. The move is non-arithmetic; it causes no IEEE 754 exceptions. Restrictions: The fields fs and fd must specify FPRs valid for operands of type fmt; see Floating- Point Registers on page B-6. If they are not valid, the result is undefined. The operand must be a value in format fmt; see section B 7 on page B-24. If it is not, the result is undefined and the value of the operand FPR becomes undefined. Operation: if FCC[cc] = tf then StoreFPR(fd, fmt, ValueFPR(fs, fmt)) else StoreFPR(fd, fmt, ValueFPR(fd, fmt)) endif Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Unimplemented operation 31 2526 1516 0 6 35 6 5 6 COP1 5 11 1021 20 18 17 5 1 1 fd MOVCFtf0ccfmt 00 1 0 0 0 1 0 1 0 0 0 10 fs 55 MOVN.fmt Floating-Point Move Conditional on Not Zero B-68 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Format: MOVN.S fd, fs, rt MIPS IV MOVN.D fd, fs, rt Purpose: To test a GPR then conditionally move an FP value. Description: if (rt ≠ 0) then fd ← fs If the value in GPR rt is not equal to zero then the value in FPR fs is placed in FPR fd. The source and destination are values in format fmt. If GPR rt contains zero, then FPR fs is not copied and FPR fd contains its previous value in format fmt. If fd did not contain a value either in format fmt or previously unused data from a load or move-to operation that could be interpreted in format fmt, then the value of fd becomes undefined. The move is non-arithmetic; it causes no IEEE 754 exceptions. Restrictions: The fields fs and fd must specify FPRs valid for operands of type fmt; see Floating- Point Registers on page B-6. If they are not valid, the result is undefined. The operand must be a value in format fmt; see section B 7 on page B-24. If it is not, the result is undefined and the value of the operand FPR becomes undefined. Operation: if GPR[rt] ≠ 0 then StoreFPR(fd, fmt, ValueFPR(fs, fmt)) else StoreFPR(fd, fmt, ValueFPR(fd, fmt)) endif Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Unimplemented operation 31 2526 1516 0 6 5 5 6 5 6 COP1 5 11 1021 20 5 fd MOVNfsrtfmt 0 1 0 0 0 1 0 1 0 0 1 1 Move Conditional on FP True MOVT FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-69 Format: MOVT rd, rs, cc MIPS IV Purpose: To test an FP condition code then conditionally move a GPR. Description: if (cc = 1) then rd ← rs If the floating-point condition code specified by cc is one then the contents of GPR rs are placed into GPR rd. Restrictions: None Operation: if FCC[cc] = tf then GPR[rd] ← GPR[rs] endif Exceptions: Reserved Instruction Coprocessor Unusable 31 2526 1516 0 6 3 5 6 5 6 SPECIAL 5 11 1021 20 18 17 5 1 1 0 MOVCIrdtf0ccrs 10 0 0 0 0 0 0 0 0 0 0 10 0 0 0 0 0 MOVT.fmt Floating-Point Move Conditional on FP True B-70 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Format: MOVT.S fd, fs, cc MIPS IV MOVT.D fd, fs, cc Purpose: To test an FP condition code then conditionally move an FP value. Description: if (cc = 1) then fd ← fs If the floating-point condition code specified by cc is one then the value in FPR fs is placed into FPR fd. The source and destination are values in format fmt. If the condition code is not one, then FPR fs is not copied and FPR fd contains its previous value in format fmt. If fd did not contain a value either in format fmt or previously unused data from a load or move-to operation that could be interpreted in format fmt, then the value of fd becomes undefined. The move is non-arithmetic; it causes no IEEE 754 exceptions. Restrictions: The fields fs and fd must specify FPRs valid for operands of type fmt; see Floating- Point Registers on page B-6. If they are not valid, the result is undefined. The operand must be a value in format fmt; see section B 7 on page B-24. If it is not, the result is undefined and the value of the operand FPR becomes undefined. Operation: if FCC[cc] = tf then StoreFPR(fd, fmt, ValueFPR(fs, fmt)) else StoreFPR(fd, fmt, ValueFPR(fd, fmt)) endif Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Unimplemented operation 31 2526 1516 0 6 3 5 6 5 6 COP1 5 11 1021 20 18 17 5 1 1 fd MOVCFfstf0ccfmt 10 1 0 0 0 1 0 1 0 0 0 10 Floating-Point Move Conditional on Zero MOVZ.fmt FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-71 Format: MOVZ.S fd, fs, rt MIPS IV MOVZ.D fd, fs, rt Purpose: To test a GPR then conditionally move an FP value. Description: if (rt = 0) then fd ← fs If the value in GPR rt is equal to zero then the value in FPR fs is placed in FPR fd. The source and destination are values in format fmt. If GPR rt is not zero, then FPR fs is not copied and FPR fd contains its previous value in format fmt. If fd did not contain a value either in format fmt or previously unused data from a load or move-to operation that could be interpreted in format fmt, then the value of fd becomes undefined. The move is non-arithmetic; it causes no IEEE 754 exceptions. Restrictions: The fields fs and fd must specify FPRs valid for operands of type fmt; see Floating- Point Registers on page B-6. If they are not valid, the result is undefined. The operand must be a value in format fmt; see section B 7 on page B-24. If it is not, the result is undefined and the value of the operand FPR becomes undefined. Operation: if GPR[rt] = 0 then StoreFPR(fd, fmt, ValueFPR(fs, fmt)) else StoreFPR(fd, fmt, ValueFPR(fd, fmt)) endif Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Unimplemented operation 31 2526 1516 0 6 5 5 6 5 6 COP1 5 11 1021 20 5 fd MOVZfsrtfmt 0 1 0 0 0 1 0 1 0 0 1 0 MSUB.fmt Floating-Point Multiply Subtract B-72 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Format: MSUB.S fd, fr, fs, ft MIPS IV MSUB.D fd, fr, fs, ft Purpose: To perform a combined multiply-then-subtract of FP values. Description: fd ← (fs × ft) - fr The value in FPR fs is multiplied by the value in FPR ft to produce an intermediate product. The value in FPR fr is subtracted from the product. The subtraction result is calculated to infinite precision, rounded according to the current rounding mode in FCSR, and placed into FPR fd. The operands and result are values in format fmt. The accuracy of the result depends which of two alternative arithmetic models is used by the implementation for the computation. The numeric models are explained in Arithmetic Instructions on page B-21. Restrictions: The fields fr, fs, ft, and fd must specify FPRs valid for operands of type fmt; see Floating-Point Registers on page B-6. If they are not valid, the result is undefined. The operands must be values in format fmt; see section B 7 on page B-24. If they are not, the result is undefined and the value of the operand FPRs becomes undefined. Operation: vfr ← ValueFPR(fr, fmt) vfs ← ValueFPR(fs, fmt) vft ← ValueFPR(ft, fmt) StoreFPR(fd, fmt, (vfs * vft) - vfr) Exceptions: Reserved Instruction Coprocessor Unusable Floating-Point Inexact Unimplemented Operation Invalid Operation Overflow Underflow 31 2526 2021 1516 COP1X fr ft 6 5 5 fs fd 5 5 3 11 10 6 5 0 3 fmt 0 1 0 0 1 1 MSUB 3 2 1 0 1 Move Word to Floating-Point MTC1 FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-73 Format: MTC1 rt, fs MIPS I Purpose: To copy a word from a GPR to an FPU (CP1) general register. Description: fs ← rt The low word in GPR rt is placed into the low word of floating-point (coprocessor 1) general register fs. If coprocessor 1 general registers are 64-bits wide, bits 63..32 of register fs become undefined. See Floating-Point Registers on page B-6. Restrictions: For MIPS I, MIPS II, and MIPS III the value of FPR fs is undefined for the instruction immediately following MTC1. Operation: MIPS I - III I : data ← GPR[rt]31..0 I + 1 :if SizeFGR() = 64 then /* 64-bit wide FGRs */ FGR[fs] ← undefined32 || data else /* 32-bit wide FGRs */ FGR[fs] ← data endif Operation: MIPS IV data ← GPR[rt]31..0 if SizeFGR() = 64 then /* 64-bit wide FGRs */ FGR[fs] ← undefined32 || data else /* 32-bit wide FGRs */ FGR[fs] ← data endif Exceptions: Coprocessor Unusable 11 31 2526 2021 1516 COP1 MT rt 6 5 5 fs 0 5 11 10 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 MUL.fmt Floating-Point Multiply B-74 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Format: MUL.S fd, fs, ft MIPS I MUL.D fd, fs, ft Purpose: To multiply FP values. Description: fd ← fs × ft The value in FPR fs is multiplied by the value in FPR ft. The result is calculated to infinite precision, rounded according to the current rounding mode in FCSR, and placed into FPR fd. The operands and result are values in format fmt. Restrictions: The fields fs, ft, and fd must specify FPRs valid for operands of type fmt; see Floating- Point Registers on page B-6. If they are not valid, the result is undefined. The operands must be values in format fmt; see section B 7 on page B-24. If they are not, the result is undefined and the value of the operand FPRs becomes undefined. Operation: StoreFPR (fd, fmt, ValueFPR(fs, fmt) * ValueFPR(ft, fmt)) Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Inexact Unimplemented Operation Invalid Operation Overflow Underflow 31 0 6 5 5 5 5 6 COP1 fmt ft fs fd MUL 11 1021 20 16 1526 25 6 5 0 1 0 0 0 1 0 0 0 0 1 0 Floating-Point Negate NEG.fmt FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-75 Format: NEG.S fd, fs MIPS I NEG.D fd, fs Purpose: To negate an FP value. Description: fd ← - (fs) The value in FPR fs is negated and placed into FPR fd. The value is negated by changing the sign bit value. The operand and result are values in format fmt. This operation is arithmetic; a NaN operand signals invalid operation. Restrictions: The fields fs and fd must specify FPRs valid for operands of type fmt; see Floating- Point Registers on page B-6. If they are not valid, the result is undefined. The operand must be a value in format fmt; see section B 7 on page B-24. If it is not, the result is undefined and the value of the operand FPR becomes undefined. Operation: StoreFPR(fd, fmt, Negate(ValueFPR(fs, fmt))) Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Unimplemented Operation Invalid Operation 31 0 6 5 5 5 5 6 COP1 fmt 0 fs fd NEG 11 1021 20 16 1526 25 6 5 0 1 0 0 0 1 0 0 0 1 1 10 0 0 0 0 NMADD.fmt Floating-Point Negative Multiply Add B-76 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Format: NMADD.S fd, fr, fs, ft MIPS IV NMADD.D fd, fr, fs, ft Purpose: To negate a combined multiply-then-add of FP values. Description: fd ← - ((fs × ft) + fr) The value in FPR fs is multiplied by the value in FPR ft to produce an intermediate product. The value in FPR fr is added to the product. The result sum is calculated to infinite precision, rounded according to the current rounding mode in FCSR, negated by changing the sign bit, and placed into FPR fd. The operands and result are values in format fmt. The accuracy of the result depends which of two alternative arithmetic models is used by the implementation for the computation. The numeric models are explained in Arithmetic Instructions on page B-21. Restrictions: The fields fr, fs, ft, and fd must specify FPRs valid for operands of type fmt; see Floating-Point Registers on page B-6. If they are not valid, the result is undefined. The operands must be values in format fmt; see section B 7 on page B-24. If they are not, the result is undefined and the value of the operand FPRs becomes undefined. Operation: vfr ← ValueFPR(fr, fmt) vfs ← ValueFPR(fs, fmt) vft ← ValueFPR(ft, fmt) StoreFPR(fd, fmt, -(vfr + vfs * vft)) Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Inexact Unimplemented Operation Invalid Operation Overflow Underflow 31 2526 2021 1516 COP1X fr ft 6 5 5 fs fd 5 5 3 11 10 6 5 0 3 fmt 0 1 0 0 1 1 NMADD 3 2 1 1 0 Floating-Point Negative Multiply Subtract NMSUB.fmt FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-77 Format: NMSUB.S fd, fr, fs, ft MIPS IV NMSUB.D fd, fr, fs, ft Purpose: To negate a combined multiply-then-subtract of FP values. Description: fd ← - ((fs × ft) - fr) The value in FPR fs is multiplied by the value in FPR ft to produce an intermediate product. The value in FPR fr is subtracted from the product. The result is calculated to infinite precision, rounded according to the current rounding mode in FCSR, negated by changing the sign bit, and placed into FPR fd. The operands and result are values in format fmt. The accuracy of the result depends which of two alternative arithmetic models is used by the implementation for the computation. The numeric models are explained in Arithmetic Instructions on page B-21. Restrictions: The fields fr, fs, ft, and fd must specify FPRs valid for operands of type fmt; see Floating-Point Registers on page B-6. If they are not valid, the result is undefined. The operands must be values in format fmt; see section B 7 on page B-24. If they are not, the result is undefined and the value of the operand FPRs becomes undefined. Operation: vfr ← ValueFPR(fr, fmt) vfs ← ValueFPR(fs, fmt) vft ← ValueFPR(ft, fmt) StoreFPR(fd, fmt, -((vfs * vft) - vfr)) Exceptions: Reserved Instruction Coprocessor Unusable Floating-Point Inexact Unimplemented Operation Invalid Operation Overflow Underflow 31 2526 2021 1516 COP1X fr ft 6 5 5 fs fd 5 5 3 11 10 6 5 0 3 fmt 0 1 0 0 1 1 NMSUB 3 2 1 1 1 PREFX Prefetch Indexed B-78 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Format: PREFX hint, index(base) MIPS IV Purpose: To prefetch locations from memory (GPR+GPR addressing). Description: prefetch_memory[base+index] PREFX adds the contents of GPR index to the contents of GPR base to form an effective byte address. It advises that data at the effective address may be used in the near future. The hint field supplies information about the way that the data is expected to be used. PREFX is an advisory instruction. It may change the performance of the program. For all hint values, it neither changes architecturally-visible state nor alters the meaning of the program. An implementation may do nothing when executing a PREFX instruction. If MIPS IV instructions are supported and enabled and Coprocessor 1 is enabled (allowing access to CP1X), PREFX does not cause addressing-related exceptions. If it raises an exception condition, the exception condition is ignored. If an addressing- related exception condition is raised and ignored, no data will be prefetched. Even if no data is prefetched in such a case, some action that is not architecturally-visible, such as writeback of a dirty cache line, might take place. PREFX will never generate a memory operation for a location with an uncached memory access type (see Memory Access Types on page A-12). If PREFX results in a memory operation, the memory access type used for the operation is determined by the memory access type of the effective address, just as it would be if the memory operation had been caused by a load or store to the effective address. PREFX enables the processor to take some action, typically prefetching the data into cache, to improve program performance. The action taken for a specific PREFX instruction is both system and context dependent. Any action, including doing nothing, is permitted that does not change architecturally-visible state or alter the meaning of a program. It is expected that implementations will either do nothing or take an action that will increase the performance of the program. For a cached location, the expected, and useful, action is for the processor to prefetch a block of data that includes the effective address. The size of the block, and the level of the memory hierarchy it is fetched into are implementation specific. 31 2526 2021 1516 COP1X base index 6 5 5 hint 0 PREFX 5 5 6 11 10 6 5 0 0 1 0 0 1 1 0 0 0 0 0 0 0 1 1 1 1 Prefetch Indexed PREFX FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-79 The hint field supplies information about the way the data is expected to be used. No hint value causes an action that modifies architecturally-visible state. A processor may use a hint value to improve the effectiveness of the prefetch action. The defined hint values and the recommended prefetch action are shown in the table below. The hint table may be extended in future implementations. Restrictions: The Region bits of the effective address must be supplied by the contents of base. If EffectiveAddress63..62 ≠ base63..62, the result of the instruction is undefined. Operation: vAddr ← GPR[base] + GPR[index] (pAddr, uncached) ← AddressTranslation(vAddr, DATA, LOAD) Prefetch(uncached, pAddr, vAddr, DATA, hint) Table B-22 Values of Hint Field for Prefetch Instruction Value Name Data use and desired prefetch action 0 load Data is expected to be loaded (not modified). Fetch data as if for a load. 1 store Data is expected to be stored or modified. Fetch data as if for a store. 2-3 Not yet defined. 4 load_streamed Data is expected to be loaded (not modified) but not reused extensively; it will “stream” through cache. Fetch data as if for a load and place it in the cache so that it will not displace data prefetched as “retained”. 5 store_streamed Data is expected to be stored or modified but not reused extensively; it will “stream” through cache. Fetch data as if for a store and place it in the cache so that it will not displace data prefetched as “retained”. 6 load_retained Data is expected to be loaded (not modified) and reused extensively; it should be “retained” in the cache. Fetch data as if for a load and place it in the cache so that it will not be displaced by data prefetched as “streamed”. 7 store_retained Data is expected to be stored or modified and reused extensively; it should be “retained” in the cache. Fetch data as if for a store and place it in the cache so that will not be displaced by data prefetched as “streamed”. 8-31 Not yet defined. PREFX Prefetch Indexed B-80 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Exceptions: Reserved Instruction Coprocessor Unusable Programming Notes: Prefetch can not prefetch data from a mapped location unless the translation for that location is present in the TLB. Locations in memory pages that have not been accessed recently may not have translations in the TLB, so prefetch may not be effective for such locations. Prefetch does not cause addressing exceptions. It will not cause an exception to prefetch using an address pointer value before the validity of a pointer is determined. Implementation Notes: It is recommended that a reserved hint field value either cause a default prefetch action that is expected to be useful for most cases of data use, such as the “load” hint, or cause the instruction to be treated as a NOP. Reciprocal Approximation RECIP.fmt FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-81 Format: RECIP.S fd, fs MIPS IV RECIP.D fd, fs Purpose: To approximate the reciprocal of an FP value (quickly). Description: fd ← 1.0 / fs The reciprocal of the value in FPR fs is approximated and placed into FPR fd. The operand and result are values in format fmt. The numeric accuracy of this operation is implementation dependent; it does not meet the accuracy specified by the IEEE 754 Floating-Point standard. The computed result differs from the both the exact result and the IEEE-mandated representation of the exact result by no more than one unit in the least-significant place (ulp). It is implementation dependent whether the result is affected by the current rounding mode in FCSR. Restrictions: The fields fs and fd must specify FPRs valid for operands of type fmt; see Floating- Point Registers on page B-6. If they are not valid, the result is undefined. The operand must be a value in format fmt; see section B 7 on page B-24. If it is not, the result is undefined and the value of the operand FPR becomes undefined. Operation: StoreFPR(fd, fmt, 1.0 / valueFPR(fs, fmt)) Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Inexact Unimplemented Operation Division-by-zero Invalid Operation Overflow Underflow 31 2526 2021 1516 COP1 fmt 0 6 5 5 fs fd RECIP 5 5 6 11 10 6 5 0 0 1 0 0 0 1 0 1 0 1 0 10 0 0 0 0 ROUND.L.fmt Floating-Point Round to Long Fixed-Point B-82 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Format: ROUND.L.S fd, fs MIPS III ROUND.L.D fd, fs Purpose: To convert an FP value to 64-bit fixed-point, rounding to nearest. Description: fd ← convert_and_round(fs) The value in FPR fs in format fmt, is converted to a value in 64-bit long fixed-point format rounding to nearest/even (rounding mode 0). The result is placed in FPR fd. When the source value is Infinity, NaN, or rounds to an integer outside the range -263 to 263-1, the result cannot be represented correctly and an IEEE Invalid Operation condition exists. The result depends on the FP exception model currently active. • Precise exception model: The Invalid Operation flag is set in the FCSR. If the Invalid Operation enable bit is set in the FCSR, no result is written to fd and an Invalid Operation exception is taken immediately. Otherwise, the default result, 263–1, is written to fd. • Imprecise exception model (R8000 normal mode): The default result, 263–1, is written to fd. No FCSR flag is set. If the Invalid Operation enable bit is set in the FCSR, an Invalid Operation exception is taken, imprecisely, at some future time. Restrictions: The fields fs and fd must specify valid FPRs; fs for type fmt and fd for long fixed-point; see Floating-Point Registers on page B-6. If they are not valid, the result is undefined. The operand must be a value in format fmt; see section B 7 on page B-24. If it is not, the result is undefined and the value of the operand FPR becomes undefined. Operation: StoreFPR(fd, L, ConvertFmt(ValueFPR(fs, fmt), fmt, L)) Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Inexact Unimplemented Operation Overflow Invalid Operation 31 0 6 5 5 5 5 6 COP1 fmt 0 fs fd ROUND.L 11 1021 20 16 1526 25 6 5 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 Floating-Point Round to Word Fixed-Point ROUND.W.fmt FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-83 Format: ROUND.W.S fd, fs MIPS II ROUND.W.D fd, fs Purpose: To convert an FP value to 32-bit fixed-point, rounding to nearest. Description: fd ← convert_and_round(fs) The value in FPR fs in format fmt, is converted to a value in 32-bit word fixed-point format rounding to nearest/even (rounding mode 0). The result is placed in FPR fd. When the source value is Infinity, NaN, or rounds to an integer outside the range -231 to 231-1, the result cannot be represented correctly and an IEEE Invalid Operation condition exists. The result depends on the FP exception model currently active. • Precise exception model: The Invalid Operation flag is set in the FCSR. If the Invalid Operation enable bit is set in the FCSR, no result is written to fd and an Invalid Operation exception is taken immediately. Otherwise, the default result, 231–1, is written to fd. • Imprecise exception model (R8000 normal mode): The default result, 231–1, is written to fd. No FCSR flag is set. If the Invalid Operation enable bit is set in the FCSR, an Invalid Operation exception is taken, imprecisely, at some future time. Restrictions: The fields fs and fd must specify valid FPRs; fs for type fmt and fd for word fixed-point; see Floating-Point Registers on page B-6. If they are not valid, the result is undefined. The operand must be a value in format fmt; see section B 7 on page B-24. If it is not, the result is undefined and the value of the operand FPR becomes undefined. Operation: StoreFPR(fd, W, ConvertFmt(ValueFPR(fs, fmt), fmt, W)) Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Inexact Unimplemented Operation Invalid Operation Overflow 31 0 6 5 5 5 5 6 COP1 fmt 0 fs fd ROUND.W 11 1021 20 16 1526 25 6 5 0 1 0 0 0 1 0 0 1 1 0 00 0 0 0 0 RSQRT.fmt Reciprocal Square Root Approximation B-84 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Format: RSQRT.S fd, fs MIPS IV RSQRT.D fd, fs Purpose: To approximate the reciprocal of the square root of an FP value (quickly). Description: fd ← 1.0 / sqrt(fs) The reciprocal of the positive square root of the value in FPR fs is approximated and placed into FPR fd. The operand and result are values in format fmt. The numeric accuracy of this operation is implementation dependent; it does not meet the accuracy specified by the IEEE 754 Floating-Point standard. The computed result differs from the both the exact result and the IEEE-mandated representation of the exact result by no more than two units in the least-significant place (ulp). It is implementation dependent whether the result is affected by the current rounding mode in FCSR. Restrictions: The fields fs and fd must specify FPRs valid for operands of type fmt; see Floating- Point Registers on page B-6. If they are not valid, the result is undefined. The operand must be a value in format fmt; see section B 7 on page B-24. If it is not, the result is undefined and the value of the operand FPR becomes undefined. Operation: StoreFPR(fd, fmt, 1.0 / SquareRoot(valueFPR(fs, fmt))) Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Inexact Unimplemented Operation Division-by-zero Invalid Operation Overflow Underflow 31 2526 2021 1516 COP1 fmt 0 6 5 5 fs fd RSQRT 5 5 6 11 10 6 5 0 0 1 0 1 1 00 1 0 0 0 1 0 0 0 0 0 Store Doubleword from Floating-Point SDC1 FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-85 Format: SDC1 ft, offset(base) MIPS II Purpose: To store a doubleword from an FPR to memory. Description: memory[base+offset] ← ft The 64-bit doubleword in FPR ft is stored in memory at the location specified by the aligned effective address. The 16-bit signed offset is added to the contents of GPR base to form the effective address. If coprocessor 1 general registers are 32-bits wide (a native 32-bit processor or 32-bit register emulation mode in a 64-bit processor), FPR ft is held in an even/odd register pair. The low word is taken from the even register ft and the high word is from ft+1. Restrictions: If ft does not specify an FPR that can contain a doubleword, the result is undefined; see Floating-Point Registers on page B-6. An Address Error exception occurs if EffectiveAddress2..0 ≠ 0 (not doubleword- aligned). MIPS IV: The low-order 3 bits of the offset field must be zero. If they are not, the result of the instruction is undefined. Operation: vAddr ← sign_extend(offset) + GPR[base] if vAddr2..0 ≠ 03 then SignalException(AddressError) endif (pAddr, uncached) ← AddressTranslation(vAddr, DATA, STORE) if SizeFGR() = 64 then /* 64-bit wide FGRs */ data ← FGR[ft] elseif ft0 = 0 then /* valid specifier, 32-bit wide FGRs */ data ← FGR[ft+1] || FGR[ft] else /* undefined for odd 32-bit FGRs */ UndefinedResult() endif StoreMemory(uncached, DOUBLEWORD, data, pAddr, vAddr, DATA) Exceptions: Coprocessor unusable Reserved Instruction TLB Refill, TLB Invalid TLB Modified Address Error 31 2526 2021 1516 0 SDC1 base ft offset 6 5 5 16 1 1 1 1 0 1 SDXC1 Store Doubleword Indexed from Floating-Point B-86 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Format: SDXC1 fs, index(base) MIPS IV Purpose: To store a doubleword from an FPR to memory (GPR+GPR addressing). Description: memory[base+index] ← fs The 64-bit doubleword in FPR fs is stored in memory at the location specified by the aligned effective address. The contents of GPR index and GPR base are added to form the effective address. If coprocessor 1 general registers are 32-bits wide (a native 32-bit processor or 32-bit register emulation mode in a 64-bit processor), FPR fs is held in an even/odd register pair. The low word is taken from the even register fs and the high word is from fs+1. Restrictions: If fs does not specify an FPR that can contain a doubleword, the result is undefined; see Floating-Point Registers on page B-6. The Region bits of the effective address must be supplied by the contents of base. If EffectiveAddress63..62 ≠ base63..62, the result is undefined. An Address Error exception occurs if EffectiveAddress2..0 ≠ 0 (not doubleword- aligned). MIPS IV: The low-order 3 bits of the offset field must be zero. If they are not, the result of the instruction is undefined. Operation: vAddr ← GPR[base] + GPR[index] if vAddr2..0 ≠ 03 then SignalException(AddressError) endif (pAddr, uncached) ← AddressTranslation(vAddr, DATA, STORE) if SizeFGR() = 64 then /* 64-bit wide FGRs */ data ← FGR[fs] elseif fs0 = 0 then /* valid specifier, 32-bit wide FGRs */ data ← FGR[fs+1] || FGR[fs] else /* undefined for odd 32-bit FGRs */ UndefinedResult() endif StoreMemory(uncached, DOUBLEWORD, data, pAddr, vAddr, DATA) 31 2526 2021 1516 COP1X base index 6 5 5 fs 0 SDXC1 5 5 6 11 10 6 5 0 0 0 1 0 0 10 1 0 0 1 1 Store Doubleword Indexed from Floating-Point SDXC1 FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-87 Exceptions: TLB Refill, TLB Invalid TLB Modified Address Error Reserved Instruction Coprocessor Unusable SQRT.fmt Floating-Point Square Root B-88 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Format: SQRT.S fd, fs MIPS II SQRT.D fd, fs Purpose: To compute the square root of an FP value. Description: fd ← SQRT(fs) The square root of the value in FPR fs is calculated to infinite precision, rounded according to the current rounding mode in FCSR, and placed into FPR fd. The operand and result are values in format fmt. If the value in FPR fs corresponds to –0, the result will be –0. Restrictions: If the value in FPR fs is less than 0, an Invalid Operation condition is raised. The fields fs and fd must specify FPRs valid for operands of type fmt; see Floating- Point Registers on page B-6. If they are not valid, the result is undefined. The operand must be a value in format fmt; see section B 7 on page B-24. If it is not, the result is undefined and the value of the operand FPR becomes undefined. Operation: StoreFPR(fd, fmt, SquareRoot(ValueFPR(fs, fmt))) Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Unimplemented Operation Invalid Operation Inexact 31 0 6 5 5 5 5 6 COP1 fmt fs fd SQRT 11 1021 20 16 1526 25 6 5 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 Floating-Point Subtract SUB.fmt FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-89 Format: SUB.S fd, fs, ft MIPS I SUB.D fd, fs, ft Purpose: To subtract FP values. Description: fd ← fs - ft The value in FPR ft is subtracted from the value in FPR fs. The result is calculated to infinite precision, rounded according to the current rounding mode in FCSR, and placed into FPR fd. The operands and result are values in format fmt. Restrictions: The fields fs, ft, and fd must specify FPRs valid for operands of type fmt; see Floating- Point Registers on page B-6. If they are not valid, the result is undefined. The operands must be values in format fmt; see section B 7 on page B-24. If they are not, the result is undefined and the value of the operand FPRs becomes undefined. Operation: StoreFPR (fd, fmt, ValueFPR(fs, fmt) – ValueFPR(ft, fmt)) Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Inexact Unimplemented Operation Invalid Operation Overflow Underflow 31 0 6 5 5 5 5 6 COP1 fmt ft fs fd SUB 11 1021 20 16 1526 25 6 5 0 1 0 0 0 1 0 0 0 0 0 1 SWC1 Store Word from Floating-Point B-90 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Format: SWC1 ft, offset(base) MIPS I Purpose: To store a word from an FPR to memory. Description: memory[base+offset] ← ft The low 32-bit word from FPR ft is stored in memory at the location specified by the aligned effective address. The 16-bit signed offset is added to the contents of GPR base to form the effective address. Restrictions: An Address Error exception occurs if EffectiveAddress1..0 ≠ 0 (not word-aligned). MIPS IV: The low-order 2 bits of the offset field must be zero. If they are not, the result of the instruction is undefined. Operation: 32-bit Processors vAddr ← sign_extend(offset) + GPR[base] if vAddr1..0 ≠ 02 then SignalException(AddressError) endif (pAddr, uncached) ← AddressTranslation (vAddr, DATA, STORE) data ← FGR[ft] StoreMemory (uncached, WORD, data, pAddr, vAddr, DATA) Operation: 64-bit Processors vAddr ← sign_extend(offset) + GPR[base] if vAddr1..0 ≠ 02 then SignalException(AddressError) endif (pAddr, uncached) ← AddressTranslation (vAddr, DATA, STORE) pAddr ← pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian || 02)) bytesel ← vAddr2..0 xor (BigEndianCPU || 02) /* the bytes of the word are moved into the correct byte lanes */ if SizeFGR() = 64 then /* 64-bit wide FGRs */ data ← 032-8*bytesel || FGR[ft]31..0 || 08*bytesel /* top or bottom wd of 64-bit data */ else /* 32-bit wide FGRs */ data ← 032-8*bytesel || FGR[ft] || 08*bytesel /* top or bottom wd of 64-bit data */ endif StoreMemory (uncached, WORD, data, pAddr, vAddr, DATA) Exceptions: Coprocessor unusable Reserved Instruction TLB Refill, TLB Invalid TLB Modified Address Error 31 2526 2021 1516 0 SWC1 base ft offset 6 5 5 16 1 1 1 0 0 1 Store Word Indexed from Floating-Point SWXC1 FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-91 Format: SWXC1 fs, index(base) MIPS IV Purpose: To store a word from an FPR to memory (GPR+GPR addressing). Description: memory[base+index] ← fs The low 32-bit word from FPR fs is stored in memory at the location specified by the aligned effective address. The contents of GPR index and GPR base are added to form the effective address. Restrictions: The Region bits of the effective address must be supplied by the contents of base. If EffectiveAddress63..62 ≠ base63..62, the result is undefined. An Address Error exception occurs if EffectiveAddress1..0 ≠ 0 (not word-aligned). MIPS IV: The low-order 2 bits of the offset field must be zero. If they are not, the result of the instruction is undefined. Operation: vAddr ← GPR[base] + GPR[index] if vAddr1..0 ≠ 02 then SignalException(AddressError) endif (pAddr, uncached) ← AddressTranslation(vAddr, DATA, STORE) pAddr ← pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian || 02)) bytesel ← vAddr2..0 xor (BigEndianCPU || 02) /* the bytes of the word are moved into the correct byte lanes */ if SizeFGR() = 64 then /* 64-bit wide FGRs */ data ← 032-8*bytesel || FGR[fs]31..0 || 08*bytesel/* top or bottom wd of 64-bit data */ else /* 32-bit wide FGRs */ data ← 032-8*bytesel || FGR[fs] || 08*bytesel /* top or bottom wd of 64-bit data */ endif StoreMemory (uncached, WORD, data, pAddr, vAddr, DATA) Exceptions: TLB Refill, TLB Invalid TLB Modified Address Error Reserved Instruction Coprocessor Unusable 31 2526 2021 1516 COP1X base index 6 5 5 fs 0 SWXC1 5 5 6 11 10 6 5 0 0 0 1 0 0 00 1 0 0 1 1 TRUNC.L.fmt Floating-Point Truncate to Long Fixed-Point B-92 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Format: TRUNC.L.S fd, fs MIPS III TRUNC.L.D fd, fs Purpose: To convert an FP value to 64-bit fixed-point, rounding toward zero. Description: fd ← convert_and_round(fs) The value in FPR fs in format fmt, is converted to a value in 64-bit long fixed-point format rounding toward zero (rounding mode 1). The result is placed in FPR fd. When the source value is Infinity, NaN, or rounds to an integer outside the range -263 to 263-1, the result cannot be represented correctly and an IEEE Invalid Operation condition exists. The result depends on the FP exception model currently active. • Precise exception model: The Invalid Operation flag is set in the FCSR. If the Invalid Operation enable bit is set in the FCSR, no result is written to fd and an Invalid Operation exception is taken immediately. Otherwise, the default result, 263–1, is written to fd. • Imprecise exception model (R8000 normal mode): The default result, 263–1, is written to fd. No FCSR flag is set. If the Invalid Operation enable bit is set in the FCSR, an Invalid Operation exception is taken, imprecisely, at some future time. Restrictions: The fields fs and fd must specify valid FPRs; fs for type fmt and fd for long fixed-point; see Floating-Point Registers on page B-6. If they are not valid, the result is undefined. The operand must be a value in format fmt; see section B 7 on page B-24. If it is not, the result is undefined and the value of the operand FPR becomes undefined. Operation: StoreFPR(fd, L, ConvertFmt(ValueFPR(fs, fmt), fmt, L)) Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Inexact Unimplemented Operation Invalid Operation Overflow 31 0 6 5 5 5 5 6 COP1 fmt 0 fs fd TRUNC.L 11 1021 20 16 1526 25 6 5 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 01 Floating-Point Truncate to Word Fixed-Point TRUNC.W.fmt FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-93 Format: TRUNC.W.S fd, fs MIPS II TRUNC.W.D fd, fs Purpose: To convert an FP value to 32-bit fixed-point, rounding toward zero. Description: fd ← convert_and_round(fs) The value in FPR fs in format fmt, is converted to a value in 32-bit word fixed-point format using rounding toward zero (rounding mode 1)). The result is placed in FPR fd. When the source value is Infinity, NaN, or rounds to an integer outside the range -231 to 231-1, the result cannot be represented correctly and an IEEE Invalid Operation condition exists. The result depends on the FP exception model currently active. • Precise exception model: The Invalid Operation flag is set in the FCSR. If the Invalid Operation enable bit is set in the FCSR, no result is written to fd and an Invalid Operation exception is taken immediately. Otherwise, the default result, 231–1, is written to fd. • Imprecise exception model (R8000 normal mode): The default result, 231–1, is written to fd. No FCSR flag is set. If the Invalid Operation enable bit is set in the FCSR, an Invalid Operation exception is taken, imprecisely, at some future time. Restrictions: The fields fs and fd must specify valid FPRs; fs for type fmt and fd for word fixed-point; see Floating-Point Registers on page B-6. If they are not valid, the result is undefined. The operand must be a value in format fmt; see section B 7 on page B-24. If it is not, the result is undefined and the value of the operand FPR becomes undefined. Operation: StoreFPR(fd, W, ConvertFmt(ValueFPR(fs, fmt), fmt, W)) Exceptions: Coprocessor Unusable Reserved Instruction Floating-Point Inexact Invalid Operation Overflow Unimplemented Operation 31 0 6 5 5 5 5 6 COP1 fmt 0 fs fd TRUNC.W 11 1021 20 16 1526 25 6 5 0 1 0 0 0 1 0 0 1 1 0 10 0 0 0 0 FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-95 B.11 FPU Instruction Formats An FPU instruction is a single 32-bit aligned word. The distinct FP instruction layouts are shown in Figure B-16. Variable information is in lower-case labels, such as “offset”. Upper-case labels and any numbers indicate constant data. A table follows all the layouts that explains the fields used in them. Note that the same field may have different names in different instruction layout pictures. The field name is mnemonic to the function of that field in the instruction layout. The opcode tables and the instruction decode discussion use the canonical field names: opcode, fmt, nd, tf, and function. The other fields are not used for instruction decode. Figure B-16 FPU Instruction Formats 31 25 21 20 16 015 offset 26 ftbaseopcode 6 5 5 16 Immediate: load/store using register + offset addressing. 31 0 6 5 5 5 5 6 COP1 fmt ft fs fd function 11 1021 20 16 1526 25 6 5 Register: 2-register and 3-register formatted arithmetic operations. 31 0 6 5 5 5 11 COP1 sub rt fs 0 11 1021 20 16 1526 25 Register Immediate: data transfer -- CPU ↔ FPU register. 15 BC 31 2526 COP1 6 0 16 offset 21 20 5 3 ndcc 1 1 tf 18 Condition code, Immediate: conditional branches on FPU cc using PC + offset. 1716 31 0 6 5 5 5 3 COP1 fmt ft fs cc 11 1021 20 16 1526 25 6 5 0 2 78 4 Register to Condition Code: formatted FP compare. function B-96 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set 31 0 6 5 5 5 5 6 COP1 fmt cc fs fd MOVCF 11 1021 20 1526 25 6 5 11 18 0 tf Condition Code, Register FP: FPU register move-conditional on FP cc. 1716 31 0 6 5 5 5 5 3 COP1X fr ft fs fd op4 11 1021 20 16 1526 25 6 5 fmt3 3 23 function Register-4: 4-register formatted arithmetic operations. 31 0 6 5 5 5 5 6 COP1X base index 0 fd function 11 1021 20 16 1526 25 6 5 Register Index: Load/store using register + register addressing. 31 0 6 5 5 5 5 6 COP1X base index hint 0 PREFX 11 1021 20 16 1526 25 6 5 Register Index hint: Prefetch using register + register addressing. 31 0 6 5 5 5 5 6 SPECIAL rs cc rd 0 MOVCI 11 1021 20 1526 25 6 5 11 18 0 tf Condition Code, Register Integer: CPU register move-conditional on FP cc. 1716 FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-97 BC Branch Conditional instruction subcode (op=COP1) base CPU register: base address for address calculations COP1 Coprocessor 1 primary opcode value in op field. COP1X Coprocessor 1 eXtended primary opcode value in op field. cc condition code specifier. For architecture levels prior to MIPS IV it must be zero. fd FPU register: destination (arithmetic, loads, move-to) or source (stores, move-from) fmt destination and/or operand type (“format”) specifier fr FPU register: source fs FPU register: source ft FPU register: source (for stores, arithmetic) or destination (for loads) function function field specifying a function within a particular op operation code. function: op4 + fmt3 op4 is a 3-bit function field specifying which 4-register arithmetic operation for COP1X, fmt3 is a 3-bit field specifying the format of the operands and destination. The combinations are shown as several distinct instructions in the opcode tables. hint hint field made available to cache controller for prefetch operation index CPU register, holds index address component for address calculations MOVC Value in function field for conditional move. There is one value for the instruction with op=COP1, another for the instruction with op=SPECIAL. nd nullify delay. If set, branch is Likely and delay slot instruction is not executed. This must be zero for MIPS I. offset signed offset field used in address calculations op primary operation code (COP1, COP1X, LWC1, SWC1, LDC1, SDC1, SPECIAL) PREFX Value in function field for prefetch instruction for op=COP1X rd CPU register: destination rs CPU register: source rt CPU register: source / destination SPECIAL SPECIAL primary opcode value in op field. sub Operation subcode field for COP1 register immediate mode instructions. tf true/false. The condition from FP compare is tested for equality with tf bit. B-98 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set B.12 FPU (CP1) Instruction Opcode Bit Encoding This section describes the encoding of the Floating-Point Unit (FPU) instructions for the four levels of the MIPS architecture, MIPS I through MIPS IV. Each architecture level includes the instructions in the previous level;† MIPS IV includes all instructions in MIPS I, MIPS II, and MIPS III. This section presents eight different views of the instruction encoding. • Separate encoding tables for each architecture level. • A MIPS IV encoding table showing the architecture level at which each opcode was originally defined and subsequently modified (if modified). • Separate encoding tables for each architecture revision showing the changes made during that revision. B 12.1 Instruction Decode Instruction field names are printed in bold in this section. The primary opcode field is decoded first. The opcode values LWC1, SWC1, LDC1, and SDC1 fully specify FPU load and store instructions. The opcode values COP1, COP1X, and SPECIAL specify instruction classes. Instructions within a class are further specified by values in other fields. B 12.1.1 COP1 Instruction Class The opcode =COP1 instruction class encodes most of the FPU instructions. The class is further decoded by examining the fmt field. The fmt values fully specify the CPU ↔ FPU register move instructions and specify the S, D, W, L, and BC instruction classes. The opcode =COP1 + fmt =BC instruction class encodes the conditional branch instructions. The class is further decoded, and the instructions fully specified, by examining the nd and tf fields. The opcode =COP1 + fmt =(S, D, W, or L) instruction classes encode instructions that operate on formatted (typed) operands. Each of these instruction classes is further decoded by examining the function field. With one exception the function values fully specify instructions. The exception is the MOVCF instruction class. The opcode =COP1 + fmt =(S or D) + function =MOVCF instruction class encodes the MOVT.fmt and MOVF.fmt conditional move instructions (to move FP values based on FP condition codes). The class is further decoded, and the instructions fully specified, by examining the tf field. † An exception to this rule is that the reserved, but never implemented, Coprocessor 3 instructions were removed or changed to another use starting in MIPS III. FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-99 B 12.1.2 COP1X Instruction Class The opcode =COP1X instruction class encodes the indexed load/store instructions, the indexed prefetch, and the multiply accumulate instructions. The class is further decoded, and the instructions fully specified, by examining the function field. B 12.1.3 SPECIAL Instruction Class The opcode =SPECIAL instruction class is further decoded by examining the function field. The only function value that applies to FPU instruction encoding is the MOVCI instruction class. The remainder of the function values encode CPU instructions. The opcode =SPECIAL + function =MOVCI instruction class encodes the MOVT and MOVF conditional move instructions (to move CPU registers based on FP condition codes). The class is further decoded, and the instructions fully specified, by examining the tf field. B 12.2 Instruction Subsets of MIPS III and MIPS IV Processors. MIPS III processors, such as the R4000, R4200, R4300, R4400, and R4600, have a processor mode in which only the MIPS II instructions are valid. The MIPS II encoding table describes the MIPS II-only mode. MIPS IV processors, such as the R8000 and R10000, have processor modes in which only the MIPS II or MIPS III instructions are valid. The MIPS II encoding table describes the MIPS II-only mode. The MIPS III encoding table describes the MIPS III-only mode. B-100 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Table B-23 FPU (CP1) Instruction Encoding - MIPS I Architecture Instructions encoded by the opcode field. opcod e bits 28..26 bits 31..29 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 χ 1 001 2 010 COP1 δ 3 011 4 100 5 101 6 110 LWC1 7 111 SWC1 Instructions encoded by the fmt field when opcode=COP1. fmt bits 23..21 bits 25..24 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 00 MFC1 * CFC1 * MTC1 * CTC1 * 1 01 BC δ * * * * * * * 2 10 S δ D δ * * W δ * * * 3 11 * * * * * * * * Instructions encoded by the tf field when opcode=COP1 and fmt=BC. t f bit 16 0 1 BC1F BC1T 31 26 opcode 0 31 26 opcode 25 21 fmt 0 = COP1 31 26 opcode 25 21 fmt 16 t f 0 = BC= COP1 FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-101 Table B-23 (cont.) FPU (CP1) Instruction Encoding - MIPS I Architecture Instructions encoded by the function field when opcode=COP1 and fmt = S, D, or W encoding when fmt = S functi on bits 2..0 bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 ADD SUB MUL DIV * ABS MOV NEG 1 001 * * * * * * * * 2 010 * * * * * * * * 3 011 * * * * * * * * 4 100 * CVT.D * * CVT.W * * * 5 101 * * * * * * * * 6 110 C.F α C.UN α C.EQ α C.UEQ α C.OLT α C.ULT α C.OLE α C.ULE α 7 111 C.SF α C.NGLE α C.SEQ α C.NGL α C.LT α C.NGE α C.LE α C.NGT α encoding when fmt = D functi on bits 2..0 bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 ADD SUB MUL DIV * ABS MOV NEG 1 001 * * * * * * * * 2 010 * * * * * * * * 3 011 * * * * * * * * 4 100 CVT.S * * * CVT.W * * * 5 101 * * * * * * * * 6 110 C.F α C.UN α C.EQ α C.UEQ α C.OLT α C.ULT α C.OLE α C.ULE α 7 111 C.SF α C.NGLE α C.SEQ α C.NGL α C.LT α C.NGE α C.LE α C.NGT α 31 26 opcode 25 21 fmt 0 = COP1 = S function 31 26 opcode 25 21 fmt 0 = COP1 = D function B-102 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set encoding when fmt = W functi on bits 2..0 bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 * * * * * * * * 1 001 * * * * * * * * 2 010 * * * * * * * * 3 011 * * * * * * * * 4 100 CVT.S CVT.D * * * * * * 5 101 * * * * * * * * 6 110 * * * * * * * * 7 111 * * * * * * * * 31 26 opcode 25 21 fmt 0 = COP1 = W function FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-103 Table B-24 FPU (CP1) Instruction Encoding - MIPS II Architecture Instructions encoded by the opcode field. opcod e bits 28..26 bits 31..29 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 χ 1 001 2 010 COP1 δ 3 011 4 100 5 101 6 110 LWC1 LDC1 7 111 SWC1 SDC1 Instructions encoded by the fmt field when opcode=COP1. fmt bits 23..21 bits 25..24 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 00 MFC1 * CFC1 * MTC1 * CTC1 * 1 01 BC δ * * * * * * * 2 10 S δ D δ * * W δ * * * 3 11 * * * * * * * * Instructions encoded by the nd and tf fields when opcode=COP1 and fmt=BC. t f bit 16 n d 0 1 0 BC1F BC1T bit 17 1 BC1FL BC1TL 31 26 opcode 0 31 26 opcode 25 21 fmt 0 = COP1 31 26 opcode 25 21 fmt 17 16 n t d f 0 = BC= COP1 B-104 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Table B-24 (cont.) FPU (CP1) Instruction Encoding - MIPS II Architecture Instructions encoded by the function field when opcode=COP1 and fmt = S, D, or W encoding when fmt = S functi on bits 2..0 bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 ADD SUB MUL DIV SQRT ABS MOV NEG 1 001 * * * * ROUND.W TRUNC.W CEIL.W FLOOR.W 2 010 * * * * * * * * 3 011 * * * * * * * * 4 100 * CVT.D * * CVT.W * * * 5 101 * * * * * * * * 6 110 C.F α C.UN α C.EQ α C.UEQ α C.OLT α C.ULT α C.OLE α C.ULE α 7 111 C.SF α C.NGLE α C.SEQ α C.NGL α C.LT α C.NGE α C.LE α C.NGT α encoding when fmt = D functi on bits 2..0 bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 ADD SUB MUL DIV SQRT ABS MOV NEG 1 001 * * * * ROUND.W TRUNC.W CEIL.W FLOOR.W 2 010 * * * * * * * * 3 011 * * * * * * * * 4 100 CVT.S * * * CVT.W * * * 5 101 * * * * * * * * 6 110 C.F α C.UN α C.EQ α C.UEQ α C.OLT α C.ULT α C.OLE α C.ULE α 7 111 C.SF α C.NGLE α C.SEQ α C.NGL α C.LT α C.NGE α C.LE α C.NGT α 31 26 opcode 25 21 fmt 0 = COP1 = S function 31 26 opcode 25 21 fmt 0 = COP1 = D function FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-105 encoding when fmt = W functi on bits 2..0 bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 * * * * * * * * 1 001 * * * * * * * * 2 010 * * * * * * * * 3 011 * * * * * * * * 4 100 CVT.S CVT.D * * * * * * 5 101 * * * * * * * * 6 110 * * * * * * * * 7 111 * * * * * * * * 31 26 opcode 25 21 fmt 0 = COP1 = W function B-106 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Table B-25 FPU (CP1) Instruction Encoding - MIPS III Architecture Instructions encoded by the opcode field. opcod e bits 28..26 bits 31..29 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 χ 1 001 2 010 COP1 δ 3 011 4 100 5 101 6 110 LWC1 LDC1 7 111 SWC1 SDC1 Instructions encoded by the fmt field when opcode=COP1. fmt bits 23..21 bits 25..24 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 00 MFC1 DMFC1 CFC1 * MTC1 DMTC1 CTC1 * 1 01 BC δ * * * * * * * 2 10 S δ D δ * * W δ L δ * * 3 11 * * * * * * * * Instructions encoded by the nd and tf fields when opcode=COP1 and fmt=BC. t f bit 16 n d 0 1 0 BC1F BC1T bit 17 1 BC1FL BC1TL 31 26 opcode 0 31 26 opcode 25 21 fmt 0 = COP1 31 26 opcode 25 21 fmt 17 16 n t d f 0 = BC= COP1 FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-107 Table B-25 (cont.) FPU (CP1) Instruction Encoding - MIPS III Architecture Instructions encoded by the function field when opcode=COP1 and fmt = S, D, W, or L encoding when fmt = S functi on bits 2..0 bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 ADD SUB MUL DIV SQRT ABS MOV NEG 1 001 ROUND.L TRUNC.L CEIL.L FLOOR.L ROUND.W TRUNC.W CEIL.W FLOOR.W 2 010 * * * * * * * 3 011 * * * * * * * * 4 100 * CVT.D * * CVT.W CVT.L * * 5 101 * * * * * * * * 6 110 C.F α C.UN α C.EQ α C.UEQ α C.OLT α C.ULT α C.OLE α C.ULE α 7 111 C.SF α C.NGLE α C.SEQ α C.NGL α C.LT α C.NGE α C.LE α C.NGT α encoding when fmt = D functi on bits 2..0 bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 ADD SUB MUL DIV SQRT ABS MOV NEG 1 001 ROUND.L TRUNC.L CEIL.L FLOOR.L ROUND.W TRUNC.W CEIL.W FLOOR.W 2 010 * * * * * * * 3 011 * * * * * * * * 4 100 CVT.S * * * CVT.W CVT.L * * 5 101 * * * * * * * * 6 110 C.F α C.UN α C.EQ α C.UEQ α C.OLT α C.ULT α C.OLE α C.ULE α 7 111 C.SF α C.NGLE α C.SEQ α C.NGL α C.LT α C.NGE α C.LE α C.NGT α 31 26 opcode 25 21 fmt 0 = COP1 = S function 31 26 opcode 25 21 fmt 0 = COP1 = D function B-108 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set encoding when fmt = W or L functi on bits 2..0 bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 * * * * * * * * 1 001 * * * * * * * * 2 010 * * * * * * * * 3 011 * * * * * * * * 4 100 CVT.S CVT.D * * * * * * 5 101 * * * * * * * * 6 110 * * * * * * * * 7 111 * * * * * * * * 31 26 opcode 25 21 fmt 0 = COP1 = W, L function FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-109 Table B-26 FPU (CP1) Instruction Encoding - MIPS IV Architecture Instructions encoded by the opcode field. opcod e bits 28..26 bits 31..29 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 SPECIAL δ, β χ 1 001 2 010 COP1 δ COP1X δ,λ 3 011 4 100 5 101 6 110 LWC1 LDC1 7 111 SWC1 SDC1 Instructions encoded by the fmt field when opcode=COP1. fmt bits 23..21 bits 25..24 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 00 MFC1 DMFC1 CFC1 * MTC1 DMTC1 CTC1 * 1 01 BC δ * * * * * * * 2 10 S δ D δ * * W δ L δ * * 3 11 * * * * * * * * Instructions encoded by the nd and tf fields when opcode=COP1 and fmt=BC. t f bit 16 n d 0 1 0 BC1F BC1T bit 17 1 BC1FL BC1TL 31 26 opcode 0 31 26 opcode 25 21 fmt 0 = COP1 31 26 opcode 25 21 fmt 17 16 n t d f 0 = BC= COP1 B-110 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Table B-26 (cont.) FPU (CP1) Instruction Encoding - MIPS IV Architecture Instructions encoded by the function field when opcode=COP1 and fmt = S, D, W, or L encoding when fmt = S functi on bits 2..0 bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 ADD SUB MUL DIV SQRT ABS MOV NEG 1 001 ROUND.L TRUNC.L CEIL.L FLOOR.L ROUND.W TRUNC.W CEIL.W FLOOR.W 2 010 * MOVCF δ MOVZ MOVN * RECIP RSQRT 3 011 * * * * * * * * 4 100 * CVT.D * * CVT.W CVT.L * * 5 101 * * * * * * * * 6 110 C.F α C.UN α C.EQ α C.UEQ α C.OLT α C.ULT α C.OLE α C.ULE α 7 111 C.SF α C.NGLE α C.SEQ α C.NGL α C.LT α C.NGE α C.LE α C.NGT α encoding when fmt = D functi on bits 2..0 bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 ADD SUB MUL DIV SQRT ABS MOV NEG 1 001 ROUND.L TRUNC.L CEIL.L FLOOR.L ROUND.W TRUNC.W CEIL.W FLOOR.W 2 010 * MOVCF δ MOVZ MOVN * RECIP RSQRT 3 011 * * * * * * * * 4 100 CVT.S * * * CVT.W CVT.L * * 5 101 * * * * * * * * 6 110 C.F α C.UN α C.EQ α C.UEQ α C.OLT α C.ULT α C.OLE α C.ULE α 7 111 C.SF α C.NGLE α C.SEQ α C.NGL α C.LT α C.NGE α C.LE α C.NGT α 31 26 opcode 25 21 fmt 0 = COP1 = S function 31 26 opcode 25 21 fmt 0 = COP1 = D function FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-111 Table B-26 (cont.) FPU (CP1) Instruction Encoding - MIPS IV Architecture encoding when fmt = W or L functi on bits 2..0 bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 * * * * * * * * 1 001 * * * * * * * * 2 010 * * * * * * * * 3 011 * * * * * * * * 4 100 CVT.S CVT.D * * * * * * 5 101 * * * * * * * * 6 110 * * * * * * * * 7 111 * * * * * * * * Instructions encoded by the function field when opcode=COP1X. functi on bits 2..0 bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 LWXC1 LDXC1 * * * * * * 1 001 SWXC1 SDXC1 * * * * * PREFX 2 010 * * * * * * * * 3 011 * * * * * * * * 4 100 MADD.S MADD.D * * * * * * 5 101 MSUB.S MSUB.D * * * * * * 6 110 NMADD.S NMADD.D * * * * * * 7 111 NMSUB.S NMSUB.D * * * * * * Instructions encoded by the tf field when opcode=COP1, fmt = S or D, and function=MOVCF. t f bit 16 0 1 These are the MOVF.fmt and MOVT.fmt instructions. They should not be confused with MOVF and MOVT.MOVF (fmt) MOVT (fmt) 31 26 opcode 25 21 fmt 0 = COP1 = W, L function 31 26 opcode function 5 0 = COP1X 31 26 opcode function 5 0 = COP1 = MOVCF 16 t f 25 21 fmt = S, D B-112 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Instruction class encoded by the function field when opcode=SPECIAL. functi on bits 2..0 bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 MOVCI δ χ... 7 111 Instructions encoded by the tf field when opcode = SPECIAL and function=MOVCI. t f bit 16 0 1 These are the MOVF and MOVT instructions. They should not be confused with MOVF.fmt and MOVT.fmt.MOVF MOVT 31 26 opcode function 5 0 = SPECIAL 31 26 opcode function 5 0 = SPECIAL = MOVCI 16 t f FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-113 Table B-27 Architecture Level In Which FPU Instructions are Defined or Extended. The architecture level in which each MIPS IVencoding was defined is indicated by a subscript 1, 2, 3, or 4 (for architecture level I, II, III, or IV). If an instruction or instruction class was later extended, the extending level is indicated after the defining level. Instructions encoded by the opcode field. opcod e bits 28..26 Architecture level is shown by a subscript 1, 2, III, or 4. bits 31..29 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 SPECIAL β 4 χ 1 001 2 010 COP1 1,2,3,4 COP1X 4 3 011 4 100 5 101 6 110 LWC1 1 LDC1 2 7 111 SWC1 1 SDC1 2 Instructions encoded by the fmt field when opcode=COP1. fmt bits 23..21 Architecture level is shown by a subscript 1, 2, 3, or 4. bits 25..24 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 00 MFC1 1 DMFC1 3 CFC1 1 * 1 MTC1 1 DMTC1 3 CTC1 1 * 1 1 01 BC 1,2,4 * 1 * 1 * 1 * 1 * 1 * 1 * 1 2 10 S 1,2,3,4 D 1,2,3,4 * 1 * 1 W 1,2,3,4 L 3,4 * 1 * 1 3 11 * 1 * 1 * 1 * 1 * 1 * 1 * 1 * 1 Instructions encoded by the nd and tf fields when opcode=COP1 and fmt=BC. t f bit 16 Architecture level is shown by a subscript 1, 2, 3, or 4. n d 0 1 0 BC1F 1, 4 BC1T 1, 4 bit 17 1 BC1FL 2, 4 BC1TL 2, 4 31 26 opcode 0 31 26 opcode 25 21 fmt 0 = COP1 31 26 opcode 25 21 fmt 17 16 n t d f 0 = BC= COP1 B-114 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Table B-27 (cont.) Architecture Level (I-IV) In Which FPU Instructions are Defined or Extended Instructions encoded by the function field when opcode=COP1 and fmt = S, D, W, or L encoding when fmt = S functi on bits 2..0 Architecture level is shown by a subscript 1, 2, 3, or 4. bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 ADD 1 SUB 1 MUL 1 DIV 1 SQRT 2 ABS 1 MOV 1 NEG 1 1 001 ROUND.L 3 TRUNC.L 3 CEIL.L 3 FLOOR.L 3 ROUND.W 2 TRUNC.W 2 CEIL.W 2 FLOOR.W 2 2 010 * 1 MOVCF 4 MOVZ 4 MOVN 4 * 1 RECIP 4 RSQRT 4 * 1 3 011 * 1 * 1 * 1 * 1 * 1 * 1 * 1 * 1 4 100 * 1 CVT.D 1, 3 * 1 * 1 CVT.W 1 CVT.L 3 * 1 * 1 5 101 * 1 * 1 * 1 * 1 * 1 * 1 * 1 * 1 6 110 C.F 1, 4 C.UN 1, 4 C.EQ 1, 4 C.UEQ 1, 4 C.OLT 1, 4 C.ULT 1, 4 C.OLE 1, 4 C.ULE 1, 4 7 111 C.SF 1, 4 C.NGLE 1, 4 C.SEQ 1, 4 C.NGL 1, 4 C.LT 1, 4 C.NGE 1, 4 C.LE 1, 4 C.NGT 1, 4 encoding when fmt = D functi on bits 2..0 Architecture level is shown by a subscript 1, 2, 3, or 4. bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 ADD 1 SUB 1 MUL 1 DIV 1 SQRT 2 ABS 1 MOV 1 NEG 1 1 001 ROUND.L 3 TRUNC.L 3 CEIL.L 3 FLOOR.L 3 ROUND.W 2 TRUNC.W 2 CEIL.W 2 FLOOR.W 2 2 010 * 1 MOVCF 4 MOVZ 4 MOVN 4 * 1 RECIP 4 RSQRT 4 * 1 3 011 * 1 * 1 * 1 * 1 * 1 * 1 * 1 * 1 4 100 CVT.S 1, 3 * 1 * 1 * 1 CVT.W 1 CVT.L 3 * 1 * 1 5 101 * 1 * 1 * 1 * 1 * 1 * 1 * 1 * 1 6 110 C.F 1, 4 C.UN 1, 4 C.EQ 1, 4 C.UEQ 1, 4 C.OLT 1, 4 C.ULT 1, 4 C.OLE 1, 4 C.ULE 1, 4 7 111 C.SF 1, 4 C.NGLE 1, 4 C.SEQ 1, 4 C.NGL 1, 4 C.LT 1, 4 C.NGE 1, 4 C.LE 1, 4 C.NGT 1, 4 31 26 opcode 25 21 fmt 0 = COP1 = S function 31 26 opcode 25 21 fmt 0 = COP1 = D function FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-115 Table B-27 (cont.) Architecture Level (I-IV) In Which FPU Instructions are Defined or Extended encoding when fmt = W or L functi on bits 2..0 Architecture level is shown by a subscript 1, 2, 3, or 4. bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 * 1 * 1 * 1 * 1 * 1 * 1 * 1 * 1 1 001 * 1 * 1 * 1 * 1 * 1 * 1 * 1 * 1 2 010 * 1 * 1 * 1 * 1 * 1 * 1 * 1 * 1 3 011 * 1 * 1 * 1 * 1 * 1 * 1 * 1 * 1 4 100 CVT.S 1, 3 CVT.D 1, 3 * 1 * 1 * 1 * 1 * 1 * 1 5 101 * 1 * 1 * 1 * 1 * 1 * 1 * 1 * 1 6 110 * 1 * 1 * 1 * 1 * 1 * 1 * 1 * 1 7 111 * 1 * 1 * 1 * 1 * 1 * 1 * 1 * 1 Instructions encoded by the function field when opcode=COP1X. functi on bits 2..0 Architecture level is shown by a subscript 1, 2, 3, or 4. bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 LWXC1 4 LDXC1 4 * 4 * 4 * 4 * 4 * 4 * 4 1 001 SWXC1 4 SDXC1 4 * 4 * 4 * 4 * 4 * 4 PREFX 4 2 010 * 4 * 4 * 4 * 4 * 4 * 4 * 4 * 4 3 011 * 4 * 4 * 4 * 4 * 4 * 4 * 4 * 4 4 100 MADD.S 4 MADD.D 4 * 4 * 4 * 4 * 4 * 4 * 4 5 101 MSUB.S 4 MSUB.D 4 * 4 * 4 * 4 * 4 * 4 * 4 6 110 NMADD.S 4 NMADD.D 4 * 4 * 4 * 4 * 4 * 4 * 4 7 111 NMSUB.S 4 NMSUB.D 4 * 4 * 4 * 4 * 4 * 4 * 4 Instructions encoded by the tf field when opcode=COP1, fmt = S or D, and function=MOVCF. t f bit 16 0 1 These are the MOVF.fmt and MOVT.fmt instructions. They should not be confused with MOVF and MOVT.MOVF (fmt) 4 MOVT (fmt) 4 31 26 opcode 25 21 fmt 0 = COP1 = W, L function 31 26 opcode function 5 0 = COP1X 31 26 opcode function 5 0 = COP1 = MOVCF 16 t f 25 21 fmt = S, D B-116 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Instruction class encoded by the function field when opcode=SPECIAL. functi on bits 2..0 Architecture level is shown by a subscript 1, 2, 3, or 4. bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 MOVCI 4 χ... 7 111 Instructions encoded by the tf field when opcode = SPECIAL and function=MOVCI. t f bit 16 0 1 These are the MOVF and MOVT instructions. They should not be confused with MOVF.fmt and MOVT.fmt.MOVF 4 MOVT 4 31 26 opcode function 5 0 = SPECIAL 31 26 opcode function 5 0 = SPECIAL = MOVCI 16 t f FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-117 Table B-28 FPU Instruction Encoding Changes - MIPS II Architecture Revision. An instruction encoding is shown if the instruction is added or extended in this architecture revision. An instruction class, like COP1, is shown if the instruction class is added in this architecture revision. Instructions encoded by the opcode field. opcod e bits 28..26 bits 31..29 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 1 001 2 010 3 011 4 100 5 101 6 110 LDC1 7 111 SDC1 Instructions encoded by the fmt field when opcode=COP1. fmt bits 23..21 bits 25..24 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 00 1 01 2 10 3 11 Instructions encoded by the nd and tf fields when opcode=COP1 and fmt=BC. t f bit 16 n d 0 1 0 bit 17 1 BC1FL BC1TL 31 26 opcode 0 31 26 opcode 25 21 fmt 0 = COP1 31 26 opcode 25 21 fmt 17 16 n t d f 0 = BC= COP1 B-118 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Table B-28 (cont.) FPU Instruction Encoding Changes - MIPS II Revision. Instructions encoded by the function field when opcode=COP1 and fmt = S, D, or W encoding when fmt = S functi on bits 2..0 bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 SQRT 1 001 ROUND.W TRUNC.W CEIL.W FLOOR.W 2 010 3 011 4 100 5 101 6 110 7 111 encoding when fmt = D functi on bits 2..0 bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 SQRT 1 001 ROUND.W TRUNC.W CEIL.W FLOOR.W 2 010 3 011 4 100 5 101 6 110 7 111 31 26 opcode 25 21 fmt 0 = COP1 = S function 31 26 opcode 25 21 fmt 0 = COP1 = D function FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-119 encoding when fmt = W functi on bits 2..0 bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 1 001 2 010 3 011 4 100 5 101 6 110 7 111 31 26 opcode 25 21 fmt 0 = COP1 = W function B-120 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Table B-29 FPU Instruction Encoding Changes - MIPS III Revision. An instruction encoding is shown if the instruction is added or extended in this architecture revision. An instruction class, like COP1, is shown if the instruction class is added in this architecture revision. Instructions encoded by the opcode field. opcod e bits 28..26 bits 31..29 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 1 001 2 010 3 011 4 100 5 101 6 110 7 111 Instructions encoded by the fmt field when opcode=COP1. fmt bits 23..21 bits 25..24 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 00 DMFC1 DMTC1 1 01 2 10 L δ 3 11 Instructions encoded by the nd and tf fields when opcode=COP1 and fmt=BC. t f bit 16 n d 0 1 0 bit 17 1 BC1FL BC1TL 31 26 opcode 0 31 26 opcode 25 21 fmt 0 = COP1 31 26 opcode 25 21 fmt 17 16 n t d f 0 = BC= COP1 FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-121 Table B-29 (cont.) FPU Instruction Encoding Changes - MIPS III Revision. Instructions encoded by the function field when opcode=COP1 and fmt = S, D, or L. encoding when fmt = S functi on bits 2..0 bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 1 001 ROUND.L TRUNC.L CEIL.L FLOOR.L 2 010 3 011 4 100 CVT.L 5 101 6 110 7 111 encoding when fmt = D functi on bits 2..0 bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 1 001 ROUND.L TRUNC.L CEIL.L FLOOR.L 2 010 3 011 4 100 CVT.L 5 101 6 110 7 111 31 26 opcode 25 21 fmt 0 = COP1 = S function 31 26 opcode 25 21 fmt 0 = COP1 = D function B-122 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set encoding when fmt = L functi on bits 2..0 bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 * * * * * * * * 1 001 * * * * * * * * 2 010 * * * * * * * * 3 011 * * * * * * * * 4 100 CVT.S CVT.D * * * * * * 5 101 * * * * * * * * 6 110 * * * * * * * * 7 111 * * * * * * * * 31 26 opcode 25 21 fmt 0 = COP1 = L function FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-123 Table B-30 FPU Instruction Encoding Changes - MIPS IV Revision. An instruction encoding is shown if the instruction is added or extended in this architecture revision. An instruction class, like COP1X, is shown if the instruction class is added in this architecture revision. Instructions encoded by the opcode field. opcod e bits 28..26 bits 31..29 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 1 001 2 010 COP1X δ 3 011 4 100 5 101 6 110 7 111 Instructions encoded by the fmt field when opcode=COP1. fmt bits 23..21 bits 25..24 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 00 1 01 2 10 3 11 Instructions encoded by the nd and tf fields when opcode=COP1 and fmt=BC. t f bit 16 n d 0 1 0 BC1F BC1T bit 17 1 BC1FL BC1TL 31 26 opcode 0 31 26 opcode 25 21 fmt 0 = COP1 31 26 opcode 25 21 fmt 17 16 n t d f 0 = BC= COP1 B-124 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Table B-30 (cont.) FPU Instruction Encoding Changes - MIPS IV Revision. Instructions encoded by the function field when opcode=COP1 and fmt = S, D, W, or L. encoding when fmt = S functi on bits 2..0 bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 1 001 2 010 MOVCF δ MOVZ MOVN RECIP RSQRT 3 011 4 100 5 101 6 110 C.F C.UN C.EQ C.UEQ C.OLT C.ULT C.OLE C.ULE 7 111 C.SF C.NGLE C.SEQ C.NGL C.LT C.NGE C.LE C.NGT encoding when fmt = D functi on bits 2..0 bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 1 001 2 010 MOVCF δ MOVZ MOVN RECIP RSQRT 3 011 4 100 5 101 6 110 C.F C.UN C.EQ C.UEQ C.OLT C.ULT C.OLE C.ULE 7 111 C.SF C.NGLE C.SEQ C.NGL C.LT C.NGE C.LE C.NGT encoding when fmt = W or L functi on bits 2..0 bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 1 001 2 010 3 011 4 100 5 101 6 110 7 111 31 26 opcode 25 21 fmt 0 = COP1 = S function 31 26 opcode 25 21 fmt 0 = COP1 = D function 31 26 opcode 25 21 fmt 0 = COP1 = W, L function FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-125 Table B-30 (cont.) FPU Instruction Encoding Changes - MIPS IV Revision. Instructions encoded by the function field when opcode=COP1X. functi on bits 2..0 bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 LWXC1 LDXC1 * * * * * * 1 001 SWXC1 SDXC1 * * * * * PREFX 2 010 * * * * * * * * 3 011 * * * * * * * * 4 100 MADD.S MADD.D * * * * * * 5 101 MSUB.S MSUB.D * * * * * * 6 110 NMADD.S NMADD.D * * * * * * 7 111 NMSUB.S NMSUB.D * * * * * * Instructions encoded by the tf field when opcode=COP1, fmt = S or D, and function=MOVCF. t f bit 16 0 1 These are the MOVF.fmt and MOVT.fmt instructions. They should not be confused with MOVF and MOVT.MOVF (fmt) MOVT (fmt) Instruction class encoded by the function field when opcode=SPECIAL. functi on bits 2..0 bits 5..3 0 1 2 3 4 5 6 7 000 001 010 011 100 101 110 111 0 000 MOVCI δ χ... 7 111 31 26 opcode function 5 0 = COP1X 31 26 opcode function 5 0 = COP1 = MOVCF 16 t f 25 21 fmt = S, D 31 26 opcode function 5 0 = SPECIAL B-126 MIPS IV Instruction Set. Rev 3.2 FPU Instruction Set Instructions encoded by the tf field when opcode = SPECIAL and function=MOVCI. t f bit 16 0 1 These are the MOVF and MOVT instructions. They should not be confused with MOVF.fmt and MOVT.fmt.MOVF MOVT 31 26 opcode function 5 0 = SPECIAL = MOVCI 16 t f FPU Instruction Set MIPS IV Instruction Set. Rev 3.2 B-127 Key to all FPU (CP1) instruction encoding tables: * This opcode is reserved for future use. An attempt to execute it causes either a Reserved Instruction exception or a Floating Point Unimplemented Operation Exception. The choice of exception is implementation specific. α The table shows 16 compare instructions with values named C.condition where “condition” is a comparison condition such as “EQ”. These encoding values are all documented in the instruction description titled “C.cond.fmt”. β The SPECIAL instruction class was defined in MIPS I for CPU instructions. An FPU instruction was first added to the instruction class in MIPS IV. δ (also italic opcode name) This opcode indicates an instruction class. The instruction word must be further decoded by examing additional tables that show values for another instruction field. λ The COP1X opcode in MIPS IV was the COP3 opcode in MIPS I and II and a reserved instruction in MIPS III. χ These opcodes are not FPU operations. For further information on them, look in the CPU Instruction Encoding information section A 8. (fmt ) This opcode is a conditional move of formatted FP registers - either MOVF.D, MOVF.S, MOVT.D, or MOVT.S. It should not be confused with the similarly-named MOVF or MOVT instruction that moves CPU registers.