Brief Solution to Assignment 2 1(a)(i) xyz’ + x’yz’ = (x’ + x) yz’ = yz’ (ii) xy’z’ + x’y’z’ + xyz’ + x’yz’ = (xy’ + x’y’ + xy + x’y) z’ = z’ (b) The half-adder functions (Table 3.9 of your text) can be expressed as: Sum = xy’ + x’y Carry = xy The Sum function requires 3 NAND gates and the Carry function require 2 NAND gates [inputs x and y to the first NAND whose output is followed by another NAND that serves as a Not-gate]. (c) a b c x y z 0 0 0 0 1 1 1 1
(d)
0 0 1 1 0 0 1 1
0 1 0 1 0 1 0 1
0 0 1 1 1 1 0 0
1 1 0 0 1 1 0 0
0 1 0 1 0 1 0 1
Using Karnaugh map technique, we can get: x = a’b + ab’ y = b’ z=c With the availability of inverted input, only x requires 3 NAND gates to implement. D1 = F = Q1’ + Q2’ + Q3’ D2 = Q1 D3 = Q2 leading to the following state transition diagram (recall the Qi = Di when the clock rises): State = <Q1, Q2, Q3> 000
100
110
111
011
101
[Note: without external input (except the clock), the state transition edges are not labeled. If there are external inputs that choose an edge over others from a same state, the transition edges should be labeled with the corresponding external input values.] (e)
A combinational circuit contains only logic gates, whereas a sequential circuit must contain some memory element (such as flip flops) to store the state. Some sequential circuits do not operate with a clock (and are known as asynchronous sequential circuits). In such a case, the memory elements may change state without the use of a clock.
2(a)
(b) (c)
3(a) (b) (c) (d) (e)
(f) (g)
(h)
4(a)
200 opcodes requires at least 8 bits to encode: 27 ≤ 200 ≤ 28. With 8 bit opcode, the computer can have up to 256 distinct opcodes. The memory must be large enough to hold the largest program stated, which is 64M words or locations. Hence an address needs 26 bits (226 = 64M). So we need 8 + 26 = 34 bits to specify each instruction. If a word must contain an integer multiple of bytes, then it must be at least 5 bytes (40 bits > 34 bits). If that is the case, the address can extend from 26 bits to 32 bits or 4G locations. Extending the machine to 2-address, then each instruction would need 8 + 26 + 26 bits = 60 bits. This increases the size of a word from 26 bits to 60 bits, or more than doubling the cost of the system. Generally 2-address machines are more programmable (easier in programming) but the cost increase cannot be ignored. (2G x 4) / (512M x 1) = (4 x 4) = 16 2G = 231. Hence 31 bits of address is needed. 512M = 229. Hence 29 bits of address is needed. For each memory access, 4 RAM chips are selected to provide the 4 bytes of data from the memory, one of each of the 4 chips that are selected. Details not shown here; the chips are organized in a 4 x 4 array (as drawn in the lecture slide but with 4 rows and 4 columns). The address 25 is equivalent to 0000…011001. The lower order 29 bits of the address are connected to the 29 address inputs of each RAM chip. The most significant two bits (in this case 00) select one of the four rows (say row 0) so that memory location 00…011001 of the selected row of 4 chips will provide the 4 data bytes to be sent to the processor. Memory interleaving allows multiple memory banks to operate concurrently. This enhances performance because of the concurrent servers. Low-order interleaving allows sequential memory addresses located in distinct memory banks. Hence if a sequence of data words from consecutive memory addresses is needed, they can be accessed in parallel. High-order interleaving does not provide this convenience. Unfortunately in this case, 1000, 1004, 1008 …. will all map to the same memory module (under low-order interleaving), the two least significant bits are 00 in these addresses. So it is not possible to overlap these accesses (i.e., performing them in parallel). As a result, 100 distinct (non-overlapped) memory cycles will be needed. Processor cycle refers to basic cycle driving a processor, and typically this is also called the clock. It is the smallest unit of passage of time during which the processor does some basic (unit of) work. Memory cycle is the time duration for the memory to perform a memory operation (read/write). Usually it is several times that of the processor cycle. Bus cycle is the time duration for the bus to perform a bus transaction. This can take many clock cycles, and indeed potentially the largest of the three mentioned here.
101
(b)
(c) (d)
The execution time of a (same machine language) program depends on three essential factors: Average number of clocks per instruction, total number of instructions executed, and the clock cycle time. A faster clock may not actually be faster when the system requires more clocks (on the average) to execute each instruction. This could arise from the internal design of the processor (for the same ISA and hence control layer details) as well as from other system components, such as memory/bus speed etc. Centralized arbitration is likely to be fastest: the decision need not involve propagation of information among all the devices/components. I/O interrupt is a mechanism that permits an I/O device to alert the processor for attention whenever need arises, at which point the processor may suspend its current program and respond to the device by executing a service routine. Without this feature, an (interactive) user sitting at the computer terminal may feel helpless when she wants to computer to react to her needs, via terminal devices such as the mouse/keyboard.