Advanced Computer Networks EE / CS 6713 End-to-end Protocols Dr. Amir Qayyum
The Big Picture Final exam is here
You are here 2
The Big Picture application end-to-end IP data link physical
} today’s topic
}
most coverage until now
3
End-to-end Protocols Outline (reading: Peterson and Davie, Ch. 5) End-to-end service model Protocol examples – User Datagram Protocol (UDP) – Transmission Control Protocol (TCP) Connection Establishment/Termination Sliding Window Revisited Flow Control; and adaptive Timeout – Remote Procedure Call (RPC) 4
Where we are now … • Understand how to – Build a network on one physical medium – Connect networks together (with switches) – Implement a reliable byte stream on a variable network (like the Internet) – Implement a UDP/TCP connection/channel – Address network heterogeneity – Address global scale
• Today’s topic – End-to-end issues and common protocols 5
End-to-End Service Model • Recall user perspective of network – Define required functionality/services – Implementation is irrelevant
• Focus of end-to-end protocols (transport layer) – Communication between applications (users) – Translating from host-to-host services (network layer)
• Services implemented in end-to-end protocols – Those that cannot be done well in lower layers (i.e., on a per-hop basis); duplicate effort should be avoided – Those not needed by all applications 6
End-to-End Service Model • Services provided by underlying network: IP - “best effort” delivery – Messages sent from a host, delivered to a host (no distinction between entities sharing a host) – Drops some messages – Reorders messages – Delivers duplicate copies of a message – Limits messages to some finite size – Delivers messages after an arbitrarily long delay 7
End-to-End Service Model • Common end-to-end services demanded by applications – – – – – – –
Multiple connections (application processes) per host Guaranteed message delivery Messages delivered in the order they are sent Messages delivered at most once Arbitrarily large message support Synchronization between sender and receiver Flow control by the receiver 8
End-to-End Protocol Challenge • Given IP service model • Provide service model demanded by applications • Service models to consider – Demultiplexing only (UDP) – Everything on the previous list (TCP) – Reliable request/response (RPC) 9
User Datagram Protocol (UDP) • • • • •
Thin veneer over IP services Addresses multiplexing of multiple connections Unreliable and unordered datagram service No flow control Endpoints identified by ports (multiplexing) – 16-bit port space – Well-known ports for certain services (see /etc/services)
• Checksum to validate header – Optional in IPv4, but mandatory in IPv6 (why ?) 10
UDP Header Format 16
0 source port UDP length
31 destination port UDP checksum
• Length includes 8-byte header and data • Checksum (purpose ?) – Uses IP checksum algorithm – Computed on pseudo-header, UDP header and data 16 0 8 31 source IP address destination IP address 0 17 (UDP) UDP length 11
Reliable Byte-Stream (TCP) Outline Connection Establishment/Termination Sliding Window Revisited Flow Control Adaptive Timeout
12
Transmission Control Protocol (TCP) • Service model implements requirements listed earlier – – – – – –
Multiple connections per host Guaranteed and in-order delivery Messages delivered at most once Arbitrarily large messages Synchronization between sender and receiver Flow control
• Multiplexing mechanism equivalent to that of UDP • Checksum mechanism also equivalent, but mandatory 13
TCP Overview • Flow control: restricts rate to something manageable by receiver • Congestion control: restricts rate to something manageable by network • Connection-oriented: setup and teardown required • Full duplex – Data flows in both directions simultaneously – Point-to-point communication
• Byte stream abstraction: no boundaries in data 14
TCP Byte Stream • Application writes bytes • TCP sends segments • Application reads bytes Application process
Application process
…
…
Write bytes
Read bytes
TCP
TCP
Send buffer
Receive buffer
Segment
Segment
…
Segment
Transmit segments
15
Data Link versus Transport • Potentially connects many different hosts – Need explicit connection establishment and termination
• Potentially different RTT – Need adaptive timeout mechanism
• Potentially long delay in network – Need to be prepared for arrival of very old packets
• Potentially different capacity at destination – Need to accommodate different node capacity
• Potentially different network capacity – Need to be prepared for network congestion 16
TCP Outline • • • • • • • •
TCP vs. a sliding window on a direct link Model of use Segment header format and options States and state diagram (think-pair-share) Sliding window implementation details Flow control issues Bit allocation limitations Adaptive retransmission algorithms 17
TCP vs. Sliding Window on Direct Link • RTT varies – Among peers (hosts at other end of connections) – Over time – Requires adaptive approach to retransmission (and window sizes)
• Packets can be – Delayed for long periods (assume 60 seconds) – Reordered in the network – Must be prepared for arrival of very old packets 18
TCP vs. Sliding Window on Direct Link • Peer capabilities vary – Minimum link speed on route – Buffering capacity at destination – Requires adaptive approach to window sizes
• Network capacity varies – Other traffic competes for most links – Requires (global) congestion control strategy
• Why not implement more functionality in IP, e.g., ordering guarantees or congestion control ? 19
End-to-End Argument • A function should not be provided at a given layer unless it can be completely and correctly implemented at that layer • In-order delivery: hop-by-hop ordering guarantee is not robust to path changes or multiple paths F
A B
C
E D 20
End-to-End Argument • Congestion control – Should be stopped at source – But network can provide feedback A
100Mbps
5Mbps 100Mbps 100Mbps
B
1Mbps
100Mbps
1Mbps
10Mbps
C
D 5Mbps
E
10Mbps
F
10Mbps
blue should get 9Mbps, but gets only 5Mbps with hop-by-hop drops 21
TCP Model of Use • Connection setup via 3-way handshake • Data transport – Sender writes some data – TCP • Breaks data into segments • Sends each segment over IP • Retransmits, reorders, and removes duplicates as necessary – Receiver reads some data
• Teardown through 4-step exchange 22
TCP Connection Setup • TCP Connection Setup via 3-way Handshake – J and K are (different) sequence numbers for messages – Sequence numbers need not start at zero SYN J Active participant
(client)
SYN K ACK J+1
Passive participant
(server)
ACK K+1 23
TCP Data Transport Model • Data broken into segments – – – –
Limited by maximum segment size (MSS) Defaults to 536 bytes Negotiable during connection setup Typically set to MTU of directly connected network minus size of IP and TCP headers (40 bytes)
• Three events send segment – >= MSS bytes of data ready to be sent – Explicit PUSH operation by application – Periodic timeout 24
TCP Connection Teardown • TCP connection teardown in 4 steps – Either client or server can initiate connection teardown – FIN is associated with sequence number space (why?) active close
FIN J ACK J+1
passive close closes connection
client
FIN K
server
ACK K+1 25
TCP Segment Header & Pseudo-Header 0
16 4 10 source port destination port sequence number ACK sequence number 0 hdrlen flags advertised window TCP checksum urgent pointer options (variable)
0
8
0
16 source IP address destination IP address 6 (TCP) TCP length 26
31
31
TCP Segment Header • • • •
16-bit source and destination ports 32-bit send and ACK sequence numbers 4-bit header length in 4-byte words (minimum 5) Six 1-bit flags – – – –
URG: segment contains urgent data ACK: ACK sequence number is valid PSH: do not delay delivery of data RST: reset connection (rejection or abnormal termination); no sequence number associated – SYN: synchronize segment for setup – FIN: final segment for teardown 27
TCP Segment Header • 16-bit advertised window – Space remaining in receive window
• 16-bit checksum – Uses IP checksum algorithm – Computed on header, data, and pseudo-header
• 16-bit urgent data pointer – If URG=1 – Index of last byte of urgent data in segment 28
TCP Segment Format • Each connection identified with 4-tuple: – (SrcPort,SrcIPAddr,DsrPort,DstIPAddr)
• Sliding window + flow control – ACK, SequenceNum, AdvertisedWindow Data(SequenceNum) Receiver
Sender Acknowledgment + AdvertisedWindow
• Flags – SYN, FIN, RESET, PUSH, URG, ACK 29
TCP Options – Existing & Proposed • Negotiate maximum segment size (MSS) – Each host suggests a value – Minimum of two values is chosen – Prevents IP fragmentation over first/last hops
• Packet timestamp – Allows RTT calculation for retransmitted packets – Extends sequence number space for identification of stray packets (packets arriving very late)
• Negotiate advertised window granularity – Allows larger windows – Good for routes with large bandwidth-delay products 30
TCP State Description • • • • • • • • • • •
CLOSED LISTEN SYN_RCVD SYN_SENT ESTABLISHED CLOSE_WAIT LAST_ACK FIN_WAIT_1 FIN_WAIT_2 CLOSING TIME_WAIT
disconnected waiting for incoming connection connection request received connection request sent ready for data transport connection closed by peer closed by peer, closed locally, await ACK connection closed locally closed locally and ACK’d closed by both sides “simultaneously” wait for network to discard related packets 31
State Transition Diagram CLOSED Active open/SYN Passive open
Close
Close
LISTEN
SYN_RCVD
SYN/SYN + ACK Send/SYN SYN/SYN + ACK ACK
Close/FIN
SYN + ACK/ACK
ESTABLISHED Close/FIN
FIN/ACK
FIN_WAIT_1 ACK FIN_WAIT_2
SYN_SENT
CLOSE_WAIT AC FIN/ACK K + FI N /A CK
Close/FIN CLOSING ACK
FIN/ACK
TIME_WAIT
LAST_ACK
Timeout after two segment lifetimes
ACK CLOSED
32
Think-Pair-Share • Describe the path taken – By a server under normal conditions, and – By a client under normal conditions, – Assuming that the client closes the connection first.
• Consider the TIME_WAIT state – What purpose does this state serve ? – Prove that at least one side of a connection enters this state before returning to CLOSED – Explain how both sides might enter this state 33
Sliding Window Implementation • Sequence numbers are indices into byte stream • ACK sequence number is actually next byte expected (as opposed to last byte received) • Receiver buffers contain – Data ready for delivery to application until requested – Out-of-order data out to maximum buffer capacity
• Sender buffers contain – Unacknowledged data – Unsent data out to maximum buffer capacity 34
Sliding Window application
• Sender side LastByteAcked
LastByteSent
TCP LastByteWritten
Window
time
max buffer size – Green: sent and acknowledged – Red: sent (or can be sent) but not acknowledged – Blue: available but not within send window 35
Sliding Window • Receiver side NextByteRead
application TCP
NextByteExpected
LastByteReceived
Advertised window max buffer size – Green: received and ready to be delivered – Red: received and buffered – Blue: received and discarded
time
36
Sliding Window Math Sending application
Receiving application
TCP LastByteWritten
LastByteAcked
LastByteSent
• Sending side – LastByteAcked < = LastByteSent – LastByteSent < = LastByteWritten – buffer bytes between LastByteAcked and LastByteWritten
TCP LastByteRead
NextByteExpected
LastByteRcvd
• Receiving side – LastByteRead < NextByteExpected – NextByteExpected < = LastByteRcvd +1 – buffer bytes between NextByteRead and LastByteRcvd
37
Flow vs. Congestion Control • Flow control prevents buffer overflow at receiver (only source and dest. are relevant) • Congestion control addresses bandwidth interactions between distinct packet flows • TCP provides both – flow control based on advertised window – congestion control will be discussed later … 38
TCP Flow Control Issues • Problem: slow receiver application – – – –
Advertised window goes to 0 Sender cannot send more data Non-data packets used to update window ? Receiver may not spontaneously generate, or update may be lost
• Solution: smart sender/dumb receiver – Sender periodically sends a 1-byte segment, ignoring advertised window of 0 – Eventually, window opens – Sender learns of opening from next ACK of 1-byte segment 39
TCP Flow Control Issues • Problem: app. delivers tiny pieces of data to TCP – e.g., telnet in character mode – Each piece sent as segment, returned as ACK – Very inefficient
• Solutions – Delay transmission to accumulate more data – Nagle’s algorithm • Send first piece • Accumulate data until first piece ACK’d • Send accumulated data and restart accumulation • Not ideal for some traffic, e.g., mouse motion 40
TCP Flow Control Issues • Problem: slow application reads data in tiny pieces – Receiver advertises tiny window – Sender fills tiny window – Known as silly window syndrome
• Solution: due to Clark – Advertise window opening only when MSS or ½ of buffer is available – Sender delays sending until window is MSS or ½ of receiver’s buffer (estimated) – Overridden by using PUSH 41
TCP Flow Control Math • Send buffer size: MaxSendBuffer • Receive buffer size: MaxRcvBuffer • Receiving side – LastByteRcvd - LastByteRead < = MaxRcvBuffer – AdvertisedWindow = MaxRcvBuffer (NextByteExpected NextByteRead)
• Sending side – LastByteSent - LastByteAcked < = AdvertisedWindow – EffectiveWindow = AdvertisedWindow - (LastByteSent - LastByteAcked) – LastByteWritten - LastByteAcked < = MaxSendBuffer – block sender if (LastByteWritten - LastByteAcked) + y > MaxSenderBuffer
• Always send ACK in response to arriving data segment • Persist when AdvertisedWindow = 0 42
TCP Bit Allocation Limitations • Sequence numbers vs. packet lifetime – – – –
Assumed that IP packets live less than 60 seconds Can we send 232 (4G) bytes in 60 seconds ? Only need a data rate of 573 Mbps! Less than an STS-12 line... (< Gigabit Ethernet)
• Advertised window vs. delay-bandwidth – Only 16 bits (64kB) for advertised window – For cross-country RTT of 100 milliseconds, adequate for a mere 5.24 Mbps! 43
Protection Against Wrap Around â&#x20AC;˘ 32-bit SequenceNum Bandwidth T1 (1.5 Mbps) Ethernet (10 Mbps) T3 (45 Mbps) FDDI (100 Mbps) STS-3 (155 Mbps) STS-12 (622 Mbps) STS-24 (1.2 Gbps)
Time Until Wrap Around 6.4 hours 57 minutes 13 minutes 6 minutes 4 minutes 55 seconds 28 seconds 44
Keeping the Pipe Full • 16-bit AdvertisedWindow Bandwidth T1 (1.5 Mbps) Ethernet (10 Mbps) T3 (45 Mbps) FDDI (100 Mbps) STS-3 (155 Mbps) STS-12 (622 Mbps) STS-24 (1.2 Gbps)
Delay x Bandwidth Product 18KB 122KB 549KB 1.2MB 1.8MB 7.4MB 14.8MB 45
TCP Extensions • Implemented as header options • Store timestamp in outgoing segments • Extend sequence space with 32-bit timestamp (PAWS) • Shift (scale) advertised window
46
Adaptive Retransmission Algorithm • Original algorithm used only RTT estimate • Theory: measure RTT for each segment + its ACK – Estimate RTT – Timeout is 2 * RTT to allow for variations
• Practice – Use exponential moving average (α = 0.8 to 0.9) – Estimate = α * estimate + (1 - α) measurement Measured RTT
depends on α
time
47
Adaptive Retransmission Algorithm â&#x20AC;˘ Problem: it did not handle variations well Measured RTT time
â&#x20AC;˘ Ambiguity for retransmitted packets: was ACK in response to first, second, etc. transmission ? transmission retransmission RTT ? ? ?
48
Adaptive Retransmission (Original Algorithm) • Measure SampleRTT for each segment/ACK pair • Compute weighted average of RTT – – − −
EstRTT = α x EstRTT + β x SampleRTT where α + β = 1 α between 0.8 and 0.9 β between 0.1 and 0.2
• Set timeout based on EstRTT – TimeOut = 2 x EstRTT 49
Karn/Partridge Algorithm Receiver
SampleR TT
Orig in
• • • •
Retr
al tra n
ans m
s m is
issio
ACK
n
Sender Orig in
s ion
SampleR TT
Sender
Receiver al t ra
ns m
is s io
ACK Retr ans m
n
issio
n
Do not sample RTT when retransmitting Double timeout (RTT estimate) after each retransmission Still did not handle variations well Did not solve network congestion problems as desired 50
Jacobson/ Karels Algorithm • Estimate variance of RTT – Calculate mean interpacket RTT deviation as an approximation of variance (Diff = RTT_estimate – measurement) – RTT_estimate = α x RTT_estimate + (1- α ) x measurement – Using another exponential moving average – Deviation = β x |RTT_estimate - measurement| + (1 - β) deviation – β = 0.25; α = 0.875 in RTT_estimate
• Use variance estimate as component of RTT estimate – next_RTT (timeout) = RTT_estimate + 4 x deviation
• Protects against high jitter • Algorithm only as good as clock granularity(500ms on Unix) • Accurate timeout mechanism important to congestion control 51
Remote Procedure Call Outline Protocol Stack
52
Remote Procedure Call (RPC) • Central premise: apply the familiar to the unfamiliar – programmers understand procedure calls – programmers do not understand client-server interactions
• Idea – translate some calls into server request messages – wait for reply, return reply value – transparent to programmer!
• Transport protocol matching the needs of applications involved in a request / reply message exchange 53
RPC Timeline Client
Server Reque s
Blocked
t
Blocked Computing
Reply
Blocked
54
Remote Procedure Call (RPC) • More than just a protocol: popular mechanism for structuring distributed systems • More complicated than local procedure calls – Complex network between calling and called process – Different architecture and data representation formats
• Complete RPC mechanism has two components – Dealing with undesirable properties of network – Packaging arguments into messages and vice-versa (stub compiler) 55
Remote Procedure Call Semantics write_check (CASE, 13500, “tuition payment”); • your machine looks up the bank server • asks the server about the checking service • receives port number for checking service • packages up arguments with type write_check • tags package (message) with your identification • sends request to checking service on bank server (potentially as several UDP packets) • waits for a reply 56
Remote Procedure Call Semantics write_check (CASE, 13500, “tuition payment”); • bank’s checking service receives request from your home machine (possibly reassembling UDP packets) • verifies your identity (possibly via RPC to an authority) • identifies request as a write_check request • unpackages arguments, possibly converting data to another format • calls write_check handler function with arguments 57
Remote Procedure Call Semantics write_check (CASE, 13500, “tuition payment”); • At bank’s checking service, RPC layer receives return value from handler function • packages up arguments with type write_check_reply • tags package (message) with your bank’s identification and unique ID for previous request • sends reply to your machine (potentially as several UDP packets) 58
Remote Procedure Call Semantics write_check (CASE, 13500, “tuition payment”); • your machine receives reply from bank (possibly reassembling UDP packets) • verifies bank’s identity (possibly via RPC to an authority) • identifies reply as a write_check_reply • finds corresponding request • unpackages return value, possibly converting data to another format • returns value to caller 59
RPC Components • Protocol Stack – BLAST: fragments and reassembles large messages – CHAN: synchronizes request and reply messages – SELECT: dispatches request to the correct process
• Stubs
Caller (client) Arguments
Return value
Client stub Request
Reply
RPC protocol
Callee (server) Arguments
Return value Server stub
Request
Reply
RPC protocol
60
Bulk Transfer (BLAST) • Unlike AAL and IP, tries to recover from lost fragments • Strategy – selective retransmission – aka partial acknowledgements
• Sender: – after sending all fragments, set timer DONE – if receive SRR, send missing fragments and reset DONE – if timer DONE expires, free fragments
Sender
Receiver
Frag men t1 Frag men t2 Frag men t3 Frag men t4 Frag men t5 Frag men t6 SRR
Frag men t3 Frag men t5 SRR
61
BLAST: Receiver Details • When first fragments arrives, set timer LAST_FRAG • When all fragments present, reassemble and pass up • Four exceptional conditions: – if last fragment arrives but message not complete • send SRR and set timer RETRY – if timer LAST_FRAG expires • send SRR and set timer RETRY – if timer RETRY expires for first or second time • send SRR and set timer RETRY – if timer RETRY expires a third time • give up and free partial message 62
BLAST Header Format • MID must protect against wrap around • TYPE = DATA or SRR • NumFrags indicates number of fragments • FragMask distinguishes among fragments
0
16
31
ProtNum MID Length NumFrags
Type
FragMask Data
– if Type=DATA, identifies this fragment – if Type=SRR, identifies missing fragments 63
Request / Reply (CHAN) • Guarantees message delivery • Synchronizes client with server • Supports at-most-once semantics Client
Req uest ACK y Repl
ACK
Server
Implicit Acks Client
R eq
Server
ues t
y1 Repl Req ues t Re p l
y2
1
2
…
Simple case
64
CHAN Details • Lost message (request, reply, or ACK) – set RETRANSMIT timer – use message id (MID) field to distinguish
• Slow (long running) server – client periodically sends “are you alive” probe, or – server periodically sends “I’m alive” notice
• Want to support multiple outstanding calls – use channel id (CID) field to distinguish
• Machines crash and reboot – use boot id (BID) field to distinguish 65
CHAN Header Format typedef struct { u_short Type; u_short CID; int MID; int BID; int Length; int ProtNum; } ChanHdr;
/* /* /* /* /* /*
typedef struct { u_char type; u_char status; int retries; int timeout; XkReturn ret_val; Msg *request; Msg *reply; Semaphore reply_sem; int mid; int bid; } ChanState;
REQ, REP, ACK, PROBE */ unique channel id */ unique message id */ unique boot id */ length of message */ high-level protocol */
/* /* /* /* /* /* /* /* /* /*
CLIENT or SERVER */ BUSY or IDLE */ number of retries */ timeout value */ return value */ request message */ reply message */ client semaphore */ message id */ boot id */
66
Synchronous vs Asynchronous Protocols • Asynchronous interface xPush(Sessn s, Msg *msg) xPop(Sessn s, Msg *msg, void *hdr) xDemux(Protl hlp, Sessn s, Msg *msg)
• Synchronous interface xCall(Sessn s, Msg *req, Msg *rep) xCallPop(Sessn s, Msg *req, Msg *rep, void *hdr) xCallDemux(Protl hlp, Sessn s, Msg *req, Msg *rep)
• CHAN is a hybrid protocol – synchronous from above: xCall – asynchronous from below: xPop/xDemux 67
chanCall(Sessn ChanState ChanHdr char
self, Msg *msg, Msg *rmsg){ *state = (ChanState *)self->state; *hdr; *buf;
/* ensure only one transaction per channel */ if ((state->status != IDLE)) return XK_FAILURE; state->status = BUSY; /* save copy of req msg and ptr to rep msg*/ msgConstructCopy(&state->request, msg); state->reply = rmsg; /* fill out header fields */ hdr = state->hdr_template; hdr->Length = msgLen(msg); if (state->mid == MAX_MID) state->mid = 0; hdr->MID = ++state->mid;
68
/* attach header to msg and send it */ buf = msgPush(msg, HDR_LEN); chan_hdr_store(hdr, buf, HDR_LEN); xPush(xGetDown(self, 0), msg); /* schedule first timeout event */ state->retries = 1; state->event = evSchedule(retransmit, self, state->timeout); /* wait for the reply msg */ semWait(&state->reply_sem); /* clean up state and return */ flush_msg(state->request); state->status = IDLE; return state->ret_val; }
69
retransmit(Event ev, int *arg){ Sessn s = (Sessn)arg; ChanState *state = (ChanState *)s->state; Msg tmp; /* see if event was cancelled */ if ( evIsCancelled(ev) ) return; /* unblock client if we've retried 4 times */ if (++state->retries > 4) { state->ret_val = XK_FAILURE; semSignal(state->rep_sem); return; } /* retransmit request message */ msgConstructCopy(&tmp, &state->request); xPush(xGetDown(s, 0), &tmp);
}
/* reschedule event with exponential backoff */ evDetach(state->event); state->timeout = 2*state->timeout; state->event = evSchedule(retransmit, s, state->timeout);
70
chanPop(Sessn self, Sessn lls, Msg *msg, void *inHdr) { /* see if this is a CLIENT or SERVER session */ if (self->state->type == SERVER) return(chanServerPop(self, lls, msg, inHdr)); else return(chanClientPop(self, lls, msg, inHdr)); }
71
chanClientPop(Sessn self, Sessn lls, Msg *msg, void *inHdr) { ChanState *state = (ChanState *)self->state; ChanHdr *hdr = (ChanHdr *)inHdr; /* verify correctness of msg header */ if (!clnt_msg_ok(state, hdr)) return XK_FAILURE; /* cancel retransmit timeout event */ evCancel(state->event); /* if ACK, then schedule PROBE and exit*/ if (hdr->Type == ACK) { state->event = evSchedule(probe, s, PROBE); return XK_SUCCESS; }
}
/* save reply and signal client */ msgAssign(state->reply, msg); state->ret_val = XK_SUCCESS; semSignal(&state->reply_sem); return XK_SUCCESS;
72
Dispatcher (SELECT) • Dispatch to appropriate procedure • Synchronous counterpart to UDP
Client
Server
Caller
Callee
xCall
xCallDemux
SELECT
SELECT
xCall
xCallDemux
CHAN
• Address Space for Procedures
xPush
CHAN
xPush
xDemux
– flat: unique id for each possible procedure – hierarchical: program + procedure number 73
xDemux
Example Code Client side static XkReturn selectCall(Sessn self, Msg *req, Msg *rep) { SelectState *state=(SelectState *)self->state; char *buf; buf = msgPush(req, HLEN); select_hdr_store(state->hdr, buf, HLEN); return xCall(xGetDown(self, 0), req, rep); }
Server side static XkReturn selectCallPop(Sessn s, Sessn lls, Msg *req, Msg *rep, void *inHdr) { return xCallDemux(xGetUp(s), s, req, rep); }
74
Simple RPC Stack SELECT
CHAN
BLAST
IP
ETH
75
VCHAN: A Virtual Protocol static XkReturn vchanCall(Sessn s, Msg *req, Msg *rep) { Sessn chan; XkReturn result; VchanState *state=(VchanState *)s->state; /* wait for an idle channel */ semWait(&state->available); chan = state->stack[--state->tos]; /* use the channel */ result = xCall(chan, req, rep); /* free the channel */ state->stack[state->tos++] = chan; semSignal(&state->available); return result; }
76
SunRPC • IP implements BLAST-equivalent – except no selective retransmit
• SunRPC implements CHAN-equivalent
SunRPC
UDP
IP
– except not at-most-once ETH
• UDP + SunRPC implement SELECT-equivalent – UDP dispatches to program (ports bound to programs) – SunRPC dispatches to procedure within program 77
SunRPC Header Format • XID (transaction id) is similar to CHAN’s MID • Server does not remember last XID it serviced • Problem if client retransmits request while reply is in transit
0
31
0
31
XID
XID
MsgType = CALL
MsgType = REPLY
RPCVersion = 2
Status = ACCEPTED
Program
Data
Version Procedure Credentials (variable) Verifier (variable) Data
78