[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Date: Wed, 19 Feb 1992 18:13 EST
The following concerns some problems with the Symbolics' implementation of
the ICMP portion of IP-TCP. I'm looking for anyone who might have had (and
solved) similar problems.
The code with problems is in the file SYS:IP-TCP;ICMP.LISP.2624 and is the
implementation of such things as ICMP-SEND-ECHO (the host pinger) and
SEND-MISLEADING-REDIRECT (an answer to a ping that says try another place
:SEND-MISLEADING-REDIRECT is completely unused. I think it's some kind
of debugging tool, to allow you to manually send an ICMP Redirect
message. What did you base your description of it on?
ICMP messages are sent with a sequence number that comes back
in the reply. The Symbolics implementation increments this number without
inhibiting process switching. Ergo, it is possible for two processes on
the same host to originate pings with the same sequence number.
Yes, that would be a problem if you were pinging simultaneously from two
processes. Why would you do that?
Actually, a somewhat more serious synchronization problem is in the
:ECHO-REPLY and :GET-ECHO-REPLY methods. It's possible for *both*
pingers to see the first reply and both of them would then try to remove
it from the list of replies, possibly screwing up royally due to the use
of the destructive DELETE function. In fact, this problem exists even
without the common sequence numbers; it could happen any time two ping
responses are received close together.
Further, the receipt of a ping only looks to see if the same sequence number
is in a return packet. This causes a problem since the answer to a ping
might be a SEND-MISLEADING-REDIRECT from a host other than the one that
was pinged. In other words, the Symbolics receiver ignores message type.
That's completely wrong. Look at (FLAVOR:METHOD :RECEIVE-IP-PACKET
ICMP-PROTOCOL). After validating the checksum, the first thing it does
is dispatch on the message type.
What's true is that it ignores the source address, so if you're pinging
two hosts simultaneously, responses from one host might be received by
the pinger of the other host.
For both of the above reasons, I am getting positive responses to pings from
hosts that haven't been powered up in months.
I don't see how duplicate sequence numbers can have this result. Not
checking the ICMP type could, but it doesn't have that bug.
It seems that the proper fix is to (1) incapsulate the sequence number
incrementation against process switching and (2) make the receiver use
redirect data to update route tables (optional) but NIL the ping response.
It does (2); here's the section of (FLAVOR:METHOD :RECEIVE-IP-PACKET
ICMP-PROTOCOL) that does it:
(5 (send network :icmp-redirect source
(neti:get-sub-packet icmp 'sys:art-8b icmp-size)
(load-internet-address icmp 4)
(case (icmp-code icmp)
((0 2) nil)
((1 3) t))))
A. Does anybody know if Symbolics is (has/will) dealt with these problems?
Only Symbolics can answer that.
B. Has anybody else fixed this stuff? (If so, will you share?)
C. If answers to A and B are negative, I will try to fix this myself.
Does anyone know the code well enough to guess that the only code that
needs to be fixed is in the file SYS:IP-TCP;ICMP.LISP.2624?
I just implemented the necessary fixes, and they seem to work (but I
wasn't able to get the original code to fail -- I guess my 3650 doesn't
process switch often enough to lose). In (FLAVOR:METHOD :ICMP-SEND-ECHO
(if (< *icmp-echo-sequence* 65535)
(setq *icmp-echo-sequence* 0))
(if (< x 65535)
and replace "push" with "process:atomic-push".
And in (FLAVOR:METHOD :GET-ECHO-REPLY ICMP-PROTOCOL), replace:
(setq echoes-outstanding (delete echo echoes-outstanding))
#'(lambda (x) (delete echo x)))
Since these changes guarantee that sequence numbers are unique (unless
you send 64K pings, so that the sequence number wraps around, before
looking for any replies), the problem of not checking the source address
If you'd like, I also have changes to the ICMP echo code that returns
the response time in microseconds (the original code records the
reception time in 60ths of a second, but doesn't return that information
to the caller). I also have a Ping CP command that is similar to the
Unix ping command.