glenda.party
term% ls -F
term% pwd
$home/manuals/plan9_4th/3/ip
term% cat index.txt
IP(3)                      Library Functions Manual                      IP(3)



NAME
       ip,  esp,  gre, icmp, icmpv6, ipmux, rudp, tcp, udp - network protocols
       over IP

SYNOPSIS
       bind -a #Ispec /net
       /net/ipifc
       /net/ipifc/clone
       /net/ipifc/stats
       /net/ipifc/n
       /net/ipifc/n/status
       /net/ipifc/n/ctl
       ...
       /net/arp
       /net/bootp
       /net/iproute
       /net/ipselftab
       /net/log
       /net/ndb
       /net/esp
       /net/gre
       /net/icmp
       /net/icmpv6
       /net/ipmux
       /net/rudp
       /net/tcp
       /net/udp
       /net/tcp/clone
       /net/tcp/stats
       /net/tcp/n
       /net/tcp/n/data
       /net/tcp/n/ctl
       /net/tcp/n/local
       /net/tcp/n/remote
       /net/tcp/n/status
       /net/tcp/n/listen
       ...

DESCRIPTION
       The ip device provides the interface to Internet Protocol stacks.  Spec
       is  an integer from 0 to 15 identifying a stack.  Each stack implements
       IPv4 and IPv6.  Each stack is independent of all others: the  only  in‐
       formation  transfer  between  them  is via programs that mount multiple
       stacks.  Normally a system  uses  only  one  stack.   However  multiple
       stacks  can be used for debugging new IP networks or implementing fire‐
       walls or proxy services.

       All addresses used are 16-byte IPv6 addresses.  IPv4  addresses  are  a
       subset  of  the  IPv6 addresses and both standard ASCII formats are ac‐
       cepted.  In binary representation, all v4 addresses start with  the  12
       bytes, in hex:

              00 00 00 00 00 00 00 00 00 00 ff ff

   Configuring interfaces
       Each  stack  may  have  multiple interfaces and each interface may have
       multiple addresses.  The /net/ipifc directory contains a clone file,  a
       stats file, and numbered subdirectories for each physical interface.

       Opening  the clone file reserves an interface.  The file descriptor re‐
       turned from the open(2) will point to the control  file,  ctl,  of  the
       newly  allocated  interface.   Reading ctl returns a text string repre‐
       senting the number of the interface.  Writing ctl alters aspects of the
       interface.   The possible ctl messages are those described under Proto‐
       col directories below and these:

       bind ether path
              Treat the device mounted at path as an Ethernet medium  carrying
              IP  and  ARP  packets and associate it with this interface.  The
              kernel will dial(2) path!0x800, path!0x806 and  path!0x86dd  and
              use the connections for IPv4, ARP and IPv6 respectively.

       bind pkt
              Treat  this interface as a packet interface.  Assume a user pro‐
              gram will read and write the data file to receive  and  transmit
              IP  packets  to  the  kernel.   This is used by programs such as
              ppp(8) to mediate IP packet transfer between the  kernel  and  a
              PPP encoded device.

       bind netdev path
              Treat  this  interface  as  a packet interface.  The kernel will
              open path and read and write the resulting  file  descriptor  to
              receive and transmit IP packets.

       bind loopback
              Treat  this  interface as a local loopback.  Anything written to
              it will be looped back.

       unbind Disassociate the physical device from an IP interface.

       add local [ mask remote mtu proxy ]
       try local [ mask remote mtu proxy ]
              Add a local IP address to the interface.  Try adds the local ad‐
              dress as a tentative address if it's an IPv6 address.  The mask,
              remote, mtu, and proxy arguments are all optional.  The  default
              mask  is  the class mask for the local address.  The default re‐
              mote address is local ANDed with mask.  The default mtu (maximum
              transmission  unit) is 1514 for Ethernet and 4096 for packet me‐
              dia.  The mtu is the size in bytes of the  largest  packet  that
              this  interface  can send.  Proxy, if specified, means that this
              machine should answer  ARP  requests  for  the  remote  address.
              Ppp(8)  does this to make remote machines appear to be connected
              to the local Ethernet.

       remove local mask
              Remove a local IP address from an interface.

       mtu n  Set the maximum transfer unit for this device to n.  The mtu  is
              the  maximum  size  of  the packet including any medium-specific
              headers.

       reassemble
              Reassemble IP fragments before forwarding to this interface

       iprouting n
              Allow (n is missing or non-zero) or disallow (n is 0) forwarding
              packets between this interface and others.

       bridge Enable bridging (see bridge(3)).

       promiscuous
              Set  the  interface into promiscuous mode, which makes it accept
              all incoming packets, whether addressed to it or not.

       connect type
              marks the Ethernet packet type as being in use, if  not  already
              in  use on this interface.  A type of -1 means `all' but appears
              to be a no-op.

       addmulti Media-addr
              Treat the multicast Media-addr on this interface as a local  ad‐
              dress.

       remmulti Media-addr
              Remove the multicast address Media-addr from this interface.

       scanbs Make the wireless interface scan for base stations.

       headersonly
              Set the interface to pass only packet headers, not data too.

       add6 v6addr pfx-len [onlink auto validlt preflt]
              Add  the local IPv6 address v6addr with prefix length pfx-len to
              this interface.  See RFC 2461 §6.2.1 for more detail.  The  re‐
              maining arguments are optional:

              onlink flag: address is `on-link'

              auto   flag: autonomous

              validlt
                     valid life-time in seconds

              preflt preferred life-time in seconds

       ra6 keyword value ...
              Set  IPv6  router  advertisement (RA) parameter keyword's value.
              Known keywords and the meanings of their values follow.  See RFC
              2461 §6.2.1 for more detail.  Flags are true iff non-zero.

              recvra flag: receive and process RAs.

              sendra flag: generate and send RAs.

              mflag  flag: ``Managed address configuration'', goes into RAs.

              oflag  flag: ``Other stateful configuration'', goes into RAs.

              maxraint
                     ``maximum time allowed between sending unsolicited multi‐
                     cast'' RAs from the interface, in ms.

              minraint
                     ``minimum time allowed between sending unsolicited multi‐
                     cast'' RAs from the interface, in ms.

              linkmtu
                     ``value to be placed in MTU options sent by the router.''
                     Zero indicates none.

              reachtime
                     sets the Reachable Time field in RAs sent by the  router.
                     ``Zero means unspecified (by this router).''

              rxmitra
                     sets  the  Retrans Timer field in RAs sent by the router.
                     ``Zero means unspecified (by this router).''

              ttl    default value of the Cur Hop Limit field in RAs  sent  by
                     the  router.   Should be set to the ``current diameter of
                     the  Internet.''   ``Zero  means  unspecified  (by   this
                     router).''

              routerlt
                     sets  the  Router Lifetime field of RAs sent from the in‐
                     terface, in ms.  Zero means the router is not to be  used
                     as a default router.

       Reading  the  interface's status file returns information about the in‐
       terface, one line for each local address on that interface.  The  first
       line  has  9  white-space-separated fields: device, mtu, local address,
       mask, remote or network address, packets in, packets out, input errors,
       output  errors.   Each  subsequent line contains all but the device and
       mtu.  See readipifc in ip(2).

   Routing
       The file iproute controls information about IP routing.  When read,  it
       returns  one  line  per  routing  entry.  Each line contains six white-
       space-separated fields: target address, target mask,  address  of  next
       hop,  flags,  tag, and interface number.  The entry used for routing an
       IP packet is the one with the longest mask for  which  destination  ad‐
       dress  ANDed with target mask equals the target address.  The one-char‐
       acter flags are:

       4      IPv4 route

       6      IPv6 route

       i      local interface

       b      broadcast address

       u      local unicast address

       m      multicast route

       p      point-to-point route

       The tag is an arbitrary, up to 4 character,  string.   It  is  normally
       used to indicate what routing protocol originated the route.

       Writing to /net/iproute changes the route table.  The messages are:

       flush  Remove all routes.

       tag string
              Associate  the tag, string, with all subsequent routes added via
              this file descriptor.

       add target mask nexthop
              Add the route to the table.  If one already exists with the same
              target and mask, replace it.

       remove target mask
              Remove a route with a matching target and mask.

       route target
              Print  on the console the route to address target, if any.  Pri‐
              marily a debugging aid.

   Address resolution
       The file /net/arp controls information about address  resolution.   The
       kernel  automatically updates the v4 ARP and v6 Neighbour Discovery in‐
       formation for Ethernet interfaces.  When read,  the  file  returns  one
       line per address containing the type of medium, the status of the entry
       (OK, WAIT), the  IP  address,  and  the  medium  address.   Writing  to
       /net/arp administers the ARP information.  The control messages are:

       flush  Remove all entries.

       add type IP-addr Media-addr
              Add an entry or replace an existing one for the same IP address.

       del IP-addr
              Delete an individual entry.

       ARP  entries do not time out.  The ARP table is a cache with an LRU re‐
       placement policy.  The IP stack listens for all ARP  requests  and,  if
       the  requester is in the table, the entry is updated.  Also, whenever a
       new address is configured onto an Ethernet, an ARP request is  sent  to
       help update the table on other systems.

       Currently, the only medium type is ether.

   Debugging and stack information
       If  any process is holding /net/log open, the IP stack queues debugging
       information to it.  This is intended primarily  for  debugging  the  IP
       stack.   The  information  provided  is implementation-defined; see the
       source for details.  Generally, what  is  returned  is  error  messages
       about bad packets.

       Writing to /net/log controls debugging.  The control messages are:

       set arglist
              Arglist  is  a space-separated list of items for which to enable
              debugging.  The possible items are: ppp, ip, fs, tcp, icmp, udp,
              compress, gre, tcpwin, tcprxmt, udpmsg, ipmsg, and esp.

       clear arglist
              Arglist  is a space-separated list of items for which to disable
              debugging.

       only addr
              If addr is non-zero, restrict debugging to  only  those  packets
              whose source or destination is that address.

       The  file  /net/ndb can be read or written by programs.  It is normally
       used by ipconfig(8) to leave configuration information for  other  pro‐
       grams such as dns and cs (see ndb(8)).  /net/ndb may contain up to 1024
       bytes.

       The file /net/ipselftab is a read-only file containing all the  IP  ad‐
       dresses  considered local.  Each line in the file contains three white-
       space-separated fields: IP address, usage count, and flags.  The  usage
       count  is  the  number of interfaces to which the address applies.  The
       flags are the same as for routing entries.  Note that the `IPv4  route'
       flag will never be set.

   Protocol directories
       The  ip  device  supports IP as well as several protocols that run over
       it: TCP, UDP, RUDP, ICMP, GRE, and ESP.  TCP and UDP provide the  stan‐
       dard  Internet  protocols  for  reliable stream and unreliable datagram
       communication.  RUDP is a locally-developed reliable datagram  protocol
       based on UDP.  ICMP is IP's catch-all control protocol used to send low
       level error messages and to implement ping(8).  GRE is a general encap‐
       sulation  protocol.   ESP  is the encapsulation protocol for IPsec.  IL
       provided a reliable datagram service for communication between  Plan  9
       machines over IPv4, but is no longer part of the system.

       Each  protocol is a subdirectory of the IP stack.  The top level direc‐
       tory of each protocol contains a clone file, a stats file,  and  subdi‐
       rectories  numbered  from  zero to the number of connections opened for
       this protocol.

       Opening the clone file reserves a connection.  The file descriptor  re‐
       turned  from  the  open(2)  will point to the control file, ctl, of the
       newly allocated connection.  Reading ctl returns a text  string  repre‐
       senting  the  number of the connection.  Connections may be used either
       to listen for incoming calls or to initiate calls to other machines.

       A connection is controlled by writing text strings  to  the  associated
       ctl  file.   After  a  connection has been established data may be read
       from and written to data.  A connection can be actively established us‐
       ing the connect message (see also dial(2)).  A connection can be estab‐
       lished passively by first using an announce message  (see  dial(2))  to
       bind  to a local port and then opening the listen file (see dial(2)) to
       receive incoming calls.

       The following control messages are supported:

       connect ip-address!port!r local
              Establish a connection to the remote ip-address  and  port.   If
              local is specified, it is used as the local port number.  If lo‐
              cal is not specified but !r is, the system will allocate  a  re‐
              stricted  port number (less than 1024) for the connection to al‐
              low communication with Unix login and exec services.   Otherwise
              a  free  port  number  starting  at 5000 is chosen.  The connect
              fails if the combination of local and remote address/port  pairs
              are already assigned to another port.

       announce X
              X is a decimal port number or Set the local port number to X and
              accept calls to X.  If X is accept calls for any  port  that  no
              process  has  explicitly announced.  The local IP address cannot
              be set.  Announce fails if the connection is  already  announced
              or connected.

       bind X X  is  a  decimal port number or Set the local port number to X.
              This exists to support emulation of BSD sockets by the  APE  li‐
              braries (see pcc(1)) and is not otherwise used.

       ttl n  Set the time to live IP field in outgoing packets to n.

       tos n  Set the service type IP field in outgoing packets to n.

       ignoreadvice
              Don't break (UDP) connections because of ICMP errors.

       addmulti ifc-ip [ mcast-ip ]
              Treat ifc-ip on this multicast interface as a local address.  If
              mcast-ip is present, use it as  the  interface's  multicast  ad‐
              dress.

       remmulti ip
              Remove the address ip from this multicast interface.

       Port numbers must be in the range 1 to 32767.

       Several  files report the status of a connection.  The remote and local
       files contain the IP address and port number for the remote  and  local
       side  of  the  connection.  The status file contains protocol-dependent
       information to help debug network connections.  On receiving and  error
       or EOF reading or writing the data file, the err file contains the rea‐
       son for error.

       A process may accept incoming  connections  by  open(2)ing  the  listen
       file.   The  open  will  block  until a new connection request arrives.
       Then open will return an open file descriptor which points to the  con‐
       trol file of the newly accepted connection.  This procedure will accept
       all calls for the given protocol.  See dial(2).

   TCP
       TCP connections are reliable point-to-point byte streams; there are  no
       message delimiters.  A connection is determined by the address and port
       numbers of the two ends.  TCP ctl files  support  the  following  addi‐
       tional messages:

       hangup close down this TCP connection

       keepalive n
              turn  on  keep alive messages.  N, if given, is the milliseconds
              between keepalives (default 30000).

       checksum n
              emit TCP checksums of zero if n is zero; otherwise, and  by  de‐
              fault, TCP checksums are computed and sent normally.

       tcpporthogdefense onoff
              onoff  of  enables  the TCP port-hog defense for all TCP connec‐
              tions; onoff of disables it.  The defense is a solution  to  hi‐
              jacked  systems staking out ports as a form of denial-of-service
              attack.  To avoid stateless TCP conversation hogs,  ip  picks  a
              TCP  sequence  number  at random for keepalives.  If that number
              gets acked by the other end, ip shuts down the connection.  Some
              firewalls,  notably  ones that perform stateful inspection, dis‐
              card  such  out-of-specification  keepalives,   so   connections
              through  such firewalls will be killed after five minutes by the
              lack of keepalives.

   UDP
       UDP connections carry unreliable and unordered datagrams.  A read  from
       data  will  return  the next datagram, discarding anything that doesn't
       fit in the read buffer.  A write is sent as a single datagram.

       By default, a UDP connection is a point-to-point link.  Either  a  con‐
       nect  establishes  a local and remote address/port pair or after an an‐
       nounce, each datagram coming from a different remote address/port  pair
       establishes  a new incoming connection.  However, many-to-one semantics
       is also possible.

       If, after an announce, the message is written to ctl, then all messages
       sent  to  the  announced  port are received on the announced connection
       prefixed with the corresponding structure, declared in <ip.h>:

              typedef struct Udphdr Udphdr;
              struct Udphdr
              {
                   uchar     raddr[16];     /* V6 remote address and port */
                   uchar     laddr[16];     /* V6 local address and port */
                   uchar     ifcaddr[16];   /* V6 interface address (receive only) */
                   uchar     rport[2]; /* remote port */
                   uchar     lport[2]; /* local port */
              };

       Before a write, a user must prefix a similar structure to each message.
       The  system  overrides the user specified local port with the announced
       one.  If the user specifies an address that isn't a unicast address  in
       /net/ipselftab,  that  too is overridden.  Since the prefixed structure
       is the same in read and write, it is relatively easy to write a  server
       that responds to client requests by just copying new data into the mes‐
       sage body and then writing back the same buffer that was read.

       In this case (writing to the ctl file), no listen nor accept is needed;
       otherwise,  the usual sequence of announce, listen, accept must be exe‐
       cuted before performing I/O on the corresponding data file.

   RUDP
       RUDP is a reliable datagram protocol based on UDP, currently  only  for
       IPv4.   Packets  are delivered in order.  RUDP does not support listen.
       One must write either or followed immediately by to ctl.

       Unlike TCP, the reboot of one end of a  connection  does  not  force  a
       closing  of  the  connection.   Communications will resume when the re‐
       booted machine resumes talking.  Any unacknowledged packets queued  be‐
       fore  the reboot will be lost.  A reboot can be detected by reading the
       err file.  It will contain the message

              hangup address!port

       where address and port are of the far side of the connection.  Retrans‐
       mitting  a  datagram  more  than 10 times is treated like a reboot: all
       queued messages are dropped, an error is queued to the  err  file,  and
       the conversation resumes.

       RUDP ctl files accept the following messages:

       headers
              Corresponds to the format of UDP.

       hangup IP port
              Drop the connection to address IP and port.

       randdrop [ percent ]
              Randomly drop percent of outgoing packets.  Default is 10%.

   ICMP
       ICMP  is a datagram protocol for IPv4 used to exchange control requests
       and their responses with other machines' IP implementations.   ICMP  is
       primarily  a  kernel-to-kernel protocol, but it is possible to generate
       `echo request' and read `echo reply' packets from user programs.

   ICMPV6
       ICMPv6 is the IPv6 equivalent of ICMP.  If, after an announce, the mes‐
       sage  is  written  to ctl, then before a write, a user must prefix each
       message with a corresponding structure, declared in <ip.h>:

              /*
               *  user level icmpv6 with control message "headers"
               */
              typedef struct Icmp6hdr Icmp6hdr;
              struct Icmp6hdr {
                   uchar     unused[8];
                   uchar     laddr[IPaddrlen];   /* local address */
                   uchar     raddr[IPaddrlen];   /* remote address */
              };

       In this case (writing to the ctl file), no listen nor accept is needed;
       otherwise,  the usual sequence of announce, listen, accept must be exe‐
       cuted before performing I/O on the corresponding data file.

   GRE
       GRE is the encapsulation protocol used by PPTP.  The kernel  implements
       just enough of the protocol to multiplex it.  Our implementation encap‐
       sulates in IPv4, per RFC 1702.  Announce is not allowed  in  GRE,  only
       connect.  Since GRE has no port numbers, the port number in the connect
       is actually the 16 bit eproto field in the GRE header.

       Reads and writes transfer a GRE datagram starting at  the  GRE  header.
       On  write,  the  kernel  fills in the eproto field with the port number
       specified in the connect message.

   ESP
       ESP is the Encapsulating Security Payload (RFC 1827, obsoleted  by  RFC
       4303)  for  IPsec (RFC 4301).  We currently implement only tunnel mode,
       not transport mode.  It is used to set up an encrypted  tunnel  between
       machines.  Like GRE, ESP has no port numbers.  Instead, the port number
       in the connect message is  the  SPI  (Security  Association  Identifier
       (sic)).   IP packets are written to and read from data.  The kernel en‐
       crypts any packets written to data, appends a MAC, and prefixes an  ESP
       header before sending to the other end of the tunnel.  Received packets
       are checked against their MAC's, decrypted, and queued for reading from
       data.   In  the following, secret is the hexadecimal encoding of a key,
       without a leading The control messages are:

       esp alg secret
              Encrypt with the algorithm, alg, using secret as the key.   Pos‐
              sible algorithms are: null, des_56_cbc, des3_cbc, and eventually
              aes_128_cbc, and aes_ctr.

       ah alg secret
              Use the hash algorithm, alg, with secret as the key for generat‐
              ing  the  MAC.   Possible  algorithms  are:  null, hmac_sha1_96,
              hmac_md5_96, and eventually aes_xcbc_mac_96.

       header Turn on header mode.  Every buffer read from data starts with  4
              unused  bytes,  and the first 4 bytes of every buffer written to
              data are ignored.

       noheader
              Turn off header mode.

   IP packet filter
       The directory /net/ipmux looks like another protocol directory.  It  is
       a  packet filter built on top of IP.  Each numbered subdirectory repre‐
       sents a different filter.  The connect messages written to the ctl file
       describe  the  filter.  Packets  matching the filter can be read on the
       data file.  Packets written to the data file are routed to an interface
       and transmitted.

       A filter is a semicolon-separated list of relations.  Each relation de‐
       scribes a portion of a packet to match.  The possible relations are:

       proto=n
              the IP protocol number must be n.

       data[n:m]=expr
              bytes n through m following the IP packet must match expr.

       iph[n:m]=expr
              bytes n through m of the IP packet header must match expr.

       ifc=expr
              the packet must have been received on an interface whose address
              matches expr.

       src=expr
              The source address in the packet must match expr.

       dst=expr
              The destination address in the packet must match expr.

       Expr is of the form:

            value

            value|value|...

            value&mask

            value|value&mask

       If  a  mask  is given, the relevant field is first ANDed with the mask.
       The result is compared against the value or list of values for a match.
       In  the  case  of ifc, dst, and src the value is a dot-formatted IP ad‐
       dress and the mask is a dot-formatted IP mask.  In the  case  of  data,
       iph  and proto, both value and mask are strings of 2 hexadecimal digits
       representing 8-bit values.

       A packet is delivered to only one filter.  The filters are merged  into
       a  single  comparison  tree.  If two filters match the same packet, the
       following rules apply in order (here '>' means is preferred to):

       1)     protocol > data > source > destination > interface

       2)     lower data offsets > higher data offsets

       3)     longer matches > shorter matches

       4)     older > younger

       So far this has just been used to implement a version of  OSPF  in  In‐
       ferno and 6to4 tunnelling.

   Statistics
       The  stats files are read only and contain statistics useful to network
       monitoring.

       Reading /net/ipifc/stats returns a list of 19 tagged and  newline-sepa‐
       rated fields representing:
         forwarding status (0 and 2 mean forwarding off,
              1 means on)
         default TTL
         input packets
         input header errors
         input address errors
         packets forwarded
         input packets for unknown protocols
         input packets discarded
         input packets delivered to higher level protocols
         output packets
         output packets discarded
         output packets with no route
         timed out fragments in reassembly queue
         requested reassemblies
         successful reassemblies
         failed reassemblies
         successful fragmentations
         unsuccessful fragmentations
         fragments created

       Reading  /net/icmp/stats  returns a list of 26 tagged and newline-sepa‐
       rated fields representing:
         messages received
         bad received messages
         unreachables received
         time exceededs received
         input parameter problems received
         source quenches received
         redirects received
         echo requests received
         echo replies received
         timestamps received
         timestamp replies received
         address mask requests received
         address mask replies received
         messages sent
         transmission errors
         unreachables sent
         time exceededs sent
         input parameter problems sent
         source quenches sent
         redirects sent
         echo requests sent
         echo replies sent
         timestamps sent
         timestamp replies sent
         address mask requests sent
         address mask replies sent

       Reading /net/tcp/stats returns a list of 11  tagged  and  newline-sepa‐
       rated fields representing:
         maximum number of connections
         total outgoing calls
         total incoming calls
         number of established connections to be reset
         number of currently established connections
         segments received
         segments sent
         segments retransmitted
         retransmit timeouts
         bad received segments
         transmission failures

       Reading /net/udp/stats returns a list of 4 tagged and newline-separated
       fields representing:
         datagrams received
         datagrams received for bad ports
         malformed datagrams received
         datagrams sent

       Reading /net/gre/stats returns a list of 1 tagged number representing:
         header length errors

SEE ALSO
       dial(2), ip(2), bridge(3), ndb(6), listen(8)
       /lib/rfc/rfc2460
              IPv6
       /lib/rfc/rfc4291
              IPv6 address architecture
       /lib/rfc/rfc4443
              ICMPv6
SOURCE
       /sys/src/9/ip
BUGS
       Ipmux has not been heavily used and should be considered  experimental.
       It  may  disappear  in favor of a more traditional packet filter in the
       future.



                                                                         IP(3)