glenda.party
term% ls -F
term% pwd
$home/manuals/plan9_4th/3/ip
term% cat index.txt
IP(3)                      Library Functions Manual                      IP(3)

NAME
       ip,  esp,  gre, icmp, icmpv6, ipmux, rudp, tcp, udp - network protocols
       over IP

SYNOPSIS
       bind -a #Ispec /net
       /net/ipifc
       /net/ipifc/clone
       /net/ipifc/stats
       /net/ipifc/n
       /net/ipifc/n/status
       /net/ipifc/n/ctl
       ...
       /net/arp
       /net/bootp
       /net/iproute
       /net/ipselftab
       /net/log
       /net/ndb
       /net/esp
       /net/gre
       /net/icmp
       /net/icmpv6
       /net/ipmux
       /net/rudp
       /net/tcp
       /net/udp
       /net/tcp/clone
       /net/tcp/stats
       /net/tcp/n
       /net/tcp/n/data
       /net/tcp/n/ctl
       /net/tcp/n/local
       /net/tcp/n/remote
       /net/tcp/n/status
       /net/tcp/n/listen
       ...

DESCRIPTION
       The ip device provides the interface to Internet Protocol stacks.  Spec
       is an integer from 0 to 15 identifying a stack.  Each stack  implements
       IPv4  and  IPv6.  Each stack is independent of all others: the only in‐
       formation transfer between them is via  programs  that  mount  multiple
       stacks.   Normally  a  system  uses  only  one stack.  However multiple
       stacks can be used for debugging new IP networks or implementing  fire‐
       walls or proxy services.

       All  addresses  used  are 16-byte IPv6 addresses.  IPv4 addresses are a
       subset of the IPv6 addresses and both standard ASCII  formats  are  ac‐
       cepted.   In  binary representation, all v4 addresses start with the 12
       bytes, in hex:

              00 00 00 00 00 00 00 00 00 00 ff ff

   Configuring interfaces
       Each stack may have multiple interfaces and  each  interface  may  have
       multiple  addresses.  The /net/ipifc directory contains a clone file, a
       stats file, and numbered subdirectories for each physical interface.

       Opening the clone file reserves an interface.  The file descriptor  re‐
       turned  from  the  open(2)  will point to the control file, ctl, of the
       newly allocated interface.  Reading ctl returns a  text  string  repre‐
       senting the number of the interface.  Writing ctl alters aspects of the
       interface.   The possible ctl messages are those described under Proto‐
       col directories below and these:

       bind ether path
              Treat the device mounted at path as an Ethernet medium  carrying
              IP  and  ARP  packets and associate it with this interface.  The
              kernel will dial(2) path!0x800, path!0x806 and  path!0x86dd  and
              use the connections for IPv4, ARP and IPv6 respectively.

       bind pkt
              Treat  this interface as a packet interface.  Assume a user pro‐
              gram will read and write the data file to receive  and  transmit
              IP  packets  to  the  kernel.   This is used by programs such as
              ppp(8) to mediate IP packet transfer between the  kernel  and  a
              PPP encoded device.

       bind netdev path
              Treat  this  interface  as  a packet interface.  The kernel will
              open path and read and write the resulting  file  descriptor  to
              receive and transmit IP packets.

       bind loopback
              Treat  this  interface as a local loopback.  Anything written to
              it will be looped back.

       unbind Disassociate the physical device from an IP interface.

       add local [ mask remote mtu proxy ]
       try local [ mask remote mtu proxy ]
              Add a local IP address to the interface.  Try adds the local ad‐
              dress as a tentative address if it's an IPv6 address.  The mask,
              remote, mtu, and proxy arguments are all optional.  The  default
              mask  is  the class mask for the local address.  The default re‐
              mote address is local ANDed with mask.  The default mtu (maximum
              transmission unit) is 1514 for Ethernet and 4096 for packet  me‐
              dia.   The  mtu  is the size in bytes of the largest packet that
              this interface can send.  Proxy, if specified, means  that  this
              machine  should  answer  ARP  requests  for  the remote address.
              Ppp(8) does this to make remote machines appear to be  connected
              to the local Ethernet.

       remove local mask
              Remove a local IP address from an interface.

       mtu n  Set  the maximum transfer unit for this device to n.  The mtu is
              the maximum size of the  packet  including  any  medium-specific
              headers.

       reassemble
              Reassemble IP fragments before forwarding to this interface

       iprouting n
              Allow (n is missing or non-zero) or disallow (n is 0) forwarding
              packets between this interface and others.

       bridge Enable bridging (see bridge(3)).

       promiscuous
              Set  the  interface into promiscuous mode, which makes it accept
              all incoming packets, whether addressed to it or not.

       connect type
              marks the Ethernet packet type as being in use, if  not  already
              in  use on this interface.  A type of -1 means ‘all' but appears
              to be a no-op.

       addmulti Media-addr
              Treat the multicast Media-addr on this interface as a local  ad‐
              dress.

       remmulti Media-addr
              Remove the multicast address Media-addr from this interface.

       scanbs Make the wireless interface scan for base stations.

       headersonly
              Set the interface to pass only packet headers, not data too.

       add6 v6addr pfx-len [onlink auto validlt preflt]
              Add  the local IPv6 address v6addr with prefix length pfx-len to
              this interface.  See RFC 2461 §6.2.1 for more detail.  The  re‐
              maining arguments are optional:

              onlink flag: address is ‘on-link'

              auto   flag: autonomous

              validlt
                     valid life-time in seconds

              preflt preferred life-time in seconds

       ra6 keyword value ...
              Set  IPv6  router  advertisement (RA) parameter keyword's value.
              Known keywords and the meanings of their values follow.  See RFC
              2461 §6.2.1 for more detail.  Flags are true iff non-zero.

              recvra flag: receive and process RAs.

              sendra flag: generate and send RAs.

              mflag  flag: ‘‘Managed address configuration'', goes into RAs.

              oflag  flag: ‘‘Other stateful configuration'', goes into RAs.

              maxraint
                     ‘‘maximum time allowed between sending unsolicited multi‐
                     cast'' RAs from the interface, in ms.

              minraint
                     ‘‘minimum time allowed between sending unsolicited multi‐
                     cast'' RAs from the interface, in ms.

              linkmtu
                     ‘‘value to be placed in MTU options sent by the router.''
                     Zero indicates none.

              reachtime
                     sets the Reachable Time field in RAs sent by the  router.
                     ‘‘Zero means unspecified (by this router).''

              rxmitra
                     sets  the  Retrans Timer field in RAs sent by the router.
                     ‘‘Zero means unspecified (by this router).''

              ttl    default value of the Cur Hop Limit field in RAs  sent  by
                     the  router.   Should be set to the ‘‘current diameter of
                     the  Internet.''   ‘‘Zero  means  unspecified  (by   this
                     router).''

              routerlt
                     sets  the  Router Lifetime field of RAs sent from the in‐
                     terface, in ms.  Zero means the router is not to be  used
                     as a default router.

       Reading  the  interface's status file returns information about the in‐
       terface, one line for each local address on that interface.  The  first
       line  has  9  white-space-separated fields: device, mtu, local address,
       mask, remote or network address, packets in, packets out, input errors,
       output errors.  Each subsequent line contains all but  the  device  and
       mtu.  See readipifc in ip(2).

   Routing
       The  file iproute controls information about IP routing.  When read, it
       returns one line per routing entry.   Each  line  contains  six  white-
       space-separated  fields:  target  address, target mask, address of next
       hop, flags, tag, and interface number.  The entry used for  routing  an
       IP  packet  is  the one with the longest mask for which destination ad‐
       dress ANDed with target mask equals the target address.  The  one-char‐
       acter flags are:

       4      IPv4 route

       6      IPv6 route

       i      local interface

       b      broadcast address

       u      local unicast address

       m      multicast route

       p      point-to-point route

       The  tag  is  an  arbitrary, up to 4 character, string.  It is normally
       used to indicate what routing protocol originated the route.

       Writing to /net/iproute changes the route table.  The messages are:

       flush  Remove all routes.

       tag string
              Associate the tag, string, with all subsequent routes added  via
              this file descriptor.

       add target mask nexthop
              Add the route to the table.  If one already exists with the same
              target and mask, replace it.

       remove target mask
              Remove a route with a matching target and mask.

       route target
              Print  on the console the route to address target, if any.  Pri‐
              marily a debugging aid.

   Address resolution
       The file /net/arp controls information about address  resolution.   The
       kernel  automatically updates the v4 ARP and v6 Neighbour Discovery in‐
       formation for Ethernet interfaces.  When read,  the  file  returns  one
       line per address containing the type of medium, the status of the entry
       (OK,  WAIT),  the  IP  address,  and  the  medium  address.  Writing to
       /net/arp administers the ARP information.  The control messages are:

       flush  Remove all entries.

       add type IP-addr Media-addr
              Add an entry or replace an existing one for the same IP address.

       del IP-addr
              Delete an individual entry.

       ARP entries do not time out.  The ARP table is a cache with an LRU  re‐
       placement  policy.   The  IP stack listens for all ARP requests and, if
       the requester is in the table, the entry is updated.  Also, whenever  a
       new  address  is configured onto an Ethernet, an ARP request is sent to
       help update the table on other systems.

       Currently, the only medium type is ether.

   Debugging and stack information
       If any process is holding /net/log open, the IP stack queues  debugging
       information  to  it.   This  is intended primarily for debugging the IP
       stack.  The information provided  is  implementation-defined;  see  the
       source  for  details.   Generally,  what  is returned is error messages
       about bad packets.

       Writing to /net/log controls debugging.  The control messages are:

       set arglist
              Arglist is a space-separated list of items for which  to  enable
              debugging.  The possible items are: ppp, ip, fs, tcp, icmp, udp,
              compress, gre, tcpwin, tcprxmt, udpmsg, ipmsg, and esp.

       clear arglist
              Arglist  is a space-separated list of items for which to disable
              debugging.

       only addr
              If addr is non-zero, restrict debugging to  only  those  packets
              whose source or destination is that address.

       The  file  /net/ndb can be read or written by programs.  It is normally
       used by ipconfig(8) to leave configuration information for  other  pro‐
       grams such as dns and cs (see ndb(8)).  /net/ndb may contain up to 1024
       bytes.

       The  file  /net/ipselftab is a read-only file containing all the IP ad‐
       dresses considered local.  Each line in the file contains three  white-
       space-separated  fields: IP address, usage count, and flags.  The usage
       count is the number of interfaces to which the  address  applies.   The
       flags  are the same as for routing entries.  Note that the ‘IPv4 route'
       flag will never be set.

   Protocol directories
       The ip device supports IP as well as several protocols  that  run  over
       it:  TCP, UDP, RUDP, ICMP, GRE, and ESP.  TCP and UDP provide the stan‐
       dard Internet protocols for reliable  stream  and  unreliable  datagram
       communication.   RUDP is a locally-developed reliable datagram protocol
       based on UDP.  ICMP is IP's catch-all control protocol used to send low
       level error messages and to implement ping(8).  GRE is a general encap‐
       sulation protocol.  ESP is the encapsulation protocol  for  IPsec.   IL
       provided  a  reliable datagram service for communication between Plan 9
       machines over IPv4, but is no longer part of the system.

       Each protocol is a subdirectory of the IP stack.  The top level  direc‐
       tory  of  each protocol contains a clone file, a stats file, and subdi‐
       rectories numbered from zero to the number of  connections  opened  for
       this protocol.

       Opening  the clone file reserves a connection.  The file descriptor re‐
       turned from the open(2) will point to the control  file,  ctl,  of  the
       newly  allocated  connection.  Reading ctl returns a text string repre‐
       senting the number of the connection.  Connections may be  used  either
       to listen for incoming calls or to initiate calls to other machines.

       A  connection  is  controlled by writing text strings to the associated
       ctl file.  After a connection has been established  data  may  be  read
       from and written to data.  A connection can be actively established us‐
       ing the connect message (see also dial(2)).  A connection can be estab‐
       lished  passively  by  first using an announce message (see dial(2)) to
       bind to a local port and then opening the listen file (see dial(2))  to
       receive incoming calls.

       The following control messages are supported:

       connect ip-address!port!r local
              Establish  a  connection  to the remote ip-address and port.  If
              local is specified, it is used as the local port number.  If lo‐
              cal is not specified but !r is, the system will allocate  a  re‐
              stricted  port number (less than 1024) for the connection to al‐
              low communication with Unix login and exec services.   Otherwise
              a  free  port  number  starting  at 5000 is chosen.  The connect
              fails if the combination of local and remote address/port  pairs
              are already assigned to another port.

       announce X
              X is a decimal port number or Set the local port number to X and
              accept  calls  to  X.  If X is accept calls for any port that no
              process has explicitly announced.  The local IP  address  cannot
              be  set.   Announce fails if the connection is already announced
              or connected.

       bind X X is a decimal port number or Set the local port  number  to  X.
              This  exists  to support emulation of BSD sockets by the APE li‐
              braries (see pcc(1)) and is not otherwise used.

       ttl n  Set the time to live IP field in outgoing packets to n.

       tos n  Set the service type IP field in outgoing packets to n.

       ignoreadvice
              Don't break (UDP) connections because of ICMP errors.

       addmulti ifc-ip [ mcast-ip ]
              Treat ifc-ip on this multicast interface as a local address.  If
              mcast-ip is present, use it as  the  interface's  multicast  ad‐
              dress.

       remmulti ip
              Remove the address ip from this multicast interface.

       Port numbers must be in the range 1 to 32767.

       Several  files report the status of a connection.  The remote and local
       files contain the IP address and port number for the remote  and  local
       side  of  the  connection.  The status file contains protocol-dependent
       information to help debug network connections.  On receiving and  error
       or EOF reading or writing the data file, the err file contains the rea‐
       son for error.

       A  process  may  accept  incoming  connections by open(2)ing the listen
       file.  The open will block until  a  new  connection  request  arrives.
       Then  open will return an open file descriptor which points to the con‐
       trol file of the newly accepted connection.  This procedure will accept
       all calls for the given protocol.  See dial(2).

   TCP
       TCP connections are reliable point-to-point byte streams; there are  no
       message delimiters.  A connection is determined by the address and port
       numbers  of  the  two  ends.  TCP ctl files support the following addi‐
       tional messages:

       hangup close down this TCP connection

       keepalive n
              turn on keep alive messages.  N, if given, is  the  milliseconds
              between keepalives (default 30000).

       checksum n
              emit  TCP  checksums of zero if n is zero; otherwise, and by de‐
              fault, TCP checksums are computed and sent normally.

       tcpporthogdefense onoff
              onoff of enables the TCP port-hog defense for  all  TCP  connec‐
              tions;  onoff  of disables it.  The defense is a solution to hi‐
              jacked systems staking out ports as a form of  denial-of-service
              attack.   To  avoid  stateless TCP conversation hogs, ip picks a
              TCP sequence number at random for keepalives.   If  that  number
              gets acked by the other end, ip shuts down the connection.  Some
              firewalls,  notably  ones that perform stateful inspection, dis‐
              card  such  out-of-specification  keepalives,   so   connections
              through  such firewalls will be killed after five minutes by the
              lack of keepalives.

   UDP
       UDP connections carry unreliable and unordered datagrams.  A read  from
       data  will  return  the next datagram, discarding anything that doesn't
       fit in the read buffer.  A write is sent as a single datagram.

       By default, a UDP connection is a point-to-point link.  Either  a  con‐
       nect  establishes  a local and remote address/port pair or after an an‐
       nounce, each datagram coming from a different remote address/port  pair
       establishes  a new incoming connection.  However, many-to-one semantics
       is also possible.

       If, after an announce, the message is written to ctl, then all messages
       sent to the announced port are received  on  the  announced  connection
       prefixed with the corresponding structure, declared in <ip.h>:

              typedef struct Udphdr Udphdr;
              struct Udphdr
              {
                   uchar     raddr[16];     /* V6 remote address and port */
                   uchar     laddr[16];     /* V6 local address and port */
                   uchar     ifcaddr[16];   /* V6 interface address (receive only) */
                   uchar     rport[2]; /* remote port */
                   uchar     lport[2]; /* local port */
              };

       Before a write, a user must prefix a similar structure to each message.
       The  system  overrides the user specified local port with the announced
       one.  If the user specifies an address that isn't a unicast address  in
       /net/ipselftab,  that  too is overridden.  Since the prefixed structure
       is the same in read and write, it is relatively easy to write a  server
       that responds to client requests by just copying new data into the mes‐
       sage body and then writing back the same buffer that was read.

       In this case (writing to the ctl file), no listen nor accept is needed;
       otherwise,  the usual sequence of announce, listen, accept must be exe‐
       cuted before performing I/O on the corresponding data file.

   RUDP
       RUDP is a reliable datagram protocol based on UDP, currently  only  for
       IPv4.   Packets  are delivered in order.  RUDP does not support listen.
       One must write either or followed immediately by to ctl.

       Unlike TCP, the reboot of one end of a  connection  does  not  force  a
       closing  of  the  connection.   Communications will resume when the re‐
       booted machine resumes talking.  Any unacknowledged packets queued  be‐
       fore  the reboot will be lost.  A reboot can be detected by reading the
       err file.  It will contain the message

              hangup address!port

       where address and port are of the far side of the connection.  Retrans‐
       mitting a datagram more than 10 times is treated  like  a  reboot:  all
       queued  messages  are  dropped, an error is queued to the err file, and
       the conversation resumes.

       RUDP ctl files accept the following messages:

       headers
              Corresponds to the format of UDP.

       hangup IP port
              Drop the connection to address IP and port.

       randdrop [ percent ]
              Randomly drop percent of outgoing packets.  Default is 10%.

   ICMP
       ICMP is a datagram protocol for IPv4 used to exchange control  requests
       and  their  responses with other machines' IP implementations.  ICMP is
       primarily a kernel-to-kernel protocol, but it is possible  to  generate
       ‘echo request' and read ‘echo reply' packets from user programs.

   ICMPV6
       ICMPv6 is the IPv6 equivalent of ICMP.  If, after an announce, the mes‐
       sage  is  written  to ctl, then before a write, a user must prefix each
       message with a corresponding structure, declared in <ip.h>:

              /*
               *  user level icmpv6 with control message "headers"
               */
              typedef struct Icmp6hdr Icmp6hdr;
              struct Icmp6hdr {
                   uchar     unused[8];
                   uchar     laddr[IPaddrlen];   /* local address */
                   uchar     raddr[IPaddrlen];   /* remote address */
              };

       In this case (writing to the ctl file), no listen nor accept is needed;
       otherwise, the usual sequence of announce, listen, accept must be  exe‐
       cuted before performing I/O on the corresponding data file.

   GRE
       GRE  is the encapsulation protocol used by PPTP.  The kernel implements
       just enough of the protocol to multiplex it.  Our implementation encap‐
       sulates in IPv4, per RFC 1702.  Announce is not allowed  in  GRE,  only
       connect.  Since GRE has no port numbers, the port number in the connect
       is actually the 16 bit eproto field in the GRE header.

       Reads  and  writes  transfer a GRE datagram starting at the GRE header.
       On write, the kernel fills in the eproto field  with  the  port  number
       specified in the connect message.

   ESP
       ESP  is  the Encapsulating Security Payload (RFC 1827, obsoleted by RFC
       4303) for IPsec (RFC 4301).  We currently implement only  tunnel  mode,
       not  transport  mode.  It is used to set up an encrypted tunnel between
       machines.  Like GRE, ESP has no port numbers.  Instead, the port number
       in the connect message is  the  SPI  (Security  Association  Identifier
       (sic)).   IP packets are written to and read from data.  The kernel en‐
       crypts any packets written to data, appends a MAC, and prefixes an  ESP
       header before sending to the other end of the tunnel.  Received packets
       are checked against their MAC's, decrypted, and queued for reading from
       data.   In  the following, secret is the hexadecimal encoding of a key,
       without a leading The control messages are:

       esp alg secret
              Encrypt with the algorithm, alg, using secret as the key.   Pos‐
              sible algorithms are: null, des_56_cbc, des3_cbc, and eventually
              aes_128_cbc, and aes_ctr.

       ah alg secret
              Use the hash algorithm, alg, with secret as the key for generat‐
              ing  the  MAC.   Possible  algorithms  are:  null, hmac_sha1_96,
              hmac_md5_96, and eventually aes_xcbc_mac_96.

       header Turn on header mode.  Every buffer read from data starts with  4
              unused  bytes,  and the first 4 bytes of every buffer written to
              data are ignored.

       noheader
              Turn off header mode.

   IP packet filter
       The directory /net/ipmux looks like another protocol directory.  It  is
       a  packet filter built on top of IP.  Each numbered subdirectory repre‐
       sents a different filter.  The connect messages written to the ctl file
       describe the filter. Packets matching the filter can  be  read  on  the
       data file.  Packets written to the data file are routed to an interface
       and transmitted.

       A filter is a semicolon-separated list of relations.  Each relation de‐
       scribes a portion of a packet to match.  The possible relations are:

       proto=n
              the IP protocol number must be n.

       data[n:m]=expr
              bytes n through m following the IP packet must match expr.

       iph[n:m]=expr
              bytes n through m of the IP packet header must match expr.

       ifc=expr
              the packet must have been received on an interface whose address
              matches expr.

       src=expr
              The source address in the packet must match expr.

       dst=expr
              The destination address in the packet must match expr.

       Expr is of the form:

            value

            value|value|...

            value&mask

            value|value&mask

       If  a  mask  is given, the relevant field is first ANDed with the mask.
       The result is compared against the value or list of values for a match.
       In the case of ifc, dst, and src the value is a  dot-formatted  IP  ad‐
       dress  and  the  mask is a dot-formatted IP mask.  In the case of data,
       iph and proto, both value and mask are strings of 2 hexadecimal  digits
       representing 8-bit values.

       A  packet is delivered to only one filter.  The filters are merged into
       a single comparison tree.  If two filters match the  same  packet,  the
       following rules apply in order (here '>' means is preferred to):

       1)     protocol > data > source > destination > interface

       2)     lower data offsets > higher data offsets

       3)     longer matches > shorter matches

       4)     older > younger

       So  far  this  has just been used to implement a version of OSPF in In‐
       ferno and 6to4 tunnelling.

   Statistics
       The stats files are read only and contain statistics useful to  network
       monitoring.

       Reading  /net/ipifc/stats returns a list of 19 tagged and newline-sepa‐
       rated fields representing:
         forwarding status (0 and 2 mean forwarding off,
              1 means on)
         default TTL
         input packets
         input header errors
         input address errors
         packets forwarded
         input packets for unknown protocols
         input packets discarded
         input packets delivered to higher level protocols
         output packets
         output packets discarded
         output packets with no route
         timed out fragments in reassembly queue
         requested reassemblies
         successful reassemblies
         failed reassemblies
         successful fragmentations
         unsuccessful fragmentations
         fragments created

       Reading /net/icmp/stats returns a list of 26 tagged  and  newline-sepa‐
       rated fields representing:
         messages received
         bad received messages
         unreachables received
         time exceededs received
         input parameter problems received
         source quenches received
         redirects received
         echo requests received
         echo replies received
         timestamps received
         timestamp replies received
         address mask requests received
         address mask replies received
         messages sent
         transmission errors
         unreachables sent
         time exceededs sent
         input parameter problems sent
         source quenches sent
         redirects sent
         echo requests sent
         echo replies sent
         timestamps sent
         timestamp replies sent
         address mask requests sent
         address mask replies sent

       Reading  /net/tcp/stats  returns  a list of 11 tagged and newline-sepa‐
       rated fields representing:
         maximum number of connections
         total outgoing calls
         total incoming calls
         number of established connections to be reset
         number of currently established connections
         segments received
         segments sent
         segments retransmitted
         retransmit timeouts
         bad received segments
         transmission failures

       Reading /net/udp/stats returns a list of 4 tagged and newline-separated
       fields representing:
         datagrams received
         datagrams received for bad ports
         malformed datagrams received
         datagrams sent

       Reading /net/gre/stats returns a list of 1 tagged number representing:
         header length errors

SEE ALSO
       dial(2), ip(2), bridge(3), ndb(6), listen(8)
       /lib/rfc/rfc2460
              IPv6
       /lib/rfc/rfc4291
              IPv6 address architecture
       /lib/rfc/rfc4443
              ICMPv6
SOURCE
       /sys/src/9/ip
BUGS
       Ipmux has not been heavily used and should be considered  experimental.
       It  may  disappear  in favor of a more traditional packet filter in the
       future.

                                                                         IP(3)