glenda.party
term% ls -F
term% pwd
$home/manuals/9front/2/html
term% cat index.txt
HTML(2)                       System Calls Manual                      HTML(2)

NAME
       parsehtml,  printitems,  validitems, freeitems, freedocinfo, dimenkind,
       dimenspec, targetid, targetname, fromStr, toStr - HTML parser

SYNOPSIS
       #include <u.h>
       #include <libc.h>
       #include <html.h>

       Item*  parsehtml(uchar* data, int datalen, Rune* src, int mtype,
              int chset, Docinfo** pdi)

       void   printitems(Item* items, char* msg)

       int    validitems(Item* items)

       void   freeitems(Item* items)

       void   freedocinfo(Docinfo* d)

       int    dimenkind(Dimen d)

       int    dimenspec(Dimen d)

       int    targetid(Rune* s)

       Rune*  targetname(int targid)

       uchar* fromStr(Rune* buf, int n, int chset)

       Rune*  toStr(uchar* buf, int n, int chset)

DESCRIPTION
       This library implements a parser for HTML 4.0  documents.   The  parsed
       HTML  is  converted  into an intermediate representation that describes
       how the formatted HTML should be laid out.

       Parsehtml parses an entire HTML document contained in the  buffer  data
       and having length datalen.  The URL of the document should be passed in
       as  src.   Mtype is the media type of the document, which should be ei‐
       ther TextHtml or TextPlain.  The character set of the document  is  de‐
       scribed  in  chset,  which can be one of US_Ascii, ISO_8859_1, UTF_8 or
       Unicode.  The return value is a linked list  of  Item  structures,  de‐
       scribed  in  detail below.  As a side effect, *pdi is set to point to a
       newly created Docinfo structure, containing information  pertaining  to
       the entire document.

       The  library  expects  two  allocation  routines  to be provided by the
       caller, emalloc and erealloc.  These  routines  are  analogous  to  the
       standard  malloc  and realloc routines, except that they should not re‐
       turn if the memory allocation fails.  In addition, emalloc is  required
       to zero the memory.

       For  debugging  purposes,  printitems may be called to display the con‐
       tents of an item list; individual items may be  printed  using  the  %I
       print  verb, installed on the first call to parsehtml.  validitems tra‐
       verses the item list, checking that all of the pointers are valid.   It
       returns  1 is everything is ok, and 0 if an error was found.  Normally,
       one would not call these routines  directly.   Instead,  one  sets  the
       global variable dbgbuild and the library calls them automatically.  One
       can  also set warn, to cause the library to print a warning whenever it
       finds a problem with the input document, and dbglex, to print debugging
       information in the lexer.

       When an item list is finished with, it should be freed with  freeitems.
       Then, freedocinfo should be called on the pointer returned in *pdi.

       Dimenkind  and  dimenspec  are provided to interpret the Dimen type, as
       described in the section Dimension Specifications.

       Frame target names are mapped to integer ids via  a  global,  permanent
       mapping.   To find the value for a given name, call targetid, which al‐
       locates a new id if the name hasn't been seen before.  The  name  of  a
       given,  known id may be retrieved using targetname.  The library prede‐
       fines FTtop, FTself, FTparent and FTblank.

       The library handles all text as Unicode strings (type Rune*).   Charac‐
       ter  set  conversion is provided by fromStr and toStr.  FromStr takes n
       Unicode characters from buf and converts them to the character set  de‐
       scribed  by  chset.   ToStr takes n bytes from buf, interpretted as be‐
       longing to character set chset, and converts them to a Unicode  string.
       Both  routines  null-terminate  the result, and use emalloc to allocate
       space for it.

   Items
       The return value of parsehtml is a linked list of  variant  structures,
       with the generic portion described by the following definition:

       typedef struct Item Item;
       struct Item
       {
             Item*    next;
             int      width;
             int      height;
             int      ascent;
             int      anchorid;
             int      state;
             Genattr* genattr;
             int      tag;
       };

       The  field  next  points  to the successor in the linked list of items,
       while width, height, and ascent are intended for use by the  caller  as
       part  of  the layout process.  Anchorid, if non-zero, gives the integer
       id assigned by the parser to the anchor that this item is in (see  sec‐
       tion  Anchors).  State is a collection of flags and values described as
       follows:

       enum
       {
             IFbrk =         0x80000000,
             IFbrksp =       0x40000000,
             IFnobrk =       0x20000000,
             IFcleft =       0x10000000,
             IFcright =      0x08000000,
             IFwrap =        0x04000000,
             IFhang =        0x02000000,
             IFrjust =       0x01000000,
             IFcjust =       0x00800000,
             IFsmap =        0x00400000,
             IFindentshift = 8,
             IFindentmask =  (255<<IFindentshift),
             IFhangmask =    255
       };

       IFbrk is set if a break is to be forced before placing this item.   IF‐
       brksp  is  set if a 1 line space should be added to the break (in which
       case IFbrk is also set).  IFnobrk is set if a break  is  not  permitted
       before the item.  IFcleft is set if left floats should be cleared (that
       is,  if  the  list of pending left floats should be placed) before this
       item is placed, and IFcright is set for right floats.  In  both  cases,
       IFbrk  is  also set.  IFwrap is set if the line containing this item is
       allowed to wrap.  IFhang is set if this item hangs into  the  left  in‐
       dent.   IFrjust is set if the line containing this item should be right
       justified, and IFcjust is set for center justified  lines.   IFsmap  is
       used  to  indicate that an image is a server-side map.  The low 8 bits,
       represented by IFhangmask, indicate the current hang into left  indent,
       in  tenths  of a tabstop.  The next 8 bits, represented by IFindentmask
       and IFindentshift, indicate the current indent in tab stops.

       The field genattr is an optional pointer to an auxiliary structure, de‐
       scribed in the section Generic Attributes.

       Finally, tag describes which variant type this item has.  It  can  have
       one of the values Itexttag, Iruletag, Iimagetag, Iformfieldtag, Itable‐
       tag,  Ifloattag  or  Ispacertag.  For each of these values, there is an
       additional structure defined, which includes Item as an unnamed initial
       substructure, and then defines additional fields.

       Items of type Itexttag represent a piece of text, using  the  following
       structure:

       struct Itext
       {
             Item;
             Rune* s;
             int   fnt;
             int   fg;
             uchar voff;
             uchar ul;
       };

       Here  s  is  a  null-terminated Unicode string of the actual characters
       making up this text item, fnt is the font number (described in the sec‐
       tion Font Numbers), and fg is the RGB encoded color for the text.  Voff
       measures the vertical offset from the baseline;  subtract  Voffbias  to
       get the actual value (negative values represent a displacement down the
       page).   The  field  ul is the underline style: ULnone if no underline,
       ULunder for conventional underline, and ULmid for strike-through.

       Items of type Iruletag represent a horizontal rule, as follows:

       struct Irule
       {
             Item;
             uchar align;
             uchar noshade;
             int   size;
             Dimen wspec;
       };

       Here align is the alignment specification (described in the correspond‐
       ing section), noshade is set if the rule should not be shaded, size  is
       the height of the rule (as set by the size attribute), and wspec is the
       desired width (see section Dimension Specifications).

       Items of type Iimagetag describe embedded images, for which the follow‐
       ing structure is defined:

       struct Iimage
       {
             Item;
             Rune*   imsrc;
             int     imwidth;
             int     imheight;
             Rune*   altrep;
             Map*    map;
             int     ctlid;
             uchar   align;
             uchar   hspace;
             uchar   vspace;
             uchar   border;
             Iimage* nextimage;
       };

       Here  imsrc  is  the  URL of the image source, imwidth and imheight, if
       non-zero, contain the specified width and height for the image, and al‐
       trep is the text to use as an alternative to the image, if the image is
       not displayed.  Map, if set, points to a structure describing an  asso‐
       ciated  client-side image map.  Ctlid is reserved for use by the appli‐
       cation, for handling animated  images.   Align  encodes  the  alignment
       specification  of  the  image.  Hspace contains the number of pixels to
       pad the image with on either side, and Vspace the padding above and be‐
       low.  Border is the width of the border to draw around the image.  Nex‐
       timage points to the next image in the document (the head of this  list
       is Docinfo.images).

       For items of type Iformfieldtag, the following structure is defined:

       struct Iformfield
       {
             Item;
             Formfield* formfield;
       };

       This  adds  a  single field, formfield, which points to a structure de‐
       scribing a field in a form, described in section Forms.

       For items of type Itabletag, the following structure is defined:

       struct Itable
       {
             Item;
             Table* table;
       };

       Table points to a structure describing the table, described in the sec‐
       tion Tables.

       For items of type Ifloattag, the following structure is defined:

       struct Ifloat
       {
             Item;
             Item*   item;
             int     x;
             int     y;
             uchar   side;
             uchar   infloats;
             Ifloat* nextfloat;
       };

       The item points to a single item (either a  table  or  an  image)  that
       floats  (the  text of the document flows around it), and side indicates
       the margin that this float sticks to; it is either ALleft  or  ALright.
       X  and  y  are reserved for use by the caller; these are typically used
       for the coordinates of the top of the float.  Infloats is used  by  the
       caller  to keep track of whether it has placed the float.  Nextfloat is
       used by the caller to link together all  of  the  floats  that  it  has
       placed.

       For items of type Ispacertag, the following structure is defined:

       struct Ispacer
       {
             Item;
             int   spkind;
       };

       Spkind  encodes  the  kind  of  spacer, and may be one of ISPnull (zero
       height and width), ISPvline (takes on height and ascent of the  current
       font),  ISPhspace  (has  the  width of a space in the current font) and
       ISPgeneral (for all other purposes, such as between markers and lists).

   Generic Attributes
       The genattr field of an item, if non-nil, points to  a  structure  that
       holds  the  values  of  attributes  not specific to any particular item
       type, as they occur on a wide variety of  underlying  HTML  tags.   The
       structure is as follows:

       typedef struct Genattr Genattr;
       struct Genattr
       {
             Rune*   id;
             Rune*   class;
             Rune*   style;
             Rune*   title;
             SEvent* events;
       };

       Fields id, class, style and title, when non-nil, contain values of cor‐
       respondingly  named  attributes  of  the  HTML tag associated with this
       item.  Events is a linked list of events (with  corresponding  scripted
       actions) associated with the item:

       typedef struct SEvent SEvent;
       struct SEvent
       {
             SEvent* next;
             int     type;
             Rune*   script;
       };

       Here,  next  points to the next event in the list, type is one of SEon‐
       blur, SEonchange,  SEonclick,  SEondblclick,  SEonfocus,  SEonkeypress,
       SEonkeyup,  SEonload, SEonmousedown, SEonmousemove, SEonmouseout, SEon‐
       mouseover, SEonmouseup, SEonreset, SEonselect,  SEonsubmit  or  SEonun‐
       load, and script is the text of the associated script.

   Dimension Specifications
       Some  structures include a dimension specification, used where a number
       can be followed by a % or a * to indicate percentage of total or  rela‐
       tive weight.  This is encoded using the following structure:

       typedef struct Dimen Dimen;
       struct Dimen
       {
             int kindspec;
       };

       Separate  kind and spec values are extracted using dimenkind and dimen‐
       spec.  Dimenkind returns one of Dnone, Dpixels, Dpercent or  Drelative.
       Dnone  means  that no dimension was specified.  In all other cases, di‐
       menspec should be called to find the absolute  number  of  pixels,  the
       percentage of total, or the relative weight.

   Background Specifications
       It  is  possible to set the background of the entire document, and also
       for some parts of the document (such as tables).  This  is  encoded  as
       follows:

       typedef struct Background Background;
       struct Background
       {
             Rune* image;
             int   color;
       };

       Image, if non-nil, is the URL of an image to use as the background.  If
       this  is  nil, color is used instead, as the RGB value for a solid fill
       color.

   Alignment Specifications
       Certain items have alignment specifiers taken from the  following  enu‐
       merated type:

       enum
       {
             ALnone = 0, ALleft, ALcenter, ALright, ALjustify,
             ALchar, ALtop, ALmiddle, ALbottom, ALbaseline
       };

       These  values  correspond  to  the various alignment types named in the
       HTML 4.0 standard.  If an item has an alignment of ALleft  or  ALright,
       the library automatically encapsulates it inside a float item.

       Tables,  and  the  various  rows, columns and cells within them, have a
       more complex alignment specification, composed of separate vertical and
       horizontal alignments:

       typedef struct Align Align;
       struct Align
       {
             uchar halign;
             uchar valign;
       };

       Halign can be one of ALnone, ALleft, ALcenter,  ALright,  ALjustify  or
       ALchar.   Valign can be one of ALnone, ALmiddle, ALbottom, ALtop or AL‐
       baseline.

   Font Numbers
       Text items have an associated font number (the fnt field), which is en‐
       coded as style*NumSize+size.  Here, style is one of FntR, FntI, FntB or
       FntT, for roman, italic, bold and typewriter font styles, respectively,
       and size is Tiny, Small, Normal, Large or Verylarge.  The total  number
       of  possible  font  numbers  is  NumFnt, and the default font number is
       DefFnt (which is roman style, normal size).

   Document Info
       Global information about an HTML page is stored in the following struc‐
       ture:

       typedef struct Docinfo Docinfo;
       struct Docinfo
       {
             // stuff from HTTP headers, doc head, and body tag
             Rune*       src;
             Rune*       base;
             Rune*       doctitle;
             Background  background;
             Iimage*     backgrounditem;
             int         text;
             int         link;
             int         vlink;
             int         alink;
             int         target;
             int         chset;
             int         mediatype;
             int         scripttype;
             int         hasscripts;
             Rune*       refresh;
             Kidinfo*    kidinfo;
             int         frameid;

             // info needed to respond to user actions
             Anchor*     anchors;
             DestAnchor* dests;
             Form*       forms;
             Table*      tables;
             Map*        maps;
             Iimage*     images;
       };

       Src gives the URL of the original source of the document, and  base  is
       the  base  URL.   Doctitle is the document's title, as set by a <title>
       element.  Background is as described in the section Background Specifi‐
       cations, and backgrounditem is set to be an image item  for  the  docu‐
       ment's  background  image (if given as a URL), or else nil.  Text gives
       the default foregound text color of the document,  link  the  unvisited
       hyperlink color, vlink the visited hyperlink color, and alink the color
       for  highlighting hyperlinks (all in 24-bit RGB format).  Target is the
       default target frame id.  Chset and mediatype are as for the chset  and
       mtype  parameters  to parsehtml.  Scripttype is the type of any scripts
       contained in the document, and is always TextJavascript.  Hasscripts is
       set if the document contains any scripts.  Scripting is  currently  un‐
       supported.   Refresh is the contents of a <meta http-equiv=Refresh ...>
       tag, if any.  Kidinfo is set if this document is a frameset  (see  sec‐
       tion Frames).  Frameid is this document's frame id.

       Anchors is a list of hyperlinks contained in the document, and dests is
       a  list  of  hyperlink  destinations within the page (see the following
       section for details).  Forms, tables and maps are lists of the  various
       forms,  tables  and  client-side maps contained in the document, as de‐
       scribed in subsequent sections.  Images is a  list  of  all  the  image
       items in the document.

   Anchors
       The library builds two lists for all of the <a> elements (anchors) in a
       document.   Each anchor is assigned a unique anchor id within the docu‐
       ment.  For anchors which are hyperlinks (the href  attribute  was  sup‐
       plied), the following structure is defined:

       typedef struct Anchor Anchor;
       struct Anchor
       {
             Anchor* next;
             int     index;
             Rune*   name;
             Rune*   href;
             int     target;
       };

       Next  points  to  the next anchor in the list (the head of this list is
       Docinfo.anchors).  Index is the anchor id; each item within this hyper‐
       link is tagged with this value in its anchorid field.   Name  and  href
       are  the  values  of the correspondingly named attributes of the anchor
       (in particular, href is the URL to go to).  Target is the value of  the
       target attribute (if provided) converted to a frame id.

       Destinations  within the document (anchors with the name attribute set)
       are held in the Docinfo.dests list, using the following structure:

       typedef struct DestAnchor DestAnchor;
       struct DestAnchor
       {
             DestAnchor* next;
             int         index;
             Rune*       name;
             Item*       item;
       };

       Next is the next element of the list, index is the anchor id,  name  is
       the  value of the name attribute, and item is points to the item within
       the parsed document that should be considered to be the destination.

   Forms
       Any  forms  within  a  document  are  kept  in  a   list,   headed   by
       Docinfo.forms.  The elements of this list are as follows:

       typedef struct Form Form;
       struct Form
       {
             Form*      next;
             int        formid;
             Rune*      name;
             Rune*      action;
             int        target;
             int        method;
             int        nfields;
             Formfield* fields;
       };

       Next  points  to  the next form in the list.  Formid is a serial number
       for the form within the document.  Name is the value of the form's name
       or id attribute.  Action is the value of any action attribute.   Target
       is the value of the target attribute (if any) converted to a frame tar‐
       get  id.   Method  is  one  of HGet or HPost.  Nfields is the number of
       fields in the form, and fields is a linked list of the actual fields.

       The individual fields in a form are described by the  following  struc‐
       ture:

       typedef struct Formfield Formfield;
       struct Formfield
       {
             Formfield* next;
             int        ftype;
             int        fieldid;
             Form*      form;
             Rune*      name;
             Rune*      value;
             int        size;
             int        maxlength;
             int        rows;
             int        cols;
             uchar      flags;
             Option*    options;
             Item*      image;
             int        ctlid;
             SEvent*    events;
       };

       Here,  next points to the next field in the list.  Ftype is the type of
       the field, which can be one of  Ftext,  Fpassword,  Fcheckbox,  Fradio,
       Fsubmit, Fhidden, Fimage, Freset, Ffile, Fbutton, Fselect or Ftextarea.
       Fieldid  is a serial number for the field within the form.  Form points
       back to the form containing this field.  Name, value, size,  maxlength,
       rows  and  cols  each contain the values of corresponding attributes of
       the field, if  present.   Flags  contains  per-field  flags,  of  which
       FFchecked and FFmultiple are defined.  Image is only used for fields of
       type Fimage; it points to an image item containing the image to be dis‐
       played.   Ctlid is reserved for use by the caller, typically to store a
       unique id of an associated control used to implement the field.  Events
       is the same as the corresponding field of the generic attributes  asso‐
       ciated  with  the  item containing this field.  Options is only used by
       fields of type Fselect; it consists of a list of possible options  that
       may be selected for that field, using the following structure:

       typedef struct Option Option;
       struct Option
       {
             Option* next;
             int     selected;
             Rune*   value;
             Rune*   display;
       };

       Next  points  to the next element of the list.  Selected is set if this
       option is to be displayed initially.  Value is the value to  send  when
       the  form  is  submitted  if  this  option is selected.  Display is the
       string to display on the screen for this option.

   Tables
       The library builds a list of all the tables in the document, headed  by
       Docinfo.tables.  Each element of this list has the following format:

       typedef struct Table Table;
       struct Table
       {
             Table*       next;
             int          tableid;
             Tablerow*    rows;
             int          nrow;
             Tablecol*    cols;
             int          ncol;
             Tablecell*   cells;
             int          ncell;
             Tablecell*** grid;
             Align        align;
             Dimen        width;
             int          border;
             int          cellspacing;
             int          cellpadding;
             Background   background;
             Item*        caption;
             uchar        caption_place;
             Lay*         caption_lay;
             int          totw;
             int          toth;
             int          caph;
             int          availw;
             Token*       tabletok;
             uchar        flags;
       };

       Next  points  to  the next element in the list of tables.  Tableid is a
       serial number for the table within the document.  Rows is an  array  of
       row specifications (described below) and nrow is the number of elements
       in  this  array.  Similarly, cols is an array of column specifications,
       and ncol the size of this array.  Cells is a list of all  cells  within
       the  table  (structure described below) and ncell is the number of ele‐
       ments in this list.  Note that a cell may  span  multiple  rows  and/or
       columns,  thus  ncell may be smaller than nrow*ncol.  Grid is a two-di‐
       mensional array of cells within the table; the cell at row i and column
       j is Table.grid[i][j].  A cell that spans multiple rows and/or  columns
       will  be  referenced by grid multiple times, however it will only occur
       once in cells.  Align gives the alignment specification for the  entire
       table,  and  width  gives the requested width as a dimension specifica‐
       tion.  Border, cellspacing and cellpadding give the values of the  cor‐
       responding attributes for the table, and background gives the requested
       background for the table.  Caption is a linked list of items to be dis‐
       played  as the caption of the table, either above or below depending on
       whether caption_place is ALtop or  ALbottom.   Most  of  the  remaining
       fields  are  reserved  for use by the caller, except tabletok, which is
       reserved for internal use.  The type Lay is not defined by the library;
       the caller can provide its own definition.

       The Tablecol structure is defined for use by the caller.   The  library
       ensures  that the correct number of these is allocated, but leaves them
       blank.  The fields are as follows:

       typedef struct Tablecol Tablecol;
       struct Tablecol
       {
             int   width;
             Align align;
             Point pos;
       };

       The rows in the table are specified as follows:

       typedef struct Tablerow Tablerow;
       struct Tablerow
       {
             Tablerow*  next;
             Tablecell* cells;
             int        height;
             int        ascent;
             Align      align;
             Background background;
             Point      pos;
             uchar      flags;
       };

       Next is only used during parsing; it should be ignored by  the  caller.
       Cells  provides  a list of all the cells in a row, linked through their
       nextinrow fields (see below).  Height, ascent and pos are reserved  for
       use  by  the caller.  Align is the alignment specification for the row,
       and background is the background to use, if specified.  Flags  is  used
       by the parser; ignore this field.

       The individual cells of the table are described as follows:

       typedef struct Tablecell Tablecell;
       struct Tablecell
       {
             Tablecell* next;
             Tablecell* nextinrow;
             int        cellid;
             Item*      content;
             Lay*       lay;
             int        rowspan;
             int        colspan;
             Align      align;
             uchar      flags;
             Dimen      wspec;
             int        hspec;
             Background background;
             int        minw;
             int        maxw;
             int        ascent;
             int        row;
             int        col;
             Point      pos;
       };

       Next is used to link together the list of all cells within a table (Ta‐
       ble.cells),  whereas  nextinrow  is used to link together all the cells
       within a single row (Tablerow.cells).  Cellid provides a serial  number
       for  the  cell within the table.  Content is a linked list of the items
       to be laid out within the cell.  Lay is reserved for the  user  to  de‐
       scribe how these items have been laid out.  Rowspan and colspan are the
       number  of  rows and columns spanned by this cell, respectively.  Align
       is the alignment specification for the cell.  Flags is some combination
       of TFparsing, TFnowrap and TFisth or'd  together.   Here  TFparsing  is
       used  internally  by the parser, and should be ignored.  TFnowrap means
       that the contents of the cell should not be wrapped if they  don't  fit
       the  available  width,  rather, the table should be expanded if need be
       (this is set when the nowrap attribute is supplied).  TFisth means that
       the cell was created by the <th> element (rather  than  the  <td>  ele‐
       ment),  indicating  that  it  is a header cell rather than a data cell.
       Wspec provides a suggested width  as  a  dimension  specification,  and
       hspec  provides a suggested height in pixels.  Background gives a back‐
       ground specification for the individual cell.  Minw, maxw,  ascent  and
       pos are reserved for use by the caller during layout.  Row and col give
       the  indices  of  the row and column of the top left-hand corner of the
       cell within the table grid.

   Client-side Maps
       The library builds a list of client-side maps, headed by  Docinfo.maps,
       and having the following structure:

       typedef struct Map Map;
       struct Map
       {
             Map*  next;
             Rune* name;
             Area* areas;
       };

       Next  points  to  the next element in the list, name is the name of the
       map (use to bind it to an image), and areas is  a  list  of  the  areas
       within the image that comprise the map, using the following structure:

       typedef struct Area Area;
       struct Area
       {
             Area*  next;
             int    shape;
             Rune*  href;
             int    target;
             Dimen* coords;
             int    ncoords;
       };

       Next  points to the next element in the map's list of areas.  Shape de‐
       scribes the shape of the area, and is one of SHrect,  SHcircle  or  SH‐
       poly.   Href  is the URL associated with this area in its role as a hy‐
       pertext link, and target is the target frame it should  be  loaded  in.
       Coords  is  an  array  of coordinates for the shape, and ncoords is the
       size of this array (number of elements).

   Frames
       If the Docinfo.kidinfo field is set, the document is  a  frameset.   In
       this  case,  it  is  typical for parsehtml to return nil, as a document
       which is a frameset should have no actual items that need  to  be  laid
       out  (such  will  appear only in subsidiary documents).  It is possible
       that items will be returned by a malformed document; the caller  should
       check for this and free any such items.

       The  Kidinfo  structure  itself reflects the fact that framesets can be
       nested within a document.  If is defined as follows:

       typedef struct Kidinfo Kidinfo;
       struct Kidinfo
       {
             Kidinfo* next;
             int      isframeset;

             // fields for "frame"
             Rune*    src;
             Rune*    name;
             int      marginw;
             int      marginh;
             int      framebd;
             int      flags;

             // fields for "frameset"
             Dimen*   rows;
             int      nrows;
             Dimen*   cols;
             int      ncols;
             Kidinfo* kidinfos;
             Kidinfo* nextframeset;
       };

       Next is only used if this structure is part of a  containing  frameset;
       it points to the next element in the list of children of that frameset.
       Isframeset  is set when this structure represents a frameset; if clear,
       it is an individual frame.

       Some fields are used only for framesets.  Rows is an array of dimension
       specifications for rows in the frameset, and nrows  is  the  length  of
       this  array.   Cols  is  the corresponding array for columns, of length
       ncols.  Kidinfos points to a list of components contained  within  this
       frameset,  each of which may be a frameset or a frame.  Nextframeset is
       only used during parsing, and should be ignored.

       The remaining fields are used if the structure describes a frame, not a
       frameset.  Src provides the URL for the document that  should  be  ini‐
       tially  loaded  into this frame.  Note that this may be a relative URL,
       in which case it should be interpretted using the containing document's
       URL as the base.  Name gives the name of the frame, typically  supplied
       via  a  name  attribute in the HTML.  If no name was given, the library
       allocates one.  Marginw, marginh and framebd are the values of the mar‐
       ginwidth, marginheight and frameborder attributes, respectively.  Flags
       can contain some combination of the following:  FRnoresize  (the  frame
       had  the  noresize attribute set, and the user should not be allowed to
       resize it), FRnoscroll (the frame should not  have  any  scroll  bars),
       FRhscroll  (the  frame  should have a horizontal scroll bar), FRvscroll
       (the frame should have a vertical scroll bar), FRhscrollauto (the frame
       should be automatically given a horizontal scroll bar if  its  contents
       would  not otherwise fit), and FRvscrollauto (the frame gets a vertical
       scrollbar only if required).

SOURCE
       /sys/src/libhtml

SEE ALSO
       fmt(1)

       W3C World Wide Web Consortium, ‘‘HTML 4.01 Specification''.

BUGS
       The entire HTML document must be loaded into memory before  any  of  it
       can be parsed.

                                                                       HTML(2)