COLLGUI(1)							   COLLGUI(1)
NAME
  collgui - a graphical front-end for collect
SYNOPSIS
  collgui [-d] [-vga] [-size ] [ [ [...]]]
FLAGS
  -vga
      allows collgui to be used on a VGA-sized (640x480) display.
  -size 
      sets the fontsize to , which influences size of window.  The
      default size is 12.
  -d  Turns on debugging output
DESCRIPTION
  collgui is intented to help evaluate data collected by collect.  It
  operates as a go-between among collect, cfilt, and gnuplot.  That is why it
  is a good idea to understand cfilt, especially if you want to do compli-
  cated or non-standard things.	 Help on cfilt can be optained by reading the
  cfilt(1) manpage or running cfilt -h.
  collgui automates the extraction of information from a binary data file
  written by collect, and feed it to gnuplot to produce a, hopefully, useful
  graphical rendition of the data.  It saves you having to type the same
  stuff repeatedly.
  collgui, and, more importantly, cfilt, are dogs.  They just don't run very
  fast.	 As far as collgui is concerned, speed is not important -- it just
  take a while to start! However cfilt must wade through oodles of text to
  get the data it wants.  Perl is very nice for wading through data, though.
  I also realize that cfilt is unreadable and very much of a hack, however
  I've made a pretty serious attempt to make sure that it does what you tell
  it to.
  As you will see, collgui offers two different methods for selecting samples
  for graphing. Close to the top, you can set the START: and END: times.
  These arguments get passed to collect, so, if your data-file contains lots
  of samples, but you only want a fraction of them, using START: and END:
  will substantially speed up your extraction because the selection is han-
  dled by collect itself.  Further down, you can set an X Range for gnuplot.
  This has more-or-less the same effect as setting START: and END:, but col-
  lect provides all samples, and cfilt must extract for all selected subsys-
  tems from this data.	This is slower.	 The difference is where the time-
  selection is done.  Of course, you may want to set START: and END:, and
  still set the X Range in order to give gnuplot explicit instructions as to
  what should be displayed -- gnuplot tends to want to use round numbers for
  the beginning and end of ranges.
  When you save a user-defined setting/configuration, a unique ID is saved
  with it, consisting of filename (no path) plus file size.  When you recall
  this setting, if the unique ID of your current 'open' data-file matches the
  saved one, things like START:, END:, X-range, Y-range, average samples, X-
  units, and 'samples w/process data' are also restored.  If the unique ID's
  don't match, then only the subsystem settings are restored.
SELECTION MECHANISMS
  There are certain 'features' of collgui that are worthy of mention.  In
  particular, the mechanism by which one of many objects (such as LSM
  Volumes, Disks, Tapes, Single CPUs) can be selected is a bit particular.
  If there are less than a fixed number of objects (~30), an MenuButton is
  created (when 'Add' is pressed' a vertical list is presented).  If the the
  number is greater than this constant, a separate window is created with a
  listbox containing all possible objects.  A double-click on an object in
  the listbox will add it to the selection listbox.
  The selection mechanism for processes deserves special mention: this is
  always a separate window with a listbox and a slider marked 'sample' and a
  button marked 'List Processes' next to it.  Using the slider, a sample
  (record) can be selected from the collection period, and when Double-
  clicking on a process will enter its PID in the selection listbox.
  Currently, it is only possible to select processes using their PIDs.	In
  the future it may be possible to select using usernames or commands.	At
  the top of each column is a button which turns red when the mouse is over
  it.  Pressing the button will cause the list to be sorted using the values
  in the button's column.
INTERFACE
  A look at the main window of collgui, from top to bottom:
  Menu
  File
  Open	    'opens' a collect binary data file and gets some info from it
  Exit	    guess!
  Options
  Legend Position
	    set the position of the labels for the various lines graphed
  X Axis Label
	    Label (or not) the X Axis. This can save valuable graph real
	    estate.
  X Time Format
	    controls the format of the X-Axis tic-mark labels.
  Set Y Label
	    allows user to specify the label for the Y-Axis, rather than
	    using the default, "KB/Transfers/Packets/Pages/etc".
  Image Format
	    allows users to chose JPEG, PPM (Portable Pixmap - Color), or PBM
	    (Portable Bitmap - Grayscale).
  Settings
  Save	    saves the current configuration -- that is, all the information
	    needed to reproduce the current graph.  The information is saved
	    in $HOME/.collguirc and read in on start.
  Delete    you are presented a list of user-defined settings. Double-
	    clicking on an entry will remove it from the list, clicking on
	    'commit' will save your changes.
  Built In  I have defined some basic settings for looking at data, not
	    necessarily very useful.
  User Defined
	    Here you can choose settings previously stored.
  FILE indicator, START: and END:
  The FILE indicator shows you the name of the currently 'open' collect
  binary data file.
  the START: and END: areas allows you to specify a time-range for samples to
  be extracted from the binary data file.  They are set to the times of the
  first and last sample respectively (i.e., the whole run) when you open a
  file.	 The 'RESET' button at the bottom of the window will restore these to
  their default values.	 These values are passed directly to collect during
  playback, so this is the fastest method for extracting a sub-range from
  your collection period.
  X: From: To: and Y: From: To:
  These areas allow you to specify gnuplot X and Y ranges.  The X entry-
  windows are wider the those of Y because you can specify a time when 'time'
  is chosen for X Axis Units.
  Average Samples and X Units:
  The average samples area allows you to cause cfilt to take an average over
  N samples. That is, for N samples read from the binary data file, cfilt
  will produce 1 output line with average values.  X Units allows you to
  specify either 'time' or 'samples' for the horizontal axis of the graph.
  Samples w/Process Data:
  This solves the problem of graphing intermittently gathered process data
  against constantly gathered other data.  For example, if you specified an
  interval for collect of -i1,4, thereby collecting process data only every
  four seconds, and you try to graph that against, say cpu idle time, which
  was gathered every second, you will get zeros for the process data for sam-
  ples in which no process data appears.  Clicking on the 'Samples w/Process
  Data' checkbutton will cause, via 'cfilt -p', only samples with process
  data to be used.  In the above example, it would be the equivalent of tak-
  ing every 4th sample (actually it would mean takings samples
  #1,5,9,13,17,etc).
  Subsystem Widgets:
  The first 6 subsystems have multiline output from collect.  Therefore it
  possible to select using certain criteria (see cfilt).  The last 2 only
  have a single output line from collect.  When you 'open' a binary data
  file, the 'Add' and which data was collected or that exist. Otherwise
  'Expressions' will be grayed-out, and the 'Add' button will say 'No Data'.
  The widget controls do nothing more that give you a graphical front-end to
  cfilt.  The 'Expressions' menu offers a confortable way of choosing cfilt
  expressions.	The 'sum' checkbutton corresponds to the '+' sign at the end
  of the subsystem name in an expression.  The values in the listbox
  correspond to the selection criterion for cfilt, that is
  =,,...,.  You are perfectly welcome to type in
  the expression window itself.	 If you specify an illegal expression, you
  will get an error message.
  SUBSYSTEM EXPRESSIONS OF NOTE
  Single CPU
  SMP Stack This will create a graph consisting of a set of horizontal
	    'bands', one per CPU.  It is recommended to select all CPUs when
	    using this display option.
  CPU Summary
  State Stack
	    This graphs four values for all CPUs: USER+SYS+IDLE+WAIT,
	    USER+SYS+IDLE, USER+SYS, and USER, as in 'SMP Stack' under 'Sin-
	    gle CPU'.
  DISPLAY, PRINT, {JPEG|PPM|PBM}, and RESET buttons:
  DISPLAY will graph your selections to the gnuplot 'x11' output device,
  whereas PRINT will set the device to 'postscript', and ask you for an out-
  put filename (which may be "|lpr -Pfoobar", to route directly to printer).
  You can also set the environment variable COLLGUI_PRINT to a default you
  like (you still get prompted for a filename, but get the contents of this
  variable COLLGUI_PRINT as default).  The {JPEG|PPM|PBM} button produces an
  image file in the corresponding format.  RESET clears most settings, and
  sets the rest to reasonable startup values (like START: and END: time).
QUICK START
  Here, a quick guide for those who want to jump in without looking a the
  cfilt readme/help:
  Let's take 'Disks' for an example.  If you click on 'Expressions' and
  select 'KB/Sec', and then, without selecting any specific disks, click on
  the 'DISPLAY' button, you will get the TOTAL KiloBytes/Second throughput
  for all disks for which data was collected. Data is totalled because the
  list on the right is empty.  cfilt assumes, since you have not selected any
  particular disk(s), you want a grand total.  If, however, you now add 'rz0'
  and 'rz1' (assuming these disks exist on your system, and you collected
  data for them), you will now get two lines graphed, KB/Sec for rz0 and
  KB/Sec for rz1.  Now if you click on Expressions and select '%Busy', you
  will get 4 lines: 'KB/Sec' and '%Busy' for rz0, and 'KB/Sec' and '%Busy'
  for rz1.  Now if you click on the 'sum' Checkbutton, (and the DISPLAY), you
  will get only 2 lines this time: 'KB/Sec' for rz0+rz1, and '%Busy' for
  rz0+rz1. 'sum' sums over all objects in the listbox, or over all objects
  for which data was collected if no specific object has been selected (i.e.,
  the listbox is empty).
  It is sometimes useful to graph dissimilar data together, for example cpu
  idle and disk KB/sec.	 Using gnuplot (at the moment) one only has one vert-
  ical scale.  In order to get such incongruous data together in a reasonable
  fashion on the same graph, data may have to be 'normalized' ('scaled' to
  fit into a particular range, typically 0-100).  Sticking a percent sign
  ('%') on the end of an expression will cause this data to be normalized.
  For all reasonable expression possibilities, I have offered 'Normalized'
  and 'Raw' options. The only difference is the '%' on the end.	 You can also
  choose the end of the normalized range yourself by giving that value after
  the '%', for example: disk:rkb/s+wkb/s%150
X RESOURCES
  collgui relies on the default colors of Tk, which is usually OK.  However,
  under CDE there are problems.	 If you are have the problem that you can see
  any text in the entry widgets, try putting:
  Collgui*foreground:	  black
  Collgui*background: white
  in your ~/.Xdefaults file.  now merge this change into your in-memory
  resource database using:
  xrdb -merge ~/.Xdefaults
  That should do it.
ENVIRONMENT VARIABLES
  COLLGUI_PRINT
	    default printer file, can also be "|lpr -lpr -flags -here" (see
	    gnuplot help "set output")
  COLLECT   the name/path of collect, if not collect, or if collect is not in
	    your path (for example 'collect3' or '/usr/foo/bin/collect4')
  CFILT	    the name/path for cfilt, if not cfilt, or if cfilt is not in your
	    path
  GNUPLOT   the name/path for gnuplot, if not gnuplot, or if gnuplot is not
	    in your path
  CJPEG	    the name/path for the cjpeg program, which is used to convert PPM
	    image files to JPEG.