Voigt profile fitting program

Copyright (C) 1992 - 2008   R.F.Carswell, J.K.Webb


Last update: 05 Sept 2011


VERSIONS 9.5 & 10.0





This program is free software, distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. For a copy of the GNU General Public License see the information which comes e.g. with the GNU Emacs editor, or write to the Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.

Contact: R.F.Carswell, Institute of Astronomy, Madingley Rd, Cambridge CB3 0HA, UK
  rfc(at)ast.cam.ac.uk

There are versions of the documentation which does not involve navigating through this series of unstructured web pages, and is certainly more up-to-date (just look at the web page dates!). You should look at vpfit9.5.pdf, which is also contained in the gzipped tarfile with the source code, or, if you are using the current (slowly evolving) version, vpfit10.pdf. There are certainly some things there which are not covered in these web pages, and it is possible, but increasingly unlikely, that the web pages have some things which are not in the .pdf documentation. If that is true, then anything which appears only here may not work - it won't have been tested for quite a while!

And there is probably a version which is still evolving (slowly), with a few enhancements and minor bugs corrected. Try version 10.0. Documentation for VPFIT10 and RDGEN10 is also in the public ftp area.

CONTENTS & LINKS:
 INTRODUCTION
 COPYING THE CODE
 ENVIRONMENT VARIABLES
 DATA REQUIRED
 OUTPUT FILES
 INTERACTIVE USE
    Single Features, (An example from UVES)
    Complex features
    Multiple lines
 RUNNING FROM FILE
 CONTINUUM ABSORPTION
 CONSTRAINTS ON LINE PARAMETERS
 ZERO LEVEL
 CONTINUUM LEVEL
 UPPER LIMITS
 OPTIONAL EXTRAS
 EMISSION LINES
 ADDING EXTRA LINES
 AUTO CONTINUUM
 OTHER FEATURES & IMPROVEMENTS
 .. AND FINALLY

INTRODUCTION:The VPFIT program enables you to fit multiple Voigt profiles (convolved with the instrument profiles) to spectroscopic data. If you don't want to do this, read no further. If you want to display data, play around with data and models, try "chi-by-eye" fits, display the result of a proper fit, do pretty plots, etc., you might prefer a program written by Joe Liske called vpguess. The vpguess program also has the great merit of providing initial estimates for  the VPFIT program.

Before you embark on Voigt profile fitting with VPFIT, the program needs quite a bit of information. The most obvious is data file(s). with wavelengths and data values. These may be ASCII files, FITS files or IRAF .imh and .pix files. Since it fits absorption lines it also needs continuum estimates, created using IRAF 'continuum' (or equivalent) in the same format as the data, to put the model lines on to compare with the data. It convolves the model profiles with the instrument profile for comparison with the data, so you do need estimates of the instrument profile. Depending on observing conditions and slit width, this may be adequately approximated by gaussian fits to arc lines which have been extracted and processed in the same way as the data, so you need to know the instrumental profile width in the regions of interest, though it will take fits to the resolution as a function of wavelength, or even pixel by pixel instrument profiles if you want. The other thing which is vital an array of  data error estimates in the same format as the data, so that an objective goodness of fit assessment may be made. There are additional minor pecularities of various datasets which have to be accommodated, but that covers the important ones.

Basic use involves marking off regions of the data to be fitted, and putting in first guesses for the line parameters for the absorption lines in these regions. Generally this is most easily done by marking things off with a cursor. You can fit several lines from a single ion, or several different ions which may overlap in various regions, simultaneously.

When the program is run it first looks for an atomic data file (which is provided), and then starts to ask obscure questions about what you might want to do. You have to tell it which spectra you want to fit, which regions of those spectra, and what ions you want to use with initial guesses at the column density, Doppler parameter and redshift for each system. The program then puts all lines in from those ions in the regions you have selected, and wanders off and finds a minimum chi-squared fit, displays statistics for this fit, and allows you to look at the results. About the only restriction is that at least one line from each ion selected at whatever guessed redshift is within at least one wavelength region, otherwise there is no constraint on that ion at all. If the fit is unacceptable, you can start again with more redshift systems and see if it gets any better, or use an option to add lines one at a time at the most discrepant parts of the fit so far (see below).

For example data, you might use the Hubble Deep Field South QSO echelle spectrum from the AAT, which, despite being obtained sometime in a previous millenium, is still available at http://www.aao.gov.au/hdfs. Alternatively, later versions of the vpfit tarfile may have a gzipped example data file for you to try. It is probably worth your going through fitting some of the lines independently (solutions are published for all) before embarking on a new dataset.

BUT NOTE: Inevitably, there are physical and programming assumptions made which can result in restrictions to the range of validity of the program. Some of the ones we have found so far are listed here. That does not mean there are no more ... [top]

COPYING THE CODE
Inevitably, the VPFIT program evolves, and you should make sure the description you are using apply to the version you have. This general description should apply for all versions in the anonymous ftp space pub/rfc on ftp.ast.cam.ac.uk. The tarfile comes with two .pdf files which (still somewhat unevenly) supersede the online documentation which has become somewhat out of control. The main gain with vpfit9.5.pdf and rdgen9.5.pdf is that they have tables of contents and each has an index to help you find things, which might be easier than chasing obscure nested links through web pages. It also comes with its own sample data file which replaces the very out-of-date one above. Otherwise the main difference between the two is that almost all operations in vpfit9 use double precision. There may be more recent versions (normally with 'dev' somewhere in the filename) in the ftp space as well. You are welcome to try them, but they come with a bigger health warning than the others...

For those who have older versions an earlier description may be more appropriate. I hope and believe things are upward compatible through versions, so anything which works on an older one should work with a later version, and give close to the same answers. The tarfile also contains the RDGEN program for general spectral plotting, modifying, visualizing and interfacing with VPFIT. It is just a collection of routines which I've found useful. If you don't want it ignore it, otherwise try looking at the rudimentary documentation.

After you gunzip and extract the tarfile, you will need to edit the makefile to reflect where the PGPLOT and CFITSIO libraries are held. Since different incantations are used in different places to access these the makefile shows the IoA version (where PGPLOT can migrate, but is aliased),  the ESO version (now somewhat dated, where PGPLOT and CFITSIO are in /scisoft), and a linux version where they are in some arbitrary area. Thus, at least on the machines I use 'make vpsol' makes a Solaris version on the IoA cluster, 'make vpfit' used to make the same thing at ESO, and 'make vpflx' makes the Linux version. You made need to change the various system libraries in the makefile to reflect what is normally used for a Fortran compilation at your site. No other external libraries are needed. For RDGEN 'make rdsol', 'make rdgen' and 'make rdglx' give the corresponding versions.

Makefile will yield the executable code (after you get it right). At this stage it is worth trying to run it just to see if there are any problems, like the instant response: 'vpfit: Not enough memory'. If this happens, then ask to have your swap space increased. Or you could edit the 'vp_sizes.f' file to have smaller array sizes - in the default version they are large enough to handle anything I am aware of. In particular the data arrays are 208000 pixels long, which hopefully is enough.

ENVIRONMENT VARIABLES
 The program uses PGPLOT for all graphics, and PGPLOT programs read a binary font file at run time. They need to be told where to find this file by environment variable PGPLOT_FONT (or maybe PGPLOT_DIR). You can define this in your .cshrc file. setenv PGPLOT_FONT /usr/local/vlb/pgplot/grfont.dat .. or wherever these are in your system. If the variable is not defined correctly, you will get a warning message at run time, e.g. %PGPLOT, Unable to read font file: /usr/local/vlb/pgplot/junk The program will continue, but all alphanumeric characters will be omitted from the plot.

Other environment variables are  ATOMDIR, which should point to the atomic data file (atom.dat, provided), and optionally for setup files for various program parameters (VPFSETUP) and to override display defaults (VPFPLOTS)
[top]

DATA REQUIREMENTS

The data can be either in FITS,  IRAF .imh and .pix form, or in an ASCII file.
 

FITS and IRAF Data:

1. data file in IRAF 1-d or echelle format, as <file>.fits or <file>.imh and .pix.

2. Wavelength values. These can be:

.. Coefficients as Chebyschev polynomials or linear coefficients. These can be held in the header as IRAF v2.10 coefficients stored as WAT2_nnn character strings, in a separate file in database/ec<NAME> in a two dimension Chebyschev form as from IRAF (echelle) ecidentify, or, for historical reasons, as linear coeffts under APNUMn (lam0=#3, dlam=#4). [Note that the internal coeffts refer to channel zero and increment, while IRAF uses channel 1 and delta lambda. This is sorted out when the coeffts are read by the program.] Uniform wavelength bins are supported (CVAL1 as the wavelength of the first pixel, and CDELT1 the wavelength increment per pixel in the header).

.. A table containing a wavelength for each pixel in a data-like .fits (or .imh/.pix) file is also OK, and the file containing these wavelengths is specified by the filename associated with WAVFILE in the data file header.  The values are just the wavelength for the corresponding pixel, so it is just a one-dimensional array.

.. Velocity binned wavelengths are also OK. WCFTYPE in header has values 'velocity' (or 'VELOCITY', or 'VE<anything>) if uniform velocity bins are used, in which case only the first two coefficients are used. wavelength = cft1*(1.0+cft2)**n for the center of the nth channel. Note that cft2 is v/c, NOT v! If WCFTYPE is absent then the wavelengths will be calculated as linear values and you might wonder why you have strange numbers.

3. Error file, in same format as data file. This can have any name, and is searched for first as the value of SIGFILE in the data header (if you have set SIGFILE in the header of course). If that fails, then it tries <NAME>.var, <NAME>.sig, <NAME>.err before giving up and prompting. If you don't have an error file, then a <CR> response to the prompt here results in sigma(i)=sqrt(max(data(i),0.2*mean(data))) for each channel -- this is not desperately useful, but is better than nothing. If the error file is absent, you will have to be very lucky (or unlucky) to get chi-squared values which look at all sensible. There is no provision for a quality array, but bad data pixels may be flagged by giving them a negative error (or variance), and then the corresponding data is ignored entirely.

4. Continuum file, as for data, specified by CONTFILE in header, or, if that is absent, <name>.cont. It prompts if that fails, and a <CR> response to the prompt gives a unit continuum (useful for renormalized data). If you have normalized your data to the continuum, then CONTFILE set to 'unity' (or 'UNITY') stops that prompt, and gives unit continuum (so don't call a data file 'unity', it will never find it).

5. Polynomial fit to resolution FWHM (in A) vs wavelength (also in A) in double precision ASCII file, specified by RESFILE in data header. File should be in database/(resname). Often you don't want to be as detailed as this, so instead in the header you can enter RESVEL nnn, where nnn is the instrumental FWHM in km/s. This value is taken if RESFILE is absent. If both are absent (or the file corresponding to RESFILE does not exist) you are prompted for a FWHM (in km/s or A).

6. Heliocentric correction (if required) in data header (in km/s). The VHELIO value from the IRAF rvcorrect task is sought first (the wavelengths are multiplied by 1+v/c). If that is absent, then it tries HELVEL and then the wavelengths are multiplied by (1-v/c) [for those who like a different sign convention!]. If neither is present, zero is assumed. Several extraction packages do the heliocentric correction for you -- in such cases VHELIO and HELVEL should be absent, or have the variable zero associated with them.

7. Sigma scale file for corrections due to rebinning of the data (if necessary). SCLFILE in data header, again database/(sclname) or in the header as SIGSCLn=(real number). If neither are present, unity is assumed (i.e. the scaling appropriate for independent channels). If you are reading ASCII data this option is not yet available.

8. The program assumes vacuum wavelengths for everything. If they are not, then an item AIR in the header will result in an internal air -> vacuum conversion automatically. This is not yet done for the database/ 2D Chebyschev polynomials made by IRAF. If you have an item 'AIR' anything (like temperature, some observatories put all sorts of rubbish in the header), and have vacuum wavelengths, then an item 'VACUUM' anywhere (with any value or text string) will force the use of vacuum wavelengths.

9. Other header items: REFSPEC1 -- wavelength coefficients are picked up from this file if it exists, but only as a last resort if the standard IRAF wavelength header is absent.

10. An atomic data file, specified by the environment variable ATOMDIR. If that is not set, then, unless you have something called /home1/rfc/atom.dat (which it will then try to read), it asks for a filename. This is vital.

The downloadable VPFIT tarfile contains an atomic data file. You should check that wavelengths and oscillator strengths are up-to-date. Included in the tarfile there is a standalone Fortran program (atomtab.f) for using Jason Prochaska's most recent atomic data and converting it to VPFIT format. There are two versions of atomic data Morton_atom.dat from Morton (2003) and Prochaska_atom.dat has the version of his data available at the time.

         [top]
 

ASCII DATA:

The data may be in an ASCII table consisting of up to four columns of  ascending wavelength, spectrum, error estimate, continuum. It is assumed that the wavelengths are vacuum, heliocentric corrected, and any scaling of the error estimate has been applied. You can specify the spectral resolution by inserting a line 'resvel nnn' anywhere in the file, where nnn is the FWHM in km/s. The input terminates on detecting the end-of-file. Sample input might look like:

resvel 6.6
4990.  0.946703 0.03102309
4990.04  0.9681857 0.03154853
4990.08  0.9858243 0.03178264
4990.12  0.9543419 0.03165093
4990.16  0.918863 0.03112572
4990.2  0.9460486 0.03140076
4990.24  0.9262011 0.03116661
4990.28  0.9371514 0.03115548
4990.32  0.9425622 0.03101928
........

Note that this example has only three columns. If the continuum column is blank, then the continuum is taken to be unity.

IMPORTANT: If you are using this option, it is vital to make sure that the wavelength array length is large enough to hold a table of wavelengths. Edit vp_sizes.f before compiling so that the last line of the section

* MAXimum Wavelength Coefficient Order (and resolution polynomial!)
* This is also the array size for wavelength tables, so should be
* the same  maxfis if that is the way the wavelengths are stored.
      integer maxwco
      parameter ( maxwco = 30 )

becomes

      parameter ( maxwco = 125000 )

(where 125000 is an example maxfis).
 

FURTHER WARNING: With wavelength tables like this you could in principle store things in any order. Please don't - the program assumes an ordered sequence of wavelengths. The other thing you could do is have large gaps in the data, like finish at 5000 and restart at 8000 because you don't see any point in keeping information in between. This is not handled at all either, and it is better to keep the data as separate files. VPFIT is fine with files from all over, but does not handle gaps at all well (since in generating narrow profiles it resamples the pixels to make the Voigt profile, and assumes slow changes in pixel width as you go from one pixel to the next, estimating the width by looking at the wavelength midpoints between pixel centres).

        [top]
 

INTERACTIVE USE:

Single Features
Get into the directory with the data in it (otherwise you'll wind up typing pathnames a lot), and start the program. Then respond to the prompts, which may or may not have obvious answers. Where the answer is not obvious, the a carriage-return normally gives something sensible.

If you have copied the Hubble Deep Field quasar spectrum there, then you might like to try using it to see how things work in practice. An example is given here.  If you have access to the UVES spectrum of 1122-1648, you might prefer this one.

Note that if you try to type the line parameters or ID's in the VPFIT command window, as opposed to the PGPLOT window, the program will ignore them as line ID's and try to save them as commands. This is not useful, and can be confusing. It was done this way so that you don't have to oscillate between windows, as well as keyboard/mouse, when entering in a number of lines. What you type is reflected in the command window so you can see what you are typing. The backspace or del keys can be used to correct mistakes, though if you do use these then the ID line being typed goes on to a new line in the command window .. so the last line you see there is the one which is being entered as the line ID, preceded by a history of your mistakes which are ignored!


OUTPUT FILES:

The program generates a number of files:

fort.18: essentially a summary of the fit, containing the information given on the sceen while running the program but without the questions.

fort.13: the starting guesses in case you have to restart. You can edit this to have closer values if you want, or to put another line in. In the example  case it looks like:
   *
 q2233   1    3809.08    3814.04
  *
   H I    1.340E+01    2.135245      24.10       0.00   0.00E+00
 

fort.26: summary of the final results, which is described here.

If you have tried the HDF-S quasar example you will notice that the fit probability is amazingly high, and (equivalently) that the (normalized) chi-squared is amazingly low, at 0.467 when it should be near unity. This is because the error estimate array is too large by about a factor 1.5, so the whole array should be divided by 1.5 (use IRAF, or, alternatively, create a file called 'vp_setup.dat' in the same directory as you are running the program from, containing the line  'sigscalemult 1.5', which will cause VPFIT to do it for you for any incoming error array). When you do this you find that the normalized chi-squared is 1.049, and the fit probability 0.294.

This may seem to be a totally arbitrary thing to do, and as presented here, it is. However, the chi-squared statistic assumes that the data bins are independent, and the data have been rebinned in this example, so there is a correlation between neighboring channels -- effectively a smoothing on a scale of up to two channels  if the rebinning is to a uniform binsize which is an average from the (usually not completely linear) raw data. As a consequence the fluctuations in the rebinned data are smaller than in the raw data, and it is the raw data which is generally used to estimate the 1-sigma error array. Thus the chi-squared values, which are the sum of [(data-fit)/sigma]^2, will be underestimated because sigma is an overextimate given the fluctuations. A way of determining the appropriate scalefactor is to measure the rms about the mean using IRAF (in splot) and comparing that with the sigma value for the same region.

The error estimates for the parameters are based on an approximation which should work reasonably well for unstaurated lines, and which allows for errors introduced by uncertainties in the parameters for other lines in the blend which make up the feature you are fitting. A little more information on error estimates in other circumstances is given here.

          [top]

Complex features

You might now try something a bit more ambitious than a single  line fit to a single feature, and you can enter multiple lines in complex features. Another example illustrates this.

You don't normally know how many components are needed to fit complex features from the start, and there are two approaches to this.

One is to give starting values interactively for as many as you think you might need, and then when you find you got it wrong add more lines where the fit looks bad and restart the program. This is normally done by editing the file containing the best guess output from the previous run to contain more lines where you think you want them, and restarting the program now running from a file  (see below for running from a file of first guesses).

The other approach is for the program itself to try to put extra lines in where the fit is worst. You can make it do this by extending the first input line giving the options from 'I' to, for example, 'I ad il'. In this case the 'ad' part  results in lines being added where the fit is worst, one at a time, until the fit probability (chi^2) is greater than 0.01 and the K-S probability is also greater than 0.01, at least in an ideal world. Adding new lines for the fit reduces the number of statistical degrees of freedom, and it is possible that addition of lines does not improve the fit statistically. If adding new lines causes the normalized chi^2 to increase, then the iteration cycle stops and you will probably have to rescue the program by hand. The 'il' part results in the program removing ill-conditioned lines, i.e. those which are not constrained by the data (because they are too weak or have moved outside the fitting region selected from the spectrum).

[If you don't like the default probabilities of 0.01 you can change them by specifying them after the 'ad' on the same line -- so 'I ad .02 .001 il' has the same effect as above, except that the process is terminated if the chi^2 probability is >0.02 and the K-S probability is >0.001, so a better fit is required from the chi^2 test, while a worse one is tolerated from the K-S test.]
 

Multiple lines

And you don't need to have all the lines of interest in only one region. For example, there is a SiIV doublet at z=1.87 in the spectrum of the Hubble Deep Field South Quasar, and it is clearly best to fit both lines simultaneously. There is also a MgII doublet in the Ly-a forest which would benefit. An example is based on a CIV doublet in the same spectrum.

This example does illustrate a deficiency in the automatic 'add more lines' approach, and that is that there is no algorithm to decide which of the possible lines it could add should be added. The approach taken here is to look to see what the ID is for the nearest feature to the place where one is to be added, and call it that. While this will work if only a single ion is present, it can lead you astray if there is more than one. If you are dealing with heavy elements in the Ly-alpha forest, or with high order Lyman lines, then you can make sure that any new lines added are Ly-alpha rather than the nearest neighbour by specifying 'I ad Ly-a il' in the options line, but this forces any new line added to be Ly-alpha whether or not it should be. If you don't like the Ly-a ID you can always change the output file so the HI becomes something else at a suitable redshift and rerun the program from file.

This example would also benefit from automatic local continuum fitting when the normalized chi-squared dropped below about 2.

         [top]
 

RUNNING FROM FILE:

While the interactive mode is good for learning what the program does, it is tedious having to use it every time. If you have first guesses then it is almost invariably more convenient to run directly from a file. The first guesses need not come from interactive runs of the program, but could come from anywhere. Joe Liske's program vpguess provides a more convenient way of setting up an initial file.

The results are the same as for the interactive mode, but now without the hassle of having to remark the region and enter guesses for something you tried before and somehow things were not quite right. The output file from the program, either fort.13 or fort.26, can be used as input for a subsequent fit, after editing if you wish. This can have the format as for the output file written by the program, but a freer format is accepted. Thus an acceptable fort.13 is:

*
  sect4.ec 1  7180.15 7188.2
sect4.ec 1   7005.63 7014.83
sect4.ec 1  7020      7025.3
sect7.ec 1  8216.73 8222.84
*
C II 15.19 4.383024b   1.63m  0.0 1.0e4
C II 1.4043E+01 4.382e 4.25p 0.00 1.00E+00 0
C II 1.4043E+01 4.382893j 4.25u 0.00 1.00E+00 0
CII 15.56 4.383865c       11.32n 0.00 1.00E+00 0
C II 1.2242E+01 4.383487h 13.04s 0.00 1.00E+00 0
C II 1.4093E+01 4.384351f 17.61q 0.00 1.0
C II 1.2890E+01 4.385030g 11.93r 0.00 1.00E+00 0
OI 1.8209E+01 4.383024B 1.63M
O I 1.4043E+01 4.382795E 4.25P
O I 1.4043E+01  4.382893J 4.25U
O I 1.5047E+01    4.382820E 14.25P
..etc
 

Note that a blank line is a terminator for a dataset.

This enables you to write a fort.13 by hand if you wish to, rather than the easier method of using the program to generate it. Note that a space between element and ionization is allowed, or may be omitted. You can also treat a fort.26 summary file as a fort.13 file if you want to -- it will be interpreted correctly unless you have removed the %% markers before the filenames.

To use this file just type 'F' (from file) instead of 'I' (interactive) as the option on startup:
 

.....
  options:   <CR> for previous value
  I - interactive setup and fit
  F - run from an input file
  D - display profiles from input file
  ? for help
  option (key) (key)...
> F                       (and any other parameters you want on that line)

  setup: ? help, <CR> defaults, n,z,b,cs,sf,il,w,me,p,d,v to change
>                           <--

 Column density (n), logN (l) or emission (e), scalefactor
>                             <--
  logN scalefactor is     1.00000
 Parameter input file, # entries? [fort.13,1]
>                            <--

 filename       : whatever
 echelle order  :  1
  Continuum set to     1.00000  everywhere
  sigma scale (errors scaled) :    1.00000
 FWHM : 0.095A
  Start & end chans:   1610  1738
   1 regions fitted
   3 systems

  no. of ions for fitting is   3

......

 If you have a file from trials like the above, you can try it and see what happens.

         [top]

LYMAN CONTINUUM ABSORPTION: The Lyman continuum optical depth is occasionally useful for setting total HI column densities for a complex, though it does not constrain the Doppler parameter or the redshift. Lyman continuum absorption is included if it falls in any of the wavelength regions chosen, and if logN(HI)>15.0 (this is only to save time, and is adjustable -- see optional extras below). There is a complication that the atom.dat table does not contain the full Lyman series (since it is infinite), so for the region between the Lyman limit and the highest order line in the table the spectrum is interpolated. This should be adequate.


ZERO LEVEL: There are two ways of adjusting this. If you use the line __ (1215.67 assumed wavelength, to make life easy (?), but you can change it in the atom.dat file), then the continuum is left where it was, and the fitted spectrum stretched linearly to the new zero level. Note that the first parameter <zero level> is now expressed as a fraction of the continuum, rather than put on a scale where it is forced to look like a column density, so the line in fort.13 now looks like:

__ <zero level> <redshift>N <b-value> 0.00 0.00E+00 n

where the zero level adjustment will normally be around zero (it may be negative), and will be less than 1! As before, N is a sample letter to fix the redshift, which should be such that 1215.67*(1+z) is within the region of interest. The number n is the dataset number for which the zero level is to be adjusted (1 for first region, 2 for second, etc.), and this is not needed if there is only one region which contains the wavelength 1215.67*(1+<redshift>). It performs a linear fit of the form <zero level> + <b-value>*delta lambda/ 1215.67*(1+<redshift>), and you can fix either of the other two values (indeed, it usually is best to set the <b-value> to zero and fix it, but occasionally you might want a slope).

Note that the zero level adjustment applies to the input spectrum received at that point, so all the real ions with lines in that region should appear higher in the list than the '__' one. Continuum adjustment matters less, but it is probably best to adopt a policy of putting the '__' last just for internal consistency.

If you use the AD option (ladd when it asks for linear, logs or emission) the the baseline, or indeed whatever profile you wish to choose, is added to the whole spectrum, so the continuum moves as well. Since this was written to allow for partially covered emission lines it is probably less appropriate. They are both matters of last resort, because the few trials done suggest that the process is pretty unstable when you start with bad guesses and try to get the zero level to help sort them out. Both work from fort.13 files, but are not wonderfully reliable from interactive input.
 

CONTINUUM LEVEL: You can adjust the continuum level using a linear multiplicative factor. It is based on false elements as tags, and the line in the fort.13 file looks like: <> <level> <redshift>N <linterm>(M) 0.0 0.00E+00 <region no>. The value of <level> is usually near unity [it is what the continuum is multiplied by at the wavelength corresponding to (1+<redshift>) * 1215.67=reference wavelength, which must be within the relevant wavelength chunk], and the redshift MUST be fixed. The <linterm> value may be fixed (usually at zero, for a constant rescaling of the continuum) or may be left to be determined by the program. The input continuum through the fitting region is multiplied by (<level> + <linterm>* (wavelength/reference wavelength - 1), so for small adjustments <level> is somewhere vaguely near unity. You can input this interactively as well. In cursor mode it asks if you want to fit the slope or not, and you fix it (at zero) by pressing the left mouse button or the central one, allow it to vary by pressing on the right mouse button. The warning regarding zero level adjustment applies here as well -- it is all too easy to have a continuum change trade off against a broad line, so it is probably best to use this only for (small) final adjustments to the parameters. If these final adjustments are not small, worry about the reliability of your results.

         [top]
 

UPPER LIMITS: If a line isn't there the program has the annoying habit of throwing it out. This is fine if you are fitting, but something of a nuisance if you are interested in an upper limit to the column density for some ion -- where you might hope to use the error estimate as a 1-sigma limit. To do this you need to know the redshift and Doppler parameter, so need information from elsewhere anyway. Then the way to get an upper limit is to create an input (fort.13) file along the lines of:
*
spect 1 5698.96 5702.88 ! CIV 1548
spect 1 5708.48 5711.60 ! CIV 1550
spect 1 5162.27 5169.1 ! SiIV 1402
*
C IV 12.7591 2.682135aa 13.87ab 0.00 1.00E+04 0 ! 1
SiIV 11.0000 2.682135AA 13.58AB 0.00 0.00E+00 0 ! 2

..if you are interested in an upper limit for SiIV, and the CIV lines are there. Then, at the start type the letters against the > prompt options:

<CR> for previous value
I - interactive setup and fit
F - run from an input file
D - display profiles from input file
? for help
option (key) (key)...
> e                  <-------------- Error estimates without iteration

lower case characters following values are treated as labels.
if the same upper case character follows another value then it will be tied to the lower case label.
isolated (un-paired) upper case characters fix values.
setup: ? help, <CR> defaults, p,v to change
>
Column density (n), logN (l) or emission (e), scalefactor
> n                  <-------------- Needs to be in linear mode.
N scalefactor is 1.00000E-14
Parameter input file, # entries? [fort.13,1]
>
etc...

The errors are calculated then without iteration, so you can use the error estimate to provide an upper limit to the column density PROVIDED THAT

(i) you have used the 'n' option above. For a barely detectable line the formal lower limit to the log column density is minus infinity, so the approximation that the errors are symmetric about the estimate breaks down.

(ii) you have not put a very low column density in for the one you want the limit for. If a low number goes in here the error estimate is low, and the value seems to flatten off at the correct value at about 1/30 of the local detection limit. We suspect this is because negative Voigt profiles (i.e. in emission) are not catered for, and it hardly seems worth putting them in.

(iii) you have fitted the line if there is one there which is not thrown out. If there really is a fittable line at the position of interest, however weak, then the error estimate alone does NOT give you the basis for an upper limit. You should take the fit value and add the error estimate (times two, or whatever you choose) to obtain  an upper limit to the column density.

         [top]

OPTIONAL EXTRAS:
 

 1. 'vp_setup.dat' (or the filename set by environment VPFSETUP), if present, over-rides internal defaults for several variables. Some of the most useful ones are the minimum Doppler parameter allowed, and options on when to drop systems if the column density or line width becomes too low. The list of variables, and some comments on what they do, is given here.

2. 'vp_splot.dat' or environment VPFPLOTS (sought in reverse order), if present, can be used to preset PGPLOT parameters. If you want a fairly bizarre color scheme, and the continuum and errors to be shown along with the data, then try:
at da co 7
at co co 5
at er co 2
at ax co 6
at ti co 5
at co st 2
al
<EOF>

.. this gives data color #7 (see PGPLOT manual to see what this actually is), continuum #5, error #2, axes #6 and tick marks #5, plots all arrays, and the continuum is shown as a dashed line (PGPLOT style #2). The 'al' means the data, error and continuum are plotted. For a list of recognized parameters type 'he' when the program asks for plot parameters (e.g. in interactive mode, or when displaying profiles after fitting).

3. 'vp_files.dat' in the current directory (or a file corresponding to the the environment variable VPFFILES) can contain filenames and wavelength ranges and be used to search for regions of interest. Details are here, but it is usually better to have thought about what lines you want to cover beforehand.

4. Complex fits take a lot of time, and your patience may run out before the program thinks it has converged. If you want to stop things early, then, using another window set to be in the same directory as the running program, type touch stop. A zero-length file called 'stop' will be created there. At the end of each iteration VPFIT checks to see if this file exists, and if it does it deletes the 'stop' file, and halts the iterations in precisely the same way as if it had converged to the final answer. This means you can look at error estimates on the basis of where it has got to so far, and look at the plots. To restart, copy the last iteration (from screen or fort.18) to fort.13, and set it all off again. If you change your mind about stopping the iterations, just 'rm stop' before VPFIT does the check.
 

         [top]

ADDITIONAL FEATURE(!) Due to the imfort header being restricted in length, if the header is more than about 200 lines long then not all the header info is accessible to the program in the older version of the program. This was unfortunate, since wavelength coefficients and other info is appended to the header which comes from the telescope, and some places put vast amounts of info in the header (I suppose for engineering reasons). The only general way around this is to strip out the rubbish if there are too many lines .. the usual signature of this is VPFIT's failure to pick up wavelength or scale coefficients. If there is a similar restriction using the CFITSIO routines we have yet to discover it.
 
 
 

EMISSION LINES:

Clearly a minor variant of the program could be used for fitting components to emission line profiles as well. A description of how you can do this for Gaussian emission lines is available.
 
 
 

ADDING EXTRA LINES:

If the profile generated is not a good fit to the data then what you would do by hand is add one or more extra components until you are satisified with the match between model and data. The program can do this for you by adding lines where there is the most significant departure from the fit, and splitting single features which have a significant high component within them. To invoke this mode simply include 'ad' in the first command line. So:

>vpsol

 VPFIT 2.04

 .....

  options:   <CR> for previous value
  I - interactive setup and fit
  F - run from an input file
  D - display profiles from input file
  ? for help
  option (key) (key)...
> I ad                      <----------- the 'ad' flag activates this bit.

 Add lines if prob(tot) < 0.0100   or prob(KS) < 0.0100

.....
.. then choose a messy single region, stick one line in it as a starter, and watch the fit develop. What you should see is the fit improving before your eyes..

If you like to stop adding lines after a particular normalized chi-squared, or fit probability, is reached, then just use e.g.

> I ad 1.25

and it will stop adding lines when the normalized chi-squared is less than 1.25. If the number following the 'ad' is less than 1, it is interpreted as a probability, so you can modify the default of 0.01 if you wish. Note that if you have a version of the program dated before 12 Sept, 2000, this option was not implemented so if you try it lines will be added forever......
 

If you get fed up with an individual iteration, 'touch stop' in the same directory as the one you are running the program from will terminate the current iteration. 'touch stopit' will end the iteration sequence with the current one.
 

As things stand the default is to add a line with species and rest wavelength of the line nearest existing component, unless the species is hydrogen in which cas Ly-a is added. You can arrange that Ly-a be the line added every time by putting

>I ad Ly-a

instead of the arrowed line above. If you wish to mix and match you have to do it by hand.

The program either adds absorption lines where they are most needed, or splits them where there is an emission peak relative to the local fit. The choice of which to do depends on an obscure parameter

adsplit 1.75           in vp_setup.dat,

which is the significance ratio an emission peak has to have over the biggest residual absorption before an attempt is made to split a line. This bias is built in for speed -- guesses for line splitting are less easy to get right, so it takes longer to iterate when lines are split, and often adding needed components to shoulders of lines drives the result the same way.

         [top]

REMOVING UNWANTED SYSTEMS:

You can do this as well as adding new ones, and do both at the same time.  To activate this option you need to expand the first command line to include 're' (for remove lines with large errors). So the command line can be

I ad re

for interactive input, adding and removing system component ions. Of course you can get into giant loops doing this, adding and removing the same ones. Ions which are ill-conditioned - i.e. for which there are no constraints on the fit parameters - are removed automatically.
 
 

AUTO CONTINUUM LEVEL:

The continuum level can be adjusted using the line

<>  <factor1> <redshift>SX <factor2>

in the fort.13 file. This provides a multiplicative linear fit to a factor which multiplies the input continuum. So factor1=1.0 and factor2=0.0 is an excellent start. If you don't want the continuum to have a slope, then factor2=0.0SB (or some other unassociated upper case letter or two) will force zero slope.

The feature which is important for the auto-line fitting is that after a while you may wish to allow the continuum to float free. You can do this automatically if the vp_setup.dat contains the line

adcontf #1 #2 #3 #4

A continuum fit will be tried when EITHER the normalized chi^2 < #1 OR there have been #2 consecutive added lines all with b<#3 and logN<#4. [I'm not sure how well the second option works]. The defaults are 1.8 10 5.0 12.5.
 

         [top]
 

OTHER FEATURES:

The above list is not exhaustive. For a more comprehensive description see vpfit9.5.pdf. Among other things, this covers:

arbitrary line spread functions which may be specified on a pixel or subpixel basis (Spectral Resolution section)

letting the program set initial guesses for paerameters (Fitting simulated spectra section)

dealing with velocity shifts (Atomic data: Special ions)

including known systems before fitting (Getting started: Optional parameters)

As with many programs, there are continual developments and (hopefully) enhancements. If a more recent version exists, and works at least tolerably well, it will be in the anonymous ftp space as vpbeta.tar.gz. There is also a rather less well documented (and less stable) program included in the VPFIT tarfile, rdgen which can be used to look at the data. There is some documentation, and what is not there might be clear from typing '?' or 'help' when using the program.

.. AND FINALLY Of course we do our best to make sure that the program gives accurate results, and try to remove bugs etc., but of course we can't guarantee anything.  If you have any problems, inconsistencies, e-mail to the contact address at the top. It might also be worth checking the list of changes to make sure that the version you have is not out of date and the problem (hopefully) already solved. If you make any improvements yourself we'd appreciate knowing about them and possibly incorporating them.

We are very grateful to Andrew Cooke, Mike Irwin, Julian King, Joe Liske and Michael Murphy for major contributions to various parts of the program. Also, for the many people who have provided other help, improvements and suggestions for its development -- many thanks.