Spectral model fitting

API

naima.get_sampler(data_table=None, p0=None, model=None, prior=None, nwalkers=500, nburn=100, guess=True, interactive=False, prefit=False, labels=None, threads=None, data_sed=None)[source]

Generate a new MCMC sampler.

Parameters
data_tableTable or list of Table

Table containing the observed spectrum. If multiple tables are passed as a string, they will be concatenated in the order given. Each table needs at least these columns, with the appropriate associated units (with the physical type indicated in brackets below) as either a Unit instance or parseable string:

  • energy: Observed photon energy [energy]

  • flux: Observed fluxes [flux or differential flux]

  • flux_error: 68% CL gaussian uncertainty of the flux [flux or differential flux]. It can also be provided as flux_error_lo and flux_error_hi (see below).

Optional columns:

  • energy_width: Width of the energy bin [energy], or

  • energy_error: Half-width of the energy bin [energy], or

  • energy_error_lo and energy_error_hi: Distance from bin center to lower and upper bin edges [energy], or

  • energy_lo and energy_hi: Energy edges of the corresponding energy bin [energy]

  • flux_error_lo and flux_error_hi: 68% CL gaussian lower and upper uncertainties of the flux.

  • ul: Flag to indicate that a flux measurement is an upper limit.

  • flux_ul: Upper limit to the flux. If not present, the flux column will be taken as an upper limit for those measurements with the ul flag set to True or 1.

The keywords metadata field of the table can be used to provide the confidence level of the upper limits with the keyword cl, which defaults to 90%. The astropy.io.ascii reader can recover all the needed information from ASCII tables in the Ipac and Daophot formats, and everything except the cl keyword from tables in the Sextractor. For the latter format, the cl keyword can be added after reading the table with:

data.meta['keywords']['cl']=0.99
p0array

Initial position vector. The distribution for the nwalkers walkers will be computed as a multidimensional gaussian of width 5% around the initial position vector p0.

modelfunction

A function that takes a vector in the parameter space and the data dictionary, and returns the expected fluxes at the energies in the spectrum. Additional return objects will be saved as blobs in the sampler chain, see the emcee documentation for the format.

priorfunction, optional

A function that takes a vector in the parameter space and returns the log-likelihood of the Bayesian prior. Parameter limits can be specified through a uniform prior, returning 0. if the vector is within the parameter bounds and -np.inf otherwise.

nwalkersint, optional

The number of Goodman & Weare “walkers”. Default is 500.

nburnint, optional

Number of burn-in steps. After nburn steps, the sampler is reset and chain history discarded. It is necessary to settle the sampler into the maximum of the parameter space density. Default is 100.

labelsiterable of strings, optional

Labels for the parameters included in the position vector p0. If not provided ['par1','par2', ... ,'parN'] will be used.

threadsint, optional

Number of parallel processes to use for sampling. Default is the number of CPU cores.

guessbool, optional

Whether to attempt to guess the normalization (first) parameter of the model. Default is True.

interactivebool, optional

Whether to launch the interactive fitting window to set the initial values for the prefitting or the MCMC run. Requires matplotlib. Default is False.

prefitbool, optional

Whether to attempt to find the maximum likelihood parameters with a Nelder-Mead algorithm and use them as starting point of the MCMC run. The parameter values in p0 will be used as starting points for the minimization. Note that the initial optimization is done without taking the prior function into account to avoid the possibility of infinite values in the objective function. If the best-fit parameter vector without prior is forbidden by the prior given, it will be discarded.

data_sedbool, optional

When providing more than one data table, whether to convert them to SED format. If unset or None, all tables will be converted to the format of the first table.

Returns
samplerEnsembleSampler instance

Ensemble sampler with walker positions after nburn burn-in steps.

posndarray

Final position vector array.

naima.run_sampler(nrun=100, sampler=None, pos=None, **kwargs)[source]

Run an MCMC sampler.

If no sampler or initial position vector is provided, extra kwargs are passed to get_sampler to generate a new sampler.

Parameters
nrunint, optional

Number of steps to run

samplerEnsembleSampler instance, optional

Sampler.

posndarray, optional

A list of initial position vectors for the walkers. It should have dimensions of (nwalkers,dim), where dim is the number of free parameters. emcee.utils.sample_ball can be used to generate a multidimensional gaussian distribution around a single initial position.

Returns
samplerEnsembleSampler instance

Sampler containing the paths of the walkers during the nrun steps.

posarray

List of final position vectors after the run.

class naima.InteractiveModelFitter(modelfn, p0, data=None, e_range=None, e_npoints=100, labels=None, sed=True, auto_update=True)[source]

Interactive model fitter using matplotlib widgets

Parameters
modelfnfunction

A function that takes a vector in the parameter space and the data table, and returns the expected fluxes at the energies in the spectrum.

p0array

Initial position vector.

dataTable or list of Table

Table or tables with observed spectra. Must follow the format described in naima.run_sampler.

e_rangelist of Quantity, length 2, optional

Limits in energy for the computation of the model. Note that setting this parameter will mean that the model output is computed twice when data is provided: once for display using e_range and once for computation of the log-probability using the energy values of the spectra.

e_npointsint, optional

How many points to compute for the model if e_range is set. Default is 100.

labelsiterable of strings, optional

Labels for the parameters included in the position vector p0. If not provided ['par1','par2', ... ,'parN'] will be used.

sedbool, optional

Whether to plot SED or differential spectrum.

auto_updatebool, optional

Whether to update the model plot when parameter sliders are changed. Default is True and can also be changed through the GUI.

naima.save_run(filename, sampler, compression=True, clobber=False)[source]

Save the sampler chain, data table, parameter labels, metadata blobs, and run information to a hdf5 file.

The data table and parameter labels stored in the sampler will also be saved to the hdf5 file.

Parameters
filenamestr

Filename for hdf5 file. If the filename extension is not ‘h5’ or ‘hdf5’, the suffix ‘_chain.h5’ will be appended to the filename.

sampleremcee.EnsembleSampler instance

Sampler instance for which chain and run information is saved.

compressionbool, optional

Whether gzip compression is applied to the dataset on write. Default is True.

clobberbool, optional

Whether to overwrite the output filename if it exists.

naima.read_run(filename, modelfn=None)[source]

Read chain from a hdf5 saved with save_run.

This function will also read the labels, data table, and metadata blobs stored in the original sampler. If you want to use the result object with plot_fit and setting the e_range parameter, you must provide the model function with the modelfn argument given that functions cannot be serialized in hdf5 files.

Parameters
filenamestr

Filename of the hdf5 containing the chain, lnprobability, and blob arrays in the group ‘sampler’

modelfnfunction, optional

Model function to be attached to the returned sampler

Returns
resultclass

Container object with same properties as an emcee.EnsembleSampler resulting from a sampling run. This object can be passed onto plot_fit, plot_chain, and plot_corner for analysis as you would do with a emcee.EnsembleSampler instance.

naima.plot_chain(sampler, p=None, **kwargs)[source]

Generate a diagnostic plot of the sampler chains.

Parameters
sampleremcee.EnsembleSampler

Sampler containing the chains to be plotted.

pint (optional)

Index of the parameter to plot. If omitted, all chains are plotted.

last_stepbool (optional)

Whether to plot the last step of the chain or the complete chain (default).

Returns
figurematplotlib.figure.Figure

Figure

naima.plot_corner(sampler, show_ML=True, **kwargs)[source]

A plot that summarizes the parameter samples by showing them as individual histograms and 2D histograms against each other. The maximum likelihood parameter vector is indicated by a cross.

This function is a thin wrapper around corner.corner, found at https://github.com/corner/corner.py.

Parameters
sampleremcee.EnsembleSampler

Sampler with a stored chain.

show_MLbool, optional

Whether to show the maximum likelihood parameter vector as a cross on the 2D histograms.

naima.plot_fit(sampler, modelidx=0, label=None, sed=True, last_step=False, n_samples=100, confs=None, ML_info=False, figure=None, plotdata=None, plotresiduals=None, e_unit=None, e_range=None, e_npoints=100, threads=None, xlabel=None, ylabel=None, ulim_opts={}, errorbar_opts={})[source]

Plot data with fit confidence regions.

Parameters
sampleremcee.EnsembleSampler

Sampler with a stored chain.

modelidxint, optional

Model index to plot.

labelstr, optional

Label for the title of the plot.

sedbool, optional

Whether to plot SED or differential spectrum.

last_stepbool, optional

Whether to use only the samples of the last step in the run when showing either the model samples or the confidence intervals.

n_samplesint, optional

If not None, number of sample models to plot. If None, confidence bands will be plotted instead of samples. Default is 100.

confslist, optional

List of confidence levels (in sigma) to use for generating the confidence intervals. Default is to plot sample models instead of confidence bands.

ML_infobool, optional

Whether to plot information about the maximum likelihood parameters and the standard deviation of their distributions. Default is True.

figurematplotlib.figure.Figure, optional

matplotlib figure to plot on. If omitted a new one will be generated.

plotdatabool, optional

Wheter to plot data on top of model confidence intervals. Default is True if the physical types of the data and the model match.

plotresidualsbool, optional

Wheter to plot the residuals with respect to the maximum likelihood model. Default is True if plotdata is True and either confs or n_samples are set.

e_unitUnit, optional

Units for the energy axis of the plot. The default is to use the units of the energy array of the observed data.

e_rangelist of Quantity, length 2, optional

Limits in energy for the computation of the model samples and ML model. Note that setting this parameter will mean that the samples for the model are recomputed and depending on the model speed might be quite slow.

e_npointsint, optional

How many points to compute for the model samples and ML model if e_range is set.

threadsint, optional

How many parallel processing threads to use when computing the samples. Defaults to the number of available cores.

xlabelstr, optional

Label for the x axis of the plot.

ylabelstr, optional

Label for the y axis of the plot.

ulim_optsdict

Option for upper-limit plotting. Available options are capsize (arrow width) and height_fraction (arrow length in fraction of flux value).

errorbar_optsdict

Addtional options to pass to matplotlib.plt.errorbar for plotting the spectral flux points.

naima.plot_data(input_data, xlabel=None, ylabel=None, sed=True, figure=None, e_unit=None, ulim_opts={}, errorbar_opts={})[source]

Plot spectral data.

Parameters
input_dataemcee.EnsembleSampler, astropy.table.Table, or dict

Spectral data to plot. Can be given as a data table, a dict generated with validate_data_table or a emcee.EnsembleSampler with a data property.

xlabelstr, optional

Label for the x axis of the plot.

ylabelstr, optional

Label for the y axis of the plot.

sedbool, optional

Whether to plot SED or differential spectrum.

figurematplotlib.figure.Figure, optional

matplotlib figure to plot on. If omitted a new one will be generated.

e_unitastropy.unit.Unit, optional

Units for energy axis. Defaults to those of the data.

ulim_optsdict

Options for upper-limit plotting. Available options are capsize (arrow width) and height_fraction (arrow length in fraction of flux value).

errorbar_optsdict

Addtional options to pass to matplotlib.plt.errorbar for plotting the spectral flux points.

naima.plot_blob(sampler, blobidx=0, label=None, last_step=False, figure=None, **kwargs)[source]

Plot a metadata blob as a fit to spectral data or value distribution

Additional kwargs are passed to plot_fit.

Parameters
sampleremcee.EnsembleSampler

Sampler with a stored chain.

blobidxint, optional

Metadata blob index to plot.

labelstr, optional

Label for the value distribution. Labels for the fit plot can be passed as xlabel and ylabel and will be passed to plot_fit.

Returns
figurematplotlib.pyplot.Figure

matplotlib figure instance containing the plot.

naima.save_diagnostic_plots(outname, sampler, modelidxs=None, pdf=False, sed=True, blob_labels=None, last_step=False, dpi=100)[source]

Generate diagnostic plots.

  • A plot for each of the chain parameters showing walker progression, final sample distribution and several statistical measures of this distribution: outname_chain_parN.png (see naima.plot_chain).

  • A corner plot of sample density in the two dimensional parameter space of all parameter pairs of the run, with the Maximum Likelihood parameter vector indicated in blue: outname_corner.png (see corner.corner).

  • A plot for each of the models returned as blobs by the model function. The maximum likelihood model is shown, as well as the 1 and 3 sigma confidence level contours. The first model will be compared with observational data and residuals shown. outname_fit_modelN.png (see naima.plot_fit and naima.plot_blob).

Parameters
outnamestr

Name to be used to save diagnostic plot files.

sampleremcee.EnsembleSampler instance

Sampler instance from which chains, blobs and data are read.

modelidxsiterable of integers, optional

Model numbers to be plotted. Default: All returned in sampler.get_blobs

blob_labelslist of strings, optional

Label for each of the outputs of the model. They will be used as title for the corresponding plot.

pdfbool, optional

Whether to save plots to multipage pdf.

naima.save_results_table(outname, sampler, format='ascii.ecsv', convert_log=True, last_step=False, include_blobs=True)[source]

Save an ASCII table with the results stored in the EnsembleSampler.

The table contains the median, 16th and 84th percentile confidence region (~1sigma) for each parameter.

Parameters
outnamestr

Root name to be used to save the table. _results.dat will be appended for the output filename.

sampleremcee.EnsembleSampler instance

Sampler instance from which chains, blobs and data are read.

formatstr, optional

Format of the saved table. Must be a format string accepted by astropy.table.Table.write, see the astropy unified file read/write interface documentation. Only the ascii.ecsv and ascii.ipac formats are able to preserve all the information stored in the run_info dictionary of the sampler. Defaults to ascii.ecsv if available (only in astropy > v1.0), else ascii.ipac.

convert_logbool, optional

Whether to convert natural or base-10 logarithms into original values in addition to saving the logarithm value.

last_stepbool, optional

Whether to only use the positions in the final step of the run (True, default) or the whole chain (False).

include_blobsbool, optional

Whether to save the distribution properties of the scalar blobs in the sampler. Default is True.

Returns
tableTable

Table with the results.

naima.normal_prior(value, mean, sigma)[source]

Normal prior distribution.

naima.uniform_prior(value, umin, umax)[source]

Uniform prior distribution.