Spectral model fitting¶
- MCMC sampling
- Interactive model fitting
- Saving and retrieving the parameter chain of a run
- Priors
naima.log_uniform_prior
API¶
- naima.get_sampler(data_table=None, p0=None, model=None, prior=None, nwalkers=500, nburn=100, guess=True, interactive=False, prefit=False, labels=None, threads=None, data_sed=None)[source]¶
Generate a new MCMC sampler.
- Parameters:
- data_table
Tableor list ofTable Table containing the observed spectrum. If multiple tables are passed as a string, they will be concatenated in the order given. Each table needs at least these columns, with the appropriate associated units (with the physical type indicated in brackets below) as either a
Unitinstance or parseable string:energy: Observed photon energy [energy]flux: Observed fluxes [fluxordifferential flux]flux_error: 68% CL gaussian uncertainty of the flux [fluxordifferential flux]. It can also be provided asflux_error_loandflux_error_hi(see below).
Optional columns:
energy_width: Width of the energy bin [energy], orenergy_error: Half-width of the energy bin [energy], orenergy_error_loandenergy_error_hi: Distance from bin center to lower and upper bin edges [energy], orenergy_loandenergy_hi: Energy edges of the corresponding energy bin [energy]flux_error_loandflux_error_hi: 68% CL gaussian lower and upper uncertainties of the flux.ul: Flag to indicate that a flux measurement is an upper limit.flux_ul: Upper limit to the flux. If not present, thefluxcolumn will be taken as an upper limit for those measurements with theulflag set to True or 1.
The
keywordsmetadata field of the table can be used to provide the confidence level of the upper limits with the keywordcl, which defaults to 90%. Theastropy.io.asciireader can recover all the needed information from ASCII tables in theIpacandDaophotformats, and everything except theclkeyword from tables in theSextractor. For the latter format, the cl keyword can be added after reading the table with:data.meta['keywords']['cl']=0.99
- p0array
Initial position vector. The distribution for the
nwalkerswalkers will be computed as a multidimensional gaussian of width 5% around the initial position vectorp0.- modelfunction
A function that takes a vector in the parameter space and the data dictionary, and returns the expected fluxes at the energies in the spectrum. Additional return objects will be saved as blobs in the sampler chain, see the emcee documentation for the format.
- priorfunction, optional
A function that takes a vector in the parameter space and returns the log-likelihood of the Bayesian prior. Parameter limits can be specified through a uniform prior, returning 0. if the vector is within the parameter bounds and
-np.infotherwise.- nwalkersint, optional
The number of Goodman & Weare “walkers”. Default is 500.
- nburnint, optional
Number of burn-in steps. After
nburnsteps, the sampler is reset and chain history discarded. It is necessary to settle the sampler into the maximum of the parameter space density. Default is 100.- labelsiterable of strings, optional
Labels for the parameters included in the position vector
p0. If not provided['par1','par2', ... ,'parN']will be used.- threadsint, optional
Number of parallel processes to use for sampling. Default is the number of CPU cores.
- guessbool, optional
Whether to attempt to guess the normalization (first) parameter of the model. Default is True.
- interactivebool, optional
Whether to launch the interactive fitting window to set the initial values for the prefitting or the MCMC run. Requires matplotlib. Default is False.
- prefitbool, optional
Whether to attempt to find the maximum likelihood parameters with a Nelder-Mead algorithm and use them as starting point of the MCMC run. The parameter values in
p0will be used as starting points for the minimization. Note that the initial optimization is done without taking the prior function into account to avoid the possibility of infinite values in the objective function. If the best-fit parameter vector without prior is forbidden by the prior given, it will be discarded.- data_sedbool, optional
When providing more than one data table, whether to convert them to SED format. If unset or None, all tables will be converted to the format of the first table.
- data_table
- Returns:
- sampler
EnsembleSamplerinstance Ensemble sampler with walker positions after
nburnburn-in steps.- pos
ndarray Final position vector array.
- sampler
See also
- naima.run_sampler(nrun=100, sampler=None, pos=None, **kwargs)[source]¶
Run an MCMC sampler.
If no sampler or initial position vector is provided, extra
kwargsare passed toget_samplerto generate a new sampler.- Parameters:
- nrunint, optional
Number of steps to run
- sampler
EnsembleSamplerinstance, optional Sampler.
- pos
ndarray, optional A list of initial position vectors for the walkers. It should have dimensions of
(nwalkers,dim), wheredimis the number of free parameters.emcee.utils.sample_ballcan be used to generate a multidimensional gaussian distribution around a single initial position.
- Returns:
- sampler
EnsembleSamplerinstance Sampler containing the paths of the walkers during the
nrunsteps.- posarray
List of final position vectors after the run.
- sampler
- class naima.InteractiveModelFitter(modelfn, p0, data=None, e_range=None, e_npoints=100, labels=None, sed=True, auto_update=True)[source]¶
Interactive model fitter using matplotlib widgets
- Parameters:
- modelfnfunction
A function that takes a vector in the parameter space and the data table, and returns the expected fluxes at the energies in the spectrum.
- p0array
Initial position vector.
- data
Tableor list ofTable Table or tables with observed spectra. Must follow the format described in
naima.run_sampler.- e_rangelist of
Quantity, length 2, optional Limits in energy for the computation of the model. Note that setting this parameter will mean that the model output is computed twice when
datais provided: once for display usinge_rangeand once for computation of the log-probability using the energy values of the spectra.- e_npointsint, optional
How many points to compute for the model if
e_rangeis set. Default is 100.- labelsiterable of strings, optional
Labels for the parameters included in the position vector
p0. If not provided['par1','par2', ... ,'parN']will be used.- sedbool, optional
Whether to plot SED or differential spectrum.
- auto_updatebool, optional
Whether to update the model plot when parameter sliders are changed. Default is True and can also be changed through the GUI.
- naima.save_run(filename, sampler, compression=True, clobber=False)[source]¶
Save the sampler chain, data table, parameter labels, metadata blobs, and run information to a hdf5 file.
The data table and parameter labels stored in the sampler will also be saved to the hdf5 file.
- Parameters:
- filenamestr
Filename for hdf5 file. If the filename extension is not ‘h5’ or ‘hdf5’, the suffix ‘_chain.h5’ will be appended to the filename.
- sampler
emcee.EnsembleSamplerinstance Sampler instance for which chain and run information is saved.
- compressionbool, optional
Whether gzip compression is applied to the dataset on write. Default is True.
- clobberbool, optional
Whether to overwrite the output filename if it exists.
- naima.read_run(filename, modelfn=None)[source]¶
Read chain from a hdf5 saved with
save_run.This function will also read the labels, data table, and metadata blobs stored in the original sampler. If you want to use the result object with
plot_fitand setting thee_rangeparameter, you must provide the model function with themodelfnargument given that functions cannot be serialized in hdf5 files.- Parameters:
- filenamestr
Filename of the hdf5 containing the chain, lnprobability, and blob arrays in the group ‘sampler’
- modelfnfunction, optional
Model function to be attached to the returned sampler
- Returns:
- resultclass
Container object with same properties as an
emcee.EnsembleSamplerresulting from a sampling run. This object can be passed ontoplot_fit,plot_chain, andplot_cornerfor analysis as you would do with aemcee.EnsembleSamplerinstance.
- naima.plot_chain(sampler, p=None, **kwargs)[source]¶
Generate a diagnostic plot of the sampler chains.
- Parameters:
- sampler
emcee.EnsembleSampler Sampler containing the chains to be plotted.
- pint (optional)
Index of the parameter to plot. If omitted, all chains are plotted.
- last_stepbool (optional)
Whether to plot the last step of the chain or the complete chain (default).
- sampler
- Returns:
- figure
matplotlib.figure.Figure Figure
- figure
- naima.plot_corner(sampler, show_ML=True, **kwargs)[source]¶
A plot that summarizes the parameter samples by showing them as individual histograms and 2D histograms against each other. The maximum likelihood parameter vector is indicated by a cross.
This function is a thin wrapper around
corner.corner, found at https://github.com/corner/corner.py.- Parameters:
- sampler
emcee.EnsembleSampler Sampler with a stored chain.
- show_MLbool, optional
Whether to show the maximum likelihood parameter vector as a cross on the 2D histograms.
- sampler
- naima.plot_fit(sampler, modelidx=0, label=None, sed=True, last_step=False, n_samples=100, confs=None, ML_info=False, figure=None, plotdata=None, plotresiduals=None, e_unit=None, e_range=None, e_npoints=100, threads=None, xlabel=None, ylabel=None, ulim_opts={}, errorbar_opts={})[source]¶
Plot data with fit confidence regions.
- Parameters:
- sampler
emcee.EnsembleSampler Sampler with a stored chain.
- modelidxint, optional
Model index to plot.
- labelstr, optional
Label for the title of the plot.
- sedbool, optional
Whether to plot SED or differential spectrum.
- last_stepbool, optional
Whether to use only the samples of the last step in the run when showing either the model samples or the confidence intervals.
- n_samplesint, optional
If not
None, number of sample models to plot. IfNone, confidence bands will be plotted instead of samples. Default is 100.- confslist, optional
List of confidence levels (in sigma) to use for generating the confidence intervals. Default is to plot sample models instead of confidence bands.
- ML_infobool, optional
Whether to plot information about the maximum likelihood parameters and the standard deviation of their distributions. Default is True.
- figure
matplotlib.figure.Figure, optional matplotlibfigure to plot on. If omitted a new one will be generated.- plotdatabool, optional
Wheter to plot data on top of model confidence intervals. Default is True if the physical types of the data and the model match.
- plotresidualsbool, optional
Wheter to plot the residuals with respect to the maximum likelihood model. Default is True if
plotdatais True and eitherconfsorn_samplesare set.- e_unit
Unit, optional Units for the energy axis of the plot. The default is to use the units of the energy array of the observed data.
- e_rangelist of
Quantity, length 2, optional Limits in energy for the computation of the model samples and ML model. Note that setting this parameter will mean that the samples for the model are recomputed and depending on the model speed might be quite slow.
- e_npointsint, optional
How many points to compute for the model samples and ML model if
e_rangeis set.- threadsint, optional
How many parallel processing threads to use when computing the samples. Defaults to the number of available cores.
- xlabelstr, optional
Label for the
xaxis of the plot.- ylabelstr, optional
Label for the
yaxis of the plot.- ulim_optsdict
Option for upper-limit plotting. Available options are capsize (arrow width) and height_fraction (arrow length in fraction of flux value).
- errorbar_optsdict
Addtional options to pass to
matplotlib.plt.errorbarfor plotting the spectral flux points.
- sampler
- naima.plot_data(input_data, xlabel=None, ylabel=None, sed=True, figure=None, e_unit=None, ulim_opts={}, errorbar_opts={})[source]¶
Plot spectral data.
- Parameters:
- input_data
emcee.EnsembleSampler,astropy.table.Table, ordict Spectral data to plot. Can be given as a data table, a dict generated with
validate_data_tableor aemcee.EnsembleSamplerwith a data property.- xlabelstr, optional
Label for the
xaxis of the plot.- ylabelstr, optional
Label for the
yaxis of the plot.- sedbool, optional
Whether to plot SED or differential spectrum.
- figure
matplotlib.figure.Figure, optional matplotlibfigure to plot on. If omitted a new one will be generated.- e_unit
astropy.unit.Unit, optional Units for energy axis. Defaults to those of the data.
- ulim_optsdict
Options for upper-limit plotting. Available options are capsize (arrow width) and height_fraction (arrow length in fraction of flux value).
- errorbar_optsdict
Addtional options to pass to
matplotlib.plt.errorbarfor plotting the spectral flux points.
- input_data
- naima.plot_blob(sampler, blobidx=0, label=None, last_step=False, figure=None, **kwargs)[source]¶
Plot a metadata blob as a fit to spectral data or value distribution
Additional
kwargsare passed toplot_fit.- Parameters:
- sampler
emcee.EnsembleSampler Sampler with a stored chain.
- blobidxint, optional
Metadata blob index to plot.
- labelstr, optional
Label for the value distribution. Labels for the fit plot can be passed as
xlabelandylabeland will be passed toplot_fit.
- sampler
- Returns:
- figure
matplotlib.pyplot.Figure matplotlibfigure instance containing the plot.
- figure
- naima.save_diagnostic_plots(outname, sampler, modelidxs=None, pdf=False, sed=True, blob_labels=None, last_step=False, dpi=100)[source]¶
Generate diagnostic plots.
A plot for each of the chain parameters showing walker progression, final sample distribution and several statistical measures of this distribution:
outname_chain_parN.png(seenaima.plot_chain).A corner plot of sample density in the two dimensional parameter space of all parameter pairs of the run, with the Maximum Likelihood parameter vector indicated in blue:
outname_corner.png(seecorner.corner).A plot for each of the models returned as blobs by the model function. The maximum likelihood model is shown, as well as the 1 and 3 sigma confidence level contours. The first model will be compared with observational data and residuals shown.
outname_fit_modelN.png(seenaima.plot_fitandnaima.plot_blob).
- Parameters:
- outnamestr
Name to be used to save diagnostic plot files.
- sampler
emcee.EnsembleSamplerinstance Sampler instance from which chains, blobs and data are read.
- modelidxsiterable of integers, optional
Model numbers to be plotted. Default: All returned in sampler.get_blobs
- blob_labelslist of strings, optional
Label for each of the outputs of the model. They will be used as title for the corresponding plot.
- pdfbool, optional
Whether to save plots to multipage pdf.
- naima.save_results_table(outname, sampler, format='ascii.ecsv', convert_log=True, last_step=False, include_blobs=True, overwrite=False)[source]¶
Save an ASCII table with the results stored in the
EnsembleSampler.The table contains the median, 16th and 84th percentile confidence region (~1sigma) for each parameter.
- Parameters:
- outnamestr
Root name to be used to save the table.
_results.datwill be appended for the output filename.- sampler
emcee.EnsembleSamplerinstance Sampler instance from which chains, blobs and data are read.
- formatstr, optional
Format of the saved table. Must be a format string accepted by
astropy.table.Table.write, see the astropy unified file read/write interface documentation. Only theascii.ecsvandascii.ipacformats are able to preserve all the information stored in therun_infodictionary of the sampler. Defaults toascii.ecsvif available (only in astropy > v1.0), elseascii.ipac.- convert_logbool, optional
Whether to convert natural or base-10 logarithms into original values in addition to saving the logarithm value.
- last_stepbool, optional
Whether to only use the positions in the final step of the run (True, default) or the whole chain (False).
- include_blobsbool, optional
Whether to save the distribution properties of the scalar blobs in the sampler. Default is True.
- Returns:
- table
Table Table with the results.
- table