Spectral model fitting¶
- MCMC sampling
- Interactive model fitting
- Saving and retrieving the parameter chain of a run
- Priors
naima.log_uniform_prior
API¶
- naima.get_sampler(data_table=None, p0=None, model=None, prior=None, nwalkers=500, nburn=100, guess=True, interactive=False, prefit=False, labels=None, threads=None, data_sed=None)[source]¶
Generate a new MCMC sampler.
- Parameters
- data_table
Table
or list ofTable
Table containing the observed spectrum. If multiple tables are passed as a string, they will be concatenated in the order given. Each table needs at least these columns, with the appropriate associated units (with the physical type indicated in brackets below) as either a
Unit
instance or parseable string:energy
: Observed photon energy [energy
]flux
: Observed fluxes [flux
ordifferential flux
]flux_error
: 68% CL gaussian uncertainty of the flux [flux
ordifferential flux
]. It can also be provided asflux_error_lo
andflux_error_hi
(see below).
Optional columns:
energy_width
: Width of the energy bin [energy
], orenergy_error
: Half-width of the energy bin [energy
], orenergy_error_lo
andenergy_error_hi
: Distance from bin center to lower and upper bin edges [energy
], orenergy_lo
andenergy_hi
: Energy edges of the corresponding energy bin [energy
]flux_error_lo
andflux_error_hi
: 68% CL gaussian lower and upper uncertainties of the flux.ul
: Flag to indicate that a flux measurement is an upper limit.flux_ul
: Upper limit to the flux. If not present, theflux
column will be taken as an upper limit for those measurements with theul
flag set to True or 1.
The
keywords
metadata field of the table can be used to provide the confidence level of the upper limits with the keywordcl
, which defaults to 90%. Theastropy.io.ascii
reader can recover all the needed information from ASCII tables in theIpac
andDaophot
formats, and everything except thecl
keyword from tables in theSextractor
. For the latter format, the cl keyword can be added after reading the table with:data.meta['keywords']['cl']=0.99
- p0array
Initial position vector. The distribution for the
nwalkers
walkers will be computed as a multidimensional gaussian of width 5% around the initial position vectorp0
.- modelfunction
A function that takes a vector in the parameter space and the data dictionary, and returns the expected fluxes at the energies in the spectrum. Additional return objects will be saved as blobs in the sampler chain, see the emcee documentation for the format.
- priorfunction, optional
A function that takes a vector in the parameter space and returns the log-likelihood of the Bayesian prior. Parameter limits can be specified through a uniform prior, returning 0. if the vector is within the parameter bounds and
-np.inf
otherwise.- nwalkersint, optional
The number of Goodman & Weare “walkers”. Default is 500.
- nburnint, optional
Number of burn-in steps. After
nburn
steps, the sampler is reset and chain history discarded. It is necessary to settle the sampler into the maximum of the parameter space density. Default is 100.- labelsiterable of strings, optional
Labels for the parameters included in the position vector
p0
. If not provided['par1','par2', ... ,'parN']
will be used.- threadsint, optional
Number of parallel processes to use for sampling. Default is the number of CPU cores.
- guessbool, optional
Whether to attempt to guess the normalization (first) parameter of the model. Default is True.
- interactivebool, optional
Whether to launch the interactive fitting window to set the initial values for the prefitting or the MCMC run. Requires matplotlib. Default is False.
- prefitbool, optional
Whether to attempt to find the maximum likelihood parameters with a Nelder-Mead algorithm and use them as starting point of the MCMC run. The parameter values in
p0
will be used as starting points for the minimization. Note that the initial optimization is done without taking the prior function into account to avoid the possibility of infinite values in the objective function. If the best-fit parameter vector without prior is forbidden by the prior given, it will be discarded.- data_sedbool, optional
When providing more than one data table, whether to convert them to SED format. If unset or None, all tables will be converted to the format of the first table.
- data_table
- Returns
- sampler
EnsembleSampler
instance Ensemble sampler with walker positions after
nburn
burn-in steps.- pos
ndarray
Final position vector array.
- sampler
See also
- naima.run_sampler(nrun=100, sampler=None, pos=None, **kwargs)[source]¶
Run an MCMC sampler.
If no sampler or initial position vector is provided, extra
kwargs
are passed toget_sampler
to generate a new sampler.- Parameters
- nrunint, optional
Number of steps to run
- sampler
EnsembleSampler
instance, optional Sampler.
- pos
ndarray
, optional A list of initial position vectors for the walkers. It should have dimensions of
(nwalkers,dim)
, wheredim
is the number of free parameters.emcee.utils.sample_ball
can be used to generate a multidimensional gaussian distribution around a single initial position.
- Returns
- sampler
EnsembleSampler
instance Sampler containing the paths of the walkers during the
nrun
steps.- posarray
List of final position vectors after the run.
- sampler
- class naima.InteractiveModelFitter(modelfn, p0, data=None, e_range=None, e_npoints=100, labels=None, sed=True, auto_update=True)[source]¶
Interactive model fitter using matplotlib widgets
- Parameters
- modelfnfunction
A function that takes a vector in the parameter space and the data table, and returns the expected fluxes at the energies in the spectrum.
- p0array
Initial position vector.
- data
Table
or list ofTable
Table or tables with observed spectra. Must follow the format described in
naima.run_sampler
.- e_rangelist of
Quantity
, length 2, optional Limits in energy for the computation of the model. Note that setting this parameter will mean that the model output is computed twice when
data
is provided: once for display usinge_range
and once for computation of the log-probability using the energy values of the spectra.- e_npointsint, optional
How many points to compute for the model if
e_range
is set. Default is 100.- labelsiterable of strings, optional
Labels for the parameters included in the position vector
p0
. If not provided['par1','par2', ... ,'parN']
will be used.- sedbool, optional
Whether to plot SED or differential spectrum.
- auto_updatebool, optional
Whether to update the model plot when parameter sliders are changed. Default is True and can also be changed through the GUI.
- naima.save_run(filename, sampler, compression=True, clobber=False)[source]¶
Save the sampler chain, data table, parameter labels, metadata blobs, and run information to a hdf5 file.
The data table and parameter labels stored in the sampler will also be saved to the hdf5 file.
- Parameters
- filenamestr
Filename for hdf5 file. If the filename extension is not ‘h5’ or ‘hdf5’, the suffix ‘_chain.h5’ will be appended to the filename.
- sampler
emcee.EnsembleSampler
instance Sampler instance for which chain and run information is saved.
- compressionbool, optional
Whether gzip compression is applied to the dataset on write. Default is True.
- clobberbool, optional
Whether to overwrite the output filename if it exists.
- naima.read_run(filename, modelfn=None)[source]¶
Read chain from a hdf5 saved with
save_run
.This function will also read the labels, data table, and metadata blobs stored in the original sampler. If you want to use the result object with
plot_fit
and setting thee_range
parameter, you must provide the model function with themodelfn
argument given that functions cannot be serialized in hdf5 files.- Parameters
- filenamestr
Filename of the hdf5 containing the chain, lnprobability, and blob arrays in the group ‘sampler’
- modelfnfunction, optional
Model function to be attached to the returned sampler
- Returns
- resultclass
Container object with same properties as an
emcee.EnsembleSampler
resulting from a sampling run. This object can be passed ontoplot_fit
,plot_chain
, andplot_corner
for analysis as you would do with aemcee.EnsembleSampler
instance.
- naima.plot_chain(sampler, p=None, **kwargs)[source]¶
Generate a diagnostic plot of the sampler chains.
- Parameters
- sampler
emcee.EnsembleSampler
Sampler containing the chains to be plotted.
- pint (optional)
Index of the parameter to plot. If omitted, all chains are plotted.
- last_stepbool (optional)
Whether to plot the last step of the chain or the complete chain (default).
- sampler
- Returns
- figure
matplotlib.figure.Figure
Figure
- figure
- naima.plot_corner(sampler, show_ML=True, **kwargs)[source]¶
A plot that summarizes the parameter samples by showing them as individual histograms and 2D histograms against each other. The maximum likelihood parameter vector is indicated by a cross.
This function is a thin wrapper around
corner.corner
, found at https://github.com/corner/corner.py.- Parameters
- sampler
emcee.EnsembleSampler
Sampler with a stored chain.
- show_MLbool, optional
Whether to show the maximum likelihood parameter vector as a cross on the 2D histograms.
- sampler
- naima.plot_fit(sampler, modelidx=0, label=None, sed=True, last_step=False, n_samples=100, confs=None, ML_info=False, figure=None, plotdata=None, plotresiduals=None, e_unit=None, e_range=None, e_npoints=100, threads=None, xlabel=None, ylabel=None, ulim_opts={}, errorbar_opts={})[source]¶
Plot data with fit confidence regions.
- Parameters
- sampler
emcee.EnsembleSampler
Sampler with a stored chain.
- modelidxint, optional
Model index to plot.
- labelstr, optional
Label for the title of the plot.
- sedbool, optional
Whether to plot SED or differential spectrum.
- last_stepbool, optional
Whether to use only the samples of the last step in the run when showing either the model samples or the confidence intervals.
- n_samplesint, optional
If not
None
, number of sample models to plot. IfNone
, confidence bands will be plotted instead of samples. Default is 100.- confslist, optional
List of confidence levels (in sigma) to use for generating the confidence intervals. Default is to plot sample models instead of confidence bands.
- ML_infobool, optional
Whether to plot information about the maximum likelihood parameters and the standard deviation of their distributions. Default is True.
- figure
matplotlib.figure.Figure
, optional matplotlib
figure to plot on. If omitted a new one will be generated.- plotdatabool, optional
Wheter to plot data on top of model confidence intervals. Default is True if the physical types of the data and the model match.
- plotresidualsbool, optional
Wheter to plot the residuals with respect to the maximum likelihood model. Default is True if
plotdata
is True and eitherconfs
orn_samples
are set.- e_unit
Unit
, optional Units for the energy axis of the plot. The default is to use the units of the energy array of the observed data.
- e_rangelist of
Quantity
, length 2, optional Limits in energy for the computation of the model samples and ML model. Note that setting this parameter will mean that the samples for the model are recomputed and depending on the model speed might be quite slow.
- e_npointsint, optional
How many points to compute for the model samples and ML model if
e_range
is set.- threadsint, optional
How many parallel processing threads to use when computing the samples. Defaults to the number of available cores.
- xlabelstr, optional
Label for the
x
axis of the plot.- ylabelstr, optional
Label for the
y
axis of the plot.- ulim_optsdict
Option for upper-limit plotting. Available options are capsize (arrow width) and height_fraction (arrow length in fraction of flux value).
- errorbar_optsdict
Addtional options to pass to
matplotlib.plt.errorbar
for plotting the spectral flux points.
- sampler
- naima.plot_data(input_data, xlabel=None, ylabel=None, sed=True, figure=None, e_unit=None, ulim_opts={}, errorbar_opts={})[source]¶
Plot spectral data.
- Parameters
- input_data
emcee.EnsembleSampler
,astropy.table.Table
, ordict
Spectral data to plot. Can be given as a data table, a dict generated with
validate_data_table
or aemcee.EnsembleSampler
with a data property.- xlabelstr, optional
Label for the
x
axis of the plot.- ylabelstr, optional
Label for the
y
axis of the plot.- sedbool, optional
Whether to plot SED or differential spectrum.
- figure
matplotlib.figure.Figure
, optional matplotlib
figure to plot on. If omitted a new one will be generated.- e_unit
astropy.unit.Unit
, optional Units for energy axis. Defaults to those of the data.
- ulim_optsdict
Options for upper-limit plotting. Available options are capsize (arrow width) and height_fraction (arrow length in fraction of flux value).
- errorbar_optsdict
Addtional options to pass to
matplotlib.plt.errorbar
for plotting the spectral flux points.
- input_data
- naima.plot_blob(sampler, blobidx=0, label=None, last_step=False, figure=None, **kwargs)[source]¶
Plot a metadata blob as a fit to spectral data or value distribution
Additional
kwargs
are passed toplot_fit
.- Parameters
- sampler
emcee.EnsembleSampler
Sampler with a stored chain.
- blobidxint, optional
Metadata blob index to plot.
- labelstr, optional
Label for the value distribution. Labels for the fit plot can be passed as
xlabel
andylabel
and will be passed toplot_fit
.
- sampler
- Returns
- figure
matplotlib.pyplot.Figure
matplotlib
figure instance containing the plot.
- figure
- naima.save_diagnostic_plots(outname, sampler, modelidxs=None, pdf=False, sed=True, blob_labels=None, last_step=False, dpi=100)[source]¶
Generate diagnostic plots.
A plot for each of the chain parameters showing walker progression, final sample distribution and several statistical measures of this distribution:
outname_chain_parN.png
(seenaima.plot_chain
).A corner plot of sample density in the two dimensional parameter space of all parameter pairs of the run, with the Maximum Likelihood parameter vector indicated in blue:
outname_corner.png
(seecorner.corner
).A plot for each of the models returned as blobs by the model function. The maximum likelihood model is shown, as well as the 1 and 3 sigma confidence level contours. The first model will be compared with observational data and residuals shown.
outname_fit_modelN.png
(seenaima.plot_fit
andnaima.plot_blob
).
- Parameters
- outnamestr
Name to be used to save diagnostic plot files.
- sampler
emcee.EnsembleSampler
instance Sampler instance from which chains, blobs and data are read.
- modelidxsiterable of integers, optional
Model numbers to be plotted. Default: All returned in sampler.get_blobs
- blob_labelslist of strings, optional
Label for each of the outputs of the model. They will be used as title for the corresponding plot.
- pdfbool, optional
Whether to save plots to multipage pdf.
- naima.save_results_table(outname, sampler, format='ascii.ecsv', convert_log=True, last_step=False, include_blobs=True)[source]¶
Save an ASCII table with the results stored in the
EnsembleSampler
.The table contains the median, 16th and 84th percentile confidence region (~1sigma) for each parameter.
- Parameters
- outnamestr
Root name to be used to save the table.
_results.dat
will be appended for the output filename.- sampler
emcee.EnsembleSampler
instance Sampler instance from which chains, blobs and data are read.
- formatstr, optional
Format of the saved table. Must be a format string accepted by
astropy.table.Table.write
, see the astropy unified file read/write interface documentation. Only theascii.ecsv
andascii.ipac
formats are able to preserve all the information stored in therun_info
dictionary of the sampler. Defaults toascii.ecsv
if available (only in astropy > v1.0), elseascii.ipac
.- convert_logbool, optional
Whether to convert natural or base-10 logarithms into original values in addition to saving the logarithm value.
- last_stepbool, optional
Whether to only use the positions in the final step of the run (True, default) or the whole chain (False).
- include_blobsbool, optional
Whether to save the distribution properties of the scalar blobs in the sampler. Default is True.
- Returns
- table
Table
Table with the results.
- table