Time Series¶

CARMA process simulations¶

A collection of functions for simulating CARMA processes.

eztao.ts.carma_sim.addNoise(y, yerr, seed=None)¶

Add (gaussian) noise to the input simulated time series given the measurement uncertainties.

Parameters:

y (array(float)) – The ‘clean’ time series.
yerr (array(float)) – The measurement uncertainties for the input time series.

seed (int): Random seed for simulating noise. Defaults to None.

Returns:: A new time series with simulated (gaussian ) noise added on top.
Return type:: array(float)

eztao.ts.carma_sim.gpSimByTime(carmaTerm, SNR, t, factor=10, nLC=1, log_flux=True, lc_seed=None)¶

Simulate CARMA time series at desired time stamps.

This function uses a ‘factor’ parameter to determine the sampling rate of a full time series to simulate and downsample from. For example, if ‘factor’ = 10, then the full time series will be 10 times denser than the median sampling rate of the provided time stamps.

Parameters:

carmaTerm (object) – An EzTao CARMA kernel.
SNR (float) – Signal-to-noise defined as ratio between CARMA RMS amplitude and the median of the measurement errors (simulated using log normal).
t (array(float)) – The desired time stamps (starting from zero).
factor (int, optional) – Parameter to control the ratio in the sampling rate between the simulated full time series and the desired output one. Defaults to 10.
nLC (int, optional) – Number of time series to simulate. Defaults to 1.
log_flux (bool) – Whether the flux/y values are in astronomical magnitude. This argument affects how errors are assigned. Defaults to True.
lc_seed (int) – Random seed for time series simulation. Defaults to None.

Returns:

Time stamps (default in day), y values and measurement errors of the simulated time series.

Return type:

(array(float), array(float), array(float))

eztao.ts.carma_sim.gpSimFull(carmaTerm, SNR, duration, N, nLC=1, log_flux=True, lc_seed=None)¶

Simulate CARMA time series using uniform sampling.

Parameters:

carmaTerm (object) – An EzTao CARMA kernel.
SNR (float) – Signal-to-noise defined as ratio between CARMA RMS amplitude and the median of the measurement errors (simulated using log normal).
duration (float) – The duration of the simulated time series (default in days).
N (int) – The number of data points in the simulated time series.
nLC (int, optional) – Number of time series to simulate. Defaults to 1.
log_flux (bool) – Whether the flux/y values are in astronomical magnitude. This argument affects how errors are assigned. Defaults to True.
lc_seed (int) – Random seed for time series simulation. Defaults to None.

Raises:

RuntimeError – If the input CARMA term/model is not stable, thus cannot be solved by celerite.

Returns:

Time stamps (default in day), y values and measurement errors of the simulated time series.

Return type:

(array(float), array(float), array(float))

eztao.ts.carma_sim.gpSimRand(carmaTerm, SNR, duration, N, nLC=1, log_flux=True, season=True, full_N=10000, lc_seed=None, downsample_seed=None)¶

Simulate CARMA time series randomly downsampled from a much denser full time series.

Parameters:

carmaTerm (object) – An EzTao CARMA kernel.
SNR (float) – Signal-to-noise defined as ratio between CARMA RMS amplitude and the median of the measurement errors (simulated using log normal).
duration (float) – The duration of the simulated time series (default in days).
N (int) – The number of data points in the simulated time series.
nLC (int, optional) – Number of time series to simulate. Defaults to 1.
log_flux (bool) – Whether the flux/y values are in astronomical magnitude. This argument affects how errors are assigned. Defaults to True.
season (bool, optional) – Whether to simulate 6-months seasonal gaps. Defaults to True.
full_N (int, optional) – The number of data points in the full time series (before downsampling). Defaults to 10_000.
lc_seed (int) – Random seed for full time series simulation. Defaults to None.
downsample_seed (int) – Random seed for downsampling the simulated full time series. Defaults to None.

Returns:

Time stamps (default in day), y values and measurement errors of the simulated time series.

Return type:

(array(float), array(float), array(float))

eztao.ts.carma_sim.pred_lc(t, y, yerr, params, p, t_pred, return_var=True)¶

Generate predicted values at particular time stamps given the initial time series and a best-fit model.

Parameters:

t (array(float)) – Time stamps of the initial time series.
y (array(float)) – y values (i.e., flux) of the initial time series.
yerr (array(float)) – Measurement errors of the initial time series.
params (array(float)) – Best-fit CARMA parameters
p (int) – The AR order (p) of the given best-fit model.
t_pred (array(float)) – Time stamps to generate predicted time series.
return_var (bool, optional) – Whether to return uncertainties in the mean prediction. Defaults to True.

Returns:

t_pred, mean prediction at t_pred and uncertainties (variance) of the mean prediction.

Return type:

(array(float), array(float), array(float))

Fitting time series to CARMA¶

A collection of functions to fit/analyze time series using CARMA models.

eztao.ts.carma_fit.carma_fit(t, y, yerr, p, q, init_func=None, neg_lp_func=None, optimizer_func=None, n_opt=20, user_bounds=None, scipy_opt_kwargs={}, scipy_opt_options={}, debug=False)¶

Fit an arbitrary CARMA model

The default settings are optimized for normalized LCs.

Parameters:

t (array(float)) – Time stamps of the input time series (the default unit is day).
y (array(float)) – y values of the input time series.
yerr (array(float)) – Measurement errors for y values.
p (int) – The p order of a CARMA(p, q) model.
q (int) – The q order of a CARMA(p, q) model.
init_func (object, optional) – A user-provided function to generate initial guesses for the optimizer. Defaults to None.
neg_lp_func (object, optional) – A user-provided function to compute negative probability given an array of parameters, an array of time series values and a celerite GP instance. Defaults to None.
optimizer_func (object, optional) – A user-provided optimizer function. Defaults to None.
n_opt (int, optional) – Number of optimizers to run. Defaults to 20.
user_bounds (array(float), optional) – Parameter boundaries for the default optimizer. If p > 2, these are boundaries for the coefficients of the factored polynomial. Defaults to None.
scipy_opt_kwargs (dict, optional) – Keyword arguments for scipy.optimize.minimize. Defaults to {}.
scipy_opt_options (dict, optional) – “options” argument for scipy.optimize.minimize. Defaults to {}.
debug (bool, optional) – Turn on/off debug mode. Defaults to False.

Raises:

celerite.solver.LinAlgError – For non-positive definite autocovariance matrices.

Returns:

Best-fit parameters

Return type:

array(float)

eztao.ts.carma_fit.carma_log_fcoeff_init(p, q, ar_range=[- 8, 8], ma_range=[- 10, 6], ma_mult_range=[- 10, 0], size=1)¶

Randomly generate CARMA coefficients in the space of the factored polynomials

The default ranges are optimized for normalized light curves (with a standard deviation of unity).

Parameters:

p (int) – The p order of a CARMA(p, q) model.
q (int) – The q order of a CARMA(p, q) model.
ar_range (object, optional) – The range (in natural log) for AR polynomial coefficients. Defaults to [-8, 8].
ma_range (object, optional) – The range (in natural log) for MA polynomial coefficients. Defaults to [-10, 6].
ma_mult_range (object, optional) – The range for the MA multiplier (the coefficient of the highest-order term in the MA characteristic polynomial). Defaults to [-10, 0].
size (int, optional) – The number of the set of coefficients to generate. Defaults to 1.

Returns:

A ndarray of coeffs for the factored polynomials in natural log.

Return type:

array(float)

Note

The notation (index) in the returned coefficients follows that in Jones et al. (1981). The last coefficient in the returned array is not part of the coefficients, rather a simple multiplying factor of the entire polynomial, which is needed to obtain the nominal CARMA representation.

eztao.ts.carma_fit.dho_fit(t, y, yerr, init_func=None, neg_lp_func=None, optimizer_func=None, n_opt=20, user_bounds=None, scipy_opt_kwargs={}, scipy_opt_options={}, debug=False)¶

Fit DHO to time series

The default settings are optimized for normalized LCs.

Parameters:

t (array(float)) – Time stamps of the input time series (the default unit is day).
y (array(float)) – y values of the input time series.
yerr (array(float)) – Measurement errors for y values.
init_func (object, optional) – A user-provided function to generate initial guesses for the optimizer. Defaults to None.
neg_lp_func (object, optional) – A user-provided function to compute negative probability given an array of parameters, an array of time series values and a celerite GP instance. Defaults to None.
optimizer_func (object, optional) – A user-provided optimizer function. Defaults to None.
n_opt (int, optional) – Number of optimizers to run.. Defaults to 20.
user_bounds (list, optional) – Parameter boundaries for the default optimizer and the default flat prior. Defaults to None.
scipy_opt_kwargs (dict, optional) – Keyword arguments for scipy.optimize.minimize. Defaults to {}.
scipy_opt_options (dict, optional) – “options” argument for scipy.optimize.minimize. Defaults to {}.
debug (bool, optional) – Turn on/off debug mode. Defaults to False.

Raises:

celerite.solver.LinAlgError – For non-positive definite autocovariance matrices.

Returns:

Best-fit DHO parameters

Return type:

array(float)

eztao.ts.carma_fit.dho_log_param_init(ar_range=[- 6, 10], ma_range=[- 10, 2], size=1)¶

Randomly generate DHO coefficients in the space of the factored polynomials

The default ranges are optimized for normalized light curves (with a standard deviation of unity).

Parameters:

ar_range (object, optional) – The range (in natural log) for DHO AR parameters. Defaults to [-6, 10].
ma_range (object, optional) – The range (in natural log) for DHO MA parameters. Defaults to [-10, 2].
size (int, optional) – The number of the set of DHO parameters to generate. Defaults to 1.

Returns:

A ndarray of DHO parameters in natural log.

Return type:

array(float)

eztao.ts.carma_fit.drw_fit(t, y, yerr, init_func=None, neg_lp_func=None, optimizer_func=None, n_opt=10, user_bounds=None, scipy_opt_kwargs={}, scipy_opt_options={}, debug=False)¶

Fit DRW.

Parameters:

t (array(float)) – Time stamps of the input time series (the default unit is day).
y (array(float)) – y values of the input time series.
yerr (array(float)) – Measurement errors for y values.
init_func (object, optional) – A user-provided function to generate initial guesses for the optimizer. Defaults to None.
neg_lp_func (object, optional) – A user-provided function to compute negative probability given an array of parameters, an array of time series values and a celerite GP instance. Defaults to None.
optimizer_func (object, optional) – A user-provided optimizer function. Defaults to None.
n_opt (int, optional) – Number of optimizers to run. Defaults to 10.
user_bounds (list, optional) – Parameter boundaries for the default optimizer. Defaults to None.
scipy_opt_kwargs (dict, optional) – Keyword arguments for scipy.optimize.minimize. Defaults to {}.
scipy_opt_options (dict, optional) – “options” argument for scipy.optimize.minimize. Defaults to {}.
debug (bool, optional) – Turn on/off debug mode. Defaults to False.

Returns:

Best-fit DRW parameters

Return type:

array(float)

eztao.ts.carma_fit.drw_log_param_init(amp_range, log_tau_range, size=1)¶

Randomly generate DRW parameters.

Parameters:

amp_range (object) – An array containing the range of DRW amplitude to simulate.
log_tau_range (object) – An array containing the range of DRW timescale (in natural log) to simulate.
size (int, optional) – The number of the set of DRW parameters to generate. Defaults to 1.

Returns:

A ndarray of DRW parameters in natural log.

Return type:

array(float)

eztao.ts.carma_fit.flat_prior(log_params, bounds)¶

A flat prior function. Returns 0 if “log_params” are within the given “bounds”, negative infinity otherwise.

Parameters:

log_params (array(float)) – CARMA parameters in natural log.
bounds (array((float, float)) – An array of boundaries.

Returns:

0 or negative infinity.

Return type:

float

eztao.ts.carma_fit.neg_fcoeff_ll(log_fcoeffs, y, gp)¶

Negative log likelihood function for CARMA specified in the factored poly space.

This method will catch ‘overflow/underflow’ runtimeWarning and return inf as probability.

Parameters:

log_fcoeffs (array(float)) – Coefficients (in natural log) of a CARMA model in the factored polynomial space.
y (array(float)) – y values of the input time series.
gp (object) – celerite GP object with a proper CARMA kernel.

Returns:

Negative log likelihood.

Return type:

float

eztao.ts.carma_fit.neg_lp_flat(log_params, y, gp, bounds=None, mode='fcoeff')¶

Negative log probability function using a flat prior.

Parameters:

log_params (array(float)) – CARMA parameters (or coefficients of the factored characteristic polynomial) in natural log.
y (array(float)) – y values of the input time series.
gp (object) – celerite GP object with a proper CARMA kernel.
bounds (array((float, float)) – An array of boundaries. Defaults to None.
mode (str, optional) – The parameter space in which proposals are made. The mode determines which loglikehood function to use. Defaults to “fcoeff”.

Returns:

Log probability of the proposed parameters.

Return type:

float

eztao.ts.carma_fit.neg_param_ll(log_params, y, gp)¶

Negative log likelihood function for CARMA specified in the nominal space.

This method will catch ‘overflow/underflow’ runtimeWarning and return inf as probability.

Parameters:

log_params (array(float)) – Natural log of CARMA parameters.
y (array(float)) – y values of the input time series.
gp (object) – celerite GP object with a proper CARMA kernel.

Returns:

Negative log likelihood.

Return type:

float

eztao.ts.carma_fit.sample_carma(p, q)¶

Randomly generate a stationary CARMA model given the orders (p and q).

Parameters:

p (int) – The p order of a CARMA(p, q) model.
q (int) – The q order of a CARMA(p, q) model.

Returns:

AR and MA coefficients in two separate arrays.

eztao.ts.carma_fit.scipy_opt(y, gp, init_func, neg_lp_func, n_opt, mode='fcoeff', debug=False, opt_kwargs={}, opt_options={})¶

A wrapper for scipy.optimize.minimize method.

Parameters:

y (array(float)) – y values of the input time series.
gp (object) – celerite GP object with a proper CARMA kernel.
init_func (object) – A user-provided function to generate initial guesses for the optimizer. Defaults to None.
neg_lp_func (object) – A user-provided function to compute negative probability given an array of parameters, an array of time series values and a celerite GP instance. Defaults to None.
n_opt (int) – Number of iterations to run the optimizer.
mode (str, optional) – The parameter space in which to make proposals, this should be determined in the “_fit” functions based on the value of the p order. Defaults to “fcoeff”.
debug (bool, optional) – Turn on/off debug mode. Defaults to False.
opt_kwargs (dict, optional) – Keyword arguments for scipy.optimize.minimize. Defaults to {}.
opt_options (dict, optional) – “options” argument for scipy.optimize.minimize. Defaults to {}.

Returns:

Best-fit parameters if “debug” is False, an array of scipy.optimize.OptimizeResult objects otherwise.

MCMC¶

A module containing functions to run MCMC using emcee.

eztao.ts.carma_mcmc.mcmc(t, y, yerr, p, q, n_walkers=32, burn_in=500, n_samples=2000, init_param=None)¶

A simple wrapper to run quick MCMC using emcee.

Parameters:

t (array(float)) – Time stamps of the input time series (the default unit is day).
y (array(float)) – y values of the input time series.
yerr (array(float)) – Measurement errors for y values.
p (int) – The p order of a CARMA(p, q) model.
q (int) – The q order of a CARMA(p, q) model.
n_walkers (int, optional) – Number of MCMC walkers. Defaults to 32.
burn_in (int, optional) – Number of burn in steps. Defaults to 500.
n_samples (int, optional) – Number of MCMC steps to run. Defaults to 2000.
init_param (array(float), optional) – The initial position for the MCMC walker. Defaults to None.

Returns:

The emcee sampler object. The MCMC flatchain (n_walkers*n_samplers, dim) and chain (n_walkers, n_samplers, dim) in CARMA space if p > 2, otherwise empty.

Return type:

(object, array(float), array(float))

Time series utility functions¶

A collection of utility functions to assist analysis/simulation of time series data.

eztao.ts.utils.add_season(t, lc_start=0, season_start=90, season_end=270)¶

Insert seasonal gaps into time series

Parameters:

t (array(float)) – Time stamps of the original time series.
lc_start (float) – Starting day for the output time series. (0 -> 365.25). Default to 0.
season_start (float) – Observing season start day within a year. Default to 90.
season_end (float) – Observing season end day within a year. Default to 270.

Returns:

A 1d array booleans indicating which data points to keep.

eztao.ts.utils.downsample_byN(t, nObs, seed)¶

Randomly choose N data points from a given time series.

Parameters:

t (array(float)) – Time stamps of the original time series.
N (int) – The number of entries in the final time series.
seed (int) – Random seed for downsampling. Defaults to None.

Returns:

A 1d array booleans indicating which data points to keep.

eztao.ts.utils.downsample_byTime(tIn, tOut)¶

Downsample a time series at desired output time stamps.

Parameters:

tIn (array(float)) – Time stamps of the original time series.
tOut (array(float)) – Time stamps of the output time series.

Returns:

Indices for which the data points should be kept from the original time series. Note that there could be duplicates.

Return type:

array(int)

eztao.ts.utils.median_clip(y, num_sigma=3)¶

Clip time series using a three point median filter.

The sigma (standard deviation) for the time series is computed from the median absolute deviation (MAD) as to reduce the effects from extreme outliers, where sigma sim 1.4826*MAD. If more than 10% of the data points are removed, the upper bound will be lifted gradually until that fraction drops bellow 10%.

Parameters:

y (array(float)) – y values of the original time series.
num_sigma (int, optional) – Data points that are more than this number of sigma away from the three point median will be removed. Defaults to 3.

Returns:

A 1d array booleans indicating which data points to keep.