Module: features.graphs

cesium.features.graphs.amplitude(x)

Half the difference between the maximum and minimum magnitude.

cesium.features.graphs.anderson_darling(x, e)

Anderson-Darling test statistic.

cesium.features.graphs.cad_prob(cads, time)

Given the observed distribution of time lags cads, compute the probability that the next observation occurs within time minutes of an arbitrary epoch.

cesium.features.graphs.delta_t_hist(t[, ...])

Build histogram of all possible |t_i - t_j|'s.

cesium.features.graphs.double_to_single_step(cads)

Ratios (t[i+2] - t[i]) / (t[i+1] - t[i]).

cesium.features.graphs.find_sorted_peaks(x)

Find peaks, i.e. local maxima, of an array.

cesium.features.graphs.flux_percentile_ratio(x, ...)

A ratio of ((50+x) flux percentile - (50-x) flux percentile) / (95 flux percentile - 5 flux percentile), where x = percentile_range/2.

cesium.features.graphs.generate_dask_graph(t, m, e)

cesium.features.graphs.get_fold2P_slope_percentile(...)

Get alphath percentile of slopes of period-folded model.

cesium.features.graphs.get_lomb_amplitude(...)

Get the amplitude of the jth harmonic of the ith frequency from a fitted Lomb-Scargle model.

cesium.features.graphs.get_lomb_amplitude_ratio(...)

Get the ratio of the amplitudes of the first harmonic for the ith and first frequencies from a fitted Lomb-Scargle model.

cesium.features.graphs.get_lomb_frequency(...)

Get the ith frequency from a fitted Lomb-Scargle model.

cesium.features.graphs.get_lomb_frequency_ratio(...)

Get the ratio of the ith and first frequencies from a fitted Lomb-Scargle model.

cesium.features.graphs.get_lomb_lambda(...)

Get the regularization parameter of a fitted Lomb-Scargle model.

cesium.features.graphs.get_lomb_rel_phase(...)

Get the relative phase of the jth harmonic of the ith frequency from a fitted Lomb-Scargle model.

cesium.features.graphs.get_lomb_signif(...)

Get the significance (in sigmas) of the first frequency from a fitted Lomb-Scargle model.

cesium.features.graphs.get_lomb_signif_ratio(...)

Get the ratio of the significances (in sigmas) of the ith and first frequencies from a fitted Lomb-Scargle model.

cesium.features.graphs.get_lomb_trend(lomb_model)

Get the linear trend of a fitted Lomb-Scargle model.

cesium.features.graphs.get_lomb_varrat(...)

Get the fraction of the variance explained by the first frequency of a fitted Lomb-Scargle model.

cesium.features.graphs.get_lomb_y_offset(...)

Get the y-intercept of a fitted Lomb-Scargle model.

cesium.features.graphs.get_max_delta_mags(model)

Largest value minus second largest value of fitted Lomb Scargle model.

cesium.features.graphs.get_medperc90_2p_p(model)

Get ratio of 90th percentiles of residuals for data folded by twice the estimated period and the estimated period, respectively.

cesium.features.graphs.get_min_delta_mags(model)

Second smallest value minus smallest value of fitted Lomb Scargle model.

cesium.features.graphs.get_model_phi1_phi2(model)

Ratio of distances between the second minimum and first maximum, and the second minimum and second maximum, of the fitted Lomb-Scargle model.

cesium.features.graphs.get_p2p_scatter_2praw(model)

Get ratio of variability (sum of squared differences of consecutive values) of folded and unfolded models.

cesium.features.graphs.get_p2p_scatter_over_mad(model)

Get ratio of variability of folded and unfolded models.

cesium.features.graphs.get_p2p_scatter_pfold_over_mad(model)

Get ratio of median of period-folded data over median absolute deviation of observed values.

cesium.features.graphs.get_p2p_ssqr_diff_over_var(model)

Get sum of squared differences of consecutive values as a fraction of the variance of the data.

cesium.features.graphs.get_qso_log_chi2_qsonu(...)

Natural log of goodness of fit of qso-model given fixed parameters.

cesium.features.graphs.get_qso_log_chi2nuNULL_chi2nu(...)

Natural log of expected chi2/nu for non-qso variable.

cesium.features.graphs.kurtosis(x)

Kurtosis of a dataset.

cesium.features.graphs.lomb_scargle_fast_period(t, m, e)

Fits a simple sinuosidal model

cesium.features.graphs.lomb_scargle_model(...)

Simultaneous fit of a sum of sinusoids by weighted least squares.

cesium.features.graphs.max_slope(t, x)

Compute the largest rate of change in the observed data.

cesium.features.graphs.maximum(x)

Maximum observed value.

cesium.features.graphs.median(x)

Median of observed values.

cesium.features.graphs.median_absolute_deviation(x)

Median absolute deviation (from the median) of the observed values.

cesium.features.graphs.minimum(x)

Minimum observed value.

cesium.features.graphs.normalize_hist(hist, ...)

Normalize histogram such that integral from t_min to t_max equals 1.

cesium.features.graphs.num_alias(lomb_model)

Here we check for "1-day" aliases in ASAS / Deboss sources.

cesium.features.graphs.p2p_model(x, y, frequency)

Compute features that compare the residuals of data folded by estimated period from Lomb-Scargle model with residuals folded by twice the estimated period.

cesium.features.graphs.peak_bin(peaks, i)

Return the (bin) index of the ith largest peak.

cesium.features.graphs.peak_ratio(peaks, i, j)

Compute the ratio of the values of the ith and jth largest peaks.

cesium.features.graphs.percent_amplitude(x)

Returns the largest distance from the median value, measured as a percentage of the median.

cesium.features.graphs.percent_beyond_1_std(x, e)

Percentage of values more than 1 std.

cesium.features.graphs.percent_close_to_median(x)

Percentage of values within window_frac*(max(x)-min(x)) of median.

cesium.features.graphs.percent_difference_flux_percentile(x)

Difference between the 95th and 5th percentiles of the data, expressed as a percentage of the median value.

cesium.features.graphs.period_folding(x, y, ...)

This section is used to calculate Dubath (10.

cesium.features.graphs.periodic_model(lomb_model)

Compute features related to the extreme points of the fitted Lomb Scargle model.

cesium.features.graphs.qso_fit(time, data, error)

Best-fit qso model determined for Sesar Strip82, ugriz-bands (default r).

cesium.features.graphs.scatter_res_raw(t, m, ...)

From arXiv 1101_2406v1 Dubath 20110112 paper.

cesium.features.graphs.shapiro_wilk(x, e)

Shapiro-Wilk test statistic.

cesium.features.graphs.skew(x)

Skewness of a dataset.

cesium.features.graphs.std(x)

Standard deviation of observed values.

cesium.features.graphs.stetson_j(x[, y, dx, dy])

Robust covariance statistic between pairs of observations x,y whose uncertainties are dx,dy.

cesium.features.graphs.stetson_k(x[, dx])

A robust kurtosis statistic.

cesium.features.graphs.weighted_average(x, e)

Arithmetic mean of observed values, weighted by measurement errors.

amplitude

cesium.features.graphs.amplitude(x)

Half the difference between the maximum and minimum magnitude.

anderson_darling

cesium.features.graphs.anderson_darling(x, e)

Anderson-Darling test statistic.

cad_prob

cesium.features.graphs.cad_prob(cads, time)

Given the observed distribution of time lags cads, compute the probability that the next observation occurs within time minutes of an arbitrary epoch.

delta_t_hist

cesium.features.graphs.delta_t_hist(t, nbins=50, conv_oversample=50)

Build histogram of all possible |t_i - t_j|’s.

For efficiency, we construct the histogram via a convolution of the PDF rather than by actually computing all the differences. For better accuracy we use a factor conv_oversample more bins when performing the convolution and then aggregate the result to have nbins total values.

double_to_single_step

cesium.features.graphs.double_to_single_step(cads)

Ratios (t[i+2] - t[i]) / (t[i+1] - t[i]).

find_sorted_peaks

cesium.features.graphs.find_sorted_peaks(x)

Find peaks, i.e. local maxima, of an array. Interior points are peaks if they are greater than both their neighbors, and edge points are peaks if they are greater than their only neighbor. In the case of ties, we (arbitrarily) choose the first index in the sequence of equal values as the peak. Returns a list of tuples (i, x[i]) of peak indices i and values x[i], sorted in decreasing order by peak value.

flux_percentile_ratio

cesium.features.graphs.flux_percentile_ratio(x, percentile_range, base=10.0, exponent=-0.4)

A ratio of ((50+x) flux percentile - (50-x) flux percentile) / (95 flux percentile - 5 flux percentile), where x = percentile_range/2.

Assumes data is log-scaled; by default we assume inputs are scaled as x=10^(-0.4*y), corresponding to units of magnitudes. Computations are performed on the corresponding linear-scale values.

generate_dask_graph

cesium.features.graphs.generate_dask_graph(t, m, e)

get_fold2P_slope_percentile

cesium.features.graphs.get_fold2P_slope_percentile(model, alpha)

Get alphath percentile of slopes of period-folded model.

get_lomb_amplitude

cesium.features.graphs.get_lomb_amplitude(lomb_model, i, j)

Get the amplitude of the jth harmonic of the ith frequency from a fitted Lomb-Scargle model.

get_lomb_amplitude_ratio

cesium.features.graphs.get_lomb_amplitude_ratio(lomb_model, i)

Get the ratio of the amplitudes of the first harmonic for the ith and first frequencies from a fitted Lomb-Scargle model.

get_lomb_frequency

cesium.features.graphs.get_lomb_frequency(lomb_model, i)

Get the ith frequency from a fitted Lomb-Scargle model.

get_lomb_frequency_ratio

cesium.features.graphs.get_lomb_frequency_ratio(lomb_model, i)

Get the ratio of the ith and first frequencies from a fitted Lomb-Scargle model.

get_lomb_lambda

cesium.features.graphs.get_lomb_lambda(lomb_model)

Get the regularization parameter of a fitted Lomb-Scargle model.

get_lomb_rel_phase

cesium.features.graphs.get_lomb_rel_phase(lomb_model, i, j)

Get the relative phase of the jth harmonic of the ith frequency from a fitted Lomb-Scargle model.

get_lomb_signif

cesium.features.graphs.get_lomb_signif(lomb_model)

Get the significance (in sigmas) of the first frequency from a fitted Lomb-Scargle model.

get_lomb_signif_ratio

cesium.features.graphs.get_lomb_signif_ratio(lomb_model, i)

Get the ratio of the significances (in sigmas) of the ith and first frequencies from a fitted Lomb-Scargle model.

get_lomb_trend

cesium.features.graphs.get_lomb_trend(lomb_model)

Get the linear trend of a fitted Lomb-Scargle model.

get_lomb_varrat

cesium.features.graphs.get_lomb_varrat(lomb_model)

Get the fraction of the variance explained by the first frequency of a fitted Lomb-Scargle model.

get_lomb_y_offset

cesium.features.graphs.get_lomb_y_offset(lomb_model)

Get the y-intercept of a fitted Lomb-Scargle model.

get_max_delta_mags

cesium.features.graphs.get_max_delta_mags(model)

Largest value minus second largest value of fitted Lomb Scargle model.

get_medperc90_2p_p

cesium.features.graphs.get_medperc90_2p_p(model)

Get ratio of 90th percentiles of residuals for data folded by twice the estimated period and the estimated period, respectively.

get_min_delta_mags

cesium.features.graphs.get_min_delta_mags(model)

Second smallest value minus smallest value of fitted Lomb Scargle model.

get_model_phi1_phi2

cesium.features.graphs.get_model_phi1_phi2(model)

Ratio of distances between the second minimum and first maximum, and the second minimum and second maximum, of the fitted Lomb-Scargle model.

get_p2p_scatter_2praw

cesium.features.graphs.get_p2p_scatter_2praw(model)

Get ratio of variability (sum of squared differences of consecutive values) of folded and unfolded models.

get_p2p_scatter_over_mad

cesium.features.graphs.get_p2p_scatter_over_mad(model)

Get ratio of variability of folded and unfolded models.

get_p2p_scatter_pfold_over_mad

cesium.features.graphs.get_p2p_scatter_pfold_over_mad(model)

Get ratio of median of period-folded data over median absolute deviation of observed values.

get_p2p_ssqr_diff_over_var

cesium.features.graphs.get_p2p_ssqr_diff_over_var(model)

Get sum of squared differences of consecutive values as a fraction of the variance of the data.

get_qso_log_chi2_qsonu

cesium.features.graphs.get_qso_log_chi2_qsonu(qso_model)

Natural log of goodness of fit of qso-model given fixed parameters.

get_qso_log_chi2nuNULL_chi2nu

cesium.features.graphs.get_qso_log_chi2nuNULL_chi2nu(qso_model)

Natural log of expected chi2/nu for non-qso variable.

kurtosis

cesium.features.graphs.kurtosis(x)

Kurtosis of a dataset. Approximately 0 for Gaussian data.

lomb_scargle_fast_period

cesium.features.graphs.lomb_scargle_fast_period(t, m, e)

Fits a simple sinuosidal model

y(t) = A sin(2*pi*w*t + phi) + c

and returns the estimated period 1/w. Much faster than fitting the full multi-frequency model used by features.lomb_scargle.

lomb_scargle_model

cesium.features.graphs.lomb_scargle_model(time, signal, error, sys_err=0.05, nharm=8, nfreq=3, tone_control=5.0, normalize=False, default_order=1, freq_grid=None)

Simultaneous fit of a sum of sinusoids by weighted least squares.

y(t) = Sum_k Ck*t^k + Sum_i Sum_j A_ij sin(2*pi*j*fi*(t-t0)+phi_j), i=[1,nfreq], j=[1,nharm]

Parameters:
timearray_like

Array containing time values.

signalarray_like

Array containing data values.

errorarray_like

Array containing measurement error values.

sys_errfloat, optional

Defaults to 0.05

nharmint, optional

Number of harmonics to fit for each frequency. Defaults to 8.

nfreqint, optional

Number of frequencies to fit. Defaults to 3.

tone_controlfloat, optional

Defaults to 5.0

normalizeboolean, optional

Normalize the timeseries before fitting? This can help with instabilities seen at large values of nharm Defaults to False.

detrend_orderint, optional

Order of polynomial to fit to the data while fitting the dominant frequency. Defaults to 1.

freq_griddict or None, optional

Grid parameters to use. If None, then calculate the frequency grid automatically. To supply the grid, keys [“f0”, “fmax”] are expected. “f0” is the smallest frequency in the grid and “df” is the difference between grid points. If “”df” is given (grid spacing) then the number of frequencies is calculated. if “numf” is given then “df” is inferred.

Returns:
dict

Dictionary containing fitted parameter values. Parameters specific to a specific fitted frequency are stored in a list of dicts at model_dict[‘freq_fits’], each of which contains the output of fit_lomb_scargle(…)

max_slope

cesium.features.graphs.max_slope(t, x)

Compute the largest rate of change in the observed data.

maximum

cesium.features.graphs.maximum(x)

Maximum observed value.

median

cesium.features.graphs.median(x)

Median of observed values.

median_absolute_deviation

cesium.features.graphs.median_absolute_deviation(x)

Median absolute deviation (from the median) of the observed values.

minimum

cesium.features.graphs.minimum(x)

Minimum observed value.

normalize_hist

cesium.features.graphs.normalize_hist(hist, total_time)

Normalize histogram such that integral from t_min to t_max equals 1. cf. np.histogram(..., density=True).

num_alias

cesium.features.graphs.num_alias(lomb_model)

Here we check for “1-day” aliases in ASAS / Deboss sources.

p2p_model

cesium.features.graphs.p2p_model(x, y, frequency)

Compute features that compare the residuals of data folded by estimated period from Lomb-Scargle model with residuals folded by twice the estimated period.

peak_bin

cesium.features.graphs.peak_bin(peaks, i)

Return the (bin) index of the ith largest peak. Peaks is a list of tuples (i, x[i]) of peak indices i and values x[i], sorted in decreasing order by peak value.

peak_ratio

cesium.features.graphs.peak_ratio(peaks, i, j)

Compute the ratio of the values of the ith and jth largest peaks. Peaks is a list of tuples (i, x[i]) of peak indices i and values x[i], sorted in decreasing order by peak value.

percent_amplitude

cesium.features.graphs.percent_amplitude(x, base=10.0, exponent=-0.4)

Returns the largest distance from the median value, measured as a percentage of the median.

Assumes data is log-scaled; by default we assume inputs are scaled as x=10^(-0.4*y), corresponding to units of magnitudes. Computations are performed on the corresponding linear-scale values.

percent_beyond_1_std

cesium.features.graphs.percent_beyond_1_std(x, e)

Percentage of values more than 1 std. dev. from the weighted average.

percent_close_to_median

cesium.features.graphs.percent_close_to_median(x, window_frac=0.1)

Percentage of values within window_frac*(max(x)-min(x)) of median.

percent_difference_flux_percentile

cesium.features.graphs.percent_difference_flux_percentile(x, base=10.0, exponent=-0.4)

Difference between the 95th and 5th percentiles of the data, expressed as a percentage of the median value. See Eyer (2005) arXiv:astro-ph/0511458v1, Evans & Belokurov (2005) (there the 98th and 2nd percentiles are used).

Assumes data is log-scaled; by default we assume inputs are scaled as x=10^(-0.4*y), corresponding to units of magnitudes. Computations are performed on the corresponding linear-scale values.

period_folding

cesium.features.graphs.period_folding(x, y, dy, lomb_model, sys_err=0.05)

This section is used to calculate Dubath (10. Percentile90:2P/P), which requires regenerating a model using 2P where P is the original found period

NOTE: this essentially runs everything a second time, so makes feature generation take roughly twice as long.

periodic_model

cesium.features.graphs.periodic_model(lomb_model)

Compute features related to the extreme points of the fitted Lomb Scargle model.

qso_fit

cesium.features.graphs.qso_fit(time, data, error, filter='g', mag0=19.0, sys_err=0.0, return_model=False)

Best-fit qso model determined for Sesar Strip82, ugriz-bands (default r). See additional notes for underlying code qso_engine.

Input:

time - measurement times [days] data - measured magnitudes in single filter (also specified) error - uncertainty in measured magnitudes

Output:

chi^2/nu - classical variability measure chi^2_qso/nu - fit statistic chi^2_qso/nu_NULL - expected fit statistic for non-qso variable

signif_qso - significance chi^2/nu<chi^2/nu_NULL (rule out false alarm) signif_not_qso - significance chi^2/nu>1 (rule out qso) signif_vary - significance that source is variable at all class - source type (ambiguous, not_qso, qso)

model - time series prediction for each datum given all others (iff return_model==True) dmodel - model uncertainty, including uncertainty in data

Note on use (i.e., how class is defined):

  1. signif_vary < 3: ambiguous, else

  2. signif_qso > 3: qso, else

  3. signif_not_qso > 3: not_qso

scatter_res_raw

cesium.features.graphs.scatter_res_raw(t, m, e, lomb_model)

From arXiv 1101_2406v1 Dubath 20110112 paper.

Scatter: res/raw Median absolute deviation (MAD) of the residuals (obtained by subtracting model values from the raw light curve) divided by the MAD of the raw light-curve values around the median.

shapiro_wilk

cesium.features.graphs.shapiro_wilk(x, e)

Shapiro-Wilk test statistic.

skew

cesium.features.graphs.skew(x)

Skewness of a dataset. Approximately 0 for Gaussian data.

std

cesium.features.graphs.std(x)

Standard deviation of observed values.

stetson_j

cesium.features.graphs.stetson_j(x, y=[], dx=0.1, dy=0.1)

Robust covariance statistic between pairs of observations x,y whose uncertainties are dx,dy. If y is not given, calculates a robust variance for x.

stetson_k

cesium.features.graphs.stetson_k(x, dx=0.1)

A robust kurtosis statistic.

weighted_average

cesium.features.graphs.weighted_average(x, e)

Arithmetic mean of observed values, weighted by measurement errors.