Product
|
Cloud costs
|
released
October 26, 2018
|
3
min read
|

A Time Series Machine Learning Model for Canary Deployments

Updated

Problem Statement

Showcase the time series machine learning model for canary analysis.

Model

Given 2 time series of equal length (e.g. canary phases) and sampled at the same frequency, detect the following:

  • They are similar if the time series patterns are similar and the values are within an acceptable deviation range.
  • They are dissimilar if the patterns are different, or the values are outside of an acceptable deviation range. Acceptable deviation range is something the model infers from the training data.

Dataset

We generate a synthetic dataset inspired by the UCI Synthetic Control dataset (see below screenshot), which is commonly used for time series model validations in the academic community.

The time series within this dataset follow the normal pattern with no clear trend. The pattern can be explained as y(t) = m + s. We model s, which captures the variation or noise from a normal distribution s~N(0,1). Also, in order to introduce anomalies in the dataset, we randomly vary m, starting from 1 to a specified high limit. The implementation of the above is given by the code snippet below.

def normal(n_samples=100, t_samples=30, m_max=1):
     data = []
     for i in range(n_samples):
          m = np.random.randint(1, m_max, 1)
          sample = m + np.random.normal(0,1,t_samples)
          data.append(sample)
     return data

We generate multiple datasets with 30 time series each. Each dataset is generated by varying the range of values for the variable m from 1 to m_max. The higher the range, the higher the probability of finding dissimilar time series in the dataset.

Methodology

For each dataset, we compare all pairs of time series, amounting to 900 comparisons.

We plot the percentage of dissimilarities detected for the datasets vs the upper limit of the variable m (m_max) for that dataset. We expect the percentage of dissimilarity detected grows with the increase in value for m_max and tapers off at some point.

The code snippet for this is given below. The highlighted portion is the call to our SAX HMM model.

n_samples = 30
n_comparison = n_samples * n_samples
for m_max in range(2, 30, 1):
data = np.array(normal(n_samples=n_samples, m_max=m_max))
error = 0.
for i in range(n_samples):
test = data[i, :]
for j in range(n_samples):
control_data = data[j, :]
sdf = SAXHMMDistanceFinder(control, test)
result = sdf.compute_dist()
if result['risk'] == 1:
error += 1
print(m_max, (error)/n_comparison)

Results

Machine Learning Model for Canary Deployments Results


We see a 75% dissimilar prediction at m_max=5 and it reaches its peak by m_max=15. As expected, the percentage labeled dissimilar grows with the increase in the value of m_max and tapers off at about 93%.

Conclusion

The results showcase the SAX HMM machine learning model for time series canary analysis. The dataset was synthetically generated much like the UCI synthetic data set. The dissimilar detection rate grows with m_max as we expect.

Reference: Synthetic Control Chart Time Series by Dr Robert Alcock.

Thanks for reading!
Sriram

Sign up now

Sign up for our free plan, start building and deploying with Harness, take your software delivery to the next level.

Get a demo

Sign up for a free 14 day trial and take your software development to the next level

Documentation

Learn intelligent software delivery at your own pace. Step-by-step tutorials, videos, and reference docs to help you deliver customer happiness.

Case studies

Learn intelligent software delivery at your own pace. Step-by-step tutorials, videos, and reference docs to help you deliver customer happiness.

We want to hear from you

Enjoyed reading this blog post or have questions or feedback?
Share your thoughts by creating a new topic in the Harness community forum.

Sign up for our monthly newsletter

Subscribe to our newsletter to receive the latest Harness content in your inbox every month.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Continuous Delivery & GitOps