Community
Products
Jira
Questions
Automation behaviour of rule triggers after incidents which have caused trigger failure

Automation behaviour of rule triggers after incidents which have caused trigger failure

Hi everyone,

I'm here due to the recent incident from 08/07 Dec. 2021 which caused problems for automation rules.

In my case specifically, it caused one of my rules to not trigger on time. The rule was scheduled to create a task on 08.12.2021 12:00AM (midnight). The task was not created, and when I checked when the next rule execution was to be expected - it was for next week - 15.12.2021 (it's a weekly Wednesday rule). In addition, there was no audit log showing whether the rule ran or failed.

I checked all of my automation rules because of this, panicking that this might in the future cause us to miss tasks. While I was doing that, the problematic rule triggered at 11AM and created the task...

So here's my question - Was this trigger run manually by JIRA as a correction for all the problems caused by the outage, or is it a cron service which existed before this outage and which makes sure that all rules are run after an outage.

Basically, I want to know whether there is a contingency in place to run these failed rules in case another outage like this happens, so I can sleep at night :D

Thanks in advance!

-Nev

1 answer

1 accepted

3 votes

Answer accepted

Hi Nev,

My experience is that when Automation comes back up, it does indeed catch up on all of the automations that should have fired. So, in effect, they are just delayed but not lost.

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Hi John,

Thanks, that sounds great! Can we be certain it will work every time, though?

The reason I'm asking this because one miss of a task creation would be business critical for us, as we rely heavily on automated quarterly/annual/biennial task creation to keep going things like business continuity, backups etc.

Are there any cases where this catching up of Automation had failed before? What's the logic behind it, and is there anything we can do to decrease chances of failure (for example set trigger time something more in business hours and not midnight?)

Appreciate the swift reply!

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Same here - they are critical for us. There have been multiple outages over the months and they have always ended up running when things got cleared up.

I can’t speak to that as a guarantee but that’s my experience.

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Like • Neven Panchev likes this

As far as I know, Automation puts every triggered execution to a queue even if it cannot start it right away. For example, if it can execute max N automations in paralell, then if there is a spike, then all automations above N will be inserted to the queue and wait.

When there is a new "worker" that is available to execute a rule, it picks out an item from queue and executes it. And so on.

This is a standard scalability, resiliency pattern.

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Like • like this

Greetings all!

An FYI to what I am reading this thread:

The Atlassian support team told our company (for a prior outage ticket) that "catching up" on scheduled and triggered rules is subject to the severity and specifics of the outage. The expectation is that rules may eventually run (as Aron notes for queued events) or miss a schedule/trigger, and there is no expectation of when those triggers will happen after the outage.

Kind regards,
Bill

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Like • Neven Panchev likes this

Suggest an answer

Was this helpful?

Thanks!

Jira

DEPLOYMENT TYPE

CLOUD

PRODUCT PLAN

FREE

PERMISSIONS LEVEL

Product Admin

Forums

Product Q&A

Community resources

Support

Top groups

Community resources

Support

Learn

Community resources

Support

Events

Community resources

Support

Get product advice from experts

Join a community group

Advance your career with learning paths

Earn badges and rewards

Connect and share ideas at events

Automation behaviour of rule triggers after incidents which have caused trigger failure