Amazon’s small typo causes huge Canvas shutdown

amazon

Amazon’s systems shutdown hit national news this past week, but few people realize what an effect it had on Bellevue College’s students and staff. At around 10:30 a.m. on Feb. 28, Canvas began gradually locking people out.

The whole website wasn’t completely shut down until around 11:00 a.m. and the outage lasted until about 2:00 in the afternoon.

Manager of Technical Support Services Jamie Osborne, explained that Amazon was not completely responsible for the shutdown as their neighboring company Instructure was also involved. Instructure is a company that works very closely with Amazon and is the reason Amazon is connected to Canvas. Instructure works with both websites and many more from their headquarters in Virginia. According to Osborne, “somebody made a small typo. That’s all it was.” And yet that small typo ended up causing Canvas – a website used by almost every BC teacher and their students – to malfunction.

Even though Amazon forbid any employees from speaking specifically on the subject, one employee speaking on condition of anonymity stated “We have a cloud server service that many companies use. Somebody types something wrong in the code and it brought down lots of businesses.”

The actual coding error was made by an Amazon employee and so the protocol was for Instructure to give Amazon 24 hours to fix their mistake. They did manage to make this deadline, finding and fixing the problem in three and a half hours, getting everything back online and in running order without Instructure needing to intervene.

However, mistakes do happen and there are three back up plans in place specifically with Instructure. There are three tiers of support that have different response times when something like this happens but each tier becomes more expensive as they go up. Canvas and all other websites under Instructure’s wing have the first tier of available support. After this past Tuesday, Bellevue College employees have suggested that they upgrade to the second tier.

“Your fail over time should be a little more rapid than 24 hours,” Osborne said.

When asked about the outage and Amazon’s place in it, Osborne said “we hold Instructure accountable for all outages.” Yet because we are working with the same company that works with much larger websites our success rate is bound to be much higher. According to the Director of Technical Support Services Jason Aqui, “since 2012 this is only the second outage,” which corresponds to all Instructure websites up and running 99 percent of the time. That’s an extremely high rate and it’s like due to the fact that Instructure works “from Netflix to thousands of other companies.” Netflix being an website that’s used around the world needs to be running near constantly and therefore invests in the company that will make this possible.

“Technology fails,” said Aqui, and stated that what matters is “what happens when it does fail.”

Be the first to comment

Leave a Reply