Service Interruption (Resolved)

Jan 15 2020, 11:56 PM

Today, one of our network providers, Level3, had a catastrophic network gear failure at their datacenter located in Los Angeles, CA. This resulted in 10 full hours of downtime as we tried to communicate with the datacenter and their resolution. We're deeply sorry regarding this downtime and will be reevaluating our services with them going forward. Here is a brief general timeline of events.

  • Approx 1pm ~ network services are out, and the datacenter is notified
  • 2pm datacenter network operations center works to locate the root cause.
  • 3pm datacenter aware, suspects network-wide ddos attack
  • 4pm datacenter notifies us that there is not an attack, but a hardware failure and equipment will be replaced.
  • 5pm-7pm, datacenter continues to work on the issue, promising that they will move traffic to a working system quickly
  • 8pm reportedly, the traffic migration is "completed" but our services are still affected
  • 9pm-11pm we contact the datacenter for elevated assistance as our services are still not working. At this time they have stated they still cannot give an ETA and this piece of equipment is still not replaced. So we're attempting to take emergency measures to route traffic.
  • 3am CST the datacenter notifies us that a 'workaround' has finally been applied to restore service to those effected. As a result of this repair, email services are back online. Service to domain owners has been restored as well as email.
As we are taking a different route:

-> If you have a domain name, it will not work at this time, only *.b1.jcink.com and *.jcink.net URLs are in service. (see 3am CST update)

-> The service may not be available to you since the routing change doesn't take effect immediately, for most it will take an hour, for others, it can take up to 24 hours.

-> Since this is an emergency measure, we will have to switch back, and there will be a planned notice regarding that.

-> Email services will not function at this time since we require that system for email. (see 3am CST update)

We are very disappointed in this entire experience from the datacenter whom we have used with relatively very few problems since 2013, and apologize for the massive frustration this caused today, and as of now it is still not over. We will continue to keep everyone updated via twitter and all avenues we can. Your patience and support in this time are very appreciated.

Comments

  1. John Says:

    Per the datacenter, their network was repaired at 3AM CST. We evaluted their network status from that point and after testing, decided to transition back from our emergency solution. We seamlessly transitioned back to the datacenter's network at 4pm CST and are closely monitoring it -- but traffic flow is normal at this time. The engineers at the datacenter told us that there are no plans for interruption going forward, and if there were it would be scheduled. We will roll back to the emergency network setup if the need is necessary but as of now everything is stable.

    Your patience in this process has been greatly appreciated, and we apologize for all of the frustration this event has caused. We will continue and always keep everyone as informed as we can regarding any significant service impact, both now and in the future. Going forward we will also reevaluate our provider and, and consider other options as well should the need arise.