Incidents | Sirv Incidents reported on status page for Sirv https://status.sirv.com/ https://d1lppblt9t2x15.cloudfront.net/logos/784ab58b1bcab84029c787a04b82eac1.jpg Incidents | Sirv https://status.sirv.com/ en Sirv API recovered https://status.sirv.com/ Tue, 20 May 2025 00:52:58 +0000 https://status.sirv.com/#f13054c9d9f963fe254fd8a6f2d4a21d5a74640fa808e89ee1396781b7b8eda0 Sirv API recovered Sirv API went down https://status.sirv.com/ Tue, 20 May 2025 00:47:02 +0000 https://status.sirv.com/#f13054c9d9f963fe254fd8a6f2d4a21d5a74640fa808e89ee1396781b7b8eda0 Sirv API went down Control panel at my.sirv.com recovered https://status.sirv.com/ Wed, 02 Apr 2025 14:39:30 +0000 https://status.sirv.com/#e0b3824a6edb48ad9eeb5ca143013bd098455a109d4d750384b2ef2e86934ca5 Control panel at my.sirv.com recovered Control panel at my.sirv.com went down https://status.sirv.com/ Wed, 02 Apr 2025 14:30:28 +0000 https://status.sirv.com/#e0b3824a6edb48ad9eeb5ca143013bd098455a109d4d750384b2ef2e86934ca5 Control panel at my.sirv.com went down Control panel at my.sirv.com recovered https://status.sirv.com/ Wed, 02 Apr 2025 08:12:14 +0000 https://status.sirv.com/#c4b2e8925594d59e0320b97617202b3dbe9aa654ba1a0bc06ab46ec89e79713c Control panel at my.sirv.com recovered Control panel at my.sirv.com went down https://status.sirv.com/ Wed, 02 Apr 2025 08:06:22 +0000 https://status.sirv.com/#c4b2e8925594d59e0320b97617202b3dbe9aa654ba1a0bc06ab46ec89e79713c Control panel at my.sirv.com went down Control panel at my.sirv.com recovered https://status.sirv.com/ Tue, 01 Apr 2025 15:41:44 +0000 https://status.sirv.com/#5c818ed9b7b25f0c29e2ab7e84d52e09e0c58df15b5d7ba5da0817a537a955aa Control panel at my.sirv.com recovered Control panel at my.sirv.com went down https://status.sirv.com/ Tue, 01 Apr 2025 15:35:43 +0000 https://status.sirv.com/#5c818ed9b7b25f0c29e2ab7e84d52e09e0c58df15b5d7ba5da0817a537a955aa Control panel at my.sirv.com went down Control panel at my.sirv.com recovered https://status.sirv.com/ Tue, 01 Apr 2025 11:32:35 +0000 https://status.sirv.com/#97371cb6c35c8103cb20c01bdbbd8f761afb50b2230957912d3fbfec7b4906d6 Control panel at my.sirv.com recovered Control panel at my.sirv.com went down https://status.sirv.com/ Tue, 01 Apr 2025 11:23:41 +0000 https://status.sirv.com/#97371cb6c35c8103cb20c01bdbbd8f761afb50b2230957912d3fbfec7b4906d6 Control panel at my.sirv.com went down Control panel at my.sirv.com recovered https://status.sirv.com/ Tue, 01 Apr 2025 00:59:21 +0000 https://status.sirv.com/#080d3d55cd5e0d4cde81b2aa550165be3dfa63af1fa2922ef841bbddfc8d1b71 Control panel at my.sirv.com recovered Control panel at my.sirv.com went down https://status.sirv.com/ Tue, 01 Apr 2025 00:53:22 +0000 https://status.sirv.com/#080d3d55cd5e0d4cde81b2aa550165be3dfa63af1fa2922ef841bbddfc8d1b71 Control panel at my.sirv.com went down Some CDN requests failing in California https://status.sirv.com/incident/537441 Mon, 31 Mar 2025 15:16:00 -0000 https://status.sirv.com/incident/537441#1836293ce175049f11d3ae3326b2e416a3633ffab422903b5710b39009b95115 The issue has been resolved. The issue was caused by packet loss affecting our new Los Angeles datacenter. The datacenter traced the issue to their upstream provider, Arelion, which was experiencing network instability. The issue was resolved when Arelion was temporarily disabled as an upstream provider and traffic was rerouted through other providers. We require that our datacenters identify and reroute traffic automatically and quickly in such scenarios. However, the resolution of this issue was not fast enough and given that it occurred soon after adopting this new datacenter, we have decided to cease using it. All traffic is now being routed from our original Los Angeles datacenter (as of 7 April 2025). Some CDN requests failing in California https://status.sirv.com/incident/537441 Mon, 31 Mar 2025 14:32:00 -0000 https://status.sirv.com/incident/537441#b525d7c08a8a920ffad096671d1ac095541699e3cb2273ff996715e436e595be Some requests from California are slow and a small number (under 2%) are failing. This is being investigated and likely related to new CDN servers that were deployed today in Los Angeles. Some uploads failing https://status.sirv.com/incident/529019 Sun, 16 Mar 2025 15:25:00 -0000 https://status.sirv.com/incident/529019#21b0e7befc0b868c3a7de5071fb2d552247635a60c16c3d13af47988c9916a11 The issue has been resolved. The issue was caused by two servers in one cluster failing simultaneously. This impacted the uploading of new files to about 20% of Sirv accounts. Uploads are designed to continue as normal when one server is down but when two servers are down, some uploads can fail, which is what happened. They failed due to an exporter service that stopped running on one server, then the other server. Errors prevented the services from restarting. Then a second issue caused the resolution to take much longer than expected because the servers couldn't be rebooted. This was due to an outdated BIOS. Once the BIOS had been updated, the servers were rebooted and the cause of the underlying issue was resolved. To prevent this from happening again, we have implemented a new BIOS management process with our datacenter. We are also shortening our hardware refresh cycle, for more frequent hardware upgrades. To reduce the chance of the exporter service failure from recurring, we have disabled a process and are monitoring the server metrics. Some uploads failing https://status.sirv.com/incident/529019 Sun, 16 Mar 2025 08:40:00 -0000 https://status.sirv.com/incident/529019#cfc2b7b1e479c232f1edb5ccd9f6883e16d7291e047d2f47d67d26b67add7010 This issue remains in progress. The cause of the issue has been identified and we are working to resolve it as soon as possible. Some uploads failing https://status.sirv.com/incident/529019 Sat, 15 Mar 2025 17:56:00 -0000 https://status.sirv.com/incident/529019#f4ea99e82c5724020787bf7102f7a0d9b5445590f9f5bd968fa6bc08623fe36e Some file uploads are not completing due to a server cluster issue. The cause is being investigated. The issue is affecting a small number of accounts. If your file upload fails, please wait until this issue has been resolved. 502 responses for some files https://status.sirv.com/incident/522075 Wed, 26 Feb 2025 09:32:00 -0000 https://status.sirv.com/incident/522075#55aeea3e25bd2f5303613c427a7c0472991b92300f3f26d119ca6f2c5d51a149 The issue was resolved within 3 minutes. The cause was the rebooting of multiple servers at the same time, whilst only 1 server should have been rebooted. This was an oversight in our standard operating procedures. We are investigating ways to automatically protect and prevent this from recurring. 502 responses for some files https://status.sirv.com/incident/522075 Wed, 26 Feb 2025 09:29:00 -0000 https://status.sirv.com/incident/522075#fa83b784215f91ba8a14790077cd302ee8f5bb99c88612c0d6d96a83391beb18 For a 3 minute period between 09:29 and 09:32 UTC, some requests were not returned. This issue affected only certain requests and only certain accounts. Most accounts were unaffected. File uploads interrupted https://status.sirv.com/incident/484452 Thu, 19 Dec 2024 18:09:00 -0000 https://status.sirv.com/incident/484452#31f3c4b188d75a4af59733658628eda4124f69f0960438cc4da44dcb76d17ea4 The issue has been resolved. An unusually large number of servers failed at the same time. Sirv is designed to continue normal operation during a server outage and all systems did continue as normal, but some file uploads failed. We are considering what possible changes could be made to assist in such situations in future. File uploads interrupted https://status.sirv.com/incident/484452 Thu, 19 Dec 2024 18:09:00 -0000 https://status.sirv.com/incident/484452#31f3c4b188d75a4af59733658628eda4124f69f0960438cc4da44dcb76d17ea4 The issue has been resolved. An unusually large number of servers failed at the same time. Sirv is designed to continue normal operation during a server outage and all systems did continue as normal, but some file uploads failed. We are considering what possible changes could be made to assist in such situations in future. File uploads interrupted https://status.sirv.com/incident/484452 Thu, 19 Dec 2024 14:20:00 -0000 https://status.sirv.com/incident/484452#1298553874c547217e0e9d9d5ebdef6fc04ab4eba5310c75302559f1943fcb7a A server fault is causing some file uploads to fail for a limited number of accounts. The issue is being resolved. If you experience failed uploads, please pause and try again later. File uploads interrupted https://status.sirv.com/incident/484452 Thu, 19 Dec 2024 14:20:00 -0000 https://status.sirv.com/incident/484452#1298553874c547217e0e9d9d5ebdef6fc04ab4eba5310c75302559f1943fcb7a A server fault is causing some file uploads to fail for a limited number of accounts. The issue is being resolved. If you experience failed uploads, please pause and try again later. Datacentre network maintenance https://status.sirv.com/incident/473047 Fri, 06 Dec 2024 06:36:58 -0000 https://status.sirv.com/incident/473047#ed4fa11d4eaeb1324a5359fc9b64f84b150c801078d1a613037ccfbc83233a20 Scheduled maintenance on part of Sirv's network will take place during a 2-hour window between 0330 and 0530 UTC on 6 December 2024. The actual duration of impact is expected to be less than 60 minutes. The purpose is to upgrade infrastructure at Sirv's primary datacenter. During the scheduled period: 1. Some accounts won't be able to upload files. 2. Some accounts may not see some files in my.sirv.com. 3. Some accounts won't be able to generate new images. 4. Some API requests may fail. All existing files will be delivered as normal from the CDN. We recommend not attempting to upload files during the period or you may receive an error. Datacentre network maintenance https://status.sirv.com/incident/473047 Fri, 06 Dec 2024 06:36:58 -0000 https://status.sirv.com/incident/473047#ed4fa11d4eaeb1324a5359fc9b64f84b150c801078d1a613037ccfbc83233a20 Scheduled maintenance on part of Sirv's network will take place during a 2-hour window between 0330 and 0530 UTC on 6 December 2024. The actual duration of impact is expected to be less than 60 minutes. The purpose is to upgrade infrastructure at Sirv's primary datacenter. During the scheduled period: 1. Some accounts won't be able to upload files. 2. Some accounts may not see some files in my.sirv.com. 3. Some accounts won't be able to generate new images. 4. Some API requests may fail. All existing files will be delivered as normal from the CDN. We recommend not attempting to upload files during the period or you may receive an error. Datacentre network maintenance https://status.sirv.com/incident/473047 Fri, 06 Dec 2024 05:30:00 +0000 https://status.sirv.com/incident/473047#56d7248c0b4332d08b82415e4b49e0c785115ae3a2a51ca8932f9008492bffc5 Maintenance completed Datacentre network maintenance https://status.sirv.com/incident/473047 Fri, 06 Dec 2024 05:30:00 +0000 https://status.sirv.com/incident/473047#56d7248c0b4332d08b82415e4b49e0c785115ae3a2a51ca8932f9008492bffc5 Maintenance completed Datacentre network maintenance https://status.sirv.com/incident/473047 Fri, 06 Dec 2024 03:30:00 -0000 https://status.sirv.com/incident/473047#ed4fa11d4eaeb1324a5359fc9b64f84b150c801078d1a613037ccfbc83233a20 Scheduled maintenance on part of Sirv's network will take place during a 2-hour window between 0330 and 0530 UTC on 6 December 2024. The actual duration of impact is expected to be less than 60 minutes. The purpose is to upgrade infrastructure at Sirv's primary datacenter. During the scheduled period: 1. Some accounts won't be able to upload files. 2. Some accounts may not see some files in my.sirv.com. 3. Some accounts won't be able to generate new images. 4. Some API requests may fail. All existing files will be delivered as normal from the CDN. We recommend not attempting to upload files during the period or you may receive an error. Datacentre network maintenance https://status.sirv.com/incident/473047 Fri, 06 Dec 2024 03:30:00 -0000 https://status.sirv.com/incident/473047#ed4fa11d4eaeb1324a5359fc9b64f84b150c801078d1a613037ccfbc83233a20 Scheduled maintenance on part of Sirv's network will take place during a 2-hour window between 0330 and 0530 UTC on 6 December 2024. The actual duration of impact is expected to be less than 60 minutes. The purpose is to upgrade infrastructure at Sirv's primary datacenter. During the scheduled period: 1. Some accounts won't be able to upload files. 2. Some accounts may not see some files in my.sirv.com. 3. Some accounts won't be able to generate new images. 4. Some API requests may fail. All existing files will be delivered as normal from the CDN. We recommend not attempting to upload files during the period or you may receive an error. Datacentre network maintenance https://status.sirv.com/incident/473046 Thu, 05 Dec 2024 05:42:00 -0000 https://status.sirv.com/incident/473046#304eb060943177d48a1d0af3e70c3b24989c412802b4d8fb5cd2f9b329b78254 The maintenance has been completed. All core services continued operating as normal during this period. In case you were not seeing files in your my.sirv.com control panel, you will see them now. Datacentre network maintenance https://status.sirv.com/incident/473046 Thu, 05 Dec 2024 04:49:00 -0000 https://status.sirv.com/incident/473046#b2bd7fa130cdafc7784399af4d5a5a5bf9b764295853b70c138011b8df7dc596 Scheduled maintenance on part of Sirv's network will take place during a 2-hour window between 0330 and 0530 UTC on 5 December 2024. The actual duration of impact is expected to be less than 60 minutes. The maintenance is to datacentre infrastructure, not to Sirv servers. All core Sirv services are expected to continue as normal - file serving, file processing, file uploads, CDN delivery. During the scheduled period, some customers may not see their files listed in my.sirv.com but they will exist and be processed and served as normal. Datacentre network maintenance https://status.sirv.com/incident/473058 Wed, 04 Dec 2024 16:22:00 -0000 https://status.sirv.com/incident/473058#b706a8ead74eb349facb6f89cb4fcffcb73abf889f4e1142e398d9ad9544d666 The maintenance has been completed. Datacentre network maintenance https://status.sirv.com/incident/473058 Wed, 04 Dec 2024 16:22:00 -0000 https://status.sirv.com/incident/473058#b706a8ead74eb349facb6f89cb4fcffcb73abf889f4e1142e398d9ad9544d666 The maintenance has been completed. Datacentre network maintenance https://status.sirv.com/incident/473058 Wed, 04 Dec 2024 16:22:00 -0000 https://status.sirv.com/incident/473058#b706a8ead74eb349facb6f89cb4fcffcb73abf889f4e1142e398d9ad9544d666 The maintenance has been completed. Datacentre network maintenance https://status.sirv.com/incident/473058 Wed, 04 Dec 2024 04:09:00 -0000 https://status.sirv.com/incident/473058#9d18215a026c4d99049151eb3160232199ab91eb24440eb45064f4b5b32b5597 Scheduled maintenance on part of Sirv's network will take place during a 2-hour window between 0330 and 0530 UTC on 4 December 2024. The actual duration of impact is expected to be minimal. The purpose is to upgrade infrastructure at Sirv's primary datacenter. Primary cluster and Riak down: no UI, no uploads, no FTP, REST API, etc, +logs down. Some accounts will switch to failover. During the scheduled period: 1. The control panel at my.sirv.com may be unavailable. 2. Some API requests may fail. 3. File uploads may fail. 4. Some accounts won't be able to generate new images. All files served from the CDN will be delivered as normal. We recommend not uploading files during the period or you may receive an error. Datacentre network maintenance https://status.sirv.com/incident/473058 Wed, 04 Dec 2024 04:09:00 -0000 https://status.sirv.com/incident/473058#9d18215a026c4d99049151eb3160232199ab91eb24440eb45064f4b5b32b5597 Scheduled maintenance on part of Sirv's network will take place during a 2-hour window between 0330 and 0530 UTC on 4 December 2024. The actual duration of impact is expected to be minimal. The purpose is to upgrade infrastructure at Sirv's primary datacenter. Primary cluster and Riak down: no UI, no uploads, no FTP, REST API, etc, +logs down. Some accounts will switch to failover. During the scheduled period: 1. The control panel at my.sirv.com may be unavailable. 2. Some API requests may fail. 3. File uploads may fail. 4. Some accounts won't be able to generate new images. All files served from the CDN will be delivered as normal. We recommend not uploading files during the period or you may receive an error. Datacentre network maintenance https://status.sirv.com/incident/473058 Wed, 04 Dec 2024 04:09:00 -0000 https://status.sirv.com/incident/473058#9d18215a026c4d99049151eb3160232199ab91eb24440eb45064f4b5b32b5597 Scheduled maintenance on part of Sirv's network will take place during a 2-hour window between 0330 and 0530 UTC on 4 December 2024. The actual duration of impact is expected to be minimal. The purpose is to upgrade infrastructure at Sirv's primary datacenter. Primary cluster and Riak down: no UI, no uploads, no FTP, REST API, etc, +logs down. Some accounts will switch to failover. During the scheduled period: 1. The control panel at my.sirv.com may be unavailable. 2. Some API requests may fail. 3. File uploads may fail. 4. Some accounts won't be able to generate new images. All files served from the CDN will be delivered as normal. We recommend not uploading files during the period or you may receive an error. Datacentre network maintenance https://status.sirv.com/incident/473058 Wed, 04 Dec 2024 04:09:00 -0000 https://status.sirv.com/incident/473058#9d18215a026c4d99049151eb3160232199ab91eb24440eb45064f4b5b32b5597 Scheduled maintenance on part of Sirv's network will take place during a 2-hour window between 0330 and 0530 UTC on 4 December 2024. The actual duration of impact is expected to be minimal. The purpose is to upgrade infrastructure at Sirv's primary datacenter. Primary cluster and Riak down: no UI, no uploads, no FTP, REST API, etc, +logs down. Some accounts will switch to failover. During the scheduled period: 1. The control panel at my.sirv.com may be unavailable. 2. Some API requests may fail. 3. File uploads may fail. 4. Some accounts won't be able to generate new images. All files served from the CDN will be delivered as normal. We recommend not uploading files during the period or you may receive an error. Elevated 504 responses for API upload requests https://status.sirv.com/incident/461633 Fri, 15 Nov 2024 04:42:00 -0000 https://status.sirv.com/incident/461633#79951214fbb3ef91a5942bb8000a0f6eab5b228c648d15a972ac154ed6dd6929 The issue has been fully resolved. The issue started when heavy image upscaling API requests overloaded the upload API, causing some API requests to return 504 errors. This was solved but a knock-on effect caused approximately 5% of API requests to fail with a 502 response. To prevent this issue from happening again, improvements have been made to reduce the load of image upscaling and better distribute load. Changes are also being made to logging and alerting, for faster identification of such scenarios. The greatest impact of this issue was on the Sirv WordPress plugin. The plugin was sending an API request on all admin pages, which blocked the page from loading until the response had been received - typically less than 300ms but up to 20 seconds if API requests timed out. Those API requests were not required, so a new WordPress plugin version v7.3.3 was released at 1805 UTC (14 November) to remove such requests and prevent this issue from recurring: https://wordpress.org/plugins/sirv/ ------ If you use the Sirv API, we recommend that your scripts are written to retry requests that receive a 502 response. Recommendations for handling 502 and other error responses are provided here: https://sirv.com/help/articles/sirv-rest-api/#error-handling-guide Elevated 504 responses for API upload requests https://status.sirv.com/incident/461633 Thu, 14 Nov 2024 19:06:00 -0000 https://status.sirv.com/incident/461633#323c8f7a69ea80dbcb8f8a22951d70c0750a3905731dc2d2a4ab4d16f2c80ed1 The API upload overload is almost fully resolved. Elevated 504 responses for API upload requests https://status.sirv.com/incident/461633 Thu, 14 Nov 2024 15:23:00 -0000 https://status.sirv.com/incident/461633#56a2479c17d966beb73b82ee6f6ea1485164d1a287779d8dccb14a8db10f41d8 Changes have been made and the issue is almost resolved. The majority of upload requests are successful. Elevated 504 responses for API upload requests https://status.sirv.com/incident/461633 Thu, 14 Nov 2024 13:19:00 -0000 https://status.sirv.com/incident/461633#0258d9deced3529a6d04a7195337a6ba84bb249de7c93176186574c81a80b5b2 There continues to be heavy load on the API from upload requests. A contributing factor is the large number of AI image upscaling requests, which are taking longer than usual to process. The delay caused by the queue of requests is causing some API file upload requests to fail with a 504 response. Elevated 504 responses for API upload requests https://status.sirv.com/incident/461633 Thu, 14 Nov 2024 09:52:00 -0000 https://status.sirv.com/incident/461633#cff8f3c88b3f1c5fde551300a0d10ef0065e4eab841487a8acf8362b608dcd85 An elevated number of 504 timeouts are being returned for API file upload requests. Elevated 404 responses for 12 accounts https://status.sirv.com/incident/330584 Thu, 22 Feb 2024 12:06:00 -0000 https://status.sirv.com/incident/330584#e9ad2e39325949a3690604d8c9d85ec269679f1d2f0625aca8f2d9a0c6488dae The issue has been resolved. At 1052 UTC, about 11% of requests to a small number of accounts (12 accounts, out of 50,000+) were returning 404 responses (file not found) when they should have returned 200 responses (OK). When the Sirv team identified the issue, they quickly worked to solve it and by 1206 UTC all such requests were returning the normal 200 response. The issue occurred due to a misconfiguration by Sirv engineers while they were configuring a new storage cluster shard. Some requests were going to the new shard before it was ready to receive them. Classified as severity level 3: Minor impact. Elevated 404 responses for 12 accounts https://status.sirv.com/incident/330584 Thu, 22 Feb 2024 10:52:00 -0000 https://status.sirv.com/incident/330584#1896e32ef29c1bba2510540f423dafd19a4a8b66641034f41417d9a1143de42b Some files are not being returned for a small number of accounts. This issue is being addressed urgently. Elevated errors in some CDN locations https://status.sirv.com/incident/309134 Fri, 05 Jan 2024 21:07:00 -0000 https://status.sirv.com/incident/309134#c76dfe82cdfaebc840fae9cbca28cd290131d2d7fcc9fb2c8f3f642d46672d70 The issue has been resolved. It was caused by the failure of a software daemon simultaneously on multiple servers. This is a lightweight and highly efficient service manager that has never failed in the past. During the period, between 3% and 58% of HTTPS requests returned an error, depending on which of the 25 CDN locations received the request. We are now investigating why this happened and will be taking actions to prevent this from impacting service availability in the future. Classified as severity level 2: Meaningful impact. Elevated errors in some CDN locations https://status.sirv.com/incident/309134 Fri, 05 Jan 2024 20:21:00 -0000 https://status.sirv.com/incident/309134#d5fe1590f7e72f2f44053996bee77826b2bf5b438fbf894f0cc0130cd3c6aa73 A high number of requests to Sirv's 25 CDN locations are returning an error. Our engineers are working quickly to identify and resolve the issue. Elevated rate of 415 errors https://status.sirv.com/incident/296342 Thu, 21 Dec 2023 10:05:00 -0000 https://status.sirv.com/incident/296342#8f83495777a388ec1b89924c741fefd5286e5bad55b71cf2b9332067632c6b90 More than 99% of the remaining truncated files have also now been restored. Any impacted user has been provided with an exact list of files that were impacted. There is no residual impact remaining from this issue. Classified as severity level 1: Critical impact. Elevated rate of 415 errors https://status.sirv.com/incident/296342 Wed, 06 Dec 2023 19:47:00 -0000 https://status.sirv.com/incident/296342#afae594595f68cdbe19e8b3db5350d42aa1a5211a7a41ee8e1592f8bb44419a6 The issue that caused the elevated rate of 415 errors has been resolved. The truncated files were caused by a software issue that wrote some files with zero bytes. As it was software, this caused the truncated files to be replicated across all versions of the master file. Sirv triplicates every file to protect against hardware issues (server or hard drive failures), however, the updated files passed all checks and faithfully triplicated the files. It existed without detection because cached versions of the files were used for processing and delivery. All systems worked perfectly, until yesterday's infrastructure upgrade when cached files were cleared. The root cause was solved by identifying and correcting the software issue that caused truncation. This was completed by 1500 UTC on 5 December. Then the task was to restore those files that had been truncated. Across the platform, approximately 0.2% of files had been affected. Of those files, we were able to restore between 95% and 100%, depending on the account. Any account which had less than 100% restoration is being sent a list of zero-byte files that should be restored by uploading the files again. Please look out for our email with a list of files. Please message us via the following URL if you need any help: https://sirv.com/help/support/#support A thorough investigation of the issue is underway, which will be reported in detail by email. Within the report will be a list of actions being taken to prevent any possible repeat of this issue, along with improvements and new features that are already being designed. We apologise to all customers who were impacted by this issue. It is the first time master data has been lost. We will learn every possible lesson from it and make Sirv even stronger and reliable than before. Elevated rate of 415 errors https://status.sirv.com/incident/296342 Wed, 06 Dec 2023 12:12:00 -0000 https://status.sirv.com/incident/296342#0ce9e24f73a7be560a9abe7737a84a2019709e5997fe75335f4e18ddc4880f34 More files have been restored. The rate of 415 errors continues to fall. You can see a list of 415 errors on this page of your account: https://my.sirv.com/#/analytics/errors/415?from=20231206&to=20231206 You can upload files to solve any 415 errors in your files. We can provide a list of zero byte files if you need - please ask the Sirv support team via this form: https://sirv.com/help/support/#support We are continuing to restore as many files as possible. Elevated rate of 415 errors https://status.sirv.com/incident/296342 Wed, 06 Dec 2023 01:48:00 -0000 https://status.sirv.com/incident/296342#0e4f9bba5503e370c65fa943692f87ffa1e7e8d458156f5fbd166d3de0501a64 We are continuing to recover the small number of remaining truncated files. This is estimated to require a further 12-24 hours. If you have any files that are not being served, you can solve this straight away by reuploading them. We are creating a tool to generate a list of zero byte files. If you'd like a list of any zero byte files in your account, please submit a request at https://sirv.com/help/support/#support and we'll prepare a list once the tool is ready. Regardless, we will continue work to recover all files. If you upload files using Sirv's Autofetch feature, you can simply delete the zero byte files and they'll be re-fetched from your remote location upon the next request. Elevated rate of 415 errors https://status.sirv.com/incident/296342 Tue, 05 Dec 2023 23:33:00 -0000 https://status.sirv.com/incident/296342#0780eeb5397f8dfbd61f175ede1291eb3e92c5177208d5616ddd0751a2eddbf9 Approximately 80% of truncated files have been recovered. The number of 415 errors have fallen significantly, though they remain elevated. To fully resolve the problem, the remaining truncated files need to be replaced with the original file. We are working on ways to recover those remaining files, though they are hard to obtain and it may take hours or days. For the fastest possible resolution, it is recommended that you re-upload the files. Such files will show in your account as having 0 bytes. Start at the following URL, to see a list of all files uploaded in the last 30 days: https://my.sirv.com/#/search/?created=30d Find any zero byte files by sorting the results by Size, Ascending. Another update will be posted here once we have news on the remaining files. Elevated rate of 415 errors https://status.sirv.com/incident/296342 Tue, 05 Dec 2023 22:56:00 -0000 https://status.sirv.com/incident/296342#b1b607af2b1c008998bbae2948f8342ca666aac10f4e8b2c043c138f21be06c3 The 415 error issue was caused by some files becoming truncated, meaning they lost part of their contents. These files show in the control panel at my.sirv.com as having 0 bytes. Files uploaded today were more likely to become truncated, though it seems that truncation has been happening to files for a long time. This was not identified because they performed entirely as normal, due to caching of non-truncated versions. The truncated file issue only became noticeable today, after an infrastructure upgrade. Server caches were cleared and the truncated versions became apparent. The cause of truncation is believed to have been a software bug that has now been resolved. We are currently working to restore truncated files - a process that may take from hours to days. If you are able to reupload any files that are not currently loading, please do that now. The issue is estimated to be affecting about ~0.2% of files. Elevated rate of 415 errors https://status.sirv.com/incident/296342 Tue, 05 Dec 2023 18:32:00 -0000 https://status.sirv.com/incident/296342#c9d08dd6e0810c9603055cb74c8f1de32d5992a2f2c552a9bd11cc1f7602fb05 We are continuing to investigate the cause of this issue. The nature of it is complex. It represents itself as a 415 error, from files of zero bytes. There is also an elevated though lower rate of 500 errors. The majority of requests are successful 200 responses. Our engineers are working hard to understand and resolve this. Elevated rate of 415 errors https://status.sirv.com/incident/296342 Tue, 05 Dec 2023 15:03:00 -0000 https://status.sirv.com/incident/296342#374edf33bc3b48a634a2118a6f914a08658f299e7059f43c0b03830bf7f4a47f An elevated level of 415 errors is occurring on the CDN. Our engineers are investigating the cause. Elevated rate of 429 errors https://status.sirv.com/incident/295487 Sat, 02 Dec 2023 10:31:00 -0000 https://status.sirv.com/incident/295487#a2cc0abae5e7af790fe76be719d13a6aa381903c546b00ac504a960654999db7 The issue was resolved at 1031 UTC. The cause was a server misconfiguration which made the Sydney CDN location see all requests coming from the same IP address, instead of the true client IP address. This triggered DDOS protection, to rate-limit the excessive number of requests from the same IP. 429 responses were returned for many of the requests. The issue appeared to have been an extremely short issue when the Sydney datacentre network became unavailable. That issue lasted approximately 1 minute only. However, the actual incident commenced immediately after our team had restored the Sydney CDN location, which is when the server misconfiguration was applied. The issue was solved once we discovered the misconfiguration and corrected it. Multiple actions are being taken to prevent this issue from recurring and to reduce the impact of similar but different issues in future: - Additional monitoring and alerting - New issue submission form - Context sensitive resolution advice - Wider internal alerting Only the Sydney CDN location was affected. Classified as severity level 3: Minor impact. Elevated rate of 429 errors https://status.sirv.com/incident/295487 Sat, 02 Dec 2023 04:19:00 -0000 https://status.sirv.com/incident/295487#8bab38de149693ef95e6941ca02a07707ff7bcec7cad23cff122fcd229259ed8 From 0419 UTC, an elevated level of 429 errors were returned from the Sydney CDN location. API searches failing https://status.sirv.com/incident/291643 Thu, 23 Nov 2023 10:35:00 -0000 https://status.sirv.com/incident/291643#7f035692e540a71251b8f9f4ce22ab412805a94660ecdfd33507fb4dc52a35b4 At 1035 UTC on 23 November, the upgrade was complete and searches resumed normal operation. There was no downtime and all systems functioned as normal, except that API search requests had been returning error responses. The issue was not identified because other API requests were returning as normal, so total API error responses were only slightly elevated and no alerts were triggered. The issue was identified after the event, after a report by a user prompted an investigation. To automatically identify such issues in future, we are evaluating ways to monitor the search API specifically. We will also be updating the Sirv WordPress plugin to show a clearly-described message in case of a failed search. Classified as severity level 4: Minimal impact. API searches failing https://status.sirv.com/incident/291643 Wed, 22 Nov 2023 18:30:00 -0000 https://status.sirv.com/incident/291643#800aba4fa13317ab477236a11eb3b2459d3315cb81267225fcf1c4a0f7a64660 At approximately 1830 UTC on 22 November, we began an upgrade to the primary file database to a new, more powerful cluster. While the upgrade took place, API search queries were failing. Control panel at my.sirv.com loading slower than normal https://status.sirv.com/incident/286637 Wed, 15 Nov 2023 14:09:00 -0000 https://status.sirv.com/incident/286637#cdcb090827fef26871ae8c4399d33b952d597b4ce6412da5f37ab4efee30d956 The new server was deployed in the evening of 14 November and gradually filled its caches. The control panel has now returned to its original fast performance. Additional servers are now being upgraded, so the control panel will load faster still over the coming 2 weeks. Classified as severity level 4: Minimal impact. Control panel at my.sirv.com loading slower than normal https://status.sirv.com/incident/286637 Tue, 14 Nov 2023 21:33:00 -0000 https://status.sirv.com/incident/286637#ccf259d137e972251a05b40e398ba368da229a33a06e6338a464acbb4f91f057 We are waiting for provisioning of a new server to add to the cluster, then UI navigation speed will become much faster. Washington D.C. CDN requests loading slowly https://status.sirv.com/incident/277693 Tue, 14 Nov 2023 13:37:00 -0000 https://status.sirv.com/incident/277693#859167c1cfa9469ff566297c7026c5d95059042876ef2902cc42a92080dac30e The detailed investigation into this issue found two contributing factors. The first was an update to AWS Route53, which appears to have been implemented by AWS on 17 October. This change was not made by Sirv to its Route53 rules but a change within the Route53 service itself. The Los Angeles POP had slightly different routing logic to other POPs and the AWS update caused routing to behave differently, with traffic becoming routed to the Washington DC fallback. This went unnoticed as he higher load was still within its capacity and requests were returned successfully, so this went unnoticed as Washington DC was able to handle the additional load. The second contributing factor was higher than normal traffic during peak US trading hours on 23 and 24 October. The two factors combined caused the Washington DC hard disks to return requests significantly slower than normal. Since the issue occurred, we have taken multiple actions to prevent it repeating; to mitigate against similar possible incidents; and to accelerate resolution time in the event of an incident. Actions include: - New routing logic has been applied. - Washington DC capacity has been increased. - Reserve capacity has been increased in all other CDN locations. - New tolerance introduced for a CDN location to be removed from routing. - Additional disk monitoring, to help early preparation of capacity upgrades. - HDDs are being replaced with SSDs. - Updated SOP for our support team to investigate and respond to similar issues. Classified as severity level 3: Minor impact. Control panel at my.sirv.com loading slower than normal https://status.sirv.com/incident/286637 Mon, 13 Nov 2023 20:47:00 -0000 https://status.sirv.com/incident/286637#05b2a5259ce2119f883975d93113a848e7acfdd63ae56d59657fc18a33fb9646 The file management tools at my.sirv.com are loading slowly due to a failed server. File listing and file searching is taking from 2 to 20 seconds, whereas it normally takes 1 to 4 seconds. The failed server is being replaced and we expect speed to return to normal within 72 hours. This will be followed by a major upgrade of the entire control panel server cluster over the next 3 months. Washington D.C. CDN requests loading slowly https://status.sirv.com/incident/277693 Wed, 25 Oct 2023 05:48:00 -0000 https://status.sirv.com/incident/277693#061f56cc4602e5f94738da82a66740b5990a7e8a804c01bd536263dae30af59b The issue was resolved by 06:48 UTC on 25 October. Investigation points towards two causes - the spike was caused by Sirv's DNS provider routing Los Angeles POP requests to the Washington D.C. POP instead. This would have been fine in isolation but the datacentre appears to have throttled the bandwidth of the receiving servers, causing the slow responses. The investigation is ongoing, has not reached definitive conclusions yet and requires collaboration with our DNS and datacentre providers. Actions are already being taken to prevent any possible repeat of this issue. A fully detailed report will be provided here later. Washington D.C. CDN requests loading slowly https://status.sirv.com/incident/277693 Wed, 25 Oct 2023 05:48:00 -0000 https://status.sirv.com/incident/277693#061f56cc4602e5f94738da82a66740b5990a7e8a804c01bd536263dae30af59b The issue was resolved by 06:48 UTC on 25 October. Investigation points towards two causes - the spike was caused by Sirv's DNS provider routing Los Angeles POP requests to the Washington D.C. POP instead. This would have been fine in isolation but the datacentre appears to have throttled the bandwidth of the receiving servers, causing the slow responses. The investigation is ongoing, has not reached definitive conclusions yet and requires collaboration with our DNS and datacentre providers. Actions are already being taken to prevent any possible repeat of this issue. A fully detailed report will be provided here later. Washington D.C. CDN requests loading slowly https://status.sirv.com/incident/277693 Mon, 23 Oct 2023 13:10:00 -0000 https://status.sirv.com/incident/277693#859305f0431f2ed40b35a5c635516883ab76ef7bb4157a1ab6109124e335c77d A dramatic increase in requests to the Washington D.C. CDN location occurred at 14:10 UTC on 23 October, causing a small proportion of requests to be returned slowly - greater than 2 seconds. The issue subsided over the next 3 hours, though average response time remained elevated. The issue returned the following day at 14: 32 UTC on 24 October, more severely than before causing 17% of requests to load slowly - greater than 2 seconds and some as long as 30 seconds. The issue impacted two of Sirv's 25 CDN locations. Washington D.C. CDN requests loading slowly https://status.sirv.com/incident/277693 Mon, 23 Oct 2023 13:10:00 -0000 https://status.sirv.com/incident/277693#859305f0431f2ed40b35a5c635516883ab76ef7bb4157a1ab6109124e335c77d A dramatic increase in requests to the Washington D.C. CDN location occurred at 14:10 UTC on 23 October, causing a small proportion of requests to be returned slowly - greater than 2 seconds. The issue subsided over the next 3 hours, though average response time remained elevated. The issue returned the following day at 14: 32 UTC on 24 October, more severely than before causing 17% of requests to load slowly - greater than 2 seconds and some as long as 30 seconds. The issue impacted two of Sirv's 25 CDN locations. Some API upload requests failing https://status.sirv.com/incident/210599 Mon, 22 May 2023 15:06:00 -0000 https://status.sirv.com/incident/210599#21d468723cd1ee0d8e851c9624a28ad715982395fdf22b264fe7247996bfeab9 The issue resolved itself within 9 minutes once upload volume normalised. During the period, approximately 98% of uploads succeeded and 2% failed. REST API uploads are limited per hour, to the amount specified on the Usage page of your Sirv account https://my.sirv.com/#/account/usage. If you use the REST API for uploading images, best-practice is to check the response for each upload request and if you receive an error response (400, 401, 429 etc.) then pause uploads and retry later. Refer to the docs here: https://sirv.com/help/articles/sirv-rest-api/#error-handling-guide Classified as severity level 4: Minimal impact. Some API upload requests failing https://status.sirv.com/incident/210599 Mon, 22 May 2023 14:57:00 -0000 https://status.sirv.com/incident/210599#87a425ab5505502069bed6f860575b2a30988179a7fc72fb1379c7416b0df005 The earlier issue of high upload usage returned, with a small number of API upload failing. Some API upload requests failing https://status.sirv.com/incident/210578 Mon, 22 May 2023 12:37:00 -0000 https://status.sirv.com/incident/210578#ba9a647a7c6a86800c5d004886613eb169490c0d77ed1b8e8bf29786b5e4ed8c This issue resolved itself once the load on the upload API started falling. During the period, approximately 97% of uploads succeeded and 3% failed. REST API uploads are limited per hour, to the amount specified on the Usage page of your Sirv account https://my.sirv.com/#/account/usage. If you use the REST API for uploading images, best-practice is to check the response for each upload request and if you receive an error response (400, 401, 429 etc.) then pause uploads and retry later. Refer to the docs here: https://sirv.com/help/articles/sirv-rest-api/#error-handling-guide Classified as severity level 4: Minimal impact. Some API upload requests failing https://status.sirv.com/incident/210578 Mon, 22 May 2023 11:31:00 -0000 https://status.sirv.com/incident/210578#d684febf3133b39176d59e64fbbf7e6b72ef15c43d1432f66363e1b086bfc4f6 A small number of REST API upload requests are failing due to very high load. Some Los Angeles CDN POP requests failing https://status.sirv.com/incident/201273 Thu, 27 Apr 2023 23:03:00 -0000 https://status.sirv.com/incident/201273#d7d795e9d541e9921fa1b8864ca09a8cd69aab833f362ab9a0666add6b0cee10 The issue was automatically resolved within 3 minutes. Classified as severity level 4: Minimal impact. Some Los Angeles CDN POP requests failing https://status.sirv.com/incident/201273 Thu, 27 Apr 2023 23:00:00 -0000 https://status.sirv.com/incident/201273#ba7eaea74b3012a27b70161d8a7e278638d8446e167c0b640f8176d5847c6d9e One of the Los Angeles CDN servers became unhealthy and started returning 4xx errors at 0000 UTC. It was removed from routing after 90 seconds and within a further 60 seconds, all traffic was being routed to healthy servers. Some London CDN POP requests failing https://status.sirv.com/incident/185707 Thu, 16 Mar 2023 16:22:00 -0000 https://status.sirv.com/incident/185707#fe2295292ddcd8248641979eeb5e0e495a99cf4633af1472ddb774e6270ec402 Some requests received by one of the London CDN servers failed due to a combination of local DNS resolution and pulsating network timeouts. The issue began at 1215 UTC and was resolved at 1332 UTC. During this period, 25% of requests from UK and Ireland returned a 404 or 500 error. The issue would have been solved sooner but an unrelated SSL certificate validation failure a few hours earlier blocked alerts from being received by our monitoring system. Changes have already been deployed to prevent this issue from recurring. The SSL certificate issue has been resolved; a second level of independent monitoring has been configured; an additional health automation is being trialled that has the potential to resolve such issues automatically. Classified as severity level 3: Minor impact. Some London CDN POP requests failing https://status.sirv.com/incident/185707 Thu, 16 Mar 2023 13:33:00 -0000 https://status.sirv.com/incident/185707#1fb280c104e661e33a3c15bd5776ae02e795d89bd217535fa7e7ff4942296b84 The issue has been resolved and is now being monitored. Some London CDN POP requests failing https://status.sirv.com/incident/185707 Thu, 16 Mar 2023 12:17:00 -0000 https://status.sirv.com/incident/185707#52a027b19b50482badc3364e7bb5ff0c0d2d09e9df9c2f74f5b6f9da745837e5 Some requests to the London POP of Sirv CDN, are returning errors due to high load. This POP serves UK and Ireland. We are working on a resolution. Some requests from India CDN return 404 https://status.sirv.com/incident/178357 Thu, 23 Feb 2023 07:55:00 -0000 https://status.sirv.com/incident/178357#3055fe9fca2d523f758ec58f3a179b035b9d59ae81c7798153563a8ff9dde361 The issue was resolved at 0755 UTC. At 0625 UTC, some requests to the Indore CDN POP which serves India started to return a 404 response (file not found), instead of a 200 response (OK). The issue occurred 14 hours after an update to all CDN servers worldwide. All POPs behaved healthily, including the India POP but the Indore POP hadn’t updated successfully and eventually started to return errors. Once the issue had been identified, it was resolved by re-deploying the update to the India POP. During the incident, 94% of requests returned an error. Requests to all other countries were returned as normal. To prevent this issue from recurring, a new automated deployment rule is being created for a part of the deployment process where there had been only a human check. An additional alerting metric is also being created, to identify a situation in which a deployment has failed. We apologise to all customers who were impacted by this incident and we are evaluating other changes that may help prevent any possible repeat of the issue. If a CDN issue like this occurs, one immediate solution within your control is to disable the CDN temporarily in your Sirv settings here: https://my.sirv.com/#/account/settings/domains Classified as severity level 2: Meaningful impact. Elevated errors on one London CDN server https://status.sirv.com/incident/165003 Mon, 16 Jan 2023 07:04:00 -0000 https://status.sirv.com/incident/165003#c8621612ef52052dac4bd2d6a5e6c4b137349b2f6b7e68f91ab61b489e1ad021 The server became healthy again and returned to service automatically, only to become unhealthy after some time then removed from service. Server health checks became healthy, then unhealthy, alternating the server in and out of service a few times, until automatic management was overridden by manually removal at 0704 UTC. The issue was caused by a faulty hard disk. Rather than failing, the disk had random periods of errors, then would operate normally. Sirv already employs health checks to identify unhealthy disks but the behaviour of this disk permitted it to pass health checks and resume service. We are reviewing what more can be done to triage disks that behave in such a way. Actions being taken include: 1. Expansion of our proactive disk replacement regime, to handle a wider range of disk behaviours. This regime catches potentially failing disks early, so expanding it to more scenarios will reduce the likelihood of repeat issues. 2. An additional alert has been added to track elevated 5xx errors per CDN server, alongside the existing alert which monitors each CDN POP (each POP has multiple servers). Classified as severity level 3: Minor impact. Some repeat visitors seeing "too many redirects" error https://status.sirv.com/incident/149182 Tue, 27 Dec 2022 06:22:00 -0000 https://status.sirv.com/incident/149182#08759d4290f60dc0d564927a6497f10a40fe948cc82907c9ec1d353e0700618a The issue has been resolved. Chrome browsers started caching 307 Temporary Redirect responses, which caused a local infinite loop on the users device. 307 responses are temporary, so they should not be cached by browsers. Indeed, Sirv has always used the same 307 logic and responses have never been cached, so this may be a Chrome bug. We solved the issue by adding a header explicitly telling browsers not to cache the response. This should also prevent the issue from occurring in future. We estimate that fewer than 1% of requests were impacted, affecting repeat visitors for files that had been previously cached and since deleted from the CDN cache. Classified as severity level 3: Minor impact. Some CDN requests returning 502 responses from Washington DC POP https://status.sirv.com/incident/149189 Mon, 26 Dec 2022 20:55:00 -0000 https://status.sirv.com/incident/149189#7156666649a5e510689045dd3c91717bc88f539b6e3432f13473a4121081f560 The issue has been fully resolved. The errors were returned by the Washington CDN server after its hard drive rapidly reached maximum capacity. Such an issue has not occurred before because each CDN server automatically deletes less-frequently accessed files, to maintain space. However, unusually heavy load of CDN MISSes outpaced the rate at which other files were deleted, causing heavy filesystem load and the failure of some requests. The 502s occurred before the server had detected the matching account, so a count of the exact number cannot be displayed in account analytics. Approximately 117,000 responses were affected, less than 1% of total requests over the period. To prevent this from happening again, the Washington DC location is being upgraded with faster drives and updated cache clearing logic. Classified as severity level 3: Minor impact. Cannot login to my.sirv.com https://status.sirv.com/incident/146795 Tue, 20 Dec 2022 12:24:00 -0000 https://status.sirv.com/incident/146795#9280b86f8808f60097d7a1473029f27aa03079595c9565dc5f8dd20d42247043 The control panel at my.sirv.com is now returning all requests as normal. Classified as severity level 4: Minimal impact. Cannot login to my.sirv.com https://status.sirv.com/incident/146795 Mon, 19 Dec 2022 16:36:00 -0000 https://status.sirv.com/incident/146795#e68f6cdff77be528f2990d6e11a22a8e6666673813266552c5609f872324a571 Some users may receive an error message when logging in to their account at my.sirv.com. This is due to a server failure. The server is being replaced and normal operation will resume shortly after that. If you need to perform any administration within your account before then, please try again as another attempt may be successful.  All image serving and processing continues to work as normal. Empty response returned for some CDN requests https://status.sirv.com/incident/146903 Thu, 15 Dec 2022 18:04:00 -0000 https://status.sirv.com/incident/146903#409f1ccb711129bb051e39f7a4c052d3243cc23b0ee42aec0efd6fb493655d5e The issue was resolved at 1804hrs UTC, after a period of 40 minutes. The cause was a misconfiguration error in newly deployed code for improving cache logic. Additional automated and manual checks have been implemented, to prevent this from recurring. Classified as severity level 3: Minor impact. Some API requests failing https://status.sirv.com/incident/147195 Wed, 23 Nov 2022 17:27:00 -0000 https://status.sirv.com/incident/147195#9f05335d16768589d2fb42fb4fa4d3c9f55d3bc6649b724aa53d063f41f144e2 The sporadic API error was resolved. It impacted approximately 5% of API requests. The issue was caused by a misconfiguration when our engineers deployed a minor update to the API. Testing of the deployment by our engineers was cut-short by a Russian missile attack on Kyiv, causing electricity and internet to be lost. To prevent a repeat of such an issue, new deployments will now be tested by a secondary team in UK. We would like to thank all nations who are providing air defence systems to help protect Ukrainian infrastructure and civilians from future attacks. Despite Russia's war, it is business-as-usual here at Sirv - during 2022, we have grown our team, expanded our CDN and launched many new features. Thank you for your support! Classified as severity level 4: Minimal impact. Some API requests failing https://status.sirv.com/incident/147195 Wed, 23 Nov 2022 10:28:00 -0000 https://status.sirv.com/incident/147195#2515d254e473549edd3dacdd4cd5229c22d383b3936802a855b965d0c50c944d A small number of requests to api.sirv.com are returning 502 or 504 responses. The issue is being investigated. Some operations failing (upload/copy/move) https://status.sirv.com/incident/85991 Thu, 05 May 2022 13:19:00 -0000 https://status.sirv.com/incident/85991#15d6ffa8b289d7615d4622578214452664d08e5304d3271a4de345d51475e457 The issue has been resolved. Internal connectivity issues between two servers on one particular cluster had caused it. Recently uploaded files were not stored and served correctly, so returned 415 errors for those new files. Any recently uploaded images with 0 bytes should be reuploaded. Only 6 Sirv accounts were impacted and the account owners have been informed. Fewer than 0.01% of all requests to those accounts were impacted and no other accounts were impacted. Some operations failing (upload/copy/move) https://status.sirv.com/incident/85991 Thu, 05 May 2022 12:56:00 -0000 https://status.sirv.com/incident/85991#071a997be5ce05741ebe0687613d71a453e470aae173cc7349e312ffd8cdd043 The failure is also affecting serving of recently uploaded images. We are continuing to investigate the issue. Some operations failing (upload/copy/move) https://status.sirv.com/incident/85991 Thu, 05 May 2022 11:55:00 -0000 https://status.sirv.com/incident/85991#bcddef92271147375eb9520e9ca0fc0e95eeeb674464c246d38a96e70b276778 A small number of operations are failing for some accounts. This is isolated to one server and we are investigating the issue. It affects file management only - it does not affect file processing or serving. New file uploads interrupted https://status.sirv.com/incident/84567 Mon, 27 Dec 2021 12:33:00 -0000 https://status.sirv.com/incident/84567#c0498b0911bb5a2718d00c729ec5184429b3ce93d5ec1a479a91a01abed93654 Two servers failed simultaneously, without triggering a health warning, leading to interrupted file upload capability for approximately 90 minutes for some accounts. During the period, some users may also have experienced reduced ability to browse files at my.sirv.com. The issue was resolved when the servers were removed from the cluster. All core file processing, delivery and CDN services remained fully operation throughout. Additional health warning checks have been planned, to identify and automatically resolve any such repeated issue. Some requests slow to load https://status.sirv.com/incident/84568 Wed, 15 Dec 2021 21:03:00 -0000 https://status.sirv.com/incident/84568#29e5c77120d8fd908a54750ce3b4d3e187d4b81dab75eba7d4e400b1067ff275 Sirv's primary image processing datacentre in Germany experienced sporadic degraded performance due to backbone maintenance. Approximately 1% of requests for unprocessed images were slow to load, taking a few seconds to be returned. To minimise the number of slow requests, a higher than usual proportion of requests were routed to Sirv's backup image processing datacentre. Requests fulfilled by the CDN were unaffected, typically returning requests at nominal speeds of 20-100ms. Classified as severity level 4: Minimal impact. my.sirv.com outage https://status.sirv.com/incident/84569 Thu, 02 Sep 2021 10:54:00 -0000 https://status.sirv.com/incident/84569#b586da61138fab23616f95ead1f7eb153ea90ac7a7b348f8ba8e075a849a295d The web app at my.sirv.com was unavailable for 54 minutes due to slow responses. The cause was identified and normal web app access restored. All core services continued operation as normal. Upgrade to my.sirv.com control panel https://status.sirv.com/incident/84570 Tue, 20 Jul 2021 09:12:00 -0000 https://status.sirv.com/incident/84570#5da8391811821be974ba0e225f87831cde03af5a19e9eb2a6c9d23f1241438fb The web app at my.sirv.com was unavailable while receiving an upgrade at 10:12 UTC. All core services were available as normal. File uploads/downloads could be performed with the REST API, FTP or S3 if needed. Upgrade was completed after 14 minutes, at 10:26 UTC. File uploads impacted by datacentre network issue https://status.sirv.com/incident/84571 Mon, 12 Jul 2021 10:59:00 -0000 https://status.sirv.com/incident/84571#5825f00247b87fbd40c3591c16ab52b170bd5de26992888d0cc6dc0bc5b50edf A network issue in the primary Sirv datacentre at 11:59 UTC caused some file uploads to fail meta processing for 16 minutes. This caused 336 files newly uploaded files to be unavailable for serving until the network issue was resolved at 12:15 UTC. The files were then rescanned and were successfully served. Datacentre network connectivity issue https://status.sirv.com/incident/84572 Fri, 05 Mar 2021 05:27:00 -0000 https://status.sirv.com/incident/84572#538c216f41cc3f79e5de9d11a3f4f0c727084f2238b4832a9bc0a5f912cd3b98 A network issue in the primary datacentre caused some sporadic issues from 0527 to 0533am UTC and 0548 to 0555am UTC. A very small number of uploads failed; some image processing requests timed out, some non-CDN requests timed out; and my.sirv.com might have shown an inaccurate file count. Once the datacentre resolved the connectivity issue, operations returned to normal. The CDN was unaffected and returned files as normal. Router failure https://status.sirv.com/incident/84573 Thu, 02 Jan 2020 10:42:00 -0000 https://status.sirv.com/incident/84573#e7406b450e17079b986b36af232c3de06012fb3acabf4904beff6f3a1d2633b7 Simultaneous router failures in two datacenters at 10:42 UTC caused a 10 minute outage for new image processing on Business and Free accounts. Normal service continued for Enterprise accounts, delivered from Sirv's failover datacenter. Our primary datacentre provider had resolved the issue by 10:52 UTC. About 1 in 5 requests were impacted. The CDN remained operational throughout, successfully returning cached images. Router failure https://status.sirv.com/incident/84573 Thu, 02 Jan 2020 10:42:00 -0000 https://status.sirv.com/incident/84573#e7406b450e17079b986b36af232c3de06012fb3acabf4904beff6f3a1d2633b7 Simultaneous router failures in two datacenters at 10:42 UTC caused a 10 minute outage for new image processing on Business and Free accounts. Normal service continued for Enterprise accounts, delivered from Sirv's failover datacenter. Our primary datacentre provider had resolved the issue by 10:52 UTC. About 1 in 5 requests were impacted. The CDN remained operational throughout, successfully returning cached images. Some image processing requests failed for 7 minutes https://status.sirv.com/incident/84576 Thu, 19 Sep 2019 13:48:00 -0000 https://status.sirv.com/incident/84576#4ebddc210d4241e7e368d049af0f602d98e79bfd947950b062fa24f91ea61841 For a 7 minute period between 13:48 and 13:55 UTC, there was a partial outage for new image processing requests. The issue occurred during the rollout of a new feature to whitelist domains (thus blocking requests from other domains). It was caused by a misconfiguration of an updated server module for one of Sirv's image processing clusters. The error was identified and resolved as quickly as possible. During the period, Sirv's other image processing clusters operated as normal and all CDN requests were completed as normal - only fresh requests received by the misconfigured cluster were affected. This affected approximately 1.5% of total requests. We are reviewing our processes to identify what more can be done to avoid such issues in the future. Reduced file delivery speed for some accounts https://status.sirv.com/incident/84577 Tue, 06 Aug 2019 17:08:00 -0000 https://status.sirv.com/incident/84577#90abc57fe9c25c7715a2ae60934102e05f06088efa606c8539fb53d43161aea0 A server failure on one of the Sirv storage clusters at 1008hrs UTC caused some Sirv accounts to experience slightly slower uploads, images processing and video to spin conversions. The server was taken offline and requests were redirected to healthy servers so all operations continued as normal. The faulty hardware was replaced and at 1549hrs UTC, all services were operating at full speed again. Delayed response for 2% of processing requests https://status.sirv.com/incident/84578 Mon, 01 Jul 2019 14:43:00 -0000 https://status.sirv.com/incident/84578#a287e00d0f34749ce3e184d73564229648afe331d41da5f2477595a5c7166c92 At 3.43pm UTC on 1 July 2019, a hardware failure on a processing server combined with heavy processing load caused some new image processing requests to take longer than usual to process. This degraded performance affected approximately 2% of new image processing requests over a 3 hour period (taking more than 10 seconds to process). The faulty hardware was replaced by 5.51pm UTC and average image processing time had been reduced back to normal (150ms) by 6.59pm UTC. All other services continued as normal throughout the incident. The slow requests made up less than 0.2% of requests during the process - the other 99.8% of requests were returned rapidly as normal. Some URLs are experiencing a redirection issue https://status.sirv.com/incident/84579 Mon, 03 Jun 2019 09:24:00 -0000 https://status.sirv.com/incident/84579#0318196f3ae2173743d17edfd1e3c540cbf766ba70a484a0e443e856052c3b64 At 10.24am UTC, a redirection issue caused some images not to load. The requests which failed had previously been processed, cached, then purged from the cache as part of a cleanup process. New requests for those images should cause NGINX to invalidate its cache and request image reprocessing. Due to recent NGINX configuration changes, the cache did not invalidate correctly and reprocessing requests were not sent, causing an error "The page isn't redirecting properly". The issue was resolved at 10.59am UTC by mass purging cache entries for all cleaned processed images. During the period, the majority of requests were successfully returned. To prevent the issue from reoccurring, additional tests are being implemented for NGINX configuration changes. Additional monitoring alerts are also being designed, for faster issue detection. Heavy processing load caused Magento extension issue https://status.sirv.com/incident/84580 Wed, 17 Apr 2019 14:19:00 -0000 https://status.sirv.com/incident/84580#d5a65744c63f63d14e05f94b039432b0776d436799d9bac113158c364f32b717 Heavy S3 load caused image processing time to rise from the usual 150ms to 250ms at 15.19hrs UTC. Some requests took up to 20 seconds and 1 in 1,200 requests were rejected. It exposed a weakness in the Sirv Magento 2 extension, which incorrectly handled S3 warning messages, preventing images for certain Magento extension versions from loading. The S3 load issue was rapidly identified and resolved by 15.54hrs UTC. The extension weakness has been resolved by the release of a new Magento extension. Router fault resolved https://status.sirv.com/incident/84581 Fri, 31 Aug 2018 12:02:00 -0000 https://status.sirv.com/incident/84581#36b2802de088c3fbe929e8a4a69e68cf0c619331abf185053d51ed03316e4c5a A router fault prevented access to image processing for Business plans for 9 minutes between 13:02hrs and 13:11hrs UTC. Enterprise plans continued as normal. The CDN continued as normal. The control panel at http://my.sirv.com was unavailable. The issue was caused by a router fault at Sirv's primary datacentre, which prevented access to image processing. Router fault resolved https://status.sirv.com/incident/84581 Fri, 31 Aug 2018 12:02:00 -0000 https://status.sirv.com/incident/84581#36b2802de088c3fbe929e8a4a69e68cf0c619331abf185053d51ed03316e4c5a A router fault prevented access to image processing for Business plans for 9 minutes between 13:02hrs and 13:11hrs UTC. Enterprise plans continued as normal. The CDN continued as normal. The control panel at http://my.sirv.com was unavailable. The issue was caused by a router fault at Sirv's primary datacentre, which prevented access to image processing. Datacenter issue https://status.sirv.com/incident/84582 Thu, 24 May 2018 08:06:00 -0000 https://status.sirv.com/incident/84582#27ec7cc8f6f0229838efe14b3ce18a50a3df19a195a2a40e677111f09afe5f24 An outage at Sirv's primary datacentre commenced at 09:06 UTC, triggering Sirv to switch to its failover cluster. Normal service continued for all Enterprise plans. Other accounts were temporarily offline. The issue was caused by a primary datacentre power & UPS failure. The UPS issue is under investigation by the datacenter. Normal service was resumed within 19 minutes at 09:25 UTC. During the issue, the CDN continued file delivery, serving 82% of all total requests (18% were for non-processed images). Datacenter issue https://status.sirv.com/incident/84582 Thu, 24 May 2018 08:06:00 -0000 https://status.sirv.com/incident/84582#27ec7cc8f6f0229838efe14b3ce18a50a3df19a195a2a40e677111f09afe5f24 An outage at Sirv's primary datacentre commenced at 09:06 UTC, triggering Sirv to switch to its failover cluster. Normal service continued for all Enterprise plans. Other accounts were temporarily offline. The issue was caused by a primary datacentre power & UPS failure. The UPS issue is under investigation by the datacenter. Normal service was resumed within 19 minutes at 09:25 UTC. During the issue, the CDN continued file delivery, serving 82% of all total requests (18% were for non-processed images). Datacenter router issue https://status.sirv.com/incident/84584 Thu, 23 Nov 2017 12:51:00 -0000 https://status.sirv.com/incident/84584#16fb05e802f9ac761ce369ead3d622abed53685dafc000623d4549b58b0b31d9 A router fault at Sirv's primary datacenter caused inaccessibility of the primary cluster at 1251hrs UTC, causing Sirv to switch to its failover cluster. Normal service continued for all Enterprise accounts. Business and Free accounts returned an HTTP 503 status for the period. All services returned to normal after 49 minutes at 13:40UTC, once the router fault had been resolved in the primary datacenter. This was a rare issue from our usually reliable datacenter. If you need 100% availability backed by an SLA, talk with us about our failover service that is part of our Enterprise plans. Datacenter router issue https://status.sirv.com/incident/84584 Thu, 23 Nov 2017 12:51:00 -0000 https://status.sirv.com/incident/84584#16fb05e802f9ac761ce369ead3d622abed53685dafc000623d4549b58b0b31d9 A router fault at Sirv's primary datacenter caused inaccessibility of the primary cluster at 1251hrs UTC, causing Sirv to switch to its failover cluster. Normal service continued for all Enterprise accounts. Business and Free accounts returned an HTTP 503 status for the period. All services returned to normal after 49 minutes at 13:40UTC, once the router fault had been resolved in the primary datacenter. This was a rare issue from our usually reliable datacenter. If you need 100% availability backed by an SLA, talk with us about our failover service that is part of our Enterprise plans. Image processing outage https://status.sirv.com/incident/84585 Tue, 12 Sep 2017 17:00:00 -0000 https://status.sirv.com/incident/84585#ae5dccb855a0778a3e064adfc1b71ec74a5721a53fb11b1a6b3182753b2a4a08 A software issue caused an 80 minute image processing outage between 1800-1920hrs UTC. Image delivery continued as normal from the CDN. Uploaded images were received correctly, with zero data loss. The issue was caused by the search software used by Sirv. Search has been disabled and a new search system is being developed with a separated Solr cluster. File cachiness has been increased. 1% of requests incomplete https://status.sirv.com/incident/84586 Thu, 08 Oct 2015 13:19:00 -0000 https://status.sirv.com/incident/84586#77c9d05f7c19e5943a6355922490ba373676fb4daa1b7a0738e6454185fb512f A software daemon failure caused 47,147 lost requests (out of 7,064,192) over a 26 minute period. Normal service continued for more than 99% of requests. Our support team was immediately alerted to the issue at 14:19hrs UTC and it was resolved by 14:45hrs UTC. The cause of the problem was identified and a solution has been implemented to prevent a reoccurrence. FTP inaccessible https://status.sirv.com/incident/84587 Fri, 24 Jul 2015 15:18:00 -0000 https://status.sirv.com/incident/84587#9b7b3c9ac23a7941da01ce93d931ba42cc39a011d76a8230a397ba8e67be7e0d FTP access was unavailable for 35 minutes commencing 1618hrs UTC. Users could upload/download files as normal via the my.sirv.com user interface or file transfers via S3. The cause was identified and within 35 minutes, full FTP access was restored. Admin UI temporarily unavailable https://status.sirv.com/incident/84588 Sat, 07 Feb 2015 14:40:00 -0000 https://status.sirv.com/incident/84588#64973b1e84aa9ec1582501672198b3e97b835840ae60f121b3780d75295cea25 A set of UI updates went live, with an unforeseen conflict causing the UI at https://my.sirv.com/ to not load. The cause has been identified and we're fixing it. All core services continue as normal. You cannot upload images through the UI at present so please use FTP or S3 until this has been resolved. Riak AAE throttle issue https://status.sirv.com/incident/84565 Fri, 07 Nov 2014 10:13:00 -0000 https://status.sirv.com/incident/84565#5078a111b4bc4a0ed74179e5ab285842eb498117745952dba6de7b59b60656d7 A hard disk failure caused Sirv to temporarily pause new image processing. Service continued as normal for existing images and image uploads. The hard disk was replaced within 45 minutes, during which 359 out of 37293 processing requests were declined. The Active Anti Entropy system in Riak fixed inconsistency but hogged 100% CPU, causing an overload in the Riak cluster. This caused about 3,000 lost requests before the issue was overcome (less than 1% of requests during the period). The configuration of Riak AAE throttling has been updated to avoid this in future.