Incidents | Informaten | Status Incidents reported on status page for Informaten | Status https://status.informaten.com/ https://d1lppblt9t2x15.cloudfront.net/logos/5a8900893a9dded3350b9a640b0fadeb.png Incidents | Informaten | Status https://status.informaten.com/ en Rootservers - Maincubes FRA01 recovered https://status.informaten.com/ Thu, 23 Apr 2026 19:29:04 +0000 https://status.informaten.com/#3ee80a7078814abb2da28c4c28aa5941c79ed42dd358484a64037549051804dd Rootservers - Maincubes FRA01 recovered Rootservers - Maincubes FRA01 went down https://status.informaten.com/ Thu, 23 Apr 2026 19:18:47 +0000 https://status.informaten.com/#3ee80a7078814abb2da28c4c28aa5941c79ed42dd358484a64037549051804dd Rootservers - Maincubes FRA01 went down Rootservers - Maincubes FRA01 recovered https://status.informaten.com/ Thu, 23 Apr 2026 18:56:04 +0000 https://status.informaten.com/#cadbc0f6edc06bac6a9b520ab90253166afc588ec31ec0e109c02bb5ab5a07a4 Rootservers - Maincubes FRA01 recovered Rootservers - Maincubes FRA01 went down https://status.informaten.com/ Thu, 23 Apr 2026 18:54:03 +0000 https://status.informaten.com/#cadbc0f6edc06bac6a9b520ab90253166afc588ec31ec0e109c02bb5ab5a07a4 Rootservers - Maincubes FRA01 went down Rootservers - Maincubes FRA01 recovered https://status.informaten.com/ Thu, 23 Apr 2026 18:45:27 +0000 https://status.informaten.com/#2b56c971eecbe6184f4774e4334e61ec27b663709221284d9e4289d3c83ede75 Rootservers - Maincubes FRA01 recovered Rootservers - Maincubes FRA01 went down https://status.informaten.com/ Thu, 23 Apr 2026 18:39:51 +0000 https://status.informaten.com/#2b56c971eecbe6184f4774e4334e61ec27b663709221284d9e4289d3c83ede75 Rootservers - Maincubes FRA01 went down Rootservers - Maincubes FRA01 recovered https://status.informaten.com/ Thu, 23 Apr 2026 18:31:16 +0000 https://status.informaten.com/#e1911a035c6968de97f5b46b91a2f25ef0f38c7aa1d28b197ee97a9fa2503a40 Rootservers - Maincubes FRA01 recovered Rootservers - Maincubes FRA01 went down https://status.informaten.com/ Thu, 23 Apr 2026 18:30:46 +0000 https://status.informaten.com/#e1911a035c6968de97f5b46b91a2f25ef0f38c7aa1d28b197ee97a9fa2503a40 Rootservers - Maincubes FRA01 went down Rootservers - Maincubes FRA01 recovered https://status.informaten.com/ Thu, 23 Apr 2026 18:22:04 +0000 https://status.informaten.com/#de242961e96001f7a26f2f9086765f2a89124d0aa1a922ffceb036c79aa35217 Rootservers - Maincubes FRA01 recovered Rootservers - Maincubes FRA01 went down https://status.informaten.com/ Thu, 23 Apr 2026 18:20:59 +0000 https://status.informaten.com/#de242961e96001f7a26f2f9086765f2a89124d0aa1a922ffceb036c79aa35217 Rootservers - Maincubes FRA01 went down Website & CP recovered https://status.informaten.com/ Sat, 28 Mar 2026 05:04:57 +0000 https://status.informaten.com/#9001feebf9a4a9a8316b4f69afea4e0098f599a09f33f87d4ffa1296a646b01a Website & CP recovered Website & CP went down https://status.informaten.com/ Sat, 28 Mar 2026 05:02:33 +0000 https://status.informaten.com/#9001feebf9a4a9a8316b4f69afea4e0098f599a09f33f87d4ffa1296a646b01a Website & CP went down Website & CP recovered https://status.informaten.com/ Sat, 28 Mar 2026 04:55:54 +0000 https://status.informaten.com/#83b4209e9de502621f9c16497b2ce660bdce3c3bc6a2956567686afa29895e9d Website & CP recovered Website & CP went down https://status.informaten.com/ Sat, 28 Mar 2026 04:53:46 +0000 https://status.informaten.com/#83b4209e9de502621f9c16497b2ce660bdce3c3bc6a2956567686afa29895e9d Website & CP went down Website & CP recovered https://status.informaten.com/ Sat, 28 Mar 2026 04:40:54 +0000 https://status.informaten.com/#47f2f6eb09723f5737d4b1a6e983bb36a93d9f5101412c6f1850d79cc65ebf30 Website & CP recovered Website & CP went down https://status.informaten.com/ Sat, 28 Mar 2026 04:38:42 +0000 https://status.informaten.com/#47f2f6eb09723f5737d4b1a6e983bb36a93d9f5101412c6f1850d79cc65ebf30 Website & CP went down Website & CP recovered https://status.informaten.com/ Sat, 28 Mar 2026 04:31:52 +0000 https://status.informaten.com/#b411ad9d594b734c3e249aabf2c5969f4d5c10efe9dee4b10d0ea7278b8e85f0 Website & CP recovered Website & CP went down https://status.informaten.com/ Sat, 28 Mar 2026 04:29:48 +0000 https://status.informaten.com/#b411ad9d594b734c3e249aabf2c5969f4d5c10efe9dee4b10d0ea7278b8e85f0 Website & CP went down Rootservers - FirstColo FRA4 recovered https://status.informaten.com/ Sun, 15 Mar 2026 01:39:12 +0000 https://status.informaten.com/#4e5e76935660728ecb8a864ed2d31a26999affc03b4674254f9093e5d4e4ee50 Rootservers - FirstColo FRA4 recovered Rootservers - FirstColo FRA4 went down https://status.informaten.com/ Sun, 15 Mar 2026 01:33:31 +0000 https://status.informaten.com/#4e5e76935660728ecb8a864ed2d31a26999affc03b4674254f9093e5d4e4ee50 Rootservers - FirstColo FRA4 went down Webspaces recovered https://status.informaten.com/ Sat, 31 Jan 2026 02:46:42 +0000 https://status.informaten.com/#536f16cf20c021d47b054f1687aa7bc43f0374a135664d9115e905b277fa6339 Webspaces recovered Webspaces went down https://status.informaten.com/ Sat, 31 Jan 2026 02:42:22 +0000 https://status.informaten.com/#536f16cf20c021d47b054f1687aa7bc43f0374a135664d9115e905b277fa6339 Webspaces went down KVM-Service aktuell gestört – wir arbeiten an der Lösung https://status.informaten.com/incident/772646 Mon, 24 Nov 2025 08:00:00 -0000 https://status.informaten.com/incident/772646#8186869c632c09dd4fe8c03fd1f075925956b16c15795d103e773fbbad0cb91e Root Cause Analysis – Cluster Outage on 23 November 2025 1. Overview On Sunday, 23 November 2025, our Ceph cluster at the DE – Maincubes FRA1 site experienced a disruption that led to temporary unavailability of the productive RBD storage. The root cause was a combined failure of two NVMe OSDs from the same manufacturing batch during an active rebalance process, resulting in several Placement Groups (PGs) being irreparably damaged. As a consequence, a full restore from backup was required. 2. Timeline • 23 November 2025, approx. 04:00 Our monitoring system reported the failure of a single OSD in the cluster. Due to the existing redundancy, this was initially classified as non-critical, as the cluster is designed to tolerate the loss of an individual OSD without service impact. • 23 November 2025, approx. 09:00 A technician arrived at the data center and replaced the failed OSD. The cluster automatically initiated a rebalance process afterward. • 23 November 2025, approx. 13:50 Monitoring triggered another alert due to the failure of a second OSD. Under the additional load caused by the ongoing rebalance, this second OSD failed as well. Post-incident analysis revealed that both failed OSDs originated from the same production batch. • Later on 23 November 2025 As a result of the second OSD failure, 18 Placement Groups (PGs) were permanently lost. These PGs contained critical metadata relevant for the RBD storage, which led to a significant degradation of the cluster and, ultimately, to the unavailability of the RBD storage. • 23 November 2025, late afternoon/evening Extensive efforts were made to recover the affected OSDs and PGs. After thorough internal analysis and consultation with external experts, the affected OSDs had to be marked as “lost”. A direct recovery of the corrupted PGs from the cluster was no longer possible. • 23 November 2025, approx. 21:00 – 24 November 2025, approx. 09:00 To restore the environment, we reverted to our incremental backups. The restore was based on the backup taken on 23 November 2025 at 03:00. The restore process of the entire cluster (nearly 500 VMs) ran overnight and was completed on 24 November 2025 at approximately 09:00. At that point, all customer systems were back online. 3. Technical Root Cause The incident can essentially be attributed to the following factors: 1. Failure of two OSDs from the same batch Both failed NVMe OSDs came from the same manufacturing batch, suggesting a batch-specific quality or reliability issue. 2. Increased load due to rebalance The second OSD failed during an active rebalance, which imposed additional I/O load on the drives involved. This increased load likely contributed significantly to the second drive’s failure. 3. Loss of critical Placement Groups Due to the combined failure of two OSDs across the relevant failure domains, 18 PGs were permanently lost, including PGs holding essential metadata for the RBD pool. This led to an inconsistent and unusable storage state for the affected pool. 4. Impact • Temporary unavailability of the RBD storage in the cluster at the DE – Maincubes FRA1 site. • Service impact on nearly 500 virtual machines hosted on this cluster. • Required full restore from backup based on the backup state from 03:00 on 23 November 2025. 5. Preventive and Corrective Measures To significantly reduce the risk of similar incidents in the future, we have implemented the following measures: 1. Increased replication level o The replica size of the affected cluster has been increased to 3, providing additional fault tolerance in the event of simultaneous OSD failures. 2. Expansion and distribution of storage capacity o Six additional NVMe OSDs were added to the cluster to improve data distribution and reduce the impact of load peaks (e.g., during rebalancing). 3. Enhanced hardware quality control for OSDs o Proactive removal of another OSD from the same batch as the failed drives to prevent potential follow-up issues. o Introduction of a standardized validation process for new batches, including S.M.A.R.T. checks, burn-in tests, and benchmark/stress tests before drives are put into production. 4. Internal processes and SOPs o Creation of an internal Standard Operating Procedure (SOP) covering regular: ▪ S.M.A.R.T. analysis ▪ Benchmark and stress testing of all OSDs o Clear definition of procedures for handling OSD failures during active rebalance operations. 5. Monitoring improvements o Tightening of proactive monitoring policies, in particular: ▪ Closer tracking of latency, I/O errors, and reallocations of individual OSDs ▪ Additional alert thresholds for rebalance load and cluster degradation 6. Customer Communication and Compensation We sincerely regret the inconvenience caused by this incident. All affected customers will be informed separately about the compensation applicable to their individual case. We highly appreciate our customers’ patience and understanding during the disruption and the subsequent restoration process. KVM-Service aktuell gestört – wir arbeiten an der Lösung https://status.informaten.com/incident/772646 Mon, 24 Nov 2025 04:51:00 -0000 https://status.informaten.com/incident/772646#11791732fb1c691c06404cd6003f56c6ccac0ad5eb0ec2b6c2c876189e81c4d4 Many of our services have already been successfully restored. We are continuing to work hard to ensure that all KVM products are running fully and stably again. Please note: Some functions will only be available to a limited extent until services are fully restored. KVM-Service aktuell gestört – wir arbeiten an der Lösung https://status.informaten.com/incident/772646 Sun, 23 Nov 2025 22:24:00 -0000 https://status.informaten.com/incident/772646#3c3155ef3ed0dda83a4da57348293d71e098080a0884923202be2d186658b7c4 We are currently still experiencing disruptions to some of our root server and web hosting services. The cause is a problem on our storage platform, which we have now clearly identified. Our engineering team is working hard to gradually restore all systems to normal operation and keep downtime to a minimum. KVM-Service aktuell gestört – wir arbeiten an der Lösung https://status.informaten.com/incident/772646 Sun, 23 Nov 2025 13:00:00 -0000 https://status.informaten.com/incident/772646#734eb8ffa87bdd7497b99c2aa58b9044f2d709c27dbd5291208b5a8ca4f51874 Wir haben eine Störung im Server-Service – Wir arbeiten mit Hochdruck an der Wiederherstellung aller betroffenen Dienste. Geplante Wartung des CPs https://status.informaten.com/incident/719634 Sun, 07 Sep 2025 16:00:43 -0000 https://status.informaten.com/incident/719634#4133bab81205f1fbd079fa59c91814a97f076154231d257cda79a9dfb3a4be98 Maintenance completed Geplante Wartung des CPs https://status.informaten.com/incident/719634 Sat, 06 Sep 2025 22:00:43 -0000 https://status.informaten.com/incident/719634#2b83c1dbf59f175f604d23a1255f3e89b063af2481079544ddb6d06834e2a2cc Wir führen derzeit Wartungsarbeiten am Customer Panel unserer Webseite durch. Während dieser Zeit ist der Zugang zum Kundenbereich möglicherweise eingeschränkt oder nicht verfügbar. Wir bitten um dein Verständnis und arbeiten daran, den Service so schnell wie möglich wieder bereitzustellen. Unsere Dienste wie Root Server, Webspace und Domains sowie alle Managed Services und Colocation sind nicht davon betroffen. EN: We are currently performing maintenance work on the customer panel of our website. During this time, access to the customer area may be restricted or unavailable. We apologize for any inconvenience and are working to restore service as quickly as possible. Our services such as root servers, web space, and domains, as well as all managed services and colocation, are not affected.