Refreshed
Uptime over the past 7 days 100.000 % uptime
Uptime over the past 7 days 100.000 % uptime
Uptime over the past 7 days 100.000 % uptime
Uptime over the past 7 days 100.000 % uptime
Uptime over the past 7 days 100.000 % uptime
Closed | Jul 19, 2024 | 12:50 GMT+01:00
Services are stable now and closing this issue. Will continue to monitor the application health.
Monitoring | Jul 19, 2024 | 04:16 GMT+01:00
We have worked with azure and brought our services back online. We are currently monitoring the situation.
Status from azure:
https://azure.status.microsoft/en-in/status
Impact Statement: Starting at 21:56 UTC on 18 Jul 2024, you have been identified as a customer using Virtual Machines in Central US who may experience connection failures when trying to access some Virtual Machines hosted in the region. These Virtual Machines may have also restarted unexpectedly.
Current Status: We are aware of this issue and have engaged multiple teams. We have determined the underlying cause. A backend cluster management workflow deployed a configuration change causing backend access to be blocked between a subset of Azure Storage clusters and compute resources in the Central US region. This resulted in the compute resources automatically restarting when connectivity was lost to virtual disks. We are currently applying mitigation. Customers should see signs of recovery at this time as mitigation applies across resources in the region. The next update will be provided in 60 minutes, or as events warrant.
Open | Jul 19, 2024 | 00:25 GMT+01:00
We are currently facing downtime in the Portal and User Authentication applications within our US environment, caused by a disruption in Azure services in the Central US region. Our team is actively investigating this issue.
Uptime over the past 7 days 100.000 % uptime
Closed | Jul 19, 2024 | 12:50 GMT+01:00
Services are stable now and closing this issue. Will continue to monitor the application health.
Monitoring | Jul 19, 2024 | 04:16 GMT+01:00
We have worked with azure and brought our services back online. We are currently monitoring the situation.
Status from azure:
https://azure.status.microsoft/en-in/status
Impact Statement: Starting at 21:56 UTC on 18 Jul 2024, you have been identified as a customer using Virtual Machines in Central US who may experience connection failures when trying to access some Virtual Machines hosted in the region. These Virtual Machines may have also restarted unexpectedly.
Current Status: We are aware of this issue and have engaged multiple teams. We have determined the underlying cause. A backend cluster management workflow deployed a configuration change causing backend access to be blocked between a subset of Azure Storage clusters and compute resources in the Central US region. This resulted in the compute resources automatically restarting when connectivity was lost to virtual disks. We are currently applying mitigation. Customers should see signs of recovery at this time as mitigation applies across resources in the region. The next update will be provided in 60 minutes, or as events warrant.
Open | Jul 19, 2024 | 00:25 GMT+01:00
We are currently facing downtime in the Portal and User Authentication applications within our US environment, caused by a disruption in Azure services in the Central US region. Our team is actively investigating this issue.
Uptime over the past 7 days 100.000 % uptime
Closed | Jul 19, 2024 | 12:50 GMT+01:00
Services are stable now and closing this issue. Will continue to monitor the application health.
Monitoring | Jul 19, 2024 | 04:16 GMT+01:00
We have worked with azure and brought our services back online. We are currently monitoring the situation.
Status from azure:
https://azure.status.microsoft/en-in/status
Impact Statement: Starting at 21:56 UTC on 18 Jul 2024, you have been identified as a customer using Virtual Machines in Central US who may experience connection failures when trying to access some Virtual Machines hosted in the region. These Virtual Machines may have also restarted unexpectedly.
Current Status: We are aware of this issue and have engaged multiple teams. We have determined the underlying cause. A backend cluster management workflow deployed a configuration change causing backend access to be blocked between a subset of Azure Storage clusters and compute resources in the Central US region. This resulted in the compute resources automatically restarting when connectivity was lost to virtual disks. We are currently applying mitigation. Customers should see signs of recovery at this time as mitigation applies across resources in the region. The next update will be provided in 60 minutes, or as events warrant.
Open | Jul 19, 2024 | 00:25 GMT+01:00
We are currently facing downtime in the Portal and User Authentication applications within our US environment, caused by a disruption in Azure services in the Central US region. Our team is actively investigating this issue.
Uptime over the past 7 days 100.000 % uptime
Closed | Jul 19, 2024 | 12:50 GMT+01:00
Services are stable now and closing this issue. Will continue to monitor the application health.
Monitoring | Jul 19, 2024 | 04:16 GMT+01:00
We have worked with azure and brought our services back online. We are currently monitoring the situation.
Status from azure:
https://azure.status.microsoft/en-in/status
Impact Statement: Starting at 21:56 UTC on 18 Jul 2024, you have been identified as a customer using Virtual Machines in Central US who may experience connection failures when trying to access some Virtual Machines hosted in the region. These Virtual Machines may have also restarted unexpectedly.
Current Status: We are aware of this issue and have engaged multiple teams. We have determined the underlying cause. A backend cluster management workflow deployed a configuration change causing backend access to be blocked between a subset of Azure Storage clusters and compute resources in the Central US region. This resulted in the compute resources automatically restarting when connectivity was lost to virtual disks. We are currently applying mitigation. Customers should see signs of recovery at this time as mitigation applies across resources in the region. The next update will be provided in 60 minutes, or as events warrant.
Open | Jul 19, 2024 | 00:25 GMT+01:00
We are currently facing downtime in the Portal and User Authentication applications within our US environment, caused by a disruption in Azure services in the Central US region. Our team is actively investigating this issue.
Uptime over the past 7 days 100.000 % uptime
Closed | Jul 19, 2024 | 12:50 GMT+01:00
Services are stable now and closing this issue. Will continue to monitor the application health.
Monitoring | Jul 19, 2024 | 04:16 GMT+01:00
We have worked with azure and brought our services back online. We are currently monitoring the situation.
Status from azure:
https://azure.status.microsoft/en-in/status
Impact Statement: Starting at 21:56 UTC on 18 Jul 2024, you have been identified as a customer using Virtual Machines in Central US who may experience connection failures when trying to access some Virtual Machines hosted in the region. These Virtual Machines may have also restarted unexpectedly.
Current Status: We are aware of this issue and have engaged multiple teams. We have determined the underlying cause. A backend cluster management workflow deployed a configuration change causing backend access to be blocked between a subset of Azure Storage clusters and compute resources in the Central US region. This resulted in the compute resources automatically restarting when connectivity was lost to virtual disks. We are currently applying mitigation. Customers should see signs of recovery at this time as mitigation applies across resources in the region. The next update will be provided in 60 minutes, or as events warrant.
Open | Jul 19, 2024 | 00:25 GMT+01:00
We are currently facing downtime in the Portal and User Authentication applications within our US environment, caused by a disruption in Azure services in the Central US region. Our team is actively investigating this issue.
Uptime over the past 7 days 100.000 % uptime
Closed | Jul 19, 2024 | 12:50 GMT+01:00
Services are stable now and closing this issue. Will continue to monitor the application health.
Monitoring | Jul 19, 2024 | 04:16 GMT+01:00
We have worked with azure and brought our services back online. We are currently monitoring the situation.
Status from azure:
https://azure.status.microsoft/en-in/status
Impact Statement: Starting at 21:56 UTC on 18 Jul 2024, you have been identified as a customer using Virtual Machines in Central US who may experience connection failures when trying to access some Virtual Machines hosted in the region. These Virtual Machines may have also restarted unexpectedly.
Current Status: We are aware of this issue and have engaged multiple teams. We have determined the underlying cause. A backend cluster management workflow deployed a configuration change causing backend access to be blocked between a subset of Azure Storage clusters and compute resources in the Central US region. This resulted in the compute resources automatically restarting when connectivity was lost to virtual disks. We are currently applying mitigation. Customers should see signs of recovery at this time as mitigation applies across resources in the region. The next update will be provided in 60 minutes, or as events warrant.
Open | Jul 19, 2024 | 00:25 GMT+01:00
We are currently facing downtime in the Portal and User Authentication applications within our US environment, caused by a disruption in Azure services in the Central US region. Our team is actively investigating this issue.
Resolved | Aug 08, 2024 | 10:11 GMT+01:00
Closing this issue since the root cause has been identified as a false alert.
Open | Aug 08, 2024 | 10:10 GMT+01:00
The service degradation alert received in the US environment is a false alarm caused by a DNS issue with the monitored endpoint.
Resolved | Jul 29, 2024 | 13:02 GMT+01:00
The issue is now resolved
Resolved | Jun 29, 2024 | 06:06 GMT+01:00
We have observed the resolution provided have mitigated the service disruption and system is stable now.
Monitoring | Jun 29, 2024 | 05:00 GMT+01:00
We have identified the root cause and fixed the issue. The service degradation was limited to articles with tags, We are currently monitoring the situation.
Identified | Jun 29, 2024 | 04:30 GMT+01:00
We faced service degradation in EU and US environment in during scheduled maintenance.
Resolved | Jul 29, 2024 | 11:00 GMT+01:00
The slowness was caused by a recent change made for improving the code stability. We have identified the root cause and fixed the issue.
Open | Jul 29, 2024 | 09:56 GMT+01:00
We are facing intermittent slowness in portal api, we have identified the root cause and working on a fix.
Closed | Jul 19, 2024 | 12:50 GMT+01:00
Services are stable now and closing this issue. Will continue to monitor the application health.
Monitoring | Jul 19, 2024 | 04:16 GMT+01:00
We have worked with azure and brought our services back online. We are currently monitoring the situation.
Status from azure:
https://azure.status.microsoft/en-in/status
Impact Statement: Starting at 21:56 UTC on 18 Jul 2024, you have been identified as a customer using Virtual Machines in Central US who may experience connection failures when trying to access some Virtual Machines hosted in the region. These Virtual Machines may have also restarted unexpectedly.
Current Status: We are aware of this issue and have engaged multiple teams. We have determined the underlying cause. A backend cluster management workflow deployed a configuration change causing backend access to be blocked between a subset of Azure Storage clusters and compute resources in the Central US region. This resulted in the compute resources automatically restarting when connectivity was lost to virtual disks. We are currently applying mitigation. Customers should see signs of recovery at this time as mitigation applies across resources in the region. The next update will be provided in 60 minutes, or as events warrant.
Open | Jul 19, 2024 | 00:25 GMT+01:00
We are currently facing downtime in the Portal and User Authentication applications within our US environment, caused by a disruption in Azure services in the Central US region. Our team is actively investigating this issue.
Resolved | May 22, 2024 | 17:17 GMT+01:00
Start Time: 2024-05-22 16:23:00
We are experiencing slowness and downtime in portal site due to disruption in Azure Service. We're actively working on to resolve this issue. We will share more updates once the issue is resolved.
Thank you for your understanding and continued support.
Should you have any further inquiries or require additional information, please do not hesitate to reach out to our support team.
Resolved Time: 2024-05-22 16:50:00
This issue has been resolved now. We'll share more details shortly.
Resolved | Feb 26, 2024 | 10:40 GMT+00:00
Incident:
During February 20, 09:21 to 10:41 PM IST and February 21, 07:26 to 09:11 PM IST, a production outage occurred in the Identity Server, disrupting service for multiple customers.
Duration:
The slowness persisted for 185 minutes in total, causing interruptions in service availability and connectivity.
Impact:
Customers using document360 portal and private projects had intermittent timeouts.
Cause:
During peak traffic periods, the system experienced performance degradation characterized by delayed response times and subsequent timeouts. Automatic horizontal scaling mechanisms were activated to manage the increased load on our servers. However, as the horizontal scaling reached a critical threshold, it exacerbated connection issues with the SQL database, resulting in timeouts.
Resolution:
To address this challenge, we implemented a solution by vertically scaling our server infrastructure to higher-core machines capable of handling increased loads with high availability. This adjustment ensures improved performance and reliability during peak usage periods, thereby mitigating potential disruptions caused by resource limitations.
Thank you for your understanding and continued support.
Should you have any further inquiries or require additional information, please do not hesitate to reach out to our support team.
Monitoring | Feb 22, 2024 | 07:01 GMT+00:00
We have taken mitigation measures for the identity service slowness, the application is working normal now and we are monitoring the situation.
Investigating | Feb 21, 2024 | 15:25 GMT+00:00
We are facing slowness in our identity/authentication servers, we are currently investigating the issue.
Resolved | Feb 26, 2024 | 10:40 GMT+00:00
Incident:
During February 20, 09:21 to 10:41 PM IST and February 21, 07:26 to 09:11 PM IST, a production outage occurred in the Identity Server, disrupting service for multiple customers.
Duration:
The slowness persisted for 185 minutes in total, causing interruptions in service availability and connectivity.
Impact:
Customers using document360 portal and private projects had intermittent timeouts.
Cause:
During peak traffic periods, the system experienced performance degradation characterized by delayed response times and subsequent timeouts. Automatic horizontal scaling mechanisms were activated to manage the increased load on our servers. However, as the horizontal scaling reached a critical threshold, it exacerbated connection issues with the SQL database, resulting in timeouts.
Resolution:
To address this challenge, we implemented a solution by vertically scaling our server infrastructure to higher-core machines capable of handling increased loads with high availability. This adjustment ensures improved performance and reliability during peak usage periods, thereby mitigating potential disruptions caused by resource limitations.
Thank you for your understanding and continued support.
Should you have any further inquiries or require additional information, please do not hesitate to reach out to our support team.
Identified | Feb 20, 2024 | 04:02 GMT+00:00
Between 04:02PM and 04:57 PM UTC on 20 February 2024, Customers who had set up private mode experienced inaccessibility to the Knowledgebase and portal site's pages, causing a disruption in service. Requests for the home page or portal would consistently time out after approximately 30 seconds.
Root Cause:
After conducting an investigation, it was determined that the identity server encountered a sudden and substantial increase in traffic, characterized by an unusually high volume of requests. This surge led to the database system reaching its maximum concurrent requests limit, resulting in data write issues.
Mitigation:
The delay in the data write process resulted in temporary system slowness, which, in turn, caused a delay in the scaling up process as defined in the system.
Next Steps:
We are currently engaged in a comprehensive analysis to identify potential areas for improvement to prevent similar incidents in the future.
We appreciate the understanding and patience of our customers during this incident, and we are dedicated to continuously enhancing our systems to provide a seamless and reliable service.
Resolved | Dec 26, 2023 | 23:22 GMT+00:00
Earlier today, our monitoring system identified a ping URL as being offline, attributing it to a DNS issue. Subsequent to a meticulous investigation, we have determined that this alert was, indeed, a false positive.
Our findings affirm that the ping URL was fully operational throughout the specified period, and there was no authentic disruption in service. We sincerely apologize for any confusion or inconvenience this erroneous alert may have caused.
Should you have further inquiries, concerns, or insights pertaining to this incident, please don't hesitate to contact our dedicated support team. Your feedback is of great value to us as we strive to enhance our systems and processes continually.
Resolved | Dec 14, 2023 | 15:24 GMT+00:00
Impact:
Customers who had set up private mode experienced inaccessibility to the Knowledgebase and portal site's pages, causing a disruption in service. Requests for the home page or portal would consistently time out after approximately 30 seconds.
Cause:
Upon investigation, it was found that the identity server experienced a sudden and significant surge, marked by an unusually high number of requests. This influx, in turn, triggered a timeout situation in the SQL server. Our analysis revealed a notable increase in thread connections, directly attributable to the surge in requests. This surge, coupled with connection pool starvation, led to the inability to establish necessary connections with the database, resulting in the observed timeouts.
Mitigation:
To address the immediate impact, our auto-heal setup and process efficiently identified the elevated server load and initiated the scaling out of resources. This reactive measure ensured that the system could adapt to the increased demand, mitigating the severity of the issue and restoring accessibility to the Knowledgebase and portal pages for users in private mode.
Next Steps:
We are currently engaged in a comprehensive analysis to identify potential areas for improvement to prevent similar incidents in the future. Our aim is to implement proactive measures that will enhance the system's resilience and responsiveness, ensuring a more robust and reliable experience for our users even during periods of unexpected demand.
We appreciate the understanding and patience of our customers during this incident, and we are dedicated to continuously enhancing our systems to provide a seamless and reliable service.
Resolved | Nov 15, 2023 | 09:56 GMT+00:00
Root Cause Analysis (RCA) Report
Incident Summary: An unusual surge in traffic was directed towards one instance of our authentication server. The increased traffic load overwhelmed the affected instance, leading to a partial outage and impacting the availability and performance of our authentication services. The incident was identified as an Azure outage in traffic manager services affecting our hosting infrastructure.
Incident Resolution: We have taken comprehensive measures to enhance the reliability and availability of our authentication and other services. This includes strengthening our auto-scaling configurations to better handle future demands and potential challenges. Our team has worked diligently to ensure that these services are not only more resilient but also more efficient in responding to varying loads and usage patterns.
Conclusion: We apologize for any inconvenience caused and are taking steps to prevent future disruptions. Thank you for your understanding and support. For further questions or information, please contact our support team.
Open | Nov 15, 2023 | 09:55 GMT+00:00
Beginning on Friday, November 10, 2023, at 10:34 UTC, the Document360 user authentication service got a high response time impacting some of our select customers. This issue is impacting the KB site and portal login functionality.
We are actively working on restoring the services. We apologize for any impact this incident has had on your business. We treated the disruption as our highest priority to ensure resolution.
Resolved | Nov 08, 2023 | 09:27 GMT+00:00
Root Cause Analysis (RCA) Report
Incident Summary: One of the Analytics cluster nodes was identified to be in an unhealthy state. This situation was triggered by an overwhelming surge in the volume of incoming requests and data processing in our analytics servers. Consequently, the delayed data synchronization in the secondary node resulted in timeouts for new requests, leading to the failure of the home page to load when the widget was configured to display on the home page. However, other sections such as the Docs page and articles pages remained unaffected.
Incident Resolution: The database system responded to the issue by initiating an automatic recovery process, scaling up to the next available premium tier once the node's status transitioned back to a healthy state. This process ensured the restoration of the Analytics services, ultimately resolving the issue.
Conclusion: We deeply regret any disruption this incident may have caused to your business operations. We are committed to implementing the necessary measures to prevent such occurrences and ensure the continued seamless functioning of our services. Thank you for your understanding and continued support.
Should you have any further inquiries or require additional information, please do not hesitate to reach out to our support team.
Open | Nov 07, 2023 | 21:22 GMT+00:00
Beginning on Tuesday, November 7, 2023 at 17:02 UTC, Document360 knowledge base site home page and analytics services experienced an outage impacting some of our customers. This issue is impacting KB site home page and analytics related services.
We’re actively working on to restore the services. We apologize for any impact this incident has had on your business. We treated the disruption as our highest priority to ensure resolution.
Resolved | Aug 11, 2023 | 00:00 GMT+01:00
Earlier today, our monitoring system flagged a ping URL as being down due to a DNS issue. After a thorough investigation, we've concluded that this alert was, in fact, a false positive. The ping URL was fully operational during the specified period, and there was no genuine disruption in service.
We apologize for any confusion or inconvenience this false alert may have caused.
If you have any further questions, concerns, or insights related to this incident, please feel free to reach out to our support team. Your feedback is invaluable as we work to improve our systems and processes.
Resolved | Jul 07, 2023 | 12:08 GMT+01:00
The issue is resolve and the api is stable now.
Monitoring | Jul 07, 2023 | 10:34 GMT+01:00
We have observed network level issue in our portal API, we are actively investigating the situation.
update: 10 AM UTC
We have taken measures to address the problem and are actively monitoring the situation to ensure stability. Additionally, we are conducting an investigation to determine the underlying cause of the issue.
Resolved | Jan 25, 2023 | 07:10 GMT+00:00
We are encountering an issue accessing our document360 portal application due to an outage in Azure Services. We are waiting for an update from the cloud platform on the same. We will provide more updates shortly.
Summary of Impact:
Customers experienced issues with networking connectivity, manifesting as network latency and/or timeouts when attempting to connect to Azure resources.
Mitigation:
MS Azure identified a recent change to WAN as the underlying cause and have rolled back that change. Issue is resolved and all services are functioning now.
Closed | Jul 21, 2022 | 05:15 GMT+01:00
We are encountering an issue accessing our document360 portal application due to an outage in one of the services in Azure (SQL Database). We are waiting for an update from the cloud platform on the same. We will update once it's resolved.
We observed an incident in Azure SQL Database West Europe region which affects our EU customers accessing portal and KB sites.
Azure SQL engineers identified a configuration change on the metadata drop operation which has caused the overall issue.
This incident has been resolved.
Resolved | Jul 18, 2022 | 11:11 GMT+01:00
We observed a false alert (malfunction) in our monitoring system which caused to send downtime incident notification. We notified our monitoring system to resolve this issue at the earliest. All our servers are functioning seamlessly.
Closed | Feb 02, 2022 | 17:32 GMT+00:00
The situation is now resolved, all the sites and portal are up and running. However, we are closely monitoring the status.
Open | Feb 02, 2022 | 15:46 GMT+00:00
We are currently investigating an issue that's affecting both the portal and customer-facing website. Our team is actively working on resolving this issue.
Closed | Apr 05, 2021 | 08:34 GMT+01:00
Currently, we are experiencing issues related to our SSL certificate blocking users from accessing our portal. We are currently working on resolving this issue.
update - 05th April, 8:45 GMT
-------------------------------------------
The certificate issue is now resolved and all the services are back to normal.
Closed | Apr 01, 2021 | 22:21 GMT+01:00
There was a DNS issue with azure cloud, which caused a short outage of our service. The services are now restored. We are monitoring the situation.
Link to azure incident page: https://status.azure.com/en-in/status/history/
The issue is resolved by Microsoft Azure.
Resolved | Mar 28, 2021 | 05:57 GMT+01:00
At the moment we have identified an issue affecting our customer's knowledge base site. The engineering team is currently investigating the issue. We will keep updating the status here.
Update
-----------
We identified the issue is caused by one of the application instances and customers hosted on that particular instance. The issue is now resolved and everything is back to normal. We are monitoring the environment closely.
Resolved | May 19, 2020 | 10:44 GMT+01:00
We have investigating some slowness in our APIs. we will update the progress shortly.
Resolved | Feb 26, 2020 | 21:29 GMT+00:00
We are currently experiencing some outages in our Azure infrastructure. This is being investigated by our technical team . We will keep you posted on the progress here.
Resolved | Jan 06, 2020 | 22:39 GMT+00:00
Resolved and root cause being investigated.
Open | Jan 06, 2020 | 21:00 GMT+00:00
We are currently experiencing some outages in our Azure infrastructure. This is being investigated by our technical team . We will keep you posted on the progress here.
Resolved | Dec 19, 2019 | 19:58 GMT+00:00
Auth0 service outage is now resolved and Document360 Portal login function is back to normal.
Identified | Dec 19, 2019 | 19:33 GMT+00:00
Our authentication partner Auth0 is experiencing some service outage, causing delayed response while logging in to Document360 Portal.
Closed | Nov 07, 2019 | 11:00 GMT+00:00
Outage in azure cloud is resolved now and all our dependent services are function as expected.
Closed | Nov 07, 2019 | 08:00 GMT+00:00
We are facing some down time as our cloud provider Microsoft Azure is facing outage in their West Europe region.
Link to Azure status page (select West Europe): https://status.azure.com/en-us/status
Resolved | Oct 18, 2019 | 02:00 GMT+01:00
Hi there, we are aware of the problem affecting the public websites. We are working on it at the moment. Apologies for the inconvenience.
Resolved | Aug 13, 2019 | 20:10 GMT+01:00
The issue in Document360 APIs causing the disruption is identified and fixed.
Identified | Aug 13, 2019 | 19:40 GMT+01:00
[13th August 2019, 7.40] : We experienced a service disruption in Document360 portal (https://portal.document360.io), caused by higher response times from the APIs. The issue is identified and resolved.
Closed | Aug 02, 2019 | 18:00 GMT+01:00
2nd August 2019 | Document360 Portal was down during the release for more than an hour, caused by an unexpected technical issue in production environment. The release was rolled back and application is back to normal.
Closed | Nov 28, 2018 | 19:47 GMT+00:00
Auth0 has confirmed that the issue they were facing was resolved.
Open | Nov 28, 2018 | 17:17 GMT+00:00
We are facing issue with our login page due to an outage in our authentication service provider Auth0. We are following up with them to get the issue resolved at the earliest.
Link to the incident. https://status.auth0.com/incidents/rjhcwj1d2r61
Closed | Oct 03, 2018 | 11:30 GMT+01:00
Issue was identified and resolved immediately.
Open | Oct 03, 2018 | 07:00 GMT+01:00
Certain customers faced down time in knowledge base due to change in license policy.