Global AppFollow Malfunction

Incident Report for AppFollow Status Page

Postmortem

Root causes:
1. Database Cluster Instability: Two nodes in our database cluster experienced problems, causing an imbalance in load distribution and impacting services relying on this infrastructure.
2. Challenges with Recovery and Data Restoration: Initial recovery efforts were hindered by experimental solutions and the complexity of restoring data from a sizable database.

Immediate actions have been taken:
1. Enhanced Infrastructure: We have strengthened all nodes in the database cluster to better handle increased loads and improved our monitoring systems for proactive issue resolution.
2. Revised Recovery Procedures: Critical data is being reallocated to a more resilient environment, and our disaster recovery plan is under review for heightened vigilance.
3. Data Management Optimization: To streamline future recovery processes, we are strategizing data cleanup and segmentation processes to mitigate potential failures.

Next steps:
1. Improved Data Governance Procedures: Collaborating closely with our product team to refine current data governance protocols for more efficient storage practices.
2. Enhanced Monitoring and Alerting: Strengthening our monitoring systems and ensuring robust alert mechanisms for early anomaly detection.

We deeply value your trust and satisfaction and remain dedicated to providing a more reliable and resilient product experience. We sincerely apologize for any inconvenience caused and appreciate your patience and understanding during this period.

Posted Dec 21, 2023 - 13:28 UTC

Resolved

The recent disruption in AppFollow services that you may have encountered is fully resolved.

Posted Dec 21, 2023 - 13:27 UTC

Monitoring

Recent Updates:

1. Meta has been successfully restored and recollected.
2. Public App pages are fully operational.

Next Steps:

1. Postmortem Preparation: Our team is gearing up to conduct a thorough postmortem analysis to ensure a comprehensive understanding of recent events.

2. Quality Assurance: We are diligently checking for any potential side effects, reviewing all reverts, and addressing minor issues.

We appreciate your patience and understanding as we work towards providing a more robust and reliable experience.

Posted Dec 21, 2023 - 07:17 UTC

Update

We are pleased to inform you that ratings and rankings have been successfully restored ✅

Tomorrow, we plan to update the meta information. Additionally, our team will conduct a comprehensive assessment for any side effects, specifically focusing on potential keyword discrepancies. Ensuring normal data collecting functionality is a priority.

Thank you for your understanding.

Posted Dec 20, 2023 - 07:59 UTC

Update

Over the weekend, our team successfully restored the data and is currently conducting performance checks.

1. Keywords: Restoration is at 99%, with any remaining small portions set to be re-collected throughout this week.

2. Ranks, Ratings and Meta data: Functionality has been reinstated up until December 10th and is now available on both the interface and via API. We are actively working on restoring data for the previous week, an will provide an estimated time of completion later.

Thank you for your patience as we work towards the solution.

Posted Dec 18, 2023 - 10:42 UTC

Update

AI replies have been successfully restored!

Pages featuring keyword data are operational, encompassing information up to December 10th. We're actively processing data to bridge the gap from December 11th to 14th for both keywords and scorings. Kindly bear with us during this process.

The app's meta information is accessible, but only the latest version is currently available, designed to prevent potential side effects. We'll provide updates on any changes as they occur.

Regrettably, ranks and ratings are still unavailable. Rest assured, we are diligently working on this in accordance with the plan shared earlier today. Thank you for your understanding.

Posted Dec 14, 2023 - 16:03 UTC

Update

We apologies for the ongoing disruption. We're actively recovering a big chunk of data, with an ETA by this weekend.

Simultaneously, our developers are implementing a parallel scenario to save keywords and metadata to recover this information as soon as possible.

We understand the impact of such a long downtime and appreciate your patience.

Posted Dec 14, 2023 - 13:51 UTC

Update

We're encountering challenges with the data copying process. Presently, we are exploring and are actively working on alternative solutions to address this issue.

Optimistically, we aim for the recovery by Thursday evening or Friday.

We understand the importance of our services to you and apologize for all the disruption ithe issue caused.

Posted Dec 13, 2023 - 12:09 UTC

Update

Fixing the issue with our major analytical pages such as Keywords tracking, Rating chart, Ranking chart and more is taking longer than expected. The recovery is 90% ready.

The new ETA for full restoration: December 14 (Thu).

We appreciate your understanding as our team works diligently to resolve this issue.

Posted Dec 12, 2023 - 15:52 UTC

Update

Our team is actively addressing the global malfunction. The majority of functions are up and running. We expect main analytical pages to be operational by end of day.

Posted Dec 12, 2023 - 08:00 UTC

Update

🛠️ Quick rundown on recent fixes:
✅ Authorization
✅ Data source connections
✅ Apps searching and adding together with showing the app data in the interface

Still working on:
🔍 Keywords: Data collection and display tweaks
📊 Rating and Ranking pages: Fine-tuning in progress

Our tech engineers are deep in code – expect seamless work very soon!

Posted Dec 11, 2023 - 10:00 UTC

Update

We are continuing to work on a fix for this issue.

Posted Dec 10, 2023 - 17:04 UTC

Identified

Right now sign in is fixed.
We are working on restore data (Rating, Ranking and Meta) application due to restore Mongo DB cluster.

Posted Dec 10, 2023 - 17:04 UTC

Investigating

We are currently investigating issue with MongoDB.

Posted Dec 10, 2023 - 14:19 UTC

This incident affected: Watch and API.