Jump to content

Incidents/20151022-EventLogging

From Wikitech

Summary

On Monday, October 26, 2015, we found out that data was no longer replicated for about 80 tables from m4-master to analytics-store (dbstore1002 not dbstore2002) as was visible to stat1003. All dashboards, research, etc. against these schemas (listed below) will be invalid until this data is backfilled from m4.

This was caused by schema changes done on task T108856, combined with the replication model not using standard mysql replication, using a 3rd party script to clone data. This meant that schema changes were not replicated, and as a consequence, the copy script failed on all tables alphabetically after the first one with a different schema.

The schema change was applied to all tables and the script is backfilling slowly all rows.

Timeline

  • 2015-10-08 (not directly related) I delete some tables from the log schema, and I notice that the tables were recreated automatically. I am puzzled by it, but cannot do anything, as they do not reappear again. (they were being recreated by the "custom replication" process)
  • 2015-10-22 15:52 A schema change is rolled on db1046. Everything seems ok after the change, but starting from that time, results may not be accurate
  • 2015-10-26 14:39 milimetric informs jynus of an issue with the m4 (eventlogging) database on chat
  • 2015-10-26 19:08 jynus finds the custom script failing to copy the events. Ori and others confirm its existence some minutes later. Schema is changed a few minutes after that and the script starts to backfill events from the master.
  • 2015-10-27 08:24 All events are synced again with the master.

Ticket: [1]

Actionables

  • Status:    Done Backfill missing data from m4.
  • Status:    Done Document better the analytics replication model.
  • Status:    Done Puppetize the copy script (otherwise, it will fail on server reboot).
  • Status:    Done Implement better monitoring of the custom replication on db1047 and dbstore1002 . Although it can be improved by monitoring the lag, too.


Schemas with Missing data

CompletionSuggestions_13630018
GatherClicks_12114785
GeoFeatures_12518424
GeoFeatures_12914994
GettingStartedRedirectImpression_7355552
GuidedTourButtonClick_13869649
GuidedTourExited_8690566
GuidedTourExternalLinkActivation_8690560
GuidedTourGuiderHidden_8690549
GuidedTourGuiderImpression_8694395
ImageMetricsCorsSupport_11686678
ImageMetricsLoadingTime_10078363
MediaViewer_10867062
MobileAppCategorizationAttempts_5359208
MobileAppLoginAttempts_5257721
MobileAppUploadAttempts_5334329
MobileOptionsTracking_14003392
MobileWebBrowse_12119641
MobileWebClickTracking_5929948
MobileWebDiffClickTracking_10720373
MobileWebEditing_8599025
MobileWebMainMenuClickTracking_11568715
MobileWebSearch_12054448
MobileWebUIClickTracking_10742159
MobileWebWatching_11761466
MobileWebWatchlistClickTracking_10720361
MobileWikiAppAppearanceSettings_10375462
MobileWikiAppAppearanceSettings_9378399
MobileWikiAppArticleSuggestions_10590869
MobileWikiAppArticleSuggestions_11448426
MobileWikiAppArticleSuggestions_12443791
MobileWikiAppCreateAccount_8240702
MobileWikiAppCreateAccount_9135391
MobileWikiAppDailyStats_12637385
MobileWikiAppEdit_9003125
MobileWikiAppInstallReferrer_12601905
MobileWikiAppLangSelect_12588733
MobileWikiAppLinkPreview_12143205
MobileWikiAppLogin_8234533
MobileWikiAppLogin_9135390
MobileWikiAppMediaGallery_10923135
MobileWikiAppMediaGallery_12588701
MobileWikiAppNavMenu_12732211
MobileWikiAppOnboarding_9123466
MobileWikiAppOperatorCode_8983918
MobileWikiAppProtectedEditAttempt_8682497
MobileWikiAppSavedPages_10375480
MobileWikiAppSavedPages_8909354
MobileWikiAppSearch_10641988
MobileWikiAppShareAFact_11331974
MobileWikiAppShareAFact_12588711
MobileWikiAppStuffHappens_8955468
MobileWikiAppTabs_12453651
MobileWikiAppToCInteraction_10375484
MobileWikiAppToCInteraction_11014396
MobileWikiAppToCInteraction_8461467
MobileWikiAppWidgets_11312870
MultimediaViewerAttribution_9758179
MultimediaViewerDimensions_10014238
MultimediaViewerDuration_10427980
MultimediaViewerNetworkPerformance_12458951
MultimediaViewerVersusPageFilePerformance_7907636
NavigationTiming_10785754
NavigationTiming_12405818
NavigationTiming_13317958
NavigationTiming_13332008
NewEditorEdit_6792669
PageContentSaveComplete_5588433
PageCreation_7481635
PageDeletion_7481655
PageMove_7495717
PageRestoration_7758372
Popups_11625443
PrefUpdate_5563398
SaveTiming_12236257
ServerSideAccountCreation_5487345
TestSearchSatisfaction2_13223897
TestSearchSatisfaction2_14098806
UniversalLanguageSelector_7327441
UploadWizardErrorFlowEvent_11772725
UploadWizardFlowEvent_11772723
UploadWizardStep_11772724
UploadWizardTutorialActions_5803466
UploadWizardUploadFlowEvent_11772717
WikimediaBlogVisit_5308166