Jump to content

Talk:Data Platform/Data Lake/Project History

From Wikitech
(Redirected from Talk:Analytics/Data Lake)
Latest comment: 7 years ago by Milimetric in topic Public data access

Public data access

I am really excited about the tool you presented in the latest Metrics and Activities Meeting. When can we expect that this will be (publicly) available (in Pivot or somewhere else)? (I am not sure if this is the best place for this question, I hope it will find the right person/people :)) Samat (talk) 19:52, 30 January 2017 (UTC)Reply

  • @Samat: Hey there, glad you're interested! I believe the major issue right now is that this data is being pulled from the production databases, so it could include sensitive information that shouldn't be publicly available. I believe the Analytics team plans to address that by setting up an alternate data warehouse with data pulled from the Labs databases, but I don't know the exact timeline. Dan Andreescu would probably know better.—Neil P. Quinn-WMF (talk) 23:10, 30 January 2017 (UTC)Reply
  • @Samat: Hiǃ Yes, Neil is right, the task we need to complete is this one. Once we do that we'll see how easy/well we can pull in public data and go from there. If everything goes well, I'd guess either end of this quarter or next quarter we'll have some public way of accessing this data in Labs. Hopefully with something like Quarry on top of it. Do you have thoughts on how you'd like to access the data? And what kinds of stuff do you hope to get out of it? I'm very interested in learning how this is valuable to as many people as I can talk to ː) milimetric (talk) 17:14, 31 January 2017 (UTC)Reply