Jump to content


Rendered with Parsoid
From Wikitech
Latest comment: 6 months ago by RudolfoMD in topic Performance (OpenRefine)
This talk page is not a Cloud VPS or Toolforge support forum. Contact information for Cloud VPS and Toolforge can be found at Help:Cloud Services communication.

New PAWS tutorials

Hey all, I'm working on a set of notebook based tutorials for PAWS. The first one is a PAWS Getting Started tutorial to help get users started with the servivice. I'd love some feedback to incorporate into this. I'm also working on a tutorial with example and recipes for tasks folks can do with PAWS and a notebook based tutorial for Pywikibot as well. Here is the Getting Started tutorial: Getting Started with PAWS SRodlund (talk)

The link was broken, I replaced it with a working link to what I surmise is the same thing (same title, under WMF umbrella). --RudolfoMD (talk) 21:31, 22 December 2023 (UTC)Reply

The https://tools.wmflabs.org/paws/ link ultimately fails with internal server error. Should this link be https://paws.wmflabs.org/paws ? -- Dave Braunschweig (talk) 01:05, 24 April 2016 (UTC)Reply

PAWs limitations on traffic and kernels

@Yuvipanda: I wanted to know the limitations of the PAWs system. Can I get some 10-15 users at a pywikibot workshop I'm organizing to use PAWs to get used to pywikibot before continuing the heavy duty stuff locally ? Can the server sustain the traffic and open up 30 (let's say 2 notebooks per person) or so kernels ? --AbdealiJK (talk) 07:42, 14 September 2016 (UTC)Reply

WCQS error

See my problems KMB dataroundtripping.ipynb looks I like I get an error "too many redirects" - Salgo60 (talk) 11:27, 12 April 2021 (UTC)Reply

The service is down

I don't know why but the service is down : https://hub.paws.wmcloud.org/hub/login. Do you have any explanation ? PAC2 (talk) 20:16, 13 February 2023 (UTC)Reply

See https://lists.wikimedia.org/hyperkitty/list/cloud-announce@lists.wikimedia.org/thread/3IPA3YL7XVRF5WRF3RG27EKRRU7NBRLR/. TL;DR there was a potential data corruption issue in the Cloud VPS storage servers which triggered a global outage while the underlying issue was diagnosed and corrected. -- BryanDavis (talk) 20:45, 13 February 2023 (UTC)Reply
thanks for your answer. PAC2 (talk) 22:18, 13 February 2023 (UTC)Reply

Need stringi.so

I run PAWS with an R kernel (https://public.paws.wmcloud.org/User:PAC2/ExploreSPARQLDataset.ipynb). I need to use the stringi package (https://cran.r-project.org/web/packages/stringi/index.html). Unfortunately I'm unable to install stringi package since it depends on libicu-dev (apt install libicu-dev). Would it be possible to install libicu-dev for PAWS users ? PAC2 (talk) 22:23, 3 March 2023 (UTC)Reply

That's probably fine to install. Could you open a phabricator ticket with this request? Tag it with "paws"
Thanks! Vivian Rook (talk) 11:40, 4 March 2023 (UTC)Reply
Thanks for your answer. PAC2 (talk) 21:56, 6 March 2023 (UTC)Reply

Any objections to advertise the OpenRefine instance on Wikimedia PAWS more widely?

Hello PAWS people! I notice that updates to OpenRefine and to the OpenRefine Wikimedia Commons extension are diligently updated on Wikimedia PAWS as soon as they are released. I am very grateful for that. I notice on Phabricator that @Vivian Rook is very much on top of things. Thank you!

This online instance of OpenRefine will be very useful for many Wikimedians who want to use OpenRefine but don't have the means to install it locally, or who run into problems doing so (which, unfortunately, is quite common). I think it will even be useful for people who want to use OpenRefine outside of Wikimedia's ecosystem. I am wondering if you have any objections (re: stress due to increased traffic? more demands from this hosted version?) if I advertise the Wikimedia-PAWS-OpenRefine instance more widely?

With 'advertising more widely', I mean things like

  • Listing it as the first / preferred way to use OpenRefine in various online documentation related to Wikidata and Wikimedia Commons editing via OpenRefine (with installing locally as the second option)
  • Announcing the PAWS instance on OpenRefine's user forum as one reliable online / cloud instance of OpenRefine. This would probably attract users who are not Wikimedians per se, but who would create Wikimedia accounts to use the instance.
  • Perhaps listing the Wikimedia-PAWS-OpenRefine online instance somewhere, or in various places, in OpenRefine's own user documentation (it would be up to the OpenRefine team to decide whether they want to do that, and where).

All of these may attract more users, and hence produce more traffic, potentially more questions, bug reports and feature requests. Hence my question if you are OK with this :-)

Many thanks! Spinster (talk) 07:29, 5 April 2023 (UTC)Reply

Hi! Any people doing wiki kinds of things should use OpenRefine in PAWS as much as they like. As for non-wiki kinds of things. Well, ideally they read the wikitech page and comport themselves in that spirit, though, in my view, that would be on the user, most of cloud services relies on the users behaving themselves.
I would suggest that the only thing in advertising as described is to mention that the PAWS instance is intended for work that furthers the wikimedia mission, to encourage good behavior. Vivian Rook (talk) 13:23, 5 April 2023 (UTC)Reply
Thanks, @Vivian Rook, I agree, and that makes a lot of sense. I'll focus on 'advertising' OpenRefine on PAWS for Wikimedians and Wikimedia-related purposes. Just generally curious - do you have any ways in which you can see / track usage (traffic...?) of/to PAWS in general and of specific tools offered through it (both to give an impression of popularity of certain tools, but also to identify if there's too heavy use for some reason)? Spinster (talk) 13:34, 7 April 2023 (UTC)Reply
Glad to help! As for data, there is no data collection, at least none that I'm aware of, in PAWS. I do keep an eye on resource usage. As there are rather few users making a legitimate use of their full CPU allotment, usually when it is heavy usage it is a crypto currency miner and we'll verify that it is, and if so or some other abuse case, block the user.
Your question reminds me of an ongoing conversation that occurs in this venue of "What is a useful project?" A question that we would like to solve with data, but one that we have a very difficult time quantifying what it means to be "Useful". The, current and subject to change at any time, effect of which is that there is an air of usage data isn't deemed to be very valuable, so collection of it doesn't get prioritized. Though this view can be changed. Vivian Rook (talk) 13:57, 7 April 2023 (UTC)Reply

Hello @Vivian Rook, it's amazing to being able to use openrefine witout any install! Could you add to the doc if it's safe for users to enter their wiki credentials in the "login" form inside openrefine? Is it stored publicly? I want to use a Special:BotPasswords, could it be possible to know what IP range PAWS is using (as seen from prod servers)? Thanks! --Framawiki (talk) 13:41, 1 June 2023 (UTC)Reply

Hello! So nothing in PAWS is meant to be treated as secret https://phabricator.wikimedia.org/T226110 As for the IP I don't know for sure, though I've always assumed prod sees PAWS from (which I just got from `curl ifconfig.me`) Vivian Rook (talk) 13:47, 1 June 2023 (UTC)Reply

All in the topic … It’s a simple feature to share results, unfortunately the datas are lost in the code blocks so it’s not very pretty. tom (talk) 21:36, 25 September 2023 (UTC)Reply

Performance (OpenRefine)

Speaking of the OpenRefine instance on Wikimedia PAWS... I'm using OpenRefine to import stuff to wikidata and noticing my instance seems to be a lot slower (several times -looks like it's going to take a few hours for the current one; I'm ~90 minutes in and it's at 66%; I recall the 1st one took a few minutes.) than it was when I started using it weeks ago. https://hub-paws.wmcloud.org/user/RudolfoMD/openrefine/project?project=2363175164991 is it. I'm doing a Reconcile cells in column Column 1 to type Q11173, on ~1600 rows. Can/should I switch to Toolserver? (I just created a dev account so I could post to this page, so now I have one.).

Am I doing something wrong, like is failing to shut down something when I'm not using it causing a problem? RudolfoMD (talk) 22:58, 22 December 2023 (UTC)Reply