Help:Toolforge/Dumps
This page contains information about dumps and Toolforge
Dumps generated by Wikimedia projects
Toolforge has access to a directory storing the dumps generated by Wikimedia projects: public Wikimedia datasets.
Recent dumps
The most recent two dumps can be found in:
/public/dumps/public
This directory is read-only, but you can copy the files to your tool's home directory if necessary. Ideally you can find (or build!) a library that can be used to read data from the dumps without decompressing them. See meta:Data dumps/Other tools for some examples.
Older dumps
- Can be manually downloaded from the Wikimedia downloads server, or from mirrors which may have better bandwidth.
/public/dumps/pagecounts-raw
contains some years of the pagecount/projectcount data.
Dumps in general
- Portal:Data Services#Wikimedia_Dumps
- meta:Data dumps
- https://dumps.wikimedia.org
- Operational runbooks for maintaining the Dumps generation system
Communication and support
Support and administration of the WMCS resources is provided by the Wikimedia Foundation Cloud Services team and Wikimedia movement volunteers. Please reach out with questions and join the conversation:
- Chat in real time in the IRC channel #wikimedia-cloud connect, the bridged Telegram group, or the bridged Mattermost channel
- Discuss via email after you have subscribed to the cloud@ mailing list
- Subscribe to the cloud-announce@ mailing list (all messages are also mirrored to the cloud@ list)
- Read the News wiki page
Use a subproject of the #Cloud-Services Phabricator project to track confirmed bug reports and feature requests about the Cloud Services infrastructure itself
Read the Cloud Services Blog (for the broader Wikimedia movement, see the Wikimedia Technical Blog)