Command line access
Access to HTTP GUIs in the Analytics Cluster is currently very restricted. You must have shell accounts on analytics nodes.
At the very minimum, you must have a shell account on the primary NameNode (analytics1001). HDFS uses POSIX accounts on the NameNode (analytics1001) for granting access to files.
Hue (Hadoop User Experience) GUI is available at https://hue.wikimedia.org. Log in using your shell username and your LDAP credentials. If you already have cluster access, but can't log into Hue, it is likely that your LDAP account needs to be manually synced. Ask an Analytics Opsen (ottomata (email@example.com) or elukey (firstname.lastname@example.org) ) for help.
Admin Instructions to sync a Hue LDAP account
When a new Hadoop user is added, an admin should give them a Hue account. If this ticket is resolved, this process should be automatic.
- Log into http://hue.wikimedia.org
- In the upper right, click on your username, and select Manage Users (you will only be able to do this if you are Hue admin. Another admin can make you one.)
- Click 'Add/Sync LDAP User'
- Fill in the form with their shell username (not LDAP/Wikitech login), deselect both 'Distinguished name' and 'Create home directory', and click 'Add/Sync user'
If you are in the wmf LDAP group (every WMF employee/contractor) and you care only about the Yarn Resource Manager UI, you can login directly to yarn.wikimedia.org.
Otherwise, If you have access to the nodes you want to send HTTP requests to, then you can access specific HTTP services using direct ssh tunneling.
To access the Hadoop Resourcemanager jobbrowser, try running:
ssh -N stat1004.eqiad.wmnet -L 8088:analytics1001.eqiad.wmnet:8088
ssh -N bast1001.wikimedia.org -L 8088:analytics1001.eqiad.wmnet:8088
And then navigate to http://localhost:8088/cluster in your browser.
You might want to check out the FairScheduler interface here too. It will show you usage of the cluster per user: http://localhost:8088/cluster/scheduler