Wikimedia Cloud Services team/EnhancementProposals/2020 Network refresh/2020-11-10-checkin
Appearance
2020-11-10 WMCS network checkin
- Status updates
- Questions, feedback?
- Next to do's
status updates from arturo
- cloudsw in codfw could be interesting
- test changes before introducing in eqiad
- cloudgw PoC in codfw going well:
- neutron accepted to work without doing the SNAT, now being done by cloudgw.
- VMs now use floating IPs (if they have one) for connections outside the cloud. No shortcomings detected so far, other than refreshing firewalling and other ACLs.
- neutron is happily running without our custom hacks.
- next steps, address some limitations:
- no NIC bonding in data plane. Would like to try it out, requires DCops work for the additional cable patch.
- no HA. Would like to try it out, requires another server in codfw.
- next next step: start thinking on eqiad1
- if we are fully happy with the codfw setup, then we should think on eqiad1
- this requires procuring 2 HW servers (could be misc spares with 1x1GB NIC 2x10GB NICs)
- our plans for NFS will likely leverage this setup: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/notes/NAT_loophole/NFS
Questions
- faidon: netns doesn't provide enough isolation of both realms (context: NFS design ideas)
- arturo: openstack does exactly the same in other services
- brooke: NFS doesn't provide direct escalation paths
- arturo: we are asuming risks when running the cloud in the same DC as prod, we share stuff. Openstack bridges things in nature.
- faidon: defense in deph approach
- andrew: we take prod management for granted. Is this in scope?
- faidon: can we drop VMs reaching the wikis using cloud addresses now?
- arturo: not in our KRs for this quarter
- arzhel: can we please work on reviewing ACLs this quarter? https://phabricator.wikimedia.org/T264993
- arturo: yes, I can do that
- birgit: key takeaway for NFS, how many layers of security we are adding vs the current model
- brooke: we are in the early stage of NFS evaluation: we are in the research phase, not in feedback collection phase.
- birgit: can we have a 3 top goals for the NFS project? like guiding principals, to make sure we keep on track (i.e.:high performance for toolforge users, more security layers etc).
- faidon: add realm boundaries to diagrams, to make sure where prod/cloud start/end.
- arturo: any concerns with the cloudgw project moving forward?