Help:Object storage user guide
This page explains how to use the Swift and S3-compatible object storage service in Cloud VPS. This service is provided by the Ceph rados gateway.
What is object storage?
Object storage treats entire files as the largest addressable unit. Files can be created, read, or deleted via REST apis but cannot be edited or randomly accessed. When an object is created in a public container, it can be accessed on the open internet via a single URL.
Object storage is most often used to store static content for web sites, but has many other uses.
Considerably more context on the subject is available on English Wikipedia.
Interacting with the Cloud VPS object storage service
There are two main interfaces for interacting with the object storage service:
- The OpenStack Swift-like API
- The Amazon S3-like API
Horizon and other OpenStack-native clients will use the Swift interface, but support for the S3 interface is much more common than support for the Swift interface in third-party software. In general, you should not mix the two interfaces on the same bucket, as access control lists and similar don't directly translate between each other.
Horizon
Logged-in project members can access Object Storage via the 'Object Store' tab in Horizon.
Each object is stored in a container. Containers can be marked as public or private. Files in public containers will be viewable to the entire internet; access to private containers is limited to users with authorized API access.
Horizon will display a link to the URL for public containers. All files are visible as subpaths of that URL.
OpenStack CLI
Objects and containers can be accessed via the OpenStack commandline. The process of authorization and access is similar to that for other openstack services and access is only available via application credentials. Containers are accessed via 'openstack container' subcommands, and objects via 'openstack object' subcommands.
$ cat ./smallfile.txt
I am a small file
$ openstack --os-cloud cloudvps container create mycontainer --public
+---------------+-------------+-----------------------------------------------------+
| account | container | x-trans-id |
+---------------+-------------+-----------------------------------------------------+
| AUTH_testlabs | mycontainer | tx00000ebe291249fc76624-00652837dc-2b59b075-default |
+---------------+-------------+-----------------------------------------------------+
$ openstack --os-cloud cloudvps container show mycontainer
+----------------+-------------------+
| Field | Value |
+----------------+-------------------+
| account | AUTH_testlabs |
| bytes_used | 18 |
| container | mycontainer |
| object_count | 1 |
| read_acl | .r:*,.rlistings |
| storage_policy | default-placement |
+----------------+-------------------+
$ openstack --os-cloud cloudvps object create mycontainer ./smallfile.txt
+-----------------+-------------+----------------------------------+
| object | container | etag |
+-----------------+-------------+----------------------------------+
| ./smallfile.txt | mycontainer | ee02f0e585723ad3ea8f3f3bedf711c9 |
+-----------------+-------------+----------------------------------+
$ openstack --os-cloud cloudvps object show mycontainer smallfile.txt
+----------------+----------------------------------+
| Field | Value |
+----------------+----------------------------------+
| account | AUTH_testlabs |
| container | mycontainer |
| content-length | 18 |
| content-type | text/plain |
| etag | ee02f0e585723ad3ea8f3f3bedf711c9 |
| last-modified | Thu, 12 Oct 2023 18:16:53 GMT |
| object | smallfile.txt |
+----------------+----------------------------------+
$ curl https://object.eqiad1.wikimediacloud.org/swift/v1/AUTH_testlabs/mycontainer/smallfile.txt
I am a small file
Swift API
The OpenStack Swift API endpoint mostly behaves like any other OpenStack API on Cloud VPS. See Help:Using OpenStack APIs for details. As with the other APIs, you should use a dedicated developer account for any credentials that are stored in shared environments (like in Cloud VPS or Toolforge).
S3 API
The API also supports an Amazon S3 compatible API, which seems to be the industry standard for various software to use. The Ceph documentation contains details about particular quirks of the API. The base S3 API URL is https://object.eqiad1.wikimediacloud.org
, and the S3 region is default
. The public URL for each bucket (if configured with a public ACL) is https://object.eqiad1.wikimediacloud.org/PROJECT:BUCKET/
.
The Cloud VPS setup is configured to use Keystone-backed authentication which means that you interact with the OpenStack API to create AWS-style authentication tokens. Again, you should use a dedicated developer account for any credentials that are stored in shared environments (like in Cloud VPS or Toolforge).
To create an AWS/ec2 style credential, you need an authenticated OpenStack CLI. This token will be bound to the current user and project.
$ openstack ec2 credential create
+------------+-----------------------------------------------------------------------------------------------------------+
| Field | Value |
+------------+-----------------------------------------------------------------------------------------------------------+
| access | someid |
| links | {'self': 'https://openstack.eqiad1.wikimediacloud.org:25000/v3/users/someuser/credentials/OS-EC2/someid'} |
| project_id | someproject |
| secret | somesecret |
| trust_id | None |
| user_id | someuser |
+------------+-----------------------------------------------------------------------------------------------------------+
Take note of the access
and secret
values.
s3cmd example |
---|
Here is an example on how to use s3cmd with the S3 API. $ s3cmd --configure
Enter new values or accept defaults in brackets with Enter.
Refer to user manual for detailed description of all options.
Access key and Secret key are your identifiers for Amazon S3. Leave them empty for using the env variables.
Access Key: access
Secret Key: secret
Default Region [US]: default
Use "s3.amazonaws.com" for S3 Endpoint and not modify it to the target Amazon S3.
S3 Endpoint [s3.amazonaws.com]: object.eqiad1.wikimediacloud.org
Use "%(bucket)s.s3.amazonaws.com" to the target Amazon S3. "%(bucket)s" and "%(location)s" vars can be used
if the target S3 system supports dns based buckets.
DNS-style bucket+hostname:port template for accessing a bucket [%(bucket)s.s3.amazonaws.com]: object.eqiad1.wikimediacloud.org
Encryption password is used to protect your files from reading
by unauthorized persons while in transfer to S3
Encryption password:
Path to GPG program [/usr/bin/gpg]:
When using secure HTTPS protocol all communication with Amazon S3
servers is protected from 3rd party eavesdropping. This method is
slower than plain HTTP, and can only be proxied with Python 2.7 or newer
Use HTTPS protocol [Yes]:
On some networks all internet access must go through a HTTP proxy.
Try setting it here if you can't connect to S3 directly
HTTP Proxy server name:
New settings:
Access Key: access
Secret Key: secret
Default Region: default
S3 Endpoint: object.eqiad1.wikimediacloud.org
DNS-style bucket+hostname:port template for accessing a bucket: object.eqiad1.wikimediacloud.org
Encryption password:
Path to GPG program: /usr/bin/gpg
Use HTTPS protocol: True
HTTP Proxy server name:
HTTP Proxy server port: 0
Test access with supplied credentials? [Y/n] y
Please wait, attempting to list all buckets...
Success. Your access key and secret key worked fine :-)
Now verifying that encryption works...
Not configured. Never mind.
Save settings? [y/N] y
Configuration saved to '/home/taavi/.s3cfg'
$ s3cmd ls
2023-10-04 17:47 s3://tf-state
$ s3cmd mb s3://demobucket
Bucket 's3://demobucket/' created
$ s3cmd info s3://demobucket
s3://demobucket/ (bucket):
Location: default
Payer: BucketOwner
Expiration Rule: none
Policy: none
CORS: none
ACL: metricsinfra: FULL_CONTROL
$ s3cmd setacl s3://demobucket --acl-public
s3://demobucket/: ACL set to Public
$ s3cmd info s3://demobucket
s3://demobucket/ (bucket):
Location: default
Payer: BucketOwner
Expiration Rule: none
Policy: none
CORS: none
ACL: *anon*: READ
ACL: metricsinfra: FULL_CONTROL
URL: http://object.eqiad1.wikimediacloud.org/demobucket/
$ s3cmd put test.txt s3://demobucket --acl-public
upload: 'test.txt' -> 's3://demobucket/test.txt' [1 of 1]
13 of 13 100% in 0s 19.21 B/s done
Public URL of the object is: http://object.eqiad1.wikimediacloud.org/demobucket/test.txt
The URLs s3cmd generates are wrong, however, the real base URL for a bucket is |
Quotas and other limitations
By default, any project may store up to 4096 different files and use a total of 8GB of space. These files may be distributed over any number of containers.
If you require more storage space, open a quota request ticket.
Toolforge
Right now there's no specific way for Toolforge users to use the object storage service. If you have a use case in Toolforge that would benefit from object storage, requesting a Cloud VPS project for that purpose only is possible.
Data persistence
Cloud VPS Object storage is not backed up, and is stored using 'erasure coding,' a slightly less redundant version of Ceph storage than that used for VMs or Cinder. The object store should probably not be the only place you keep your critical data.
What doesn't work
- Object storage will not work at all for projects with a
-
in their name due to limitations in the software. Feel free to open a project request for a new project with a simpler name, either to replace your project or to use alongside your current project for object storage. - Large files (greater than about 200M) cannot be uploaded from the Horizon UI. Large file support works fine via the openstack CLI using application credentials.
See also
Communication and support
Support and administration of the WMCS resources is provided by the Wikimedia Foundation Cloud Services team and Wikimedia movement volunteers. Please reach out with questions and join the conversation:
- Chat in real time in the IRC channel #wikimedia-cloud connect or the bridged Telegram group
- Discuss via email after you have subscribed to the cloud@ mailing list
- Subscribe to the cloud-announce@ mailing list (all messages are also mirrored to the cloud@ list)
- Read the News wiki page
Use a subproject of the #Cloud-Services Phabricator project to track confirmed bug reports and feature requests about the Cloud Services infrastructure itself
Read the Cloud Services Blog (for the broader Wikimedia movement, see the Wikimedia Technical Blog)