Introduction to Google Cloud Storage
Google Cloud Storage is a persistent location for your files. It is designed to store any file format, and is generally referred to as BLOB (Binary Large Object Block).
Cloud Storage is a useful staging area. You will need to place your data here if you intend to access it from somewhere like Cloud SQL, BigQuery, or Dataproc.
One way of accessing Cloud Storage is via a command line tool called gsutil
. Other ways of accessing it are via the GCP console, a REST API, or from a programming language that is supported. You will need to have the GCloud SDK installed before you can use gsutil
.
Cloud Storage makes use of buckets, which you could think of as individual storage locations. When you copy a file to Cloud Storage, you will need to specify a bucket as the target location. For example:
gsutil cp file.ext gs://bucket_name
It’s possible to copy files in a multi-threaded manner using the -m
parameter.
gsutil -m cp file*.ext gs://bucket_name
If you wanted to place the file in a particular folder, it would be:
gsutil cp file.ext gs://bucket_name/folder_name/
gsutil
supports other operations besides cp
, for example rm
, mv
, ls
.
Recall that you can create a bucket from the console. The following image illustrates that.
Similarly, you could make use of the console to upload files to this bucket.
Besides copying files, you could set up a transfer service. With this, you setup monitoring on a location, and as new files appear in that location, they get copied over to your bucket. This can be done from the console, as well as programatically. There are a bunch of options, so I recommend you look up the relevant documentation here.
Access control is available at both the bucket and the object level. By default, when you upload files to your bucket, they are not readable outside of Cloud Storage. You can grant access from the console as shown below.
Cloud Storage makes for a very convenient CDN (content delivery network). You can host your files here, and your web applications can easily make use of them. Recall the access control features of Cloud Storage. This will let you grant your users permission to both store and retrieve files.
Cloud Storage comes with the idea of storage class, which is something you pick when creating a bucket. Here is a closer look at what they are:
Multi-regional is where you would store files for use by your applications that are public facing. Regional is best for internal jobs that require storage. Nearline is for backups. Coldline is for disaster recovery, basically data you will not be touching but needs to be stored for when regulatory authorities ask for it. The cost of storage goes down from left to right as illustrated below.
Accessing data from Multi-regional and Regional buckets is free. Fetching from Nearline costs some money, and fetching from Coldline costs even more.
You can also use your Cloud Storage as a static web server!
I hope you found this quick tutorial useful. Have fun using Google Cloud Storage.