Skip to main content

Uploading and Retrieving images on Google Cloud Storage

You would already be aware that there are multiple options given by Google Cloud Platform to store data. Here is Google documentation on when to use which option:
Google recommends using Google Cloud Storage (GCS) to store static content like files/videos etc. There is something called 'Blobstore' as well which is also used to store such content but it is on the way to being deprecated. This page talks about using GCS to store images.
Look at this page to understand basic requirements for setup of GCS. In the Cloud Store Browser below following buckets are already available. If you select any bucket, you would be able to see the objects created in it. 
Here you can see the image file in the 'jda-pd-slo-sandbox.appspot.com' bucket. You won't be able to add/delete files or folder from the browser if you don't have proper access but through code (running with the service account) it should not be a problem.
Objects on GCS are immutable so you can't edit an object after it has been created. You can overwrite it.

Helpful Links

Code

Following code accepts an image file, resizes it and uploads it on cloud storage bucket and returns URL. To test this code: Using Postman send a POST request to https://peaceful-guide-133423.appspot.com/uploadData/{imageName}, Have the body set to Binary, attach an image and send. You should see URL for resized image.
This won't work unless deployed on cloud because when running locally the code can't access any bucket on GCS and you will probably get all sorts of errors and exceptions!
//This code uploads an image on the GCS in a bucket named 'bucket_name' and assigns the image name 'image.png'. 
//If you run it multiple times, it will overwrite the image. 
//In second part it takes the images, applies a transform on it using ImagesService and saves it back on 
//GCS and returns a URL to access the image.
@RequestMapping(
        value = "/uploadData/{imageName}",
        method = RequestMethod.POST,
        produces = MediaType.APPLICATION_JSON_VALUE)
public String uploadImage (@PathVariable (value="imageName") String imageName, HttpServletRequest request) {  
 GcsService gcsService = GcsServiceFactory.createGcsService(new RetryParams.Builder()
        .initialRetryDelayMillis(10)
        .retryMaxAttempts(10)
        .totalRetryPeriodMillis(15000)
        .build());
  String bucketName = "bucket_name";
  String userFileName = imageName + ".png";
  GcsFilename fileName;
  GcsFileOptions options = new GcsFileOptions.Builder().cacheControl("no-cache").build();
  try {
    //Get the GcsFileName using bucket name and desired file name
 fileName = createGcsFileName(bucketName, userFileName);

    //Save the image on GCS 
 GcsOutputChannel outputChannel = gcsService.createOrReplace(fileName, options);

    //Actually write the image to GCS bucket
 copy(request.getInputStream(), Channels.newOutputStream(outputChannel));
  } catch (Exception e) {
    System.out.println(e.getMessage());
  }

  //To transform the image, get the image from GCS.
 Image originalImage = getImageFromGCS(bucketName, userFileName);
  ImagesService imagesService = ImagesServiceFactory.getImagesService();

  //Transform image.
 Image modifiedImage = transformImage(imagesService, originalImage);

  fileName = createGcsFileName(bucketName, userFileName);
  //Save the transformed image on GCS. 
 try {
    gcsService.createOrReplace(fileName, options, ByteBuffer.wrap(modifiedImage.getImageData()));
  } catch (IOException e) {
    e.printStackTrace();
  }

  //Get a URL to access the image saved on GCS.
 String url = imagesService.getServingUrl(ServingUrlOptions
          .Builder.withGoogleStorageFileName("/gs/" + bucketName + "/" + userFileName)
          .secureUrl(true));

  //DO whatever you want with the URL! 
 request.getSession().setAttribute("URL", url);
  return "imageUpload";
}

/*
 This method creates a GcsFileName using bucketName and fileName. File name is the name of the file you
 want on GCS. It is not the actual file name necessarily. 
*/
private GcsFilename createGcsFileName(String bucketName, String userFileName) {
  return new GcsFilename(bucketName, userFileName);
}

/*
 This method fetches an image from GCS using BlobstoreService. To create 'BlobKey' the name of the file 
 is in format: /gs/<bucketName>/<fileName>. If while saving the file you gave a path like
 <bucketName>/some/directory/fileName then in the browser you will see that fileName is created inside
 some/directory but still to access the image you will have to give whole path, not just name.
*/
private Image getImageFromGCS(String bucketName, String userFileName) {
  BlobstoreService blobstoreService = BlobstoreServiceFactory.getBlobstoreService();
  BlobKey blobKey = blobstoreService.createGsBlobKey("/gs/" + bucketName + "/" + userFileName);
  return ImagesServiceFactory.makeImageFromBlob(blobKey);
}

/*
 This method applies transformation on an image and returns the modified image. Here it is just resizing. 
*/
private Image transformImage(ImagesService imagesService, Image originalImage) {
  Transform transform = ImagesServiceFactory.makeResize(225, 250, true);
  return imagesService.applyTransform(transform, originalImage);
}

private void copy(InputStream input, OutputStream output) throws IOException {
  try {
    byte[] buffer = new byte[BUFFER_SIZE];
    int bytesRead = input.read(buffer);
    while (bytesRead != -1) {
      output.write(buffer, 0, bytesRead);
      bytesRead = input.read(buffer);
    }
  } finally {
    input.close();
    output.close();
  }
}
Known Issue
If you run above code multiple times you will notice that even if you send different images, though the image gets overwritten the URL you get is still the one you got the first time. This is due to caching in intermediate proxy servers and currently I don't know how to resolve it. Workarounds are to using MD5 hash of the file as the name or storing different images with different names.
 

Comments

Popular posts from this blog

How to upload to Google Cloud Storage buckets using CURL

Signed URLs are pretty nifty feature given by Google Cloud Platform to let anyone access your cloud storage (bucket or any file in the bucket) without need to sign in. Official documentation gives step by step details as to how to read/write to the bucket using gsutil or through a program. This article will tell you how to upload a file to the bucket using curl so that any client which doesn't have cloud SDK installed can do this using a simple script. This command creates a signed PUT URL for your bucket. gsutil signurl -c 'text/plain' -m PUT serviceAccount.json gs://test_bucket_location Here is my URL: https://storage.googleapis.com/test_sl?GoogleAccessId=my-project-id@appspot.gserviceaccount.com&Expires=1490266627&Signature=UfKBNHWtjLKSBEcUQUKDeQtSQV6YCleE9hGG%2BCxVEjDOmkDxwkC%2BPtEg63pjDBHyKhVOnhspP1%2FAVSr%2B%2Fty8Ps7MSQ0lM2YHkbPeqjTiUcAfsbdcuXUMbe3p8FysRUFMe2dSikehBJWtbYtjb%2BNCw3L09c7fLFyAoJafIcnoIz7iJGP%2Br6gAUkSnZXgbVjr6wjN%2FIaudXIqA...

Running Apache Beam pipeline using Spark Runner on a local standalone Spark Cluster

The best thing about Apache Beam ( B atch + Str eam ) is that multiple runners can be plugged in and same pipeline can be run using Spark, Flink or Google Cloud Dataflow. If you are a beginner like me and want to run a simple pipeline using Spark Runner then whole setup may be tad daunting. Start with Beam's WordCount examples  which help you quickstart with running pipelines using different types of runners. There are code snippets for running the same pipeline using different types of runners but here the code is running on your local system using Spark libraries which is good for testing and debugging pipeline. If you want to run the pipeline on a Spark cluster you need to do a little more work! Let's start by setting up a simple standalone single-node cluster on our local machine. Extending the cluster is as easy as running a command on another machine, which you want to add to cluster. Start with the obvious: install spark on your machine! (Remember to have Java a...

Example of Using SimpleHttpOperator to make POST call

Airflow has SimpleHttpOperator which can be used to invoke REST APIs. However using this operator is not exactly straightforward. Airflow needs to be told about the connection parameters and all the other information that is needed to connect to external system. For this we need to create Connections. Open 'Connections' page through Admin->Connections link.  Expand the dropdown to see the various types of connection options available. For a REST call, create an HTTP connection. Give the host URL and any other details if required. Now when we write our task using SimpleHttpOperator we will need to refer to the connection that was just created. The task below is making a post call to  https://reqres.in/api/users  API and passing it some data in JSON format. myHttpTask = SimpleHttpOperator(  task_id='get_op',  method='POST',  http_conn_id='dcro',  data=json.dumps({    "name":"Morpheus",    " job ":" L...